llvm-project

Commit Graph

Author	SHA1	Message	Date
jacquesguan	0fe5f03eeb	[RISCV][NFC] Use nested namespace definations. Since we use C++17 now, we could use nested namespace definations to simplify code. Differential Revision: https://reviews.llvm.org/D131751	2022-08-13 09:56:59 +08:00
Yeting Kuo	875694089d	[RISCV] Peephole optimization to fold merge.vvm and unmasked intrinsics. The patch uses peephole method to fold merge.vvm and unmasked intrinsics to masked intrinsics. Using peephole intead of tablegen patterns is to avoid large auto gnerated code. Note: The patch ignores segment loads since I don't know how to test them. Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D130442	2022-08-11 17:58:11 +08:00
Craig Topper	a304d70ee9	[RISCV] Reorder (and/or/xor (shl X, C1), C2) if we can form ANDI/ORI/XORI. InstCombine and DAGCombine prefer to keep shl before binops. This patch teaches isel to convert to (shl (and/or/xor X, C1 >> C2), C2) if (C1 >> C2) is a simm12. The idea was taken from X86's isel code. There's a special case implemented for a sext_inreg between the shift and the binop. Differential Revision: https://reviews.llvm.org/D130610	2022-07-27 17:35:26 -07:00
Craig Topper	31b8939ded	[RISCV] Recognize bexti from (srl (and X, 1<<C), C). This is the form we get for (zext (setne (and X 1<<C))). We only had bexti patterns for the alternative form (and (srl X, C), 1).	2022-07-20 15:03:52 -07:00
Craig Topper	79016f6eef	[RISCV] Refine the heuristics for our custom (mul (and X, C2), C1) isel. Prefer to use SLLI instead of zext.w/zext.h in more cases. SLLI might be better for compression.	2022-07-14 18:24:10 -07:00
Craig Topper	759e5e0096	[RISCV] Remove doPeepholeLoadStoreADDI. All of the cases should be handled by SelectAddrRegImm now. Reviewed By: asb, luismarques Differential Revision: https://reviews.llvm.org/D129451	2022-07-11 10:44:33 -07:00
Craig Topper	907d923a20	[RISCV] Move the custom isel for (add X, imm) into SelectAddrRegImm. This custom isel was used to split the lo12 bits of the imm so that they could be folded into load/store addresses via a post-isel peephole. This patch instead splits the immediate during isel and folds the lo12 removing the need for the post-isel peephole to do anything. After this we'll be able to remove the post-isel peephole. Reviewed By: asb, luismarques Differential Revision: https://reviews.llvm.org/D129450	2022-07-11 10:44:33 -07:00
Craig Topper	5f7641a3be	[RISCV] Modify the custom isel for (add X, imm) used by load/stores. We have custom isel that tries to select the Lo12 bits using a separate ADDI that can later folded into the load/store address by the post-isel peephole. This patch disables this if the load/store already had a non-zero offset. A non-zero offset implies that CodeGenPrepare split several large offsets used by different loads and stores into a common large offset and multiple small offsets that could be folded. Folding more of the lo12 bits changes this common offset by increasing the small offsets. While this can save an instruction to materialize the common offset, it can also prevent the small offsets from fitting in a compressed load/store instruction. Removing this also simplifies the last piece needed to fold the custom isel for add into SelectAddrRegImm and remove the post-isel peephole.	2022-07-09 22:47:27 -07:00
Craig Topper	9c6a2200e2	[RISCV] Support folding constant addresses in SelectAddrRegImm. We already handled this by folding an ADDI in the post-isel peephole. My goal is to remove that peephole so this adds the functionality to isel.	2022-07-09 13:12:02 -07:00
Craig Topper	088bb8a328	[RISCV] Add more SHXADD patterns where the input is (and (shl/shr X, C2), C1) It might be possible to rewrite (and (shl/shr X, C2), C1) in a way that gives us a left shift that can be folded into a SHXADD.	2022-07-05 16:21:47 -07:00
Craig Topper	a1cd3f49b6	[RISCV] Use a switch statement in PreprocessISelDAG. NFC This should make it easier to add more peepholes in the future.	2022-07-05 12:25:04 -07:00
Craig Topper	c15bcad2f9	[RISCV] Update PreprocessISelDAG to use RemoveDeadNodes. Instead of deleting nodes as we go, delete all dead nodes if a change is made. This allows adding peepholes that might make multiple nodes dead.	2022-07-05 12:25:03 -07:00
Craig Topper	f27672924e	[RISCV] Replace an explicit check with an assert. Shift amounts should never be 0 or more than bitwidth - 1.	2022-07-04 23:21:54 -07:00
Craig Topper	66790b70ea	[RISCV] Rename some variables for clarity. NFC	2022-07-04 23:21:54 -07:00
luxufan	c06d0b4d02	[RISCV] Add ADDI instr for computing FrameIndex address RVV doesn't have immediate field for memory addressing. Currently we build MachineInstructions in PEI to computing stack offset for RVV load store instructions. These instructions were added too late to can be optimized by CSE, LICM... passes. This patch makes FrameIndex SDNodes can't be matched in RVV Load Store instruction selection patterns. So that the FrameIndex SDNodes would be selected as `ADDI GPR, targetframeindex`. There are 2 advantages for such change: 1. Stack objects address computing can be optimized by machine function passes. 2. Since the ADDI instruction's destination register can be used as a temp register, we can save an emergency spill slot. Differential Revision: https://reviews.llvm.org/D128187	2022-07-04 22:13:35 +08:00
Craig Topper	d36e09cfe5	[RISCV] Add more SHXADD patterns. This handles the code we get for this. int foo(unsigned x, int *y) { return y[x >> 3]; } The srl and shl implied by the array index will be combined to form (srl (and X, C2), C1). We need to reverse this get to back the shl to fold into SHXADD.	2022-07-03 21:57:05 -07:00
Craig Topper	8eb4dcb737	[RISCV] Move some SHXADD matching cases into a ComplexPattern. NFC Some more complex cases require checking the relationship of operands on different nodes of the match. They also require additional instructions to be created. Using a ComplexPattern gives us that flexibility. I'll be adding another pattern in a future patch.	2022-07-03 21:57:05 -07:00
Craig Topper	5d787689b1	[RISCV] Match RISCVISD::ADD_LO in SelectAddrRegImm. This allows us to fold global and constant pool addresses into load/store during isel instead of in the post-isel peephole. I did not copy the alignment check for ConsantPoolSDNode because it wasn't tested. This is a step towards being able to remove the post-isel peephole. Reviewed By: reames Differential Revision: https://reviews.llvm.org/D128738	2022-07-02 09:51:06 -07:00
Craig Topper	b2e9684fe4	[RISCV] isel (shl (and X, C2), C) -> (slli (srliw X, C3), C3+C). where C2 has 32 leading zeros and C3 trailing zeros. When the shl is used by an add C is 1,2 or 3, we end up matching (add (shl X, C), Y) first. This leaves an and with a constant that is harder to materialize.	2022-07-02 01:04:44 -07:00
Yeting Kuo	8590a35ef9	[RISCV][NFC] Simplify condition of IsTU. Just simplify code. Reviewed By: khchen Differential Revision: https://reviews.llvm.org/D128972	2022-07-02 09:22:38 +08:00
Craig Topper	188582b7e0	[RISCV] Considering existing offset in the alignment when folding ADDIs into load/store. getPointerAlignment and ConstantPoolSDNode::getAlign only consider the alignment of the object. If we already have a non-zero offset into the offset that may have reduced the alignment. Since the base pointer will become an LUI with the old offset, we need to be sure the new offset fits in the alignment of the address that will be used to create the LUI immediate. I'm not sure it is possible to have a non-zero offset in the GlobalAddressSDNode or ConstantPoolSDNode at this point today so this may only be a theoretical bug. Differential Revision: https://reviews.llvm.org/D129006	2022-07-01 11:18:40 -07:00
Craig Topper	058d521ea4	[RISCV] Avoid repeated code in SelectAddrRegImm. NFC	2022-06-30 17:22:04 -07:00
Craig Topper	5ca39a55a7	[RISCV] Remove an unnecessary copy of X0 in selectShiftMask. We know which instruction we're emitting so its ok to directly encode X0 into the instruction. We only need to create a copy when a constant 0 is selected without context of what instructions uses it.	2022-06-30 15:11:58 -07:00
Craig Topper	354e04554a	[RISCV] Make custom isel for (add X, imm) used by load/stores more selective. Only handle immediates that would produce an ADDI or ADDIW of Lo12 as the final instruction in their materialization. As the test change show this removes immediates that materialize with lui+addiw that is not the same as lui+addi.	2022-06-30 14:20:11 -07:00
Craig Topper	ae5f5eb2f1	[RISCV] Replace some uses of XLenVT in RISCVDAGToDAGISel::Select with the original Node VT. NFCI These should contain the same thing, but we aren't consistent about which we use. Since we call ReplaceNode, it seems more correct to use the initial VT.	2022-06-30 13:00:44 -07:00
Craig Topper	2b7b609821	[RISCV] Use getVTList to simplify creation of vleff MachineSDNode. NFC We don't need to pass the 3 VTs separately, we already have a list available to us.	2022-06-30 11:34:02 -07:00
Craig Topper	89e7e59621	[RISCV] Use the VT passed into selectImm instead of XLenVT. NFCI I think the VT pased in will always be XLen.	2022-06-30 11:15:28 -07:00
Craig Topper	7cbfb4eb7a	[RISCV] Select (srl (and X, C2) as (slli (srliw X, C3), C3-C). If C2 has 32 leading zeros and C3 trailing zeros.	2022-06-29 09:15:09 -07:00
Craig Topper	5dcc525492	[RISCV] Fold (add X, [-4096, -2049]) or (add X, [2048,4096]) into load/store address during isel. Previously we iseled this to a pair of ADDIs and relied on a post isel peephole to fold one of the ADDIs into the load/store. Now we split the immediate in two parts the same way isel does and fold one of the pieces. If the add has a non-memory use it will emit two isels and larger one will CSE with the ADDI we created for the the memory use. Reviewed By: reames Differential Revision: https://reviews.llvm.org/D128741	2022-06-28 16:59:39 -07:00
Craig Topper	87077c7eb5	[RISCV] Remove repeated calls to getSExtValue. NFC	2022-06-27 13:42:58 -07:00
Craig Topper	5e944e9eb7	[RISCV] Refactor SelectAddrRegImm to not depend on SelectBaseAddr. SelectBaseAddr was a minor convenience to use since it already' existed for vector load/store. D128187 is going to remove the other uses of SelectBaseAddr so it has less reason to exist. This patch removes the dependency on SelectBaseAddr and adds a new SelectAddrFrameIndex to share some code with SelectFrameAddrRegImm.	2022-06-26 11:11:41 -07:00
Craig Topper	352346fa9e	[RISCV] Refactor code to remove some small wrapper methods and merge two functions together. NFC	2022-06-22 23:04:58 -07:00
Craig Topper	8780630ded	[RISCV] Merge two similar asserts from different if/else blocks. NFC	2022-06-19 19:48:50 -07:00
Craig Topper	cef03e3dcd	[RISCV] Move creation of constant pools from isel to lowering. This simplifies the isel code by removing the manual load creation. It also improves our ability to use 0 strided loads for vector splats. There is an assumption here that Mask and ShiftedMask constants are cheap enough that they don't become constant pool loads so that our isel optimizations involving And still work. I believe those constants are 3 instructions in the worst case. The rv64zbp-intrinsic.ll changes is a regression caused by intrinsics being expanded to RISCVISD also occuring during lowering. So the optimizations were only happening during the last DAGCombine, which can't see through the load. I believe we can fix this test by implementing TargetLowering::getTargetConstantFromLoad for RISC-V or by adding the intrinsic to computeKnownBitsForTargetNode to enable earlier DAG combine. Since Zbp is not a ratified extension, I don't view these as blocking this patch. Reviewed By: reames Differential Revision: https://reviews.llvm.org/D127520	2022-06-13 09:07:57 -07:00
Yeting Kuo	f68cad9087	[RISCV] Lower VLEFF/VLSEGFF SDNodes to MachineInstrs with VL outputs. The patch is a replacement of D125199. PseudoReadVL with vtype has worry for computing same vtypes of VLEFF/VLSEGFF in two different places, DAGToDAG and InsertVSETVLI. VLEFF/VLSEGFF MI with VL output still could provide the vtype of VLEFF/VLSEGFF to the users of its VL. The patch names the new pseudo as original VLEFF/VLSEGFF name suffixed "_VL" and expand them in RISCVInsertVSETVLI pass. This patch also reverts commit `4537aae0d5`, "[RISCV] Make PseudoReadVL have the vtypes of the corresponding VLEFF/VLSEGFF.". Reviewed By: reames Differential Revision: https://reviews.llvm.org/D126794	2022-06-10 13:57:10 +08:00
Philip Reames	28be4b7454	[RISCV] Simplify InstrInfo access in doPeepholeMaskedRVV [nfc]	2022-06-09 17:02:40 -07:00
Craig Topper	cc3bd43533	[RISCV] Support LUI+ADDIW in doPeepholeLoadStoreADDI. This fixes an inconsistency between RV32 and RV64. Still considering trying to do this peephole during isel, but wanted to fix the inconsistency first. Reviewed By: reames Differential Revision: https://reviews.llvm.org/D126986	2022-06-03 18:06:56 -07:00
Craig Topper	170c550ca8	[RISCV] Use SelectionDAG::isBaseWithConstantOffset in scalar load/store address matching. Test changes are because isBaseWithConstantOffset uses computeKnownBits and that is able to see that an earlier AND instruction guaranteed alignment so that we can treat an OR as an ADD. Reviewed By: reames Differential Revision: https://reviews.llvm.org/D126970	2022-06-03 10:55:28 -07:00
Craig Topper	4402852002	[RISCV] Reduce scalar load/store isel patterns to a single ComplexPattern. NFCI Previously we had 3 different isel patterns for every scalar load store instruction. This reduces them to a single ComplexPattern that returns the Base and Offset. Or an offset of 0 if there was no offset identified I've done a similar thing for the 2 isel patterns that match add/or with FrameIndex and immediate. Using the offset of 0, I was also able to remove the custom handler for FrameIndex. Happy to split that to another patch. We might be able to enhance in the future to remove the post-isel peephole or the special handling for ADD with constant added by D126576. A nice side effect is that this removes nearly 3000 bytes from the isel table. Differential Revision: https://reviews.llvm.org/D126932	2022-06-03 09:00:17 -07:00
Craig Topper	dbead2388b	[RISCV] Add custom isel for (add X, imm) used by load/stores. If the imm is out of range for an ADDI, we will materialize it in a register using multiple instructions. If the ADD is used by a load/store, doPeepholeLoadStoreADDI can try to pull an ADDI from the constant materialization into the load/store offset. This only works if the ADD has a single use, otherwise the peephole would have to rebuild multiple nodes. This patch instead tries to solve the problem when the add is selected. We check that the add is only used by loads/stores and if it is we will select it to (ADDI (ADD X, Imm-Lo12), Lo12). This will enable the simple case in doPeepholeLoadStoreADDI that can bypass an ADDI used as a pointer. As a result we can remove the more complicated peephole from doPeepholeLoadStoreADDI. Reviewed By: reames Differential Revision: https://reviews.llvm.org/D126576	2022-06-02 13:45:32 -07:00
eopXD	2cadf84fc8	[RISCV] Pass OptLevel to `RISCVDAGToDAGISel` correctly Originally, `OptLevel` isn't passed into the `MachineFunctionPass`. This lets the default parameter of `SelectionDAGISel`, which is `CodeGenOpt::Default`, be passed in. OptLevelChanger captures the optimization level with the parameter, and rather not the value within `TargetMachine`. This lets the optimization be unintentionally overwriten if other value than `CodeGenOpt::Default` passed. This patch fixes this by passing the optimization level rather than using the default value. Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D126641	2022-05-30 17:22:50 -07:00
Craig Topper	b09e54541a	[RISCV] Use template version of SignExtend64 for constant extends. NFC We were inconsistent about which one we used.	2022-05-27 13:11:15 -07:00
Craig Topper	d2ee2c9c8d	[RISCV] Add an operand kind to the opcode/imm returned from RISCVMatInt. Instead of matching opcodes to know the format to emit, use an enum value that we can get from the RISCVMatInt::Inst class. Change the consumers to use fully covered switches so that we get a compiler warning if a new kind is added. With the opcode checks it was easier to forget to update one of the 3 consumers. Reviewed By: reames Differential Revision: https://reviews.llvm.org/D126317	2022-05-24 14:56:29 -07:00
Zakk Chen	7dfc56c107	[RISCV] Add the passthru operand for RVV unmasked segment load IR intrinsics. The goal is support tail and mask policy in RVV builtins. We focus on IR part first. If the passthru operand is undef, we use tail agnostic, otherwise use tail undisturbed. Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D125323	2022-05-13 02:16:40 -07:00
Craig Topper	5c7ec998a9	[RISCV] Fold addiw from (add X, (addiw (lui C1, C2))) into load/store address This is a followup to D124231. We can fold the ADDIW in this pattern if we can prove that LUI+ADDI would have produced the same result as LUI+ADDIW. This pattern occurs because constant materialization prefers LUI+ADDIW for all simm32 immediates. Only immediates in the range 0x7ffff800-0x7fffffff require an ADDIW. Other simm32 immediates work with LUI+ADDI. Reviewed By: asb Differential Revision: https://reviews.llvm.org/D124693	2022-05-11 12:47:13 -07:00
Yeting Kuo	4537aae0d5	[RISCV] Make PseudoReadVL have the vtypes of the corresponding VLEFF/VLSEGFF. The patch make PseudoReadVL have the vtypes of the corresponding VLEFF/VLSEGFF. It's useful to get the vtypes of locations of PseudoReadVL without finding the corresponding VLEFF/VLSEGFF. It could simplify optimizations in RISCVInsertVSETVLI like D123581. Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D125199	2022-05-11 14:07:58 +08:00
Zakk Chen	5807e59a0a	[RISCV] Fix incorrect codegen for masked vmsge{u}.vx with mask agnostic. The result was totally wrong. We could use mask undisturbed result to emulate the mask agnostic result. Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D124684	2022-05-02 17:57:29 -07:00
Craig Topper	f91690f7db	[RISCV] Don't merge addi into load/store address if addi has a FrameIndex operand. This fixes a crash from D124231. We can't fold (load (add base, (addi src, off1)), off2) -> (load (add base, src), off1+off2) if the src is a FrameIndex. FrameIndex cannot be the operand of an add. There was an immediate==0 check that I think was trying to catch the common case of FrameIndex addis where the immediate is 0, but they can also appear in non-zero form. Instead explicitly check for a FrameIndex operand.	2022-04-29 18:22:20 -07:00
Hsiangkai Wang	c62b014db9	[RISCV] Merge addi into load/store as there is a ADD between them This patch adds peephole optimizations for the following patterns: (load (add base, (addi src, off1)), off2) -> (load (add base, src), off1+off2) (store val, (add base, (addi src, off1)), off2) -> (store val, (add base, src), off1+off2) Differential Revision: https://reviews.llvm.org/D124231	2022-04-29 04:33:05 +00:00
ShihPo Hung	6b55f133fb	[RISCV][RVV] Select unmasked TU RVV pseudos in a DAG post-process Following D118810 that reduced the size of ISel table, this patch optimizes allone-masked RVV pseudos with TU policy and swap them out to their unmasked TU pseudos. Since the UNDEF merge operand is not preserved, we turn it into TA pseudo regardless of the policy operand. Reviewed By: craig.topper, frasercrmck Differential Revision: https://reviews.llvm.org/D121881	2022-04-26 20:14:54 -07:00

1 2 3 4 5

232 Commits