llvm-project

Commit Graph

Author	SHA1	Message	Date
Philip Reames	7ed16e7c51	[riscv] Fix state tracking bug on vsetvli (phi of vsetvli) peephole This fixes the first of several cases where the state computed in phase 1 and 2 of the algorithm differs from the state computed during phase 3. Note that such differences can cause miscompiles by creating disagreements about contents of the VL and VTYPE registers at block boundaries. In this particular case, we recognize that for the first vsetvli in a block, that if the AVL is a phi of GPR results from previous vsetvlis and the VTYPE field matches, we can avoid emitting a vsetvli as the register contents don't change. Unfortunately, the abstract state does change and that update was lost. As noted in the test change, this can actually improve results by preserving information until later state transitions in the block. However, this minor codegen improvement is not the motivation for the patch. The motivation is to avoid cases a case where we break a key internal correctness invariant. Differential Revision: https://reviews.llvm.org/D125133	2022-05-09 06:21:45 -07:00
Philip Reames	c7c3f58544	[riscv] Use early return to reduce nesting for InsertVSETVLI [nfc]	2022-05-06 13:10:05 -07:00
Philip Reames	99a41005fe	[riscv] Add early return to InsertVSETLI fixed point step [nfc] If the income state hasn't changed, and the step function is fixed by assumption, then the output state can't have changed. In the current algorithm, this is a very minor win and mostly allows adding tracing output without being horrible verbose.	2022-05-06 13:08:11 -07:00
Philip Reames	dee9b01d83	[riscv] Add some minimal tracing output to InsertVSETVLI Only available with -debug. Main purpose is simplifying an upcoming change, and providing tools for debugging problems.	2022-05-06 13:08:11 -07:00
Philip Reames	f486119ce9	[riscv] Add strict asserts for VSETVLI insertion algorithm to help catch bugs This assertion should hold for any reasonable data flow algorithm, but is known not to in several cases today. I'd like to go ahead and land this off-by-default, so that we can collaborate on fixes and have a common definition of success. Differential: https://reviews.llvm.org/D125035	2022-05-06 10:28:22 -07:00
wangpc	4ff5e8184c	[RISCV] Enable MachineOutliner by default under -Oz for RISCV Enable default outlining when the function has the minsize attribute. `addr-label.ll` crashed after enabling this, so a barrier is added before instruction selection as a workaround. Reviewed By: luismarques Differential Revision: https://reviews.llvm.org/D122213	2022-05-06 17:37:45 +08:00
Philip Reames	042a7a5f0d	[riscv] Use X0 for destination of VSETVLI instruction if result unused If the GPR destination register of a VSETVLI instruction is unused, we can replace it with X0. This discards the result, and thus reduces register pressure. Since after the core insertion/lowering algorithm has run, many user written VSETVLIs will have their GPR result unused (as VTYPE/VLEN is now explicitly read instead), this kicks in for most tests which involve a vsetvli intrinsic for fixed length vectorization. (vscale vectorization generally uses the GPR result to know how far to e.g. advance pointers in a loop and these uses are not removed.) When inserting VSETVLIs to lower psuedos, we prefer the X0 form anyways. Differential Revision: https://reviews.llvm.org/D124961	2022-05-05 07:39:45 -07:00
Lian Wang	8bb10436ab	[RISCV][NFC] Use true_mask replace riscv_vmset_vl in defined patterns. Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D124660	2022-05-05 03:05:52 +00:00
Craig Topper	60cb489685	[RISCV] Use movImm went multiplying by simm12 in getVLENFactoredAmount. No reason to special case simm12, movImm handles all immediates. This also fixe a bug that we weren't passing the frame-setup/destroy flag to movImm when we were calling it.	2022-05-04 17:23:22 -07:00
Philip Reames	18ed2ee80c	[RISCV] Add a version of insertVSETVLI which uses an iterator [NFC] This is to simplify the final version of D124869.	2022-05-04 14:48:31 -07:00
Craig Topper	411bb42eed	[RISCV] Add a special case to treat riscv-v-vector-bits-min=-1 as meaning use Zvlb value. riscv-v-vector-bits-min is primarily used to opt-in to the autovectorizer. The vector width can be determined from Zvlb. This patch adds support treating -1 as meaning use Zvlb so we can still opt-in to autovectorization without needing to repeat a vector width already given by Zvlb or -mcpu. Reviewed By: reames Differential Revision: https://reviews.llvm.org/D124960	2022-05-04 14:26:45 -07:00
Craig Topper	1d6430b9e2	[RISCV] Update isLegalAddressingMode for RVV. RVV instructions only support base register addressing. Reviewed By: reames Differential Revision: https://reviews.llvm.org/D124820	2022-05-03 19:49:11 -07:00
Craig Topper	9cce9a126c	[RISCV] Make use of SHXADD instructions in RVV spill/reload code. We can use SH1ADD, SH2ADD, SH3ADD to multipy by 3, 5, and 9 respectively. We could extend this to 3, 5, or 9 multiplied by a power 2 by also emitting a SLLI. Reviewed By: reames Differential Revision: https://reviews.llvm.org/D124824	2022-05-03 19:35:21 -07:00
Craig Topper	0971819740	[RISCV] Don't lookup TII in RISCVInstrInfo::getVLENFactoredAmount. NFCI We're already inside of our implementation of TII.	2022-05-03 19:35:21 -07:00
Weverything	5afd20806d	[riscv] Mark function as used to avoid unused warning.	2022-05-03 18:51:23 -07:00
Philip Reames	2982d0032b	Fix a buildbot warning [nfc]	2022-05-03 14:40:27 -07:00
Philip Reames	be50b8c185	[riscv] Add debug printing support for VSETVLIInfo class [nfc]	2022-05-03 14:00:17 -07:00
Hsiangkai Wang	eaaa31ff2c	[RISCV][TargetLowering] Special case overflow expansion for (uaddo X, C). Follow-up to D122933. Differential Revision: https://reviews.llvm.org/D124374	2022-05-03 03:51:36 +00:00
Craig Topper	72a66358f6	[RISCV] Add isCommutable to FADD/FMUL/FMIN/FMAX/FEQ. Reviewed By: arcbbb Differential Revision: https://reviews.llvm.org/D123972	2022-05-02 20:21:16 -07:00
Zakk Chen	5807e59a0a	[RISCV] Fix incorrect codegen for masked vmsge{u}.vx with mask agnostic. The result was totally wrong. We could use mask undisturbed result to emulate the mask agnostic result. Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D124684	2022-05-02 17:57:29 -07:00
Fangrui Song	2019c9b1c8	[RISCV] Lower case the first letter of LowerRISCVMachineOperandToMCOperand. NFC	2022-05-01 14:13:55 -07:00
luxufan	e098281c27	[RISCV] Don't getDebugLoc for the end node of MBB iterator Because of shrink wrapping, the block to insert epilog may don't have instructions (Only debug instructions). And the position to insert may point to MBB.end() that don't have a DebugLoc. This patch fix this problem. The test program was copied from the issue:https://github.com/llvm/llvm-project/issues/53662 Reviewed By: luismarques Differential Revision: https://reviews.llvm.org/D123679	2022-04-30 16:00:20 +08:00
Yeting Kuo	c069e37019	[RISCV] Add DAGCombine to fold base operation and reduction. Transform (<bop> x, (reduce.<bop> vec, splat(neutral_element))) to (reduce.<bop> vec, splat (x)). Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D122563	2022-04-30 14:07:05 +08:00
Craig Topper	f91690f7db	[RISCV] Don't merge addi into load/store address if addi has a FrameIndex operand. This fixes a crash from D124231. We can't fold (load (add base, (addi src, off1)), off2) -> (load (add base, src), off1+off2) if the src is a FrameIndex. FrameIndex cannot be the operand of an add. There was an immediate==0 check that I think was trying to catch the common case of FrameIndex addis where the immediate is 0, but they can also appear in non-zero form. Instead explicitly check for a FrameIndex operand.	2022-04-29 18:22:20 -07:00
Craig Topper	5aa1a7b307	[RISCV] Remove 'frameindex' from list for ComplexPattern. NFC Putting a node in this list allows the node to be used as the root of an isel pattern that would then call the ComplexPattern. The usual case is to use the ComplexPattern as the operand of another operator. AddrFI is never used as a root operation. frameindex is handled directly with custom code in RISCVISelDAGToDAG::Select. So adding frameindex to the list here serves no purpose.	2022-04-29 17:41:07 -07:00
Philip Reames	3ea191ed03	[RISCV] Factor repeating code into getMaskTypeFor(VT) [nfc]	2022-04-29 10:00:57 -07:00
Philip Reames	f927be0df8	[RISCV] Extract getAllOnesMask helper [nfc]	2022-04-29 09:30:18 -07:00
Craig Topper	5c38373125	[RISCV] Improve constant materialization for cases that can use LUI+ADDI instead of LUI+ADDIW. It's possible that we have a constant that isn't simm32 so we can't use LUI+ADDIW, but we can use LUI+ADDI. Because ADDI uses a sign extended constant, it's possible that after subtracting it out, we end up with a simm32 that maps to LUI. This patch detects this case after removing Lo12 and before shifting the value for SLLI. Reviewed By: luismarques Differential Revision: https://reviews.llvm.org/D124222	2022-04-29 08:58:32 -07:00
LiaoChunyu	03a3654203	[RISCV] Add cost model for SK_Broadcast Add cost model for broadcast shuffle in RISCVTTIImpl::getShuffleCost with scalable vector. The cost model might not the best. For scalable vector, BasicTTIImpl::getShuffleCost return invalid cost, so this patch relies on the existing cost model in BasicTTIImpl. Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D124101	2022-04-29 13:28:02 +08:00
Hsiangkai Wang	c62b014db9	[RISCV] Merge addi into load/store as there is a ADD between them This patch adds peephole optimizations for the following patterns: (load (add base, (addi src, off1)), off2) -> (load (add base, src), off1+off2) (store val, (add base, (addi src, off1)), off2) -> (store val, (add base, src), off1+off2) Differential Revision: https://reviews.llvm.org/D124231	2022-04-29 04:33:05 +00:00
Craig Topper	ec11fbb1d6	[RISCV] Use default promotion for (i32 (shl 1, X)) on RV64 when Zbs is enabled. This improves opportunities to use bset/bclr/binv. Unfortunately, there are no W versions of these instrcutions so this isn't always a clear win. If we use SLLW we get free sign extend and shift masking, but need to put a 1 in a register and can't remove an or/xor. If we use bset/bclr/binv we remove the immediate materializationg and logic op, but might need a mask on the shift amount and sext.w. Reviewed By: luismarques Differential Revision: https://reviews.llvm.org/D124096	2022-04-28 09:58:30 -07:00
Craig Topper	8631a5e712	[RISCV] Fix alias printing for vmnot.m By clearing the HasDummyMask flag from mask register binary operations and mask load/store. HasDummyMask was causing an extra operand to get appended when converting from MachineInstr to MCInst. This extra operand doesn't appear in the assembly string so was mostly ignored, but it prevented the alias instruction printing from working correctly. Reviewed By: arcbbb Differential Revision: https://reviews.llvm.org/D124424	2022-04-28 08:33:52 -07:00
Lian Wang	dc0ae8ce18	[RISCV] Support VP_SETCC mask operations Support VP_SETCC mask operations, turn it to logical operation. Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D124438	2022-04-28 08:52:29 +00:00
Craig Topper	c2614b31d9	[RISCV] Add isCommutable to scalar FMA instructions. The default implementation of findCommutedOpIndices picks the first two source operands. That's exactly what we want for the scalar FMA instructions. Reviewed By: reames Differential Revision: https://reviews.llvm.org/D124463	2022-04-27 11:07:18 -07:00
Jim Lin	9de7b93bc0	[RISCV][NFC] Update and add missing closed curly bracket comment in RISCVInstrInfoZb.td	2022-04-27 15:08:51 +08:00
ShihPo Hung	6b55f133fb	[RISCV][RVV] Select unmasked TU RVV pseudos in a DAG post-process Following D118810 that reduced the size of ISel table, this patch optimizes allone-masked RVV pseudos with TU policy and swap them out to their unmasked TU pseudos. Since the UNDEF merge operand is not preserved, we turn it into TA pseudo regardless of the policy operand. Reviewed By: craig.topper, frasercrmck Differential Revision: https://reviews.llvm.org/D121881	2022-04-26 20:14:54 -07:00
Vasileios Porpodas	fa8a9fea47	Recommit "[SLP][TTI] Refactoring of `getShuffleCost` `Args` to work like `getArithmeticInstrCost`" This reverts commit `6a9bbd9f20`. Code review: https://reviews.llvm.org/D124202	2022-04-26 14:02:40 -07:00
Shao-Ce SUN	c59473aacc	[NFC][RISCV][CodeGen] Use ArrayRef in TargetLowering functions Based on D123467. Reviewed By: reames Differential Revision: https://reviews.llvm.org/D123653	2022-04-26 23:53:00 +08:00
Craig Topper	40f1af4760	[RISCV] Add isCommutable to ADD/ADDW/MUL/AND/OR/XOR/MIN/MAX/CLMUL Reviewed By: reames Differential Revision: https://reviews.llvm.org/D123970	2022-04-25 10:53:41 -07:00
Zakk Chen	ffe03ff75c	[RISCV] Fix incorrect policy implement for unmasked vslidedown and vslideup. vslideup works by leaving elements 0<i<OFFSET undisturbed. so it need the destination operand as input for correctness regardless of policy. Add a operand to indicate policy. We also add policy operand for unmaksed vslidedown to keep the interface consistent with vslideup because vslidedown have only undisturbed at 0<i<vstart but user have no way to control of vstart. Reviewed By: rogfer01, craig.topper Differential Revision: https://reviews.llvm.org/D124186	2022-04-25 09:18:41 -07:00
wangpc	7a21a0525a	[RISCV] Add sched to pseudo function call instructions To fix llvm-mca's error of 'found an unsupported instruction in the input assembly sequence.' caused by the lack of scheduling info. Pseudo function call instructions will be expanded to `auipc` and `jalr`, so their scheduling info are the combination of two. Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D123578	2022-04-24 14:58:18 +08:00
Mohammed Nurul Hoque	5dd99f71aa	[RISCV] transform MI to W variant to remove sext.w Backwards search The sext.w removal pass (before the new patch) checks if the input to sext.w is already in sign-extended form, so it can eliminate it. It does that by checking every definition/source that reaches the sext.w is an instruction that produces a sign-extended value, either by definition (e.g. ADDW), or it propagates sign-extension (e.g. OR) so we check its sources recursively. Forward search Sometimes, one of the sources is an instruction that doesn't always produce a sign-extended value, but it has a W-version that does (e.g. ADD / ADDW). If we transform the ADD to ADDW, the sext.w can be removed (assuming other def paths are satisfied), but this transformation is sound only if every use of this ADD/W only reqruires the lower 32-bits either directly (like sll %x, 32) or they propagate dependency (lower word of output only depends on lower word of input) so we check its uses recursively. When searching backwards, if an instruction that can be replaced with W-variant is encountered, this pass runs the forward search to verify it can be replaced, then adds it to a list of fixable instructions. After verifying all paths, it replaces the instruction and removes the sext.w. Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D119928	2022-04-22 10:59:26 -07:00
Fraser Cormack	98db7ea262	[RISCV][NFC] Adjust some formatting in VL patterns	2022-04-22 17:19:27 +01:00
Fraser Cormack	2b0fedc2dd	[RISCV] Print human-readable VTYPE/SEW/LMUL in MIR This patch adds custom MIR operand comments to VTYPE immediate operands in VSETVLI instructions and SEW/LMUL operands in vector codegen pseudo instructions. The result is intended to be more human-readable and hopefully maintainable when working with MIR, particularly when writing or reading test cases. Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D124187	2022-04-22 17:13:18 +01:00
wangpc	5c3ea07848	[RISCV] Do not outline CFI instructions when they are needed in EH We saw a failure caused by unwinding with incomplete CFIs, so we can't outline CFI instructions when they are needed in EH. This is a recommit of `0d40688`, which was reverted in `ce83883` as related precommit test `360d44e` caused some errors. Reviewed By: luismarques Differential Revision: https://reviews.llvm.org/D122634	2022-04-22 12:28:19 +08:00
Ping Deng	7493d9ffb6	[RISCV][NFC] Use defvar to simplify pattern definations. Reviewed By: jacquesguan, frasercrmck Differential Revision: https://reviews.llvm.org/D123839	2022-04-22 02:45:14 +00:00
Craig Topper	9534811aa8	[RISCV] Teach generateInstSeqImpl to generate BSETI for single bit cases. If the immediate has one bit set, but isn't a simm32 we can try the BSETI instruction from Zbs.	2022-04-21 12:08:34 -07:00
Craig Topper	98b866892d	[RISCV] Add special case to constant materialization to remove trailing zeros first. If there are fewer than 12 trailing zeros, we'll try to use an ADDI at the end of the sequence. If we strip trailing zeros and end the sequence with a SLLI we might find a shorter sequence. Differential Revision: https://reviews.llvm.org/D124148	2022-04-21 09:43:32 -07:00
wangpc	ce83883691	Revert "[RISCV] Do not outline CFI instructions when they are needed in EH" This reverts commit `0d40688925`.	2022-04-21 16:23:10 +08:00
wangpc	0d40688925	[RISCV] Do not outline CFI instructions when they are needed in EH We saw a failure caused by unwinding with incomplete CFIs, so we can't outline CFI instructions when they are needed in EH. Reviewed By: luismarques Differential Revision: https://reviews.llvm.org/D122634	2022-04-21 16:13:22 +08:00
Fraser Cormack	3e678cb772	[RISCV] Don't emit fractional VIDs with negative steps We can't shift-right negative numbers to divide them, so avoid emitting such sequences. Use negative numerators as a proxy for this situation, since the indices are always non-negative. An alternative strategy could be to add a compiler flag to emit division instructions, which would at least allow us to test the VID sequence matching itself. Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D123796	2022-04-21 07:00:34 +01:00
Craig Topper	186d5c8af5	[RISCV] Make getInstSeqCost handle other Zb* instructions. We haven't been updating this as Zb* instructions have been used for immediate materialization. They will hit the default case and trigger an llvm_unreachable. Instead of trying to list them all, assume instructions that aren't explicitly listed aren't compressible. Spotted while looking at integer materialization for other reasons. I haven't seen a crash from this yet.	2022-04-20 22:08:04 -07:00
Craig Topper	6db0afb44e	[RISCV] Fold (xor (sllw 1, x), -1) -> (rolw ~1, x). There's an existing generic combine that does this for legal types. This patch adds a RISCV specific combine for W instructions. Reviewed By: luismarques Differential Revision: https://reviews.llvm.org/D123983	2022-04-19 15:03:43 -07:00
Fraser Cormack	c5cac48549	[RISCV] Fix lowering of BUILD_VECTORs as VID sequences This patch fixes a bug when lowering BUILD_VECTOR via VID sequences. After adding support for fractional steps in D106533, elements with zero steps may be skipped if no step has yet been computed. This allowed certain sequences to slip through the cracks, being identified as VID sequences when in fact they are not. The fix for this is to perform a second loop over the BUILD_VECTOR to validate the entire sequence once the step has been computed. This isn't the most efficient, but on balance the code is more readable and maintainable than doing back-validation during the first loop. Fixes the tests introduced in D123785. Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D123786	2022-04-19 07:43:38 +01:00
jacquesguan	25445b94db	[RISCV] Add rvv codegen support for vp.fptrunc. This patch adds rvv codegen support for vp.fptrunc. The lowering of fp_round and vp.fptrunc share most code so use a common lowering function to handle those two, similar to vp.trunc. Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D123841	2022-04-19 01:56:18 +00:00
Lian Wang	545d353b3c	[RISCV][NFC] Refactor VL patterns for vnsrl and vnsra Reviewed By: frasercrmck Differential Revision: https://reviews.llvm.org/D123274	2022-04-15 07:42:59 +00:00
jacquesguan	1aa4f0bb6c	[RISCV][VP] Add RVV codegen for vp.trunc. Differential Revision: https://reviews.llvm.org/D123579	2022-04-15 02:29:53 +00:00
Lian Wang	3100893f63	[RISCV] Remove sext_inreg+riscv_grev/riscv_gorc isel patterns Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D123565	2022-04-14 08:16:32 +00:00
Lian Wang	38706dd940	[RISCV][NFC] Refactor patterns for Multiply Add instructions Reviewed By: craig.topper, frasercrmck Differential Revision: https://reviews.llvm.org/D123355	2022-04-14 08:00:00 +00:00
wangpc	d0828c5af9	[RISCV][NFC] Use addExpr() instead of createExpr() It seems to be neater. Reviewed By: asb Differential Revision: https://reviews.llvm.org/D123675	2022-04-14 10:48:25 +08:00
Liqin Weng	8265679018	[RISCV][NFC] Refactor the type promotion of fsl/fsr/becompress/bdecompress/bfp Reviewed By: asb, jrtc27, craig.topper, frasercrmck Differential Revision: https://reviews.llvm.org/D123181	2022-04-13 08:52:04 +00:00
Craig Topper	057c063c9b	[RISCV] Add a encodeLMUL function to RISCVVType. NFC This moves the encoding handling out of the assembly parser. Reviewed By: khchen, frasercrmck Differential Revision: https://reviews.llvm.org/D123553	2022-04-12 13:39:47 -07:00
Craig Topper	2ce2562876	[RISCV][SelectionDAG] Add a hook to sign extend i32 ConstantInt operands of phis on RV64. Materializing constants on RISCV is simpler if the constant is sign extended from i32. By default i32 constant operands of phis are zero extended. This patch adds a hook to allow RISCV to override this for i32. We have an existing isSExtCheaperThanZExt, but it operates on EVT which we don't have at these places in the code. Reviewed By: efriedma Differential Revision: https://reviews.llvm.org/D122951	2022-04-11 14:38:39 -07:00
Craig Topper	76192182d0	[RISCV] Remove riscv-v-fixed-length-vector-elen-max command line option. This was added before Zve extensions were defined. I think users should use Zve32x or Zve32f now. Though we will lose support for limiting ELEN to 16 or 8, but I hope no one was using that. Reviewed By: frasercrmck Differential Revision: https://reviews.llvm.org/D123418	2022-04-11 10:14:48 -07:00
Craig Topper	c266e50430	[RISCV] Remove ExtZvl enum from RISCVSubtarget. NFC Having an enum with names that contain the string representation of their value doesn't add any value. We can just use the numbers. Reviewed By: kito-cheng, frasercrmck Differential Revision: https://reviews.llvm.org/D123417	2022-04-11 10:01:17 -07:00
LiaoChunyu	505fce5a9e	[RISCV] Add basic code modeling for llvm.experimental.stepvector intrinsic Scalable vectors llvm.experimental.stepvector intrinsic will crash due to an invalid cost when run the code through the loopunroll. Reviewed By: kito-cheng Differential Revision: https://reviews.llvm.org/D122782	2022-04-11 10:19:23 +08:00
Craig Topper	4e561a581f	[RISCV] Remove unnecessary cast to i8* when converting gather/scatter to strided load/store. Not sure why I thought this necessary at the time.	2022-04-09 20:05:03 -07:00
Craig Topper	70046438d0	[RISCV] Only try LUI+SHADD+ADDI for int materialization if LUI+ADDI+SHADD failed. There's an assert in LUI+SHADD+ADDI materialization that makes sure the lower 12 bits aren't zero since that case should have been handled as LUI+ADDI+SHADD. But nothing prevented the LUI+SH*ADD+ADDI checks from running after the earlier code handled it. The sequence would be the same length or longer so it wouldn't replace the earlier sequence, but the assert happened before that was checked. The vector holding the sequence also wasn't reset before the second check so that guaranteed the sequence would never be found to be shorter. This patch fixes this by only trying the second expansion when the earlier fails. Fixes PR54812. Reviewed By: benshi001 Differential Revision: https://reviews.llvm.org/D123406	2022-04-09 08:52:15 -07:00
Fraser Cormack	34e1b4774a	[RISCV] Select unmasked FP setcc insts via ISel post-process Similar to D123217 but for the floating-point patterns. No change in generated output, while reducing the generated table size. Reviewed By: arcbbb Differential Revision: https://reviews.llvm.org/D123291	2022-04-08 17:13:43 +01:00
Craig Topper	1903b99154	[RISCV] Always select (and (srl X, C), Mask) as (srli (slli X, C2), C3). SLLI is always compressible to C.SLLI as long as the source and dest register is the same. ANDI and SRLI are only compressible if the register is x8-x15. By using SLLI we have a better chance of generating shorter code. I had to exclude one exclusion for the BEXTI case so that it's pattern match could still fire. Reviewed By: luismarques Differential Revision: https://reviews.llvm.org/D123336	2022-04-08 09:04:04 -07:00
Kito Cheng	9c5aedfbf5	[RISCV] Fixing stack offset for RVV object with vararg in stack. We found LLVM generate wrong stack offset for RVV object when stack having variable argument, that cause by we didn't count vaarg part during calculate RVV stack objects. Also update the stack layout diagram for including vaarg in the diagram. Stack layout ref: https://github.com/gcc-mirror/gcc/blob/master/gcc/config/riscv/riscv.cc#L3941 Reviewed By: rogfer01 Differential Revision: https://reviews.llvm.org/D123180	2022-04-08 12:01:16 +08:00
Kito Cheng	690085c9b7	[RISCV] Store/restore RISCVMachineFunctionInfo into MIR YAML file RISCVMachineFunctionInfo has some fields like VarArgsFrameIndex and VarArgsSaveSize are calculated at ISel lowering stage, those info are not contained in MIR files, that cause test cases rely on those field can't not reproduce correctly by MIR dump files. This patch adding the MIR read/write for those fields. Reviewed By: frasercrmck Differential Revision: https://reviews.llvm.org/D123178	2022-04-08 11:55:48 +08:00
jacquesguan	a55c19c44b	[RISCV][NFC] Use defvar to simplify pattern definations. Reviewed By: frasercrmck Differential Revision: https://reviews.llvm.org/D123292	2022-04-08 02:51:30 +00:00
Craig Topper	d98bea87ef	[RISCV] Add more .vx patterns for VLMax integer setccs. This patch synchronizes the structure of the templates with those in RISCVInstrInfoVVLPatterns.td so that we get patterns with .vx on the left hand side. Reviewed By: rogfer01 Differential Revision: https://reviews.llvm.org/D123255	2022-04-07 09:17:43 -07:00
Craig Topper	82662b753d	[RISCV] Add swapped patterns to VPatIntegerSetCCVL_VIPlus1. This matches VPatIntegerSetCCVL_VI_Swappable. But as noted in the FIXME this may only be needed due to lack of canonicalization on VP_SETCC. Reviewed By: rogfer01 Differential Revision: https://reviews.llvm.org/D123239	2022-04-07 09:17:08 -07:00
Luís Marques	d09d297c5d	[RISCV] Fix crash for section alignment with .option norvc The existing code wasn't getting the subtarget info from the fragment, so the current status of RVC would be ignored. This would cause a crash for the new test case when the target then reported it couldn't write the requested number of code alignment bytes. Differential Revision: https://reviews.llvm.org/D122236	2022-04-07 12:02:27 +01:00
Fraser Cormack	8ebc9b1560	[RISCV] Select unmasked integer setcc insts via ISel post-process This patch has no effect on the generated code, whilst mitigating the increase in ISel table size caused by the recent addition of masked patterns. I aim to do the same for floating-point patterns once D123051 lands, giving us a reason to use masked floating-point patterns. Reviewed By: arcbbb Differential Revision: https://reviews.llvm.org/D123217	2022-04-07 09:30:19 +01:00
Fraser Cormack	8216255c9f	[RISCV][VP] Add basic RVV codegen for vp.fcmp This patch adds the necessary infrastructure to lower vp.fcmp via ISD::VP_SETCC to RVV instructions. Most notably this patch adds cond-code legalization for VP_SETCC, reusing the existing TargetLowering::LegalizeSetCCCondCode by passing in additional SDValue parameters for the Mask and EVL. This method then uses VP operations to legalize the condcode. There is still a general lack of canonicalization on VP_SETCC as opposed to SETCC which results in worse code than is theoretically possible. Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D123051	2022-04-07 09:16:07 +01:00
Liqin Weng	f891123556	[RISCV] Add CMOV isel pattern for (select (setgt X, Imm), Y, Z) Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D122644	2022-04-07 05:55:53 +00:00
Lian Wang	1b547799c5	[RISCV] Supplement patterns for vnsrl.wx/vnsra.wx when splat shift is sext or zext Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D122786	2022-04-07 02:21:41 +00:00
Craig Topper	e13a44b460	[RISCV] Add lowering for vp.sext and vp.zext. Including mask vector inputs. Reviewed By: frasercrmck, rogfer01 Differential Revision: https://reviews.llvm.org/D123150	2022-04-06 09:59:49 -07:00
Fraser Cormack	6be5e875be	[RISCV][VP] Add basic RVV codegen for vp.icmp This patch adds the minimum required to successfully lower vp.icmp via the new ISD::VP_SETCC node to RVV instructions. Regular ISD::SETCC goes through a lot of canonicalization which targets may rely on which has not hereto been ported to VP_SETCC. It also supports expansion of individual condition codes and a non-boolean return type. Support for all of that will follow in later patches. In the case of RVV this largely isn't a problem as the vector integer comparison instructions are plentiful enough that it can lower all VP_SETCC nodes on legal integer vectors except for boolean vectors, which regular SETCC folds away immediately into logical operations. Floating-point VP_SETCC operations aren't as well supported in RVV and the backend relies on condition code expansion, so support for those operations will come in later patches. Portions of this code were taken from the VP reference patches. Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D122743	2022-04-06 16:51:22 +01:00
Craig Topper	3c831c9b28	[RISCV] Add support for vp.fptosi where the result is a mask type. We can do this conversion by converting the same sized integer type, then compare the result with 0. The conversion is undefined if the converted FP value doesn't fit in an i1. Reviewed By: frasercrmck Differential Revision: https://reviews.llvm.org/D122678	2022-04-05 09:48:04 -07:00
Craig Topper	d970e96c53	[RISCV] Add lowering for vp.fptoui and vp.uitofp. This is a straightforward extension of D122512 to unsigned integers.	2022-04-01 18:28:46 -07:00
Craig Topper	fa630e7594	[RISCV][AMDGPU][TargetLowering] Special case overflow expansion for (uaddo X, 1). If we expand (uaddo X, 1) we previously expanded the overflow calculation as (X + 1) <u X. This potentially increases the live range of X and can prevent X+1 from reusing the register that previously held X. Since we're adding 1, overflow only occurs if X was UINT_MAX in which case (X+1) would be 0. So this patch adds a special case to expand the overflow calculation to (X+1) == 0. This seems to help with uaddo intrinsics that get introduced by CodeGenPrepare after LSR. Alternatively, we could block the uaddo transform in CodeGenPrepare for this case. Reviewed By: arsenm Differential Revision: https://reviews.llvm.org/D122933	2022-04-01 13:14:10 -07:00
Lian Wang	62dd3674bc	[RISCV] Supplement SDNode patterns for vfwmul/vfwadd/vfwsub Reviewed By: jacquesguan Differential Revision: https://reviews.llvm.org/D122720	2022-04-01 03:09:50 +00:00
Fraser Cormack	ee51aefba0	[RISCV][NFC] Minor formatting fix	2022-03-31 16:15:22 +01:00
Fraser Cormack	a276d1f44b	[RISCV][NFC] Fix formatting on one line	2022-03-31 13:17:37 +01:00
ShihPo Hung	2f1261abe4	[RISCV][RVV] Add Uses = [FRM] and mayRaiseFPException = true to RVV instructions This patch adds Uses = [FRM] and mayRaiseFPException = true to following instructions: VFADD, VFSUB, VFRSUB, VFMUL, VFDIV, VFRDIV VFWADD, VFWSUB, VFWMUL VFMADD, VFMACC, VFMSAC, VFMSUB VFNMADD, VFNMACC, VFNMSAC, VVFNMSUB VFWMACC, VFWMSAC, VFWNMACC, VFWNMSAC VFSQRT, VFREC7 VFREDOSUM, VFREDUSUM, VFWREDOSUM, VFWREDUSUM and only adds mayRaiseFPException = true to following instructions: VFRSQRT7, VFMIN, VFMAX, VFREDMIN, VFREDMAX VMFEQ, VMFNE, VMFLT,VMFLE, VMFGT, VMFGE Reviewed By: rogfer01 Differential Revision: https://reviews.llvm.org/D121087	2022-03-31 01:33:17 -07:00
Fraser Cormack	893d63fbdc	[RISCV][NFC] Fix comment to refer to correct file	2022-03-31 08:59:10 +01:00
Lian Wang	b3851e9931	[RISCV] Add VL patterns for vfwmul/vfwadd/vfwsub Reviewed By: frasercrmck Differential Revision: https://reviews.llvm.org/D122369	2022-03-31 07:08:58 +00:00
Craig Topper	4477500533	[RISCV] ISel (and (shift X, C1), C2)) to shift pair in more cases Previously, these isel optimizations were disabled if the AND could be selected as a ANDI instruction. This patch disables the optimizations only if the immediate is valid for C.ANDI. If we can't use C.ANDI, we might be able to compress the shift instructions instead. I'm not checking the C extension since we have relatively poor test coverage of the C extension. Without C extension the code size should be equal. My only concern would be if the shift+andi had better latency/throughput on a particular CPU. I did have to add a peephole to match SRLIW if the input is zexti32 to prevent a regression in rv64zbp.ll. Reviewed By: luismarques Differential Revision: https://reviews.llvm.org/D122701	2022-03-30 11:46:42 -07:00
Craig Topper	7417eb29ce	[RISCV] Use getSplatBuildVector instead of getSplatVector for fixed vectors. The splat_vector will be legalized to build_vector eventually anyway. This patch makes it take fewer steps. Unfortunately, this results in some codegen changes. It looks like it comes down to how the nodes were ordered in the topological sort for isel. Because the build_vector is created earlier we end up with a different ordering of nodes. Reviewed By: frasercrmck Differential Revision: https://reviews.llvm.org/D122185	2022-03-30 11:36:34 -07:00
Liqin Weng	4cb85da811	[RISCV] Add CMIX isel pattern for (xor (and (xor rs1, rs3), rs2), rs3) Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D122702	2022-03-30 16:51:09 +08:00
Fraser Cormack	75047577d6	[RISCV] Trim RVV isel pats matchable via DAG post-process In D122512, several masked patterns were added to support lowering of vector-predicated float-to-int and int-to-float conversions. With the introduction of these patterns, all of the old "unmasked" patterns are matchable via the DAG post-process introduced in D118810, once the relevant opcode entries are set up in the helper table. Locally this reduces the generated isel table by 4%. Reviewed By: arcbbb Differential Revision: https://reviews.llvm.org/D122637	2022-03-30 08:56:38 +01:00
Liqin Weng	7f81765898	[RISCV][NFC] Add immediate tests for the icmp instruction Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D122651	2022-03-30 02:51:26 +00:00
Zakk Chen	b578330754	[RISCV] Use maskedoff to decide mask policy for masked compare and vmsbf/vmsif/vmsof. masked compare and vmsbf/vmsif/vmsof are always tail agnostic, we could check maskedoff value to decide mask policy rather than have a addtional policy operand. Reviewed By: craig.topper, arcbbb Differential Revision: https://reviews.llvm.org/D122456	2022-03-29 18:05:33 -07:00
Zakk Chen	10b2760da0	Revert "[RISCV] Add policy operand for masked compare and vmsbf/vmsif/vmsof IR" This reverts commit `10fd2822b7`. I have a better implementation for those operations without the additional policy operand. masked compare and vmsbf/vmsif/vmsof are always tail agnostic so we could assume undef maskedoff is mask agnostic. Differential Revision: https://reviews.llvm.org/D122455	2022-03-29 18:05:33 -07:00
Liqin Weng	d660c0d793	[RISCV] Optimize LI+SLT to SLTI+XORI for immediates in specific range This transform will reduce one GPR. Reviewed By: craig.topper, benshi001 Differential Revision: https://reviews.llvm.org/D122051	2022-03-29 14:46:49 +08:00
Craig Topper	45e85feba6	[RISCV] Pull APInt/computeKnonwbits specifics out of computeGREVOrGORC. NFC This function now takes a uint64_t instead of an APInt. The caller is responsible for masking the shift amount, extracting and inserting into the KnownBits APInts, and inverting to compute zeros. This is less code and cleaner division of responsibilities.	2022-03-28 20:53:54 -07:00
Shao-Ce SUN	662b9fa02c	[NFC][CodeGen] Add a setTargetDAGCombine use ArrayRef Reviewed By: arsenm Differential Revision: https://reviews.llvm.org/D122557	2022-03-29 09:53:24 +08:00
Craig Topper	01203918d1	[RISCV] Add computeKnownBits support for RISCVISD::GORC. Reviewed By: luismarques Differential Revision: https://reviews.llvm.org/D121575	2022-03-28 16:56:33 -07:00
Craig Topper	e68257fcee	[RISCV][SelectionDAG] Enable TargetLowering::hasBitTest for masks that fit in ANDI. Modified DAGCombiner to pass the shift the bittest input and the shift amount to hasBitTest. This matches the other call to hasBitTest in TargetLowering.h This is an alternative to D122454. Reviewed By: luismarques Differential Revision: https://reviews.llvm.org/D122458	2022-03-28 12:46:36 -07:00
Craig Topper	cfe533da05	[RISCV] Add lowering for vp.fptosi and vp.sitofp. This as an alternative version of D120641. Starting from the code here https://repo.hca.bsc.es/gitlab/rferrer/llvm-epi/-/raw/EPI/llvm/lib/Target/RISCV/RISCVISelLowering.cpp but with some modifications to how the interim types are calculated, and adding support for f16. Still need to add fptosi for mask vectors. Lots of masked isel patterns added so we can pass the mask through the type changes. Reviewed By: frasercrmck, arcbbb Differential Revision: https://reviews.llvm.org/D122512	2022-03-28 11:06:41 -07:00
Kazu Hirata	6212871968	[Target] Apply clang-tidy fixes for readability-redundant-member-init (NFC)	2022-03-27 22:22:37 -07:00
Maksim Panchenko	4ae9745af1	[Disassember][NFCI] Use strong type for instruction decoder All LLVM backends use MCDisassembler as a base class for their instruction decoders. Use "const MCDisassembler " for the decoder instead of "const void ". Remove unnecessary static casts. Reviewed By: skan Differential Revision: https://reviews.llvm.org/D122245	2022-03-25 18:53:59 -07:00
Dávid Bolvanský	9a738c477e	[NFCI] Fix set-but-unused warning in RISCVAsmParser.cpp	2022-03-24 08:33:40 +01:00
jacquesguan	8910ac400c	[RISCV] Add patterns for vector widening integer multiply Add patterns for vector widening integer multiply instructions Differential Revision: https://reviews.llvm.org/D117385	2022-03-24 15:26:08 +08:00
Vasileios Porpodas	39aa202aff	Recommit "[SLP] Fix lookahead operand reordering for splat loads." attempt 3, fixed assertion crash. Original review: https://reviews.llvm.org/D121354 This reverts commit `e6ead19b77`.	2022-03-23 18:32:17 -07:00
Craig Topper	6c90a654bb	[RISCV] Simplify some code in lowering vector int<->fp conversions. NFC Don't call EltVT.getSizeInBits() or SrcEltVT.getSizeInBits() a second time. They are already in EltSize or SrcEltSize variables. Refactor some comparisons to use multiply instead of division.	2022-03-23 12:09:05 -07:00
Arthur Eubanks	e6ead19b77	Revert "Recommit "[SLP] Fix lookahead operand reordering for splat loads." attempt 2, fixed assertion crash." This reverts commit `27bd8f9492`. Causes crashes, see comments in D121973	2022-03-23 10:57:45 -07:00
luxufan	5800fb41a6	[RISCV] Remove check and update test file in D121183 Differential Revision: https://reviews.llvm.org/D122290	2022-03-24 00:48:52 +08:00
luxufan	227496dc09	[RISCV] Generate correct ELF EFlags when .ll file has target-abi attribute In the past, when construct RISCVAsmBackend, MCTargetOptions.ABIName would be passed and stored in RISCVAsmBackend. But MCTargetOptions.ABIName can only be specified by -target-abi xxx in command line, if the .ll file has target-abi attribute, the codegen module will ignore it. And the generated object file would have incorrect EFlags value. https://github.com/llvm/llvm-project/issues/50591 also caused by this problem. This patch override the AsmPrinter::emitFunctionEntryLabel function and use it to set the target abi value that get from .ll file's target-abi attribute. And storing the target-abi in RISCVTargetStreamer instead of RISCVAsmBackend. Differential Revision: https://reviews.llvm.org/D121183	2022-03-24 00:48:52 +08:00
Vasileios Porpodas	27bd8f9492	Recommit "[SLP] Fix lookahead operand reordering for splat loads." attempt 2, fixed assertion crash. Original review: https://reviews.llvm.org/D121354 This reverts commit `f7d7d2a08d`.	2022-03-22 16:41:55 -07:00
Arthur Eubanks	f7d7d2a08d	Revert "Recommit "[SLP] Fix lookahead operand reordering for splat loads."" This reverts commit `79613185d3`. Causes crashes, see comments in https://reviews.llvm.org/D121973.	2022-03-22 13:33:49 -07:00
Craig Topper	51940d69cb	[RISCV] Special case sign extended scalars when type legalizing nxvXi64 .vx instrinsics on RV32. On RV32, we need to type legalize i64 scalar arguments to intrinsics. We usually do this by splatting the value into a vector separately. If the scalar happens to be sign extended, we can continue using a .vx intrinsic. We already special cased sign extended constants, this extends it to any sign extended value. I've only added tests for one case of vadd. Most intrinsics go through the same check. Reviewed By: khchen Differential Revision: https://reviews.llvm.org/D122186	2022-03-22 10:29:06 -07:00
Craig Topper	9b0f227d7b	[TableGen][RISCV] Add InstAliases with zero_reg to cover unmasked vnot.v, vncvt.x.x.w, vneg.v, etc. The mask being NoRegister prevented the existing aliases from matching since NoRegister isn't in the VMV0 register class. To workaround this I've added new aliases that look for zero_reg. I had to motify tablegen to generate matching code for zero_reg. And as a consequence, I had to change the EmitPriority for an ARM alias that used zero_reg that started printing. Reviewed By: frasercrmck Differential Revision: https://reviews.llvm.org/D121496	2022-03-22 10:14:43 -07:00
Zakk Chen	10fd2822b7	[RISCV] Add policy operand for masked compare and vmsbf/vmsif/vmsof IR intrinsics. Those operations are updated under a tail agnostic policy, but they could have mask agnostic or undisturbed. Reviewed By: rogfer01 Differential Revision: https://reviews.llvm.org/D120228	2022-03-22 07:47:21 -07:00
Zakk Chen	9ab18cc535	[RISCV] Add policy operand for masked vid and viota IR intrinsics. Reviewed By: rogfer01 Differential Revision: https://reviews.llvm.org/D120227	2022-03-22 02:32:31 -07:00
Zakk Chen	abb5a985e9	[RISCV] Support mask policy for RVV IR intrinsics. Add the UsesMaskPolicy flag to indicate the operations result would be effected by the mask policy. (ex. mask operations). It means RISCVInsertVSETVLI should decide the mask policy according by mask policy operand or passthru operand. If UsesMaskPolicy is false (ex. unmasked, store, and reduction operations), the mask policy could be either mask undisturbed or agnostic. Currently, RISCVInsertVSETVLI sets UsesMaskPolicy operations default to MA, otherwise to MU to keep the current mask policy would not be changed for unmasked operations. Add masked-tama, masked-tamu, masked-tuma and masked-tumu test cases. I didn't add all operations because most of implementations are using the same pseudo multiclass. Some tests maybe be duplicated in different tests. (ex. masked vmacc with tumu shows in vmacc-rv32.ll and masked-tumu) I think having different tests only for policy would make the testing clear. Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D120226	2022-03-22 01:19:16 -07:00
Yeting Kuo	ecd7a0132a	[RISCV] Add basic cost model for vector casting To perform the cost model of vector casting, the patch consider most vector casts as their scalar form and consider those vector form of free scalr castings as 1. Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D121771	2022-03-22 14:17:08 +08:00
Vasileios Porpodas	79613185d3	Recommit "[SLP] Fix lookahead operand reordering for splat loads." Original review: https://reviews.llvm.org/D121354 The original commit `9136145eb0` broke the build on several targets. Differential Revision: https://reviews.llvm.org/D121973	2022-03-21 15:57:32 -07:00
Craig Topper	cc5b0868ff	Revert "[RISCV] Special case sign extended scalars when type legalizing nxvXi64 .vx instrinsics on RV32." This reverts commit `8c4937b33f`. Committed by mistake.	2022-03-21 14:58:11 -07:00
Craig Topper	d4aeb5000f	[RISCV] Simplify some code. NFC	2022-03-21 14:50:56 -07:00
Craig Topper	19de2e8db6	[RISCV] Remove stray slash from comment. NFC	2022-03-21 14:50:56 -07:00
Craig Topper	8c4937b33f	[RISCV] Special case sign extended scalars when type legalizing nxvXi64 .vx instrinsics on RV32. On RV32, we need to type legalize i64 scalar arguments to intrinsics. We usually do this by splatting the value into a vector separately. If the scalar happens to be sign extended, we can continue using a .vx intrinsic. We already special cased sign extended constants, this extends it to any sign extended value. I've only added tests for one case of vadd. Most intrinsics go through the same check. I can add more tests if we're concerned. Differential Revision: https://reviews.llvm.org/D122186	2022-03-21 14:50:55 -07:00
Mohammed Nurul Hoque	7afa44f5f5	[RISCV] Add more sign-extending ops to MIR sext.w pass. This patch adds single-bit and bit-counting ops to list of sign-extending ops. A single-bit write propagates sign-extendedness if it's not in the sign-bits. Bit extraction and bit counting always outputs a small number, so sign-extended. Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D121152	2022-03-18 18:21:17 +08:00
Jessica Clarke	63ea7797dd	[RISCV] Fix buildbot breakage by explicitly instantiating templates RISCVISelDAGToDAG's selectImm uses RISCVTargetLowering::getAddr (specifically the ConstantPoolSDNode) as of `41454ab256` ("[RISCV] Use constant pool for large integers"), but nothing explicitly instantiates any of the templates, the only reason they exist is because of the various lowering methods in RISCVISelLowering.cpp that themselves use the methods. However, with inlining, those can end up not existing as real functions and thus not be exported, leading to link errors. Up until now this hasn't happened, but for whatever reason D121654 has triggered this on the sanitizer-ppc64be-linux buildbot, giving: ../../../../lib/libLLVMRISCVCodeGen.a(RISCVISelDAGToDAG.cpp.o): In function `selectImm(llvm::SelectionDAG, llvm::SDLoc const&, llvm::MVT, long, llvm::RISCVSubtarget const&)': RISCVISelDAGToDAG.cpp:(.text._ZL9selectImmPN4llvm12SelectionDAGERKNS_5SDLocENS_3MVTElRKNS_14RISCVSubtargetE+0x3d8): undefined reference to `llvm::SDValue llvm::RISCVTargetLowering::getAddr<llvm::ConstantPoolSDNode>(llvm::ConstantPoolSDNode, llvm::SelectionDAG&, bool) const' collect2: error: ld returned 1 exit status Fix this by explicitly instantiating getAddr in its four different forms so separate translation units can reliably use it. Fixes: `41454ab256` ("[RISCV] Use constant pool for large integers")	2022-03-18 02:22:17 +00:00
Craig Topper	bbd2ecf9f0	[RISCV] Add +experimental-zvfh extension to cover half types in vectors. Currently we allow half types in vectors if the scalar Zfh extension is enabled. This behavior is not inline with the vector spec. For f32 and f64 types, the Zve32f, Zve64f, Zve64d, and V explicitly control the availablity of floating point types in vectors. In order to make our compiler compliant, we either need to remove all support for half in vectors or we need an extension to control it. Draft spec here https://github.com/riscv/riscv-v-spec/pull/780 Reviewed By: kito-cheng Differential Revision: https://reviews.llvm.org/D121345	2022-03-17 10:04:02 -07:00
Craig Topper	7e15303062	[RISCV] Simplify scalable vector case in lowerVectorMaskExt. Since we have SPLAT_VECTOR_PARTS these days, I don't think we need to go through extra lengths to avoid introducing an illegal scalar type. We can just call getConstant using the scalable vector type and let it create either a SPLAT_VECTOR or a SPLAT_VECTOR_PARTS. Reviewed By: frasercrmck, rogfer01 Differential Revision: https://reviews.llvm.org/D121645	2022-03-17 09:43:13 -07:00
Lian Wang	214afc7116	[RISCV] Add patterns for vnsrl.wi and vnsra.wi instructions Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D121675	2022-03-17 07:22:32 +00:00
Lian Wang	b26abcad81	[RISCV][NFC] Replace redundant code with VLOpFrag Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D121783	2022-03-17 02:05:21 +00:00
Craig Topper	2e10671ec7	[RISCV] Improve detection of when to skip (and (srl x, c2) c1) -> (srli (slli x, c3-c2), c3) isel. We have a special case to skip this transform if c1 is 0xffffffff and x is sext_inreg in order to use sraiw+zext.w. But we were only checking that we have a sext_inreg opcode, not how many bits are being sign extended. This commit adds a check that it is a sext_inreg from i32 so we know for sure that an sraiw can be created.	2022-03-16 14:54:34 -07:00
Jessica Clarke	659363c0cc	[RISCV] Ensure PseudoLA* can be hoisted Since we mark the pseudos as mayLoad but do not provide any MMOs, isSafeToMove conservatively returns false, stopping MachineLICM from hoisting the instructions. PseudoLA_TLS_GD does not actually expand to a load, so stop marking that as mayLoad to allow it to be hoisted, and for the others make sure to add MMOs during lowering to indicate they're GOT loads and thus can be freely moved. Fixes https://github.com/llvm/llvm-project/issues/54372 Reviewed By: MaskRay, arichardson Differential Revision: https://reviews.llvm.org/D121654	2022-03-16 18:45:36 +00:00
Shengchen Kan	37b378386e	[NFC][CodeGen] Rename some functions in MachineInstr.h and remove duplicated comments	2022-03-16 20:25:42 +08:00
serge-sans-paille	989f1c72e0	Cleanup codegen includes This is a (fixed) recommit of https://reviews.llvm.org/D121169 after: 1061034926 before: 1063332844 Discourse thread: https://discourse.llvm.org/t/include-what-you-use-include-cleanup Differential Revision: https://reviews.llvm.org/D121681	2022-03-16 08:43:00 +01:00
Haocong.Lu	6a54776fe0	[RISCV] Select SRLI+SLLI for AND with leading ones mask Select SRLI+SLLI for and i64 %x, imm if the imm is a leading ones mask. It's useful in RV64 when the mask exceeds simm32 (cannot be generated by LUI). Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D121598	2022-03-16 02:10:57 +00:00
Craig Topper	06c5d74090	[RISCV] Remove lowerSPLAT_VECTOR This code handles fixed vector SPLAT_VECTOR, but is never called in any tests. We only form fixed vector splat vectors for vXi64 on RV32 as part of DAGCombine. This will be type legalized to SPLAT_VECTOR_PARTS. So the Custom handling for SPLAT_VECTOR is never needed. This patch makes SPLAT_VECTOR for vXi64 'Legal' on RV32 so that DAGCombine will create it, but there's no need for Custom handler. It will still be type legalized to SPLAT_VECTOR_PARTS. Reviewed By: frasercrmck Differential Revision: https://reviews.llvm.org/D121673	2022-03-15 08:22:13 -07:00
Yeting Kuo	ae7c6647f3	[RISCV] Add basic code modeling for fixed length vector reduction. Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D121447	2022-03-14 11:04:31 +08:00
Craig Topper	eeb3bfd74a	[RISCV] Merge ReplaceNodeResults code for SHFL and GREV/GORC. NFC	2022-03-13 18:42:26 -07:00
Lehua Ding	1648852c98	[RISCV][RVV] Fix vslide1up/down intrinsics overflow bug for SEW=64 on RV32 Reviewed By: craig.topper, kito-cheng Differential Revision: https://reviews.llvm.org/D120899	2022-03-13 18:06:09 +08:00
Craig Topper	fd4d584d6b	[RISCV] Add DAGCombine to fold (bitreverse (bswap X)) to brev8 with Zbkb. If the type is less than XLenVT, type legalization will turn this into (srl (bitreverse (bswap (srl (bswap X), C))), C). We can't completely recover from these shifts. They introduce zeros into the upper bits of the result and we can't easily tell if they are needed. By doing a DAG combine early, we avoid introducing these shifts.	2022-03-12 16:39:39 -08:00
Craig Topper	43f668b98e	[RISCV] Move GORCIW/GREVIW formation to isel patterns. Type legalize narrow RISCVISD::GREV/GORC with constant to a larger type without switching to W. Detect sext_inreg+gorci/grevi with a uimm5 immediate during isel to emit GREVIW/GORCIW. This allows us to better propagate known bits information through extended bits after type legalization. It will also simplify a change I'm considering for BREV8 with Zbkb. A future patch will add computeKnownBits support for GORC. A further improvement here would be to use hasAllWUsers and doPeepholeSExtW like we do for SLLIW, but I don't think we have the test coverage for that yet.	2022-03-11 18:02:47 -08:00
Craig Topper	d0969e485c	[RISCV] Optimize vfmv.s.f intrinsic with scalar 0.0 to vmv.s.x with x0. We already do this for RISCVISD::VFMV_S_F_VL and the vfmv.v.f intrinsic. Reviewed By: kito-cheng Differential Revision: https://reviews.llvm.org/D121429	2022-03-11 10:05:43 -08:00
Craig Topper	e9d4922543	[RISCV] Add tablegen helper classes to create PatFrag to check for one use. NFC Reduces code and the class can be instantiated in isel patterns to avoid creating more *_oneuse classes.	2022-03-10 23:14:21 -08:00
Craig Topper	337d49da84	[RISCV] Fix typo in comment. NFC	2022-03-10 22:00:18 -08:00
Eric Tang	336c92d5e8	[RISCV] Add alias for HFENCE.VVMA Signed-off-by: Eric Tang <eric.tang@starfivetech.com> Differential Revision: https://reviews.llvm.org/D120878	2022-03-11 13:32:52 +08:00
Craig Topper	1f3a8d58a6	[RISCV] Use ZERO_EXTEND instead of ANY_EXTEND when promoting i32 RISCVISD::SHFL. NFC We know the shift amount is a constant with bit 31 clear. anyext of constant will be either zext or sext which will produce the same result here. But we really shouldn't rely on that. It would be valid to put a random number in the upper bits. Our isel patterns expect the upper bits to be 0 so we should ask for it explicitly.	2022-03-10 20:57:04 -08:00
Craig Topper	9ce6b1ca86	[RISCV] Remove performANY_EXTENDCombine. This doesn't appear to be needed any more. I did some inspecting of the gcc torture suite and SPEC2006 with this removed and didn't find any meaningful changes. I think we're more aggressive about forming ADDIW now using sign_extend_inreg during type legalization and hasAllWUsers in isel. This probably helps catch the cases this helped with before.	2022-03-10 11:29:31 -08:00
Craig Topper	e0e8edf823	[RISCV] Add isel patterns for masked RISCVISD::FMA_VL with RISCVISD::FNEG_VL. This helps us form vfnmsub, vfnmadd, and vfmusb from masked VP intrinsics. I've used "srcvalue" for the mask parameter in the fneg nodes. We can't match "V0" because that doesn't ensure the mask the is the same. Instead it matches two different nodes and generates two copies to V0 of those separate values. Reviewed By: rogfer01 Differential Revision: https://reviews.llvm.org/D120287	2022-03-10 10:05:42 -08:00
Nico Weber	a278250b0f	Revert "Cleanup codegen includes" This reverts commit `7f230feeea`. Breaks CodeGenCUDA/link-device-bitcode.cu in check-clang, and many LLVM tests, see comments on https://reviews.llvm.org/D121169	2022-03-10 07:59:22 -05:00
serge-sans-paille	7f230feeea	Cleanup codegen includes after: 1061034926 before: 1063332844 Differential Revision: https://reviews.llvm.org/D121169	2022-03-10 10:00:30 +01:00
Luke	0803dba7dd	[RISCV] Add fixed-length vector instrinsics for segment load Inspired by reviews.llvm.org/D107790. Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D119834	2022-03-10 16:23:40 +08:00
Craig Topper	d53707508a	[RISCV] Remove RISCVISD::VLE_VL/VSE_VL. Use intrinsics instead. Similar to what we do for other loads/stores, use the intrinsic version that we already have custom isel for. Reviewed By: rogfer01 Differential Revision: https://reviews.llvm.org/D121166	2022-03-09 22:44:28 -08:00
Craig Topper	edd6632127	[RISCV] Support 'generic' as a valid CPU name. Most other targets support 'generic', but RISCV issues an error. This can require a special case in tools that use LLVM that aren't clang. This patch treats "generic" the same as an empty string and remaps it to generic-rv/rv64 based on the triple. Unfortunately, it has to be added to RISCV.td because MCSubtargetInfo is constructed and parses the CPU before RISCVSubtarget's constructor gets a chance to remap it. The CPU will then reparsed and the state in the MCSubtargetInfo subclass will be updated again. Fixes PR54146. Reviewed By: khchen Differential Revision: https://reviews.llvm.org/D121149	2022-03-09 16:43:22 -08:00
Shao-Ce SUN	365c858a5d	[RISCV] Share PatFprFpr classes for F, D, and Zfh Inspired by D115469 Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D121066	2022-03-08 13:02:04 +08:00
jacquesguan	e55b9b0d0a	[RISCV] Add patterns for vector widening floating-point reduction instructions. Add patterns for vector widening floating-point reduction instructions. Differential Revision: https://reviews.llvm.org/D120390	2022-03-08 10:53:49 +08:00
Craig Topper	845bfcede1	[RISCV] Rename 'SplatOperand' to 'ScalarOperand'. NFC vslide1up/down have this flag set, but the value isn't a splat. Rename for clarity. Reviewed By: khchen Differential Revision: https://reviews.llvm.org/D121037	2022-03-07 11:28:32 -08:00
Zakk Chen	3be907621f	[RISCV] Fix incorrect optimization for masked vmsgeu.vi with 0 immediate. vmsgeu.vi with 0 is always true, but in the masked with mask undisturbed policy, we still need to keep inactive elelemt which come from maskedoff. We could return mask directly if it's mask agnostic policy in the future. Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D121080	2022-03-06 19:22:35 -08:00
Benjamin Kramer	fbce4a7803	Drop some more global std::maps. NFCI.	2022-03-06 13:28:29 +01:00
Craig Topper	bd5f124716	[RISCV] Add SimplifyDemandedBits support for FSR/FSL/FSRW/FSLW.	2022-03-05 21:26:51 -08:00
Zakk Chen	33b61c5678	[RISCV] Fix incorrect codegen introduced by D119688. We should not emit a tail agnostic vlse for a tail undisturbed vmv.s.x In D119688: - if (IsScalarMove && !Node->getOperand(0).isUndef()) + bool HasPassthruOperand = Node->getOpcode() != ISD::SPLAT_VECTOR; + if (HasPassthruOperand && !IsScalarMove && !Node->getOperand(0).isUndef()) break; The IsScalarMove check in the if statement had been changed. Differential Revision: https://reviews.llvm.org/D120963	2022-03-05 06:10:26 -08:00
Craig Topper	1e569e3b7b	[RISCV] Add CMOV isel pattern for (select (setgt X, -1), Y, Z) setgt X, -1 is the canonical form of setge X, 0. We can swap the select operands and use setlt X, X0 when selecting CMOV. This avoid materializing the -1 in a register.	2022-03-04 22:35:13 -08:00
Craig Topper	232f57319d	[RISCV] Move vslide1up/down intrinsics into lowerVectorIntrinsicSplats. NFC Rename to lowerVectorIntrinsicScalars. This allows us to share the code that checks if the scalar needs to be type legalized.	2022-03-04 18:21:53 -08:00
Craig Topper	3d4e83f17d	[RISCV] With Zbb, fold (sext_inreg (abs X)) -> (max X, (negw X)) With Zbb, abs is expanded to (max X, neg) by default. If X has 33 or more sign bits, we can expand it a little early using negw instead of neg to save a sext_inreg. If X started as a 32 bit value, type legalization would have inserted a sext before the abs so X having 33 sign bits should always be true. Note: I've used ISD::FREEZE here since we increase the number of uses. Our default expansion for ABS doesn't do that, but I think that's a bug. We can't do this with custom type legalization because ISD::FREEZE doesn't propagate sign bits so later DAG combine won't expand be able to see optmize it. Alives2 https://alive2.llvm.org/ce/z/Gx3RNe Reviewed By: spatel Differential Revision: https://reviews.llvm.org/D120597	2022-03-03 15:42:29 -08:00
Alex Tsao	89f15fc687	[RISCV] Add cost modelling for masked memory op The patch adds very basic cost model for masked memory op on scalable vector. Reviewed By: frasercrmck Differential Revision: https://reviews.llvm.org/D117884	2022-03-03 20:47:58 +08:00
jacquesguan	44a430354d	[RISCV] Fold store of vmv.f.s to a vse with VL=1. This patch support the FP part of D109482. Differential Revision: https://reviews.llvm.org/D120235	2022-03-03 16:35:19 +08:00
Craig Topper	6cb42cd666	[RISCV] More correctly ignore Zfinx register classes in getRegForInlineAsmConstraint. Until Zfinx is supported in CodeGen we need to convert all Zfinx register classes to GPR. Remove the zfinx-types.ll test which didn't test anything meaningful since -mattr=zfinx isn't implemented completely in llc. Follow up to D93298.	2022-03-02 11:22:46 -08:00
Craig Topper	a1f8349d77	[RISCV] Don't combine ROTR ((GREV x, 24), 16)->(GREV x, 8) on RV64. This miscompile was introduced in D119527. This was a special pattern for rotate+bswap on RV32. It doesn't work for RV64 since the rotate needs to be half the bitwidth. The equivalent pattern for RV64 is ROTR ((GREV x, 56), 32) so match that instead. This could be generalized further as noted in the new FIXME. Reviewed By: Chenbing.Zheng Differential Revision: https://reviews.llvm.org/D120686	2022-03-02 09:47:06 -08:00
Nikita Popov	98cfcae4e9	Revert "[RISCV] Add cost modelling for masked memory op" This reverts commit `76f243b53b`. The newly added test fails.	2022-03-02 17:32:10 +01:00
Alex Tsao	76f243b53b	[RISCV] Add cost modelling for masked memory op The patch adds very basic cost model for masked memory op on scalable vector. Reviewed By: frasercrmck Differential Revision: https://reviews.llvm.org/D117884	2022-03-02 22:48:41 +08:00
Shao-Ce SUN	0e38b29543	[RISCV] add the MC layer support of Zfinx extension This patch added the MC layer support of Zfinx extension. Authored-by: StephenFan Co-Authored-by: Shao-Ce Sun Reviewed By: asb Differential Revision: https://reviews.llvm.org/D93298	2022-03-02 14:25:19 +08:00
Mircea Trofin	cb2160760e	[nfc][codegen] Move RegisterBank[Info].h under CodeGen This wraps up from D119053. The 2 headers are moved as described, fixed file headers and include guards, updated all files where the old paths were detected (simple grep through the repo), and `clang-format`-ed it all. Differential Revision: https://reviews.llvm.org/D119876	2022-03-01 21:53:25 -08:00
Craig Topper	b9d6e8c441	[RISCV] Lower VECTOR_SPLICE to RVV instructions. This lowers VECTOR_SPLICE of scalable vectors to a slidedown follow by a slideup. Fixed vectors are encouraged to use shufflevector instruction. The equivalent patch for fixed vectors is D119039. I've used a tail agnostic slidedown and limited the VL to only the elements that will not be overwritten by the slideup. The slideup uses VLMax for its VL. It unfortunately uses tail undisturbed policy but it isn't required as there is no tail. We just need the merge operand to carry the bits for the lower portion of the result. Care was taken to ensure that either the slideup or slidedown will be able to use a .vi instruction when the immediate is small. Which one uses the immediate depends on the sign of the immediate. Reviewed By: frasercrmck, ABataev Differential Revision: https://reviews.llvm.org/D119303	2022-03-01 10:10:13 -08:00
Lian Wang	db85cd729a	[RISCV] Add FMV_W_X and FMV_H_X instrutions to hasAllNBitUsers Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D120699	2022-03-01 08:13:59 +00:00
lian wang	5d91a8a707	[RISCV] Add schedule class for Zbp extension and Zbr extension Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D120012	2022-03-01 07:35:59 +00:00
Lian Wang	e2c150ab52	[RISCV][NFC] Move defined non_imm12 to proper place in RISCVInstrInfoZb.td Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D120656	2022-03-01 01:45:30 +00:00
Craig Topper	e83db8c001	[RISCV] Only enable combineROTR_ROTL_RORW_ROLW with Zbp. I think the immediate values we check for on the GREV nodes already protect this, but better to be explicit.	2022-02-28 12:47:36 -08:00
Craig Topper	b083157b7b	[RISCV] Don't call combineROTR_ROTL_RORW_ROLW for SLLW/SRLW/SRAW nodes. NFC I think the function does the correct thing internally, but it's confusing to read.	2022-02-28 11:05:10 -08:00
Craig Topper	f46890711f	[RISCV] Custom type legalize i32 ISD::ABS on RV64 without Zbb. Default type legalization will create sext_inreg+abs, but we may not be able to remove the sext_inreg. Instead this patch expands abs during type legalization to Y = sraiw X, 31; subw(xor X, Y), Y) which doesn't require the input to be sign extended. This gives a big improvement for some neg-abs tests where the abs is used more than the the neg. Previously the abs was expanded a different way before and after type legalization. Now they are expanded in a similar way enabling more CSE. Reviewed By: asb Differential Revision: https://reviews.llvm.org/D120636	2022-02-28 09:30:27 -08:00
eric.tang	b496a172e4	[RISCV] Support hypervisor extention instructions According to privileged spec version-20211203 Add the following hypervisor instructions: - HLV.B HLV.BU - HLV.H HLV.HU HLVX.HU - HLV.W HLV.WU HLVX.WU - HLV.D - HSV.B HSV.H HSV.W HSV.D Signed-off-by: eric.tang <eric.tang@starfivetech.com> Differential Revision: https://reviews.llvm.org/D117733	2022-02-28 14:02:43 +08:00
eric.tang	386c5be92a	[RISCV] Support Sinval extension and hypervisor memory management fence instructions According to Privileged spec version-20211203 Add Supervisor Memory-Management Instructions: - SINVAL.VMA, SFENCE.W.INVAL, SFENCE.INVAL.IR Add Hypervisor Memory-Management Instructions: - HFENCE.VVMA, HFENCE.GVMA, HINVAL.VVMA, HINVAL.GVMA Signed-off-by: eric.tang <eric.tang@starfivetech.com> Differential Revision: https://reviews.llvm.org/D117654	2022-02-28 14:02:43 +08:00
Eric Tang	cf80ef1393	[RISCV] Change GPRMemAtomic to GPRMemZeroOffset for general usage Not only some AMO instructions but also other instructions need to process (${gpr}) or 0(${gpr}), where the 0 is be silently ignored. This patch does some changes for general usage. Signed-off-by: Eric Tang <eric.tang@starfivetech.com> Differential Revision: https://reviews.llvm.org/D120017	2022-02-28 14:02:43 +08:00
Chenbing Zheng	7f811ce127	[RISCV] Optimize (sext.w, srli) to sraiw with Zba. In this patch, we add a more narrower exclusion for zeroext (srl x) -> srli (slli x), so that it provides an opportunity for the selection of sraiw. Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D120467	2022-02-28 10:34:35 +08:00
Jessica Clarke	6aa8521fdb	[RISCV] Fix parseBareSymbol to not double-parse top-level operators By failing to lex the token we end up both parsing it as a binary operator ourselves and parsing it as a unary operator when calling parseExpression on the RHS. For plus this is harmless but for minus this parses "foo - 4" as "foo - -4", effectively treating a top-level minus as a plus. Fixes https://github.com/llvm/llvm-project/issues/54105 Reviewed By: asb, MaskRay Differential Revision: https://reviews.llvm.org/D120635	2022-02-27 20:48:52 +00:00
Jameson Nash	c4b1a63a1b	mark getTargetTransformInfo and getTargetIRAnalysis as const Seems like this can be const, since Passes shouldn't modify it. Reviewed By: wsmoses Differential Revision: https://reviews.llvm.org/D120518	2022-02-25 14:30:44 -05:00
Haocong.Lu	865fe131f8	[RISCV] Fix a mistake in PostprocessISelDAG With the condition N->use_empty(), the root node of DAG always misses peephole optimization. So a dummy node is needed. Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D119934	2022-02-25 12:38:31 +00:00
Chenbing Zheng	b20e80aa59	[RISCV] DAG Combine vcpop and vfirst with VL=0 to li imm vcpop and vfirst are still useful when VL=0. vcpop equivalents to li 0 and vfirst equivalents to li -1, since no mask elements are active. Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D120302	2022-02-25 14:44:25 +08:00
Zakk Chen	4e115b7d88	[RISCV] Update computeTargetABI from llc as well as clang Clang computes the default ABI if -mabi is empty and encode it in LLVM IR module flag since D105555. For correctness, llc need to give the same target-abi (Options.MCOptions.ABIName) with ABI encoded in IR. The getSubtargetImpl already has a check for them only if Options.MCOptions.ABIName is not empty. In order to get more robustness we could have a check for explicit ABI, but now we have two different logic to compute the default ABI. The front-end ABI is defautl to the ilp32/ilp32e/lp64, and ilp32d/lp64d when hardware support for extension D. The backend ABI is default to the ilp32/ilp32e/lp64. Reviewed by: asb, jrtc27 Differential Revision: https://reviews.llvm.org/D118333	2022-02-24 21:55:44 -08:00
lian wang	f37d21ed20	[RISCV] Add schedule class for Zbt extension Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D119808	2022-02-25 01:57:20 +00:00
Qihan Cai	0d058ed3d6	[RISCV] Change rvv version to 1.0 and remove ratify notice This patch changes the version of V extension from 0.1 to 1.0 in RISCVInstrInfoVPseudos.td, RISCVInstrInfoVSDPatterns.td, RISCVInstrInfoVVLPatterns.td, RISCVInstrInfoV.td Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D120525	2022-02-25 11:38:20 +11:00
Craig Topper	506ac29632	[RISCV] Add 'i64' to some isel so tablegen will remove them for RV32. NFC Saves a 100 bytes or so from the isel table.	2022-02-24 15:10:05 -08:00
Craig Topper	a975ca97c3	[RISCV] Fold (sext_inreg (fmv_x_anyexth X), i16) -> (fmv_x_signexth X). Add a new ISD opcode to represent the sign extending behavior of vmv.x.h. Keep the previous anyext opcode to allow the existing (fmv_x_anyexth (fmv_h_x X)) combine to keep working without needing to generate a sign extend. For fmv.x.w we are able to match the sext_inreg in an isel pattern, but a 16-bit sext_inreg is lowered to a shift pair before isel. This seemed like a larger match than we should do in isel. Reviewed By: asb Differential Revision: https://reviews.llvm.org/D118974	2022-02-24 09:19:01 -08:00
Shao-Ce SUN	78b5f0fb05	[NFC][RISCV] Reuse ISD::NodeType in float extension Reviewed By: asb Differential Revision: https://reviews.llvm.org/D120412	2022-02-24 19:57:55 +08:00
Nikita Popov	c7fe6f9c92	Revert "[RISCV] add the MC layer support of Zfinx extension" This reverts commit `7798ecca9c`. As reported in https://reviews.llvm.org/D93298#3331641 and following, this causes assertion failures with inline assembly.	2022-02-24 12:14:31 +01:00
lian wang	e1d4d1c242	[RISCV] Add schedule class for Zbm and Zbe extension Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D119805	2022-02-24 08:49:25 +00:00
Chenbing.Zheng	2ae92e19eb	[RISCV][NFC] Add helper function isVectorConfigInstr to reduce Repeated code. Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D119924	2022-02-24 05:59:12 +00:00
Craig Topper	5b7ac107b1	[RISCV] Use SelectionDAG::getFreeze to simplify some code. NFC	2022-02-23 21:13:01 -08:00
Craig Topper	c7d6448d03	[DAGCombiner][TargetLowering] Pass SDValue by value to isMulAddWithConstProfitable. Internally to DAGCombiner the SDValues were passed by non-const reference despite not being modified. They were then passed by const reference to TLI. This patch passes them by value which is consistent with the vast majority of code. Reviewed By: RKSimon Differential Revision: https://reviews.llvm.org/D120420	2022-02-23 12:40:45 -08:00
Alex Bradbury	c5bcfb983e	[RISCV] Avoid infinite loop between DAGCombiner::visitMUL and RISCVISelLowering::transformAddImmMulImm See https://github.com/llvm/llvm-project/issues/53831 for a full discussion. The basic issue is that DAGCombiner::visitMUL and RISCVISelLowering;:transformAddImmMullImm get stuck in a loop, as the current checks in transformAddImmMulImm aren't sufficient to avoid all cases where DAGCombiner::isMulAddWithConstProfitable might trigger a transformation. This patch makes transformAddImmMulImm bail out if C0 (the constant used for multiplication) has more than one use. Differential Revision: https://reviews.llvm.org/D120332	2022-02-23 11:05:46 +00:00
jacquesguan	5acd9c49a8	[RISCV] Add patterns for vector widening integer reduction instructions Add patterns for vector widening integer reduction instructions. Differential Revision: https://reviews.llvm.org/D117643	2022-02-22 14:14:05 +08:00
Zakk Chen	f7dfc5d1af	[RISCV] Optimize tail agnostic vmv.s.x which don't need to select tail value. Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D120250	2022-02-21 14:53:37 -08:00
Craig Topper	90d240553d	[RISCV] Teach shouldSinkOperands to sink splat operands of vp.fma intrinsics. Reviewed By: frasercrmck Differential Revision: https://reviews.llvm.org/D120167	2022-02-21 11:52:59 -08:00
Lian Wang	4abe484525	[RISCV][NFC] Add sched for some instructions in Zb extension Add sched to brev8, zip and unzip instruction. Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D120009	2022-02-21 09:58:08 +08:00
Zakk Chen	17d5ba5bc7	[RISCV][NFC] Remove unused multiclass def.	2022-02-18 23:58:56 -08:00
Craig Topper	5489969550	[RISCV] Add IsRV32 to the isel pattern for ZIP_RV32/UNZIP_RV32. NFC I think the i32 in the pattern prevents this from matching on RV64, but using IsRV32 is safer. Add tests for RV64 to make sure we don't print zip or unzip because we incorrectly picked ZIP_RV32/UNZIP_RV32.	2022-02-18 22:38:14 -08:00
Zakk Chen	ca78312407	[RISCV] Add the policy operand for nomask vector Multiply-Add IR intrinsics. The goal is support tail and mask policy in RVV builtins. We focus on IR part first. The nomask vector Multiply-Add need a policy operand because merge value could not be undef. Reviewed By: monkchiang Differential Revision: https://reviews.llvm.org/D119727	2022-02-17 09:12:46 -08:00
Craig Topper	bbee9e77f3	[RISCV] Match shufflevector corresponding to slideup. This generalizes isElementRotate to work when there's only a single slide needed. I've removed matchShuffleAsSlideDown which is now redundant. Reviewed By: frasercrmck, khchen Differential Revision: https://reviews.llvm.org/D119759	2022-02-17 08:19:10 -08:00
Craig Topper	954fe404ab	[RISCV] Fix incorrect MemOperand copy converting splat+load to vlse. Due to an incorrect copy/paste from load intrinsic handling we checked if the splat node was a MemSDNode which of course it isn't. Instead get the MemOperand from the LoadSDNode for the source of the splat. This enables LICM to see the load is loop invariant and hoist it out of the loop. Reviewed By: frasercrmck Differential Revision: https://reviews.llvm.org/D120014	2022-02-17 08:15:50 -08:00
Zakk Chen	eeb7754f68	[RISCV] Add the passthru operand for vmv.vv/vmv.vx/vfmv.vf IR intrinsics. Add the passthru operand for VMV_V_X_VL, VFMV_V_F_VL and SPLAT_VECTOR_SPLIT_I64_VL also. The goal is support tail and mask policy in RVV builtins. We focus on IR part first. If the passthru operand is undef, we use tail agnostic, otherwise use tail undisturbed. Reviewed By: rogfer01 Differential Revision: https://reviews.llvm.org/D119688	2022-02-17 06:38:14 -08:00
Shao-Ce SUN	7798ecca9c	[RISCV] add the MC layer support of Zfinx extension This patch added the MC layer support of Zfinx extension. Authored-by: StephenFan Co-Authored-by: Shao-Ce Sun Reviewed By: asb Differential Revision: https://reviews.llvm.org/D93298	2022-02-17 21:54:13 +08:00
Zakk Chen	093ecccdab	[RISCV] Add the passthru operand for vadc/vsbc/vmerge/vfmerge IR intrinsics. The goal is support tail and mask policy in RVV builtins. We focus on IR part first. If the passthru operand is undef, we use tail agnostic, otherwise use tail undisturbed. Reviewed By: rogfer01 Differential Revision: https://reviews.llvm.org/D119686	2022-02-17 02:21:39 -08:00
Ben Shi	0b93e90971	Revert "[RISCV] LUI used for address computation should not isAsCheapAsAMove" This reverts commit `23a5073600`. Although this patch achieved better codegen in most cases, it is really important to accurately describe the cost of instructions. So I revert it.	2022-02-17 17:27:37 +08:00
Jessica Paquette	67ab4c010b	[MachineOutliner] NFC: Update LRU stuff for RISCV I missed it in my grep. Fixes broken buildbot.`	2022-02-16 12:01:59 -08:00
Jessica Paquette	6d58f4ab07	[MachineOutliner] NFC: Hide LRU-related stuff behind helper functions It's not particularly user-friendly to have to call `initLRU` everywhere. Also, it wasn't particularly great that the LRU for registers used in a sequence was also initialized by `initLRU`. This patch hides this stuff behind some helper functions: * `isAvailableAcrossAndOutOfSeq` * `isAnyUnavailableAcrossOrOutOfSeq` * `isAvailableInsideSeq` This allows the user to avoid calling `initLRU` explicitly. Also, it allows us to separate initializing the used-in-sequence LRU from the main LRU. Since both ARM and AArch64 check LR liveness in `insertOutlinedCall`, this refactor requires that we de-const the Candidate there. Some other quality-of-code improvements: * LRUs in outliner::Candidate now have more descriptive names * Use `Register` instead of `unsigned` in some places * Improve readability in some places by using ranges rather than `std::for_each` This is a preparatory commit for a larger compile time related change for the AArch64 outliner.	2022-02-16 11:39:07 -08:00
Craig Topper	cfbbcc544c	[RISCV] Improve lowering of SHL_PARTS/SRL_PARTS/SRA_PARTS. Part of the shift lowering creates a (sub XLEN-1, ShAmt). When this value is used we know that ShAmt is [0..XLEN-1]. Since XLEN is a power of 2 we can replace the sub with an xor. This allows us to use XORI instead of LI+SUB. Reviewed By: asb Differential Revision: https://reviews.llvm.org/D119411	2022-02-16 09:22:11 -08:00
Zakk Chen	e8973dd389	[RISCV] Add the passthru operand for some RVV nomask unary and nullary intrinsics. The goal is support tail and mask policy in RVV builtins. We focus on IR part first. If the passthru operand is undef, we use tail agnostic, otherwise use tail undisturbed. My plan is to handle more complex operations in follow-up patches. Reviewers: frasercrmck Differential Revision: https://reviews.llvm.org/D118253	2022-02-15 22:34:06 -08:00
Shao-Ce SUN	2aed07e96c	[NFC][MC] remove unused argument `MCRegisterInfo` in `MCCodeEmitter` Reviewed By: skan Differential Revision: https://reviews.llvm.org/D119846	2022-02-16 13:10:09 +08:00
Shao-Ce SUN	9cc49c1951	Revert "[NFC][MC] remove unused argument `MCRegisterInfo` in `MCCodeEmitter`" This reverts commit `fe25c06cc5`.	2022-02-16 11:57:49 +08:00
Shao-Ce SUN	fe25c06cc5	[NFC][MC] remove unused argument `MCRegisterInfo` in `MCCodeEmitter` For ten years, it seems that `MCRegisterInfo` is not used by any target. Reviewed By: skan Differential Revision: https://reviews.llvm.org/D119846	2022-02-16 11:47:17 +08:00
Zakk Chen	b784719904	[RISCV] Add the passthru operand for RVV nomask binary intrinsics. The goal is support tail and mask policy in RVV builtins. We focus on IR part first. If the passthru operand is undef, we use tail agnostic, otherwise use tail undisturbed. Add passthru operand for VSLIDE1UP_VL and VSLIDE1DOWN_VL to support i64 scalar in rv32. The masked VSLIDE1 would only emit mask undisturbed policy regardless of giving mask agnostic policy until InsertVSETVLI supports mask agnostic. Reviewed by: craig.topper, rogfer01 Differential Revision: https://reviews.llvm.org/D117989	2022-02-15 18:36:18 -08:00
Craig Topper	ab6e02dded	[RISCV] Match vwmulsu_vx with scalar splat input. This is a more generic version of D119110 that uses MaskedValueIsZero to do the matching and SimplifyDemandedBits to remove any unneeded AND instructions. Tests were taken from D119110. Reviewed By: Chenbing.Zheng Differential Revision: https://reviews.llvm.org/D119622	2022-02-15 08:45:21 -08:00
Craig Topper	d132b47bb9	[RISCV] Replace llvm_unreachable with report_fatal_error. Parsing errors aren't handled earlier in all cases. A simple example is llc -mtriple=riscv64 -mattr=+zve32f. If F or Finx is not also specified, this will hit a parse error. Use a fatal_error so that the error is conveyed to the user.	2022-02-15 08:40:37 -08:00
jacquesguan	bfb4c0c370	[RISCV] Recover the implication between Zve* extensions and the V extension. This revision recover the implication between Zve* extensions and the V extension. Differential Revision: https://reviews.llvm.org/D119210	2022-02-14 15:52:07 +08:00
Craig Topper	478c237e21	[RISCV] Fix incorrect extend type in vwmulsu combine. While matching widening multiply, if we matched an extend from i8->i32, i16->i64 or i8->i64, we need to reintroduce a narrower extend. If we're matching a vwmulsu we need to use a sext for op0 and a zext for op1. This bug exists in LLVM 14 and will need to be backported. Differential Revision: https://reviews.llvm.org/D119618	2022-02-12 12:47:20 -08:00
Dimitry Andric	7af3d4ab3d	Revert "[RISCV] Enable shrink wrap by default" This reverts commit `5ebdb07e7e`. Enabling shrink wrap by default can cause assertions or crashes, and these should first be investigated and fixed. For now, reverting the change so it can be cherry-picked into 14.0.0 is the safest choice.	2022-02-12 19:04:12 +01:00
Haocong.Lu	23a5073600	[RISCV] LUI used for address computation should not isAsCheapAsAMove A LUI instruction with flag RISCVII::MO_HI is usually used in conjunction with ADDI, and jointly complete address computation. To bind the cost evaluation of address computation, the LUI should not be regarded as a cheap move separately, which is consistent with ADDI. In this test case, it improves the unroll-loop code that the rematerialization of array's base address miss MachineCSE with Heuristics #1 at isProfitableToCSE. Reviewed By: asb, frasercrmck Differential Revision: https://reviews.llvm.org/D118216	2022-02-12 07:14:38 +00:00
Chenbing.Zheng	9e975e558b	[RISCV][NFC] Move some combine patterns to DAG combine. Move some combine patterns to DAG combine，and it dealt with fixme left in RISCVInstrInfoZb.td. Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D119527	2022-02-12 02:52:21 +00:00
Craig Topper	541c9ba842	[RISCV] Insert VSETVLI at the end of a basic block if we didn't produce BlockInfo.Exit. This is an alternative to D118667 that instead of fixing the store to match phase 1, it tries to detect the mismatch with the expected value at the end of the block. This inserts a vsetvli after the vse to satisfy the requirement of the other basic block. We still have serious design issues in the pass, that is going to require some rethinking. Differential Revision: https://reviews.llvm.org/D119518	2022-02-11 09:34:16 -08:00
Craig Topper	f35ac872b8	Revert "[RISCV] Fix a vsetvli insertion bug involving loads/stores." and "[RISCC] Add missing words to comment. NFC" This reverts commit `f943c58cae`. and commit `7eb7810727`. This introduced a new bug that appears to be easier to hit. Differential Revision: https://reviews.llvm.org/D119517	2022-02-11 09:34:16 -08:00
Zakk Chen	d224be3b99	[RISCV] Add the policy operand for some masked RVV ternary IR intrinsics. Masked reduction intrinsics are specical cases which don't need to have policy operand. The mask only affects which elements are read. It doesn't effect the destination register. The reduction intrinsics have a dedicated destination operand. If it is undef, we use tail agnostic. If it not undef we use tail undisturbed. Co-Authored-by: Craig Topper <craig.topper@sifive.com> Differential Revision: https://reviews.llvm.org/D117681	2022-02-11 05:02:03 -08:00
serge-sans-paille	06943537d9	Cleanup MCParser headers As usual with that header cleanup series, some implicit dependencies now need to be explicit: llvm/MC/MCParser/MCAsmParser.h no longer includes llvm/MC/MCParser/MCAsmLexer.h Preprocessed lines to build llvm on my setup: after: 1068185081 before: 1068324320 So no compile time benefit to expect, but we still get the looser coupling between files which is great. Discourse thread: https://discourse.llvm.org/t/include-what-you-use-include-cleanup Differential Revision: https://reviews.llvm.org/D119359	2022-02-11 10:39:29 +01:00
Fangrui Song	8eb750189c	[RISCV] Fix -Wunused-variable in -DLLVM_ENABLE_ASSERTIONS=off builds	2022-02-10 20:10:12 -08:00
Craig Topper	b0e77d5e48	[RISCV] Lower the shufflevector equivalent of vector.splice We can lower a vector splice to a vslidedown and a vslideup. The majority of the matching code here came from X86's code for matching PALIGNR and VPALIGND/Q. The slidedown and slideup lowering don't really require it to be concatenation, but it happened to be an interesting pattern with existing analysis code I could use. This helps with cases where the scalar loop optimizer forwarded a load result from a previous loop iteration. For example, this happens if the loop uses x[i] and x[i+1] on the same iteration. The scalar optimizer will forward x[i+1] load from the previous loop to satisfy x[i] on this loop. When this get vectorized it results in one element of a vector being forwarded from the previous loop to be concatenated with elements loaded on this iteration. Whether that's more efficient than doing a shifted loaded or reloading the single scalar and using vslide1up is an interesting question. But that's not something the backend can help with. Reviewed By: khchen Differential Revision: https://reviews.llvm.org/D119039	2022-02-10 09:39:35 -08:00
Craig Topper	b861ddf365	[RISCV] Move the creation of VLMaxSentinel to isel. Use X0 during lowering. The VLMaxSentinel is represented as TargetConstant, but that's included in isa<ConstantSDNode>. To keep constant VLs and VLMax separate as long as possible, use the X0 register during lowering and only convert to VLMaxSentinel during isel. Reviewed By: frasercrmck Differential Revision: https://reviews.llvm.org/D118845	2022-02-10 09:28:44 -08:00
Craig Topper	727cd5205f	[RISCV] Remove stale comment. NFC Now that we pre-process SPLAT_VECTOR to VFMV_V_F_VL, these patterns handled scalable vectors and vectors converted from fixed. These are also used by vp.fma lowering.	2022-02-10 09:04:32 -08:00
Fraser Cormack	fd43d99c93	[RISCV] Pre-process FP SPLAT_VECTOR to RISCVISD::VFMV_V_F_VL This patch builds on top of D119197 to canonicalize floating-point SPLAT_VECTOR as RISCVISD::VFMV_V_F_VL as a pre-process ISel step. This primarily benefits scalable-vector VP code, where our VP patterns only match VFMV_V_F_VL to reduce the burden on our ISel patterns, but where at the same time, scalable-vector code doesn't custom-legalize SPLAT_VECTOR. Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D117670	2022-02-10 09:56:00 +00:00
Chenbing.Zheng	c5d3b231e0	[RISCV] Add support for matching vwmaccsu/vwmaccus from fixed vectors Add pattern to match add and widening mul to vwmacc, and two multipliers are sext and zext. Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D119314	2022-02-10 01:59:31 +00:00
Craig Topper	c45c1b130b	[RISCV] Teach RISCVDAGToDAGISel::selectShiftMask to replace sub from constant with neg. If the shift amount is (sub C, X) where C is 0 modulo the size of the shift, we can replace it with neg or negw. Similar is is done for AArch64 and X86. Reviewed By: khchen Differential Revision: https://reviews.llvm.org/D119089	2022-02-09 12:33:01 -08:00
Craig Topper	09629215c2	[RISCV] Add a really basic cost model for SK_Splice. While testing scalable vectors I found that if we generate a vector splice intrinsic and run the code through the loop unroller, we'll crash due to an invalid cost. This adds a basic cost based on the 2 slide instructions used by the lowering in D119303. We probably need to factor LMUL into this, but that's true for arithmetic instructions too. So I've ignored for the moment. Reviewed By: ABataev Differential Revision: https://reviews.llvm.org/D119316	2022-02-09 11:43:31 -08:00
Craig Topper	279b3b8179	[RISCV][VP] Lower VP_FMA to RVV instructions. We already had FMA_VL node, but we didn't have masked patterns. I have not added the fneg variations. I'll do those after I add llvm.vp.fneg. Reviewed By: frasercrmck Differential Revision: https://reviews.llvm.org/D119196	2022-02-09 11:33:12 -08:00
Craig Topper	63e711549c	[RISCV] Lower VP_FNEG to RVV instructions Reviewed By: frasercrmck Differential Revision: https://reviews.llvm.org/D119269	2022-02-09 10:56:39 -08:00
Craig Topper	e305b1de7e	[RISCV] Pre-process integer ISD::SPLAT_VECTOR to RISCISD::VMV_V_X_VL before isel. This allows us to remove some isel patterns that exist for both operations. Saving nearly 3000 bytes from the isel table. Reviewed By: frasercrmck Differential Revision: https://reviews.llvm.org/D119197	2022-02-09 08:10:21 -08:00
Lian Wang	af2cd94555	[RISCV][NFC] Remove useless code Reviewed By: craig.topper, asb Differential Revision: https://reviews.llvm.org/D119317	2022-02-09 19:17:25 +08:00
serge-sans-paille	ef736a1c39	Cleanup LLVMMC headers There's a few relevant forward declarations in there that may require downstream adding explicit includes: llvm/MC/MCContext.h no longer includes llvm/BinaryFormat/ELF.h, llvm/MC/MCSubtargetInfo.h, llvm/MC/MCTargetOptions.h llvm/MC/MCObjectStreamer.h no longer include llvm/MC/MCAssembler.h llvm/MC/MCAssembler.h no longer includes llvm/MC/MCFixup.h, llvm/MC/MCFragment.h Counting preprocessed lines required to rebuild llvm-project on my setup: before: 1052436830 after: 1049293745 Which is significant and backs up the change in addition to the usual benefits of decreasing coupling between headers and compilation units. Discourse thread: https://discourse.llvm.org/t/include-what-you-use-include-cleanup Differential Revision: https://reviews.llvm.org/D119244	2022-02-09 11:09:17 +01:00
Fraser Cormack	6449bea508	[RISCV] Select unmasked RVV pseudos in a DAG post-process This patch drops TableGen patterns matching all-ones masked RVV pseudos in the case where there are fallback patterns matching the generic masked forms to "_MASK" pseudos. This optimization is now performed with a SelectionDAG post-processing step which peephole-optimizes these same pseudos with all-ones masks and swaps them out to their unmasked pseudos. This cuts our generated ISel table down by around ~5% (~110kB) in lieu of a far smaller auto-generated table to help with the peephole. This only targets our custom RISCVISD::*_VL binary operator nodes, which use the one form for both masked and unmasked variants. A similar approach could be used for our intrinsics but we'd need to do some work, e.g., to represent unmasked intrinsics as true-masked intrinsics at the IR or ISel level. At a rough estimate, this could save us a further 9% on the size of our ISel table for the binary intrinsic patterns alone. There is no observable impact on our tests. Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D118810	2022-02-09 07:50:15 +00:00
Zakk Chen	cfe7f69036	[RISCV][NFC] Refactor RISCVISAInfo. 1. Remove computeDefaultABIFromArch and add computeDefaultABI in RISCVISAInfo. 2. Add parseFeatureBits which may used in D118333. Differential Revision: https://reviews.llvm.org/D119250	2022-02-08 18:37:43 -08:00
jacquesguan	5e71bbfb6c	[RISCV] Add patterns for vector widening floating-point fused multiply-add instructions Add patterns for vector widening floating-point fused multiply-add instructions. Differential Revision: https://reviews.llvm.org/D117546	2022-02-09 10:34:39 +08:00
Fraser Cormack	62c4ac764b	[RISCV] Optimize splats of extracted vector elements This patch adds an optimization to splat-like operations where the splatted value is extracted from a identically-sized vector. On RVV we can splat that via vrgather.vx/vrgather.vi without dropping to scalar beforehand. We do have a similar VECTOR_SHUFFLE-specific optimization but that only works on fixed-length vector types and for those with a constant splat lane. This patch extends this optimization to make it work on scalable-vector types and on unknown extract indices. It is performed during fixed-vector BUILD_VECTOR lowering and during a new DAGCombine on SPLAT_VECTOR for scalable vectors. Reviewed By: craig.topper, khchen Differential Revision: https://reviews.llvm.org/D118456	2022-02-08 10:35:25 +00:00
wangpc	c53d99c37d	[RISCV] Split f64 undef into two i32 undefs So that no store instruction will be generated. Reviewed By: asb Differential Revision: https://reviews.llvm.org/D118222	2022-02-08 13:42:15 +08:00
Craig Topper	2c26cfdef7	[RISCV] Use splat_vector instead of SplatPat in widening FP instruction patterns. NFCI We use splat_vector for FP nodes without VL, not SplatPat which handles splat_vector and integer VMV_V_X_VL. Reduces isel table size by a few hundred bytes.	2022-02-07 15:53:27 -08:00
Kazu Hirata	3a3cb929ab	[llvm] Use = default (NFC)	2022-02-06 22:18:35 -08:00
Craig Topper	c1cef111a3	Revert "[RISCV] Fold (sext_inreg (fmv_x_anyexth X), i16) -> (fmv_x_signexth X)." This reverts commit `673d68cd92`. This hadn't been reviewed yet.	2022-02-05 12:51:01 -08:00
Craig Topper	673d68cd92	[RISCV] Fold (sext_inreg (fmv_x_anyexth X), i16) -> (fmv_x_signexth X). Add a new ISD opcode to represent the sign extending behavior of vmv.x.h. Keep the previous anyext opcode to allow the existing (fmv_x_anyexth (fmv_h_x X)) combine to keep working without needing to generate a sign extend. For fmv.x.w we are able to match the sext_inreg in an isel pattern, but a 16-bit sext_inreg is lowered to a shift pair before isel. This seemed like a larger match than we should do in isel. Differential Revision: https://reviews.llvm.org/D118974	2022-02-05 12:42:12 -08:00
Craig Topper	5f35009996	[RISCV] Remove a ComputeNumSignBits call from an isel special case. Only isel (and (srl (sexti32 Y), c2), c1) -> (srliw (sraiw Y, 31), c3 - 32) when there is a sext_inreg present. Don't both checking for Y having 32 sign bits.	2022-02-04 23:26:53 -08:00
Craig Topper	d752ea9a72	[RISCV] Remove exclusions for zext.h/zext.w from our (and (srl X, C1), C2) selection code. This code tries to replace the pattern with a pair of shifts, but we were excluding if the And could be a zext.h or zext.w. The SLLI/SRL pair is more compressible and doesn't come with much down side. We do regress one test case in rv64i-exhaustive-w-insts.ll but we can probably add a narrower exclusion for that case.	2022-02-04 17:10:48 -08:00
Craig Topper	1d8bbe3d25	[RISCV] Implement a basic version of AArch64RedundantCopyElimination pass. Using AArch64's original implementation for reference, this patch implements a pass to remove unneeded copies of X0. This pass runs after register allocation and looks to see if a register is implied to be 0 by a branch in the predecessor basic block. Reviewed By: asb Differential Revision: https://reviews.llvm.org/D118160	2022-02-04 10:43:46 -08:00
Craig Topper	234e54bdd8	[RISCV] Add more types of shuffles isShuffleMaskLegal. Add the vslidedown and interleave patterns that I recently implemented. Reviewed By: frasercrmck Differential Revision: https://reviews.llvm.org/D118952	2022-02-04 09:13:13 -08:00
Craig Topper	c83905a308	[RISCV] Add inline expansion for vector fround. This avoids a crash for scalable vectors and or scalarization for fixed vectors. The algorithm is different enough that I don't think it makes sense to merge with ceil/floor/trunc. Algorithm is adapted from gcc's X86 SSE2 output. Reviewed By: frasercrmck Differential Revision: https://reviews.llvm.org/D117247	2022-02-04 09:12:09 -08:00
serge-sans-paille	ffe8720aa0	Reduce dependencies on llvm/BinaryFormat/Dwarf.h This header is very large (3M Lines once expended) and was included in location where dwarf-specific information were not needed. More specifically, this commit suppresses the dependencies on llvm/BinaryFormat/Dwarf.h in two headers: llvm/IR/IRBuilder.h and llvm/IR/DebugInfoMetadata.h. As these headers (esp. the former) are widely used, this has a decent impact on number of preprocessed lines generated during compilation of LLVM, as showcased below. This is achieved by moving some definitions back to the .cpp file, no performance impact implied[0]. As a consequence of that patch, downstream user may need to manually some extra files: llvm/IR/IRBuilder.h no longer includes llvm/BinaryFormat/Dwarf.h llvm/IR/DebugInfoMetadata.h no longer includes llvm/BinaryFormat/Dwarf.h In some situations, codes maybe relying on the fact that llvm/BinaryFormat/Dwarf.h was including llvm/ADT/Triple.h, this hidden dependency now needs to be explicit. $ clang++ -E -Iinclude -I../llvm/include ../llvm/lib/Transforms/Scalar/*.cpp -std=c++14 -fno-rtti -fno-exceptions \| wc -l after: 10978519 before: 11245451 Related Discourse thread: https://llvm.discourse.group/t/include-what-you-use-include-cleanup [0] https://llvm-compile-time-tracker.com/compare.php?from=fa7145dfbf94cb93b1c3e610582c495cb806569b&to=995d3e326ee1d9489145e20762c65465a9caeab4&stat=instructions Differential Revision: https://reviews.llvm.org/D118781	2022-02-04 11:44:03 +01:00
Craig Topper	237eb37260	[RISCV] Add FMV_X_W and FMV_X_H to RISCVSExtWRemoval. Add -target-abi to sextw-removal.ll RUN lines to show benefit on new test case.	2022-02-03 09:40:47 -08:00
Craig Topper	997a86b99c	[RISCV] Remove createVirtualRegister from RISCVInstrInfo::movImm. Based on the discussion in D61884, this was done to enable compressed instructions by giving freedom to pick a compressible register. Integer materializing can generate LUI, ADDI, ADDIW, SLLI and some Zb* instructions. C.LI, C.LUI, C.ADDI, C.ADDIW, and C.SLLI all have a 5-bit register encoding. The Zb* instructions aren't compressible. Based on that I don't think compressibility of the register is a concern. Reviewed By: asb Differential Revision: https://reviews.llvm.org/D118741	2022-02-03 08:34:26 -08:00
Craig Topper	2349fb0312	[RISCV] Remove RISCVISD::SPLAT_VECTOR_I64 in favor of RISCVISD::VMV_V_X_VL. SPLAT_VECTOR_I64 has the same semantics as RISCVISD::VMV_V_X_VL, it just assumed VLMax instead of carrying a VL operand. Include order of RISCVInstrInfoVSDPatterns.td and RISCVInstrInfoVVLPatterns.td has been swapped to avoid moving riscv_vmv_v_x_vl into RISCVInstrInfoVSDPatterns.td and to allow moving other "_vl" SDNodes back to RISCVInstrInfoVVLPatterns.td Reviewed By: frasercrmck Differential Revision: https://reviews.llvm.org/D118841	2022-02-03 08:30:25 -08:00
Shao-Ce SUN	005fd8aa70	[RISCV] Add support for Zihintpause extention Add support for the 'pause' hint instruction as an alias for 'fence w, 0'. To do this allow the 'fence' operands pred and succ to be set to 0 (the empty set). This will also allow future hints to be encoded as 'fence 0, <x>' and 'fence <x>, 0'. This patch revised from @mundaym's D93019. Reviewed By: asb Differential Revision: https://reviews.llvm.org/D117789	2022-02-03 20:55:47 +08:00
Craig Topper	abc6716038	[RISCV] Remove unused variables. NFC	2022-02-02 19:23:16 -08:00
Craig Topper	f1720abb54	[RISCV] Cleanup some places that assumed VLMaxSentinel and -1 constant mean the same thing. NFCI VLMaxSentintel happens to be represented as -1 TargetConstant. A user provided -1 would be an ISD::Constant. We shouldn't assume that they are the same thing. I'm still not entirely convinced that we should be treating -1 from the user as VLMAX. Also fix one place that failed to use XLenVT for the VLMaxSentinel, using MVT::i64 in code that only executes on RV32.	2022-02-02 12:23:12 -08:00
Craig Topper	b73d151a11	[RISCV] Add DAG combines to transform ADD_VL/SUB_VL into widening add/sub. This adds or reuses ISD opcodes for vadd.wv, vaddu.wv, vadd.vv, vaddu.vv and a similar set for sub. I've included support for narrowing scalar splats that have known sign/zero bits similar to what was done for MUL_VL. The conversion to vwadd.vv proceeds in two phases. First we'll form a vwadd.wv by narrowing one of the operands. Then we'll visit the vwadd.wv to try to narrow the other operand. This turned out to be simpler than catching all the cases in one step. The forming of of vwadd.wv can happen for either operand for add, but only the right hand side for sub since sub isn't commutable. An interesting quirk is that ADD_VL and VZEXT_VL/VSEXT_VL are formed during vector op legalization, but VMV_V_X_VL isn't usually formed until op legalization when BUILD_VECTORS are handled. This leads to VWADD_W_VL forming in one DAG combine round, and then a later DAG combine round sees the VMV_V_X_VL and needs to commute the operands to get the splat in position. This alone necessitated a VWADD_W_VL combine function which made forming vwadd.vv in two stages an easy choice. I've left out trying hard to form vwadd.wx instructions for now. It would only save an extend in the scalar domain which isn't as interesting. Might need to review the test coverage a bit. Most of the vwadd.wv instructions are coming from vXi64 tests on rv64. The tests were copy pasted from the existing multiply tests. Reviewed By: rogfer01 Differential Revision: https://reviews.llvm.org/D117954	2022-02-02 10:03:08 -08:00
Craig Topper	5a5037c602	[RISCV] Fix some 80 column violations in ComputeNumSignBitsForTargetNode. NFC	2022-02-01 21:43:11 -08:00
Craig Topper	f943c58cae	[RISCC] Add missing words to comment. NFC	2022-02-01 07:39:51 -08:00
Craig Topper	7eb7810727	[RISCV] Fix a vsetvli insertion bug involving loads/stores. The first phase of the analysis can avoid a vsetvli if an earlier instruction in the block used an SEW and LMUL that when combined with the EEW of the load/store would produce the desired EMUL. If we avoided a vsetvli this will affect the global analysis we do in the second phase. The third phase where we really insert the vsetvlis needs to agree with the first phase. If it doesn't we can insert vsetvlis that invalidate the global analysis. In the test case there is a VSETVLI in the preheader that sets SEW=64 and LMUL=1. Inside the loop there is a VADD with SEW=64 and LMUL=1. This VADD is followed by a store that wants wants SEW=32 LMUL=1/2. Because it has EEW=32 as part of the opcode the SEW=64 LMUL=1 from the VADD can be become EMUL=1 for the store. So the first phase determines no vsetvli is needed. The third phase manages CurInfo differently than BBInfo.Change from the first phase. CurInfo is only updated when we see a vsetvli or insert a vsetvli. This was done to allow predecessor block information from the global analysis to be applied to multiple instructions. Since the loop body has no vsetvli we won't update CurInfo for either the VADD or the VSE. This prevented us from checking the store vsetvli elision for the VSE resulting in a vsetvli SEW=32 LMUL=1/2 being emitted which invalidated the global analysis. To mitigate this, I've added a BBLocalInfo variable that more closely matches the first phase propagation. This gets updated based on the VADD and prevents emitting a vsetvli for the store like we did in the first phase. I wonder if we should do an earlier phase to handle the load/store case by adding more pseudo opcodes and changing the SEW/LMUL for those instructions before the insertion analysis. That might be more robust than trying to guarantee two phases make the same decision. Fixes the test from D118629. Reviewed By: frasercrmck Differential Revision: https://reviews.llvm.org/D118667	2022-02-01 07:29:01 -08:00
Shao-Ce SUN	a2a7fc7ea5	[RISCV] Adjust some comments.	2022-02-01 22:53:54 +08:00
Craig Topper	2e45e8abb1	[RISCV] Add a fatal error if ISD::VSCALE is used with Zvl32b. We convert VLEN to vscale by dividing by RVVBitsPerBlock which is currently 64. This is only correct if VLEN is evenly divisible by 64. With only Zvl32b we can't assume that. This patch adds a fatal_error to prevent generating code that may be broken. We probably need to look at how we size stack frame objects too. Reviewed By: frasercrmck Differential Revision: https://reviews.llvm.org/D118583	2022-01-31 09:13:14 -08:00
Craig Topper	09606d6a63	[RISCV] Update the computeKnownBitsForTargetNode for RISCVISD::READ_VLENB to consider Zve/Zvl. We had previously hardcoded this to assume that vector registers are 128 bits. This was true when only V existed, but after Zve extensions were added this became incorrect. This patch adjusts it to support 128, 64, or 32 bit vectors depending on Zvl. The 128-bit limit is artificial, but we don't have any test coverage showing that we larger values so I was being conservative. None of our lit tests depend on this code today due to the custom lowering of ISD::VSCALE that inserts the appropriate left or right shift to convert from VLENB to VSCALE. That code was added after this code in computeKnownBitsForTargetNode. Reviewed By: frasercrmck Differential Revision: https://reviews.llvm.org/D118582	2022-01-31 09:13:14 -08:00
Craig Topper	aae947e860	[RISCV] Separate the Zfhmin and Zfh extensions. The spec doesn't seem to be written as if Zfh implies Zfhmin. They seem to be separate extensions. This patch moves the instructions from Zfhmin to be enabled with either the Zfh or Zfhmin extensions. Reviewed By: achieveartificialintelligence Differential Revision: https://reviews.llvm.org/D118581	2022-01-31 09:06:43 -08:00
Nikita Popov	0801940c17	[RISCV] Avoid pointer element type access for masked atomicrmw intrinsics masked.atomicrmw.*.i32 intrinsics access an i32 (and then possibly mask it), so hardcode MVT::i32 as the access type here, rather than determining it from the pointer element type. Differential Revision: https://reviews.llvm.org/D118336	2022-01-31 09:28:39 +01:00
Craig Topper	5fbc3cda9e	[RISCV] Use existing variable intead of calling getOperand again. NFCI This is a slight change because I'm using the ANY_EXTEND result instead of the original operand, but getNode should constant fold. While there, add a comment about why the code specifically checks for a ConstantSDNode.	2022-01-30 18:42:19 -08:00
Craig Topper	744be8c502	[RISCV] Lower riscv_zip/unzip intrinsic to RISCVISD::SHFL/UNSHFL. These are special versions of the more general shfli/unshfli instructions. We can use the general ISD opcodes with the correct immediates.	2022-01-30 13:27:41 -08:00
Craig Topper	e1075186a6	[RISCV] Custom lower brev8 intrinsic to RISCVISD::GREV. We can use the RISCVISD::GREV encoding that swaps the bits in each byte. This allows it to use the existing computeKnownBits support for RISCVISD::GREV.	2022-01-30 12:41:09 -08:00
Craig Topper	524545317c	[RISCV] Remove RISCVISD::BREV8 and use RISCVISD::GREV instead. We already have an ISD opcode for the more general GREV/GREVI instructon. We can just use it with the encoding that corresponds to the behavior of brev8. This is similar to what we do for orc.b where we use the GORC ISD opcode.	2022-01-29 22:45:43 -08:00
Craig Topper	0405ac0150	[RISCV] Rerrange RISCVInstrInfoZB.td to better group related wthings. NFC Especially placing W instructions/patterns near their non-W versions.	2022-01-29 21:16:15 -08:00
Craig Topper	815786eb67	[RISCV] Use RVBUnary to simplify ZEXT_H_RV32/ZEXT_H_RV64 definitions. NFC	2022-01-29 18:28:14 -08:00
Craig Topper	8faf2a0638	[RISCV] Correct predicate orc.b pattern to not include Zbkb. This was incorrectly lumped in when the predicate was changed for the rotate instructions.	2022-01-29 00:10:54 -08:00
Craig Topper	d8f929a567	[RISCV] Custom legalize BITREVERSE with Zbkb. With Zbkb, a bitreverse can be split into a rev8 and a brev8. Reviewed By: VincentWu Differential Revision: https://reviews.llvm.org/D118430	2022-01-28 23:11:12 -08:00
jacquesguan	1276678982	[RISCV] Improve extract_vector_elt for fixed mask registers. Now the backend promotes mask vector to an i8 vector and extract element from that. We could bitcast to a widen element vector, and extract from it to GPR, then use I instruction to extract the certain bit. Differential Revision: https://reviews.llvm.org/D117389	2022-01-29 11:07:53 +08:00
Craig Topper	06bd56d47d	[RISCV] Update comments about getInstSizeInBytes hard-coding the number of bytes. After D118175, we get the information from the tablegen definition. Differential Revision: https://reviews.llvm.org/D118488	2022-01-28 09:51:49 -08:00
Craig Topper	ea05ee9059	[RISCV] Preserve VL when truncating i64 gather/scatter indices on RV32. We were creating a truncate with the default for the type, but for VP intrinsics we have a VL that we should use. Reviewed By: frasercrmck Differential Revision: https://reviews.llvm.org/D118406	2022-01-28 09:25:30 -08:00
Craig Topper	de0c2d75bf	[RISCV] Use tablegen size for getInstSizeInBytes. Fix the pseudos to have the correct size in the MCInstrDesc description. Inspired by D118009 and D117970. Reviewed By: asb Differential Revision: https://reviews.llvm.org/D118175	2022-01-28 09:21:28 -08:00
Kito Cheng	a9d5bb926d	[RISCV] Use __extendhfsf2/__truncsfhf2 for fp16 <-> fp32 `__gnu_h2f_ieee` and `__gnu_f2h_ieee` are introduce by ARM and set that as default name for fp16 and fp32 conversion in LLVM. However RISC-V GCC using default naming scheme for that, which is `__extendhfsf2` and `__truncsfhf2` for that, that cause runtime ABI incompatible issue. Although we didn't have formal runtime ABI spec to specify those naming convention yet, but I think it would be great to fix the incompatible issue first. And I've plan to create a runtime ABI spec undere psABI spec this year. Reviewed By: asb Differential Revision: https://reviews.llvm.org/D118207	2022-01-29 00:01:00 +08:00
Alex Bradbury	588f121ada	[RISCV][NFC] Make Zb* instruction naming match the convention used elsewhere in the RISC-V backend Where the instruction mnemonic contains a dot, we name the corresponding instruction in the .td file using a _ in the place of the dot. e.g. LR_W rather than LRW. This commit updates RISCVInstrInfoZb.td to follow that convention.	2022-01-28 15:20:37 +00:00
Chenbing.Zheng	6d6c44a3f3	[RISCV] Add support for matching vwmulsu from fixed vectors According to riscv-v-spec-1.0, widening signed(vs2)-unsigned integer multiply vwmulsu.vv vd, vs2, vs1, vm # vector-vector vwmulsu.vx vd, vs2, rs1, vm # vector-scalar It is worth noting that signed op is only for vs2. For vwmulsu.vv, we can swap two ops, and don't care which is sign extension, but for vwmulsu.vx signExt can not be a vector extended from scalar (rs1). I specifically added two functions ending with _swap in the test case. Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D118215	2022-01-28 02:33:30 +00:00
Craig Topper	70e1cc6792	[RISCV] Prefer vmslt.vx v0, v8, zero over vmsle.vi v0, v8, -1. At least when starting from a vmslt.vx intrinsic or ISD::SETLT. We don't handle the case where the user used vmsle.vx intrinsic with -1.	2022-01-27 11:48:27 -08:00
Fraser Cormack	84e85e025e	[SelectionDAG][VP] Provide expansion for VP_MERGE This patch adds support for expanding VP_MERGE through a sequence of vector operations producing a full-length mask setting up the elements past EVL/pivot to be false, combining this with the original mask, and culminating in a full-length vector select. This expansion should work for any data type, though the only use for RVV is for boolean vectors, which themselves rely on an expansion for the VSELECT. Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D118058	2022-01-27 09:00:41 +00:00
Wu Xinlong	6a4d3f37b5	[RISCV] fix dead code fix dead code mentioned on https://reviews.llvm.org/D98136 Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D118323	2022-01-27 16:00:01 +08:00
Wu Xinlong	615d71d9a3	[RISCV][CodeGen] Implement IR Intrinsic support for K extension This revision implements IR Intrinsic support for RISCV Scalar Crypto extension according to the specification of version [[ https://github.com/riscv/riscv-crypto/releases/tag/v1.0.0-scalar \| 1.0]] Co-author：@ksyx & @VincentWu & @lihongliang & @achieveartificialintelligence Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D102310	2022-01-27 15:53:35 +08:00
Craig Topper	b3bec6e453	[RISCV] Use vnsrl.wx with x0 instead of vnsrl.vi for truncate. This matches what the spec uses for the vncvt.x.x.w assembly pseudoinstruction. Reviewed By: kito-cheng Differential Revision: https://reviews.llvm.org/D118295	2022-01-26 18:38:13 -08:00
Craig Topper	f487a76430	[RISCV] Add hasStdExtZbp() to hasAndNotCompare.	2022-01-26 13:54:05 -08:00
Craig Topper	b3d94b199c	[RISCV] Remove references to 'B' extension from AssemblerPredicate and SubtargetFeature strings. For Zba/Zbb/Zbc/Zbs I've removed the 'B' completely and used the extension names as presented at the start of Chapter 1 of the 1.0.0 Bitmanipulation spec. For the unratified extensions, I've replaced 'B' with 'Zb' and otherwise left them unchanged. Reviewed By: asb Differential Revision: https://reviews.llvm.org/D117822	2022-01-26 11:08:29 -08:00
Benjamin Kramer	f15014ff54	Revert "Rename llvm::array_lengthof into llvm::size to match std::size from C++17" This reverts commit `ef82063207`. - It conflicts with the existing llvm::size in STLExtras, which will now never be called. - Calling it without llvm:: breaks C++17 compat	2022-01-26 16:55:53 +01:00
serge-sans-paille	ef82063207	Rename llvm::array_lengthof into llvm::size to match std::size from C++17 As a conquence move llvm::array_lengthof from STLExtras.h to STLForwardCompat.h (which is included by STLExtras.h so no build breakage expected).	2022-01-26 16:17:45 +01:00
jacquesguan	267711e38b	[RISCV] Fix support of vlen = 64. In the Zve* extensions, the vlen could be 64. This patch change the vlen constraint of low bound to 64. Differential Revision: https://reviews.llvm.org/D118217	2022-01-26 16:31:21 +08:00
Zakk Chen	510710d037	[RISCV][NFC] Add getVLOperand for RVV intrinsics. Use the VLOperand information to get the VL. Differential Revision: https://reviews.llvm.org/D118156	2022-01-25 17:37:58 -08:00
Zakk Chen	9273378b85	[RISCV] Add the passthru operand for RVV nomask load intrinsics. The goal is support tail and mask policy in RVV builtins. We focus on IR part first. If the passthru operand is undef, we use tail agnostic, otherwise use tail undisturbed. Co-Authored-by: Hsiangkai Wang <Hsiangkai@gmail.com> Reviewers: craig.topper, frasercrmck Differential Revision: https://reviews.llvm.org/D117647	2022-01-25 17:31:36 -08:00
eopXD	b089e4072a	[RISCV] Don't allow i64 vector div by constant to use mulh with Zve64x EEW=64 of mulh and its vairants requires V extension. Authored by: Craig Topper <craig.topper@sifive.com> @craig.topper Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D117947	2022-01-25 09:55:05 -08:00
Nikita Popov	aa97bc116d	[NFC] Remove uses of PointerType::getElementType() Instead use either Type::getPointerElementType() or Type::getNonOpaquePointerElementType(). This is part of D117885, in preparation for deprecating the API.	2022-01-25 09:44:52 +01:00
Craig Topper	fd0a4bc76b	[RISCV] Add missing space to 'clang-format on' directive. NFC Without a space after the comment characters it seems to be ignored.	2022-01-24 17:00:37 -08:00
Craig Topper	cd2a9ff397	[RISCV] Select int_riscv_vsll with shift of 1 to vadd.vv. Add might be faster than shift. We can't do this earlier without using a Freeze instruction. This is the intrinsic version of D106689. Reviewed By: frasercrmck Differential Revision: https://reviews.llvm.org/D118013	2022-01-24 08:04:53 -08:00
Fraser Cormack	d42678b453	[RISCV] Add side-effect-free vsetvli intrinsics This patch introduces new intrinsics that enable the use of vsetvli in contexts where only the returned vector length is of interest. The pre-existing intrinsics are marked with side-effects, which prevents even trivial optimizations on/across them. These intrinsics are intended to be used in situations where the vector length is fed in turn to RVV intrinsics or to vector-predication intrinsics during loop vectorization, for example. Those codegen paths ensure that instructions are generated with their own implicit vsetvli, so the vector length and vtype can be relied upon to be correct. No corresponding C builtins are planned at this stage, though that is a possibility for the future if the need arises. Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D117910	2022-01-24 13:52:08 +00:00
SForeKeeper	70f83f3084	[RISCV] add support for zbkx subextension in MC layer. This patch adds support for zbkx extension from K extension(v1.0.0) in MC layer. Instructions with same functionality and same encoding is defined in the bitmanip extension. It defines {Xperm8, Xperm4} as instruction aliases for xperm.* in Zbp extension. When Zbkx is enabled while Zbp is not, xperm.h will not be available. When Zbkx and Zbp are both enabled, the instructions will be decoded in Zbp format. [[ https://reviews.llvm.org/D94999 \| D94999 ]] this is the patch that introduces xperm.* instructions. Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D117889	2022-01-24 20:38:46 +08:00
Fraser Cormack	af773a1818	[RISCV][VP] Lower VP_MERGE to RVV instructions This patch adds lowering of the llvm.vp.merge.* intrinsic (ISD::VP_MERGE) to RVV vmerge/vfmerge instructions. It introduces a special pseudo form of vmerge which allows a tied merge operand, allowing us to specify the tail elements as being equal to the "on false" operand, using a tied-def constraint and a "tail undisturbed" policy. While this strategy allows us to often lower the intrinsic to just one instruction, it may be less efficient in fixed-vector types as the number of tail elements may extend far beyond the length of the fixed vector. Another strategy could be to use a vmerge/vfmerge instruction with an AVL equal to the length of the vector type, and manipulate the condition operand such that mask elements greater than the operation's EVL are false. I've also observed inefficient codegen in which our 'VF' patterns don't match raw floating-point SPLAT_VECTORs, which occur in scalable-vector code. Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D117561	2022-01-24 11:05:05 +00:00
Fraser Cormack	e7926e8d97	[RISCV] Match VF variants for masked VFRDIV/VFRSUB This patch follows up on D117697 to help the simple binary operations behave similarly in the presence of masks. It also enables CGP sinking support for vp.fdiv and vp.fsub intrinsics, now that VFRDIV and VFRSUB are consistently matched with a LHS splat for masked and unmasked variants. Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D117783	2022-01-24 10:59:43 +00:00
Chenbing.Zheng	9aaa74aeef	[RISCV] Add patterns of SET[U]LT_VI for STECC forms This patch optmizes "li a0, 5 vmsgt[u].vx v10, v8, a0" -> "vmsgt[u].vi v10, v8, 5" Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D118014	2022-01-24 08:50:49 +00:00
jacquesguan	ba16e3c31f	[RISCV] Decouple Zve* extensions and the V extension. According to the spec, there are some difference between V and Zve64d. For example, the vmulh integer multiply variants that return the high word of the product (vmulh.vv, vmulh.vx, vmulhu.vv, vmulhu.vx, vmulhsu.vv, vmulhsu.vx) are not included for EEW=64 in Zve64, but V extension does support these instructions. So we should decouple Zve extensions and the V extension. Differential Revision: https://reviews.llvm.org/D117854	2022-01-24 14:55:21 +08:00
Wu Xinlong	e29d8fb169	[RISCV] Initially support the K-extension instructions on the LLVM MC layer This commit is currently implementing supports for scalar cryptography extension for LLVM according to version v1.0.0 of [K Ext specification](https://github.com/riscv/riscv-crypto/releases)(scala crypto has been ratified already). Currently, we are implementing the MC (Machine Code) layer of his extension and the majority of work is done under `llvm/lib/Target/RISCV` directory. There are also some test files in `llvm/test/MC/RISCV` directory. Remove the subfeature of Zbk* which conflict with b extensions to reduce the size of the patch. (Zbk* will be resubmit after this patch has been merged) Co-author：@ksyx & @VincentWu & @lihongliang & @achieveartificialintelligence Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D98136	2022-01-24 14:45:35 +08:00
Jim Lin	3f24cdec25	[RISCV][NFC] Remove tailing whitespaces in RISCVInstrInfoVSDPatterns.td and RISCVInstrInfoVVLPatterns.td	2022-01-24 10:49:43 +08:00
Craig Topper	413684313d	[RISCV] Adjust the header comment in RISCVInstrInfoZb.td to better integrate Zbk* extensions. The Zbk* extensions have some overlap with Zb so have been placed in this file. Reviewed By: VincentWu Differential Revision: https://reviews.llvm.org/D117958	2022-01-23 11:42:52 -08:00
eopXD	3cf15af2da	[RISCV] Remove experimental prefix from rvv-related extensions. Extensions affected: +v, +zve, +zvl Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D117860	2022-01-22 20:18:40 -08:00
Craig Topper	d44b6be6ea	[RISCV] Don't Custom legalize f16/f32/f64 bitcasts if those types aren't Legal.	2022-01-22 11:55:18 -08:00
Alex Fan	e796eaf2af	[RISCV][RFC] add MC support for zbkc subextension Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D117874	2022-01-22 10:23:01 +08:00
Craig Topper	0379459fc5	[RISCV] Strengthen a SDTypeProfile. Fix formatting.	2022-01-21 13:01:53 -08:00
Craig Topper	48132bb1e4	[RISCV] Simplify interface to combineMUL_VLToVWMUL. NFC Instead of passing the both the SDNode* and 2 of the operands in two different orders, just pass the SDNode * and a bool to indicate which operand order to test. While there rename to combineMUL_VLToVWMUL_VL.	2022-01-21 11:43:06 -08:00
Craig Topper	11754a4dbb	[RISCV] Use RVBUnary in more places to simplify some tablegen declarations. NFCI	2022-01-21 10:55:35 -08:00
Fraser Cormack	4d268dc94a	[RISCV] Enable CGP to sink splat operands of VP intrinsics This patch brings better splat-matching to our VP support, by sinking splat operands of VP intrinsics back into the same block as the VP operation. The list of VP intrinsics we are interested in matches that of the regular instructions. Some optimization is still lacking. For instance, our VL nodes aren't recognized as commutative, so splats must be on the RHS. Because of this, we limit our sinking of splats to just the RHS operand for now. Improvement in this regard can come in another patch. Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D117703	2022-01-21 11:30:37 +00:00
wangpc	8def89b5dc	[RISCV] Set CostPerUse to 1 iff RVC is enabled After D86836, we can define multiple cost values for different cost models. So here we set CostPerUse to 1 iff RVC is enabled to avoid potential impact on RA. Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D117741	2022-01-21 14:44:26 +08:00
Craig Topper	7b3d307288	[RISCV] Add isel patterns for grevi, shfli, and unshfli to brev8/zip/unzip instructions. Zbkb supports some encodings of the general grevi, shfli, and unshfli instructions legal, so we added separate instructions for those encodings to improve the diagnostics for assembler and disassembler. To be consistent we should always use these separate instructions whenever those specific encodings of grevi/shfli/unshfli occur. So this patch adds specific isel patterns to override the generic isel patterns for these cases. Similar was done for rev8 and zext.h for Zbb previously.	2022-01-20 20:43:52 -08:00
Wu Xinlong	7ee1c162cc	[RISCV][RFC] add inst support of zbkb This commit add instructions supports of `zbkb` which defined in scalar cryptography extension version v1.0.0 (has been ratified already). Most of the zbkb directives reuse parts of the zbp and zbb directives, so this patch just modified some of the inst aliases and predicates. Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D117640	2022-01-21 11:49:36 +08:00
Hsiangkai Wang	ad06e65dc4	[RISCV] Fix the bug in the register allocator caused by reserved BP. Originally, hasRVVFrameObject() will scan all the stack objects to check whether if there is any scalable vector object on the stack or not. However, it causes errors in the register allocator. In issue 53016, it returns false before RA because there is no RVV stack objects. After RA, it returns true because there are spilling slots for RVV values during RA. The compiler will not reserve BP during register allocation and generate BP access in the PEI pass due to the inconsistent behavior of the function. The function is changed to use hasStdExtV() as the return value. It is not precise, but it can make the register allocation correct. Refer to https://github.com/llvm/llvm-project/issues/53016. Differential Revision: https://reviews.llvm.org/D117663	2022-01-21 01:23:01 +00:00
Craig Topper	cfae2c65db	[RISCV] Factor Zve32 support into RISCVSubtarget::getMaxELENForFixedLengthVectors. This is needed to properly limit fractional LMULs for Zve32. Add new RUN Zve32 RUN lines to the existing tests for the -riscv-v-fixed-length-vector-elen-max command line option.	2022-01-20 16:31:12 -08:00
Craig Topper	5e88f527da	[RISCV] Remove RISCVSubtarget::hasStdExtV() and hasStdExtZve(). NFC All code should use one of the cleaner named hasVInstructions functions. Fix the two uses that weren't and delete the methods so no new uses can be created.	2022-01-20 15:05:09 -08:00
Craig Topper	fa8bb22466	[RISCV] Optimize vector_shuffles that are interleaving the lowest elements of two vectors. RISCV only has a unary shuffle that requires places indices in a register. For interleaving two vectors this means we need at least two vrgathers and a vmerge to do a shuffle of two vectors. This patch teaches shuffle lowering to use a widening addu followed by a widening vmaccu to implement the interleave. First we extract the low half of both V1 and V2. Then we implement (zext(V1) + zext(V2)) + (zext(V2) * zext(2^eltbits - 1)) which simplifies to (zext(V1) + zext(V2) * 2^eltbits). This further simplifies to (zext(V1) + zext(V2) << eltbits). Then we bitcast the result back to the original type splitting the wide elements in half. We can only do this if we have a type with wider elements available. Because we're using extends we also have to be careful with fractional lmuls. Floating point types are supported by bitcasting to/from integer. The tests test a varied combination of LMULs split across VLEN>=128 and VLEN>=512 tests. There a few tests with shuffle indices commuted as well as tests for undef indices. There's one test for a vXi64/vXf64 vector which we can't optimize, but verifies we don't crash. Reviewed By: rogfer01 Differential Revision: https://reviews.llvm.org/D117743	2022-01-20 14:44:47 -08:00
Craig Topper	dd7b69a61f	[RISCV] Remove HadStdExtV and HasStdZve* Predicates from tablegen. No instructions should be using these. Everything should use HasVInstructions* Predicates. Remove them so that they can't be used by accident.	2022-01-20 12:54:20 -08:00
Craig Topper	7a275dc354	[RISCV] Remove Zvlsseg extension. This string no longer appears in the Vector Extension specification. The segment load/store instructions are just part of the vector instruction set. Reviewed By: asb Differential Revision: https://reviews.llvm.org/D117724	2022-01-20 12:40:07 -08:00
Craig Topper	94e69fbb4f	[RISCV] Add DAG combine to fold (fp_to_int_sat (ffloor X)) -> (select X == nan, 0, (fcvt X, rdn)) Similar for ceil, trunc, round, and roundeven. This allows us to use static rounding modes to avoid a libcall. This is similar to D116771, but for the saturating conversions. This optimization is done for AArch64 as isel patterns. RISCV doesn't have instructions for ceil/floor/trunc/round/roundeven so the operations don't stick around until isel to enable a pattern match. Thus I've implemented a DAG combine. I'm only handling saturating to i64 or i32. This could be extended to other sizes in the future. Reviewed By: asb Differential Revision: https://reviews.llvm.org/D116864	2022-01-20 11:35:37 -08:00
Fraser Cormack	ca36cc56ac	[RISCV] Match RVV VF variants also through masked operations This brings floating-point RVV vector/scalar support more in line with the integer vector patterns, which can already match '.vx' instructions with masked operations. Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D117697	2022-01-20 12:08:02 +00:00
Fraser Cormack	5a12024b95	[RISCV] Optimize lowering of floating-point -0.0 This idea has come up in several reviews -- D115978 and D105902 -- so I can't take any credit for the idea. Instead of using a constant pool to lower -0.0, we can emit a sequence of two instructions: fmv.[hwd].x freg, zero fsgnjn.[hsd] freg, freg, freg This is only done when the floating-point type is legal. Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D117687	2022-01-20 11:46:28 +00:00
Chenbing.Zheng	0be3da1fab	[RISCV] Add intrinsic for Zbt extension RV32: fsl, fsr, fsri RV64: fsl, fsr, fsri, fslw, fsrw, fsriw Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D117468	2022-01-20 08:27:05 +00:00
eopXD	8eae99dfe5	[RISCV] Add the zve extension according to the v1.0 spec `zve` is the new standard vector extension to specify varying degrees of vector support for embedding processors. The `zve` extension is related to the `zvl` extension and other updates that are added in v1.0. According to https://github.com/riscv-non-isa/riscv-c-api-doc/pull/21, Clang defines macro `__riscv_v_max_elen`, `__riscv_v_max_elen_fp` for `zve` and it can be used by applications that uses the vector extension. Authored by: Zakk Chen <zakk.chen@sifive.com> @khchen Co-Authored by: Eop Chen <eop.chen@sifive.com> @eopXD Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D112408	2022-01-19 23:48:28 -08:00
Mohammed Nurul Hoque	21c79be5d7	[RISCV] Add patterns to MIR sign-extension removal pass. This patch adds a few instruction patterns that generate sign-extended values or propagate them, adding to the pass introduced in https://reviews.llvm.org/D116397 Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D117465	2022-01-19 17:33:58 -08:00
Luís Marques	a767ae2c5c	[RISCV] Fix incomplete asm statement parsing For instructions without operands, the final `AsmToken::EndOfStatement` wasn't being consumed. In the context of inline assembly, the resulting empty statements would cause extraneous empty lines to be emitted. Fix the issue by consuming the `EndOfStatement` token. Differential Revision: https://reviews.llvm.org/D117565	2022-01-19 21:56:21 +00:00
Craig Topper	4060b81e76	[RISCV] Obey -riscv-v-fixed-length-vector-elen-max when lowering mask BUILD_VECTORs. We may not be allowed to use vXiXLen vectors. Consult ELEN to determine what is allowed. This will become even more important when Zve32 is added. Reviewed By: frasercrmck, arcbbb Differential Revision: https://reviews.llvm.org/D117518	2022-01-19 10:47:37 -08:00
Jim Lin	d6b0734837	[NFC] Use Register instead of unsigned	2022-01-19 20:17:04 +08:00
eopXD	9f27941c2f	[RISCV] Add patterns for vector narrowing integer right shift instructions Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D117454	2022-01-18 22:30:13 -08:00
Craig Topper	5a6c622afd	[RISCV] Remove special case for constant shift amount in FSHL/FSHR lowering to FSL/FSR. Remove fshl/fshr with constant shift amount isel patterns. Replace with fsr/fsl with constant isel patterns. This hack was trying to preserve as much optimization opportunity for fshl/fshr by constant as possible, but the conversion to RISCVISD::FSR/FSL happens so late it probably isn't worth much. The new isel patterns are needed by D117468 anyway.	2022-01-18 11:47:50 -08:00
Craig Topper	aa7fc02feb	Recommit "[RISCV] Make the operand order for RISCVISD::FSL(W)/FSR(W) match the instruction register numbering." This reverts the revert commit `e328385739`. Accidental demanded bits change has been removed. The demanded bits code itself was remove in a pre-commit since it isn't tested. Original commit message: Previous we used the fshl/fshr operand ordering for simplicity. This made things confusing when D117468 proposed adding intrinsics for the instructions. We can't just use the generic funnel shifting intrinsics because fsl/fsr have different functionality that should be exposed to software. Now we use rs1, rs3, rs2/shamt order which matches the instruction printing order and the order used in this intrinsic header https://github.com/riscv/riscv-bitmanip/blob/main-history/cproofs/rvintrin.h	2022-01-18 10:52:43 -08:00
Craig Topper	b3a0ec7645	[RISCV] Remove DemandedBits handling for FSR/FSL until we have test cases for it. Testing may be easier after D117468. Right now we get demanded bits optimizations done on ISD::FSHL/FSHR before they become FSR/FSL. This makes it hard to test.	2022-01-18 10:52:43 -08:00
Craig Topper	e328385739	Revert "[RISCV] Make the operand order for RISCVISD::FSL(W)/FSR(W) match the instruction register numbering." This reverts commit `b634f8a663`. I broke the SimplifyDemandedBits code, but we don't have tests.	2022-01-18 10:36:03 -08:00
Craig Topper	b634f8a663	[RISCV] Make the operand order for RISCVISD::FSL(W)/FSR(W) match the instruction register numbering. Previous we used the fshl/fshr operand ordering for simplicity. This made things confusing when D117468 proposed adding intrinsics for the instructions. We can't just use the generic funnel shifting intrinsics because fsl/fsr have different functionality that should be exposed to software. Now we use rs1, rs3, rs2/shamt order which matches the instruction printing order and the order used in this intrinsic header https://github.com/riscv/riscv-bitmanip/blob/main-history/cproofs/rvintrin.h	2022-01-18 09:47:28 -08:00
David Sherwood	f4515ab858	Revert "[CodeGen][AArch64] Ensure isSExtCheaperThanZExt returns true for negative constants" This reverts commit `197f3c0deb`. Reverting after miscompilation errors discovered with ffmpeg.	2022-01-18 08:40:20 +00:00
Lian Wang	5ceb4f5446	[RISCV] Add instruction schedule for Zbc extension and Zbs extension Zbc extension: CLMUL/CLMULR/CLMULH are grouped together, defined one schedule class. Zbs extension: BCLR/BSET/BINV/BEXT are grouped together, defined one schedule class. BCLRI/BSETI/BINVI/BEXTI are grouped together, defined one schedule class. Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D117538	2022-01-18 07:31:50 +00:00
jacquesguan	1090000b63	[RISCV] Add patterns for vector widening floating-point multiply Add patterns for vector widening floating-point multiply Differential Revision: https://reviews.llvm.org/D117530	2022-01-18 14:52:43 +08:00
Han-Kuan Chen	ec9cb3a79c	[RISCV] Provide VLOperand in td. Currently, users expected VL is the last operand. However, since some intrinsics has tail policy in the last operand, this rule cannot be used anymore. Reviewed By: craig.topper, frasercrmck Differential Revision: https://reviews.llvm.org/D117452	2022-01-17 20:25:47 -08:00
Han-Kuan Chen	3fc4b5896a	[RISCV] Make SplatOperand start from 0. Current SplatOperand starts from 1 because operand 0 (or 1) is intrinsic id in SelectionDAG. Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D117453	2022-01-17 20:14:59 -08:00
jacquesguan	c29d6c410e	[RISCV] Add patterns for vector widening floating-point add/subtract instructions Add patterns for Vector Widening Floating-Point Add/Subtract Instructions Differential Revision: https://reviews.llvm.org/D117466	2022-01-18 10:33:56 +08:00
Craig Topper	116af698e2	[RISCV] When expanding CONCAT_VECTORS, don't create INSERT_SUBVECTORS for undef subvectors. For fixed vectors, the undef will get expanded to an all zeros build_vector. We don't want that so suppress creating the insert_subvector. Reviewed By: frasercrmck Differential Revision: https://reviews.llvm.org/D117379	2022-01-17 14:40:59 -08:00
Craig Topper	9c410838d2	[RISCV] Legalize fixed length (insert_subvector undef, X, 0) to a scalable insert. We were considering this legal, but later the undef would become an all zeros vector. This would cause us to need to re-legalize the insert later into a vslideup with zero vector. This patch catches the case and directly legalizes it to a scalable insert. Reviewed By: frasercrmck Differential Revision: https://reviews.llvm.org/D117377	2022-01-17 14:31:30 -08:00
David Sherwood	197f3c0deb	[CodeGen][AArch64] Ensure isSExtCheaperThanZExt returns true for negative constants When we know the value we're extending is a negative constant then it makes sense to use SIGN_EXTEND because this may improve code quality in some cases, particularly when doing a constant splat of an unpacked vector type. For example, for SVE when splatting the value -1 into all elements of a vector of type <vscale x 2 x i32> the element type will get promoted from i32 -> i64. In this case we want the splat value to sign-extend from (i32 -1) -> (i64 -1), whereas currently it zero-extends from (i32 -1) -> (i64 0xFFFFFFFF). Sign-extending the constant means we can use a single mov immediate instruction. New tests added here: CodeGen/AArch64/sve-vector-splat.ll I believe we see some code quality improvements in these existing tests too: CodeGen/AArch64/reduce-and.ll CodeGen/AArch64/unfold-masked-merge-vector-variablemask.ll The apparent regressions in CodeGen/AArch64/fast-isel-cmp-vec.ll only occur because the test disables codegen prepare and branch folding. Differential Revision: https://reviews.llvm.org/D114357	2022-01-17 11:08:57 +00:00
Lian Wang	85def34f5e	[RISCV] Add scheduler for bfp instruction in Zbf extension Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D117290	2022-01-17 09:17:18 +00:00
Kito Cheng	cc35161dc7	[RISCV] Add initial support for getRegUsageForType and getNumberOfRegisters Those two TTI hooks are used during vectorization for calculating register pressure, the default implementation isn't consider for LMUL, and that's also definitly wrong value for register number (all register class are 8 registers). So in this patch we tried to: 1. Calculate right register usage for vector type and scalar type. 2. Return right number of register for general purpose register and vector register. Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D116890	2022-01-17 15:27:54 +08:00
eopXD	5a457782a2	[RISCV] Add patterns for vector widening integer multiply-add instructions Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D117404	2022-01-16 18:37:13 -08:00
Craig Topper	4c1e1e05cb	[RISCV] Add RISCVISD::BFPW to ComputeNumSignBitsForTargetNode.	2022-01-15 15:23:49 -08:00
Fraser Cormack	877d1b3d07	[SelectionDAG][VP] Add splitting/widening for VP_LOAD and VP_STORE Original patch by @hussainjk. This patch was split off from D109377 to keep vector legalization (widening/splitting) separate from vector element legalization (promoting). While the original patch added a third overload of SelectionDAG::getVPStore, this patch takes the liberty of collapsing those all down to 1, as three overloads seems excessive for a little-used node. The original patch also used ModifyToType in places, but that method still crashes on scalable vector types. Seeing as the other VP legalization methods only work when all operands need identical widening, this patch follows in that vein. Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D117235	2022-01-15 11:41:29 +00:00
Alex Bradbury	0ee679e22c	[RISCV] Add CSRs defined in the recently ratified Sstc extension The 'RISC-V "stimecmp / vstimecmp" Extension' was ratified at the end of last year though hasn't yet been integrated in the main specification documents (see <https://wiki.riscv.org/display/TECH/Recently+Ratified+Extensions>). RISC-V "stimecmp / vstimecmp" Extension <https://github.com/riscv/riscv-time-compare/releases/download/v0.5.4/Sstc.pdf>. Differential Revision: https://reviews.llvm.org/D117311	2022-01-15 08:36:04 +00:00
Alex Bradbury	1ca79823e0	[RISCV] Add CSRs defined in the recently ratified Smstateen extension The "RISC-V State Enable Extension" was ratified at the end of at the end of last year though hasn't yet been integrated in the main specification documents (see <https://wiki.riscv.org/display/TECH/Recently+Ratified+Extensions>). This commit adds the CSRs defined by this extension in <https://github.com/riscv/riscv-state-enable/releases/download/v0.6.3/Smstateen.pdf>. Differential Revision: https://reviews.llvm.org/D117310	2022-01-15 08:35:47 +00:00
Alex Bradbury	f00a98a0a9	[RISCV] Add CSRs defined in the recently ratified Sscofpmf extension The "RISC-V Count Overflow and Mode-Based Filtering Extension" was ratified at the end of last year though hasn't yet been integrated in the main specification documents (see <https://wiki.riscv.org/display/TECH/Recently+Ratified+Extensions>). This commit adds the CSRs defined by this extension in <https://github.com/riscv/riscv-count-overflow/releases/download/v0.5.2/Sscofpmf.pdf>. Differential Revision: https://reviews.llvm.org/D117308	2022-01-15 08:35:13 +00:00
Chenbing.Zheng	fdd33a0c75	[RISCV][NFC] Add a function to customLegalizeToWOp by Intrinsic These cases follow the same pattern, so they can be combined to a unqiue function. Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D117378	2022-01-15 08:28:08 +00:00
eopXD	26bb1b1dab	[RISCV] Add the zvl extension according to the v1.0 spec `zvl` is the new standard vector extension that specifies the minimum vector length of the vector extension. The `zvl` extension is related to the `zve` extension and other updates that are added in v1.0. According to https://github.com/riscv-non-isa/riscv-c-api-doc/pull/21, Clang defines macro `__riscv_v_min_vlen` for `zvl` and it can be used for applications that uses the vector extension. LLVM checks whether the option `riscv-v-vector-bits-min` (if specified) matches the `zvl*` extension specified. Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D108694	2022-01-14 23:01:48 -08:00
Lian Wang	21dad9a522	[RISCV][NFC] Add IsRV64 predicate in xperm.w pattern Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D117191	2022-01-15 04:22:16 +00:00
jacquesguan	b148348ad4	[RISCV] Add patterns for vector widening integer add/subtract Add patterns for vector widening integer add/subtract instructions Differential Revision: https://reviews.llvm.org/D117188	2022-01-15 09:41:07 +08:00
Shao-Ce SUN	a0a76fee0c	[RISCV] update zfh and zfhmin extention to v1.0 `zfh` and `zfhmin` have been ratified, with version 1.0. Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D117098	2022-01-15 09:21:24 +08:00
Craig Topper	2baa1dffd1	[RISCV] Add basic support for matching shuffles to vslidedown.vi. Specifically the unary shuffle case where the elements being shifted in are undef. This handles the shuffles produce by expanding llvm.reduce.mul. I did not reduce the VL which would increase the number of vsetvlis, but may improve the execution speed. We'd also want to narrow the multiplies so we could share vsetvlis between the vslidedown.vi and the next multiply. Reviewed By: frasercrmck Differential Revision: https://reviews.llvm.org/D117239	2022-01-14 09:04:54 -08:00
Craig Topper	ac6b4896ea	[RISCV] Honor the VT when converting float point register names to register class for inline assembly. It appears the code here was written for the inline asm clobbering a specific register, but it also gets used for named input and output registers. For the input and output case, we should honor the VT so we don't insert conversion instructions around the inline assembly. For the clobber, case we need to pick the largest register class. Reviewed By: asb, jrtc27 Differential Revision: https://reviews.llvm.org/D117279	2022-01-14 09:04:00 -08:00
Alex Bradbury	4a4a652f34	[RISCV][NFC] Use TableGen 'foreach' to simplify repetitive CSR definitions Make the definitions of hpmcounter3-hpmcounter31, hpmcounter3h-hpmcounter31h, mhpmcounter3-mhpmcounter31, mhpmcounter3h-mhpmcounter31h, pmpaddr0-pmpaddr63, mhpmevent3-31, and pmpcfg0-15 substantially less repetitive using a foreach loop. Differential Revision: https://reviews.llvm.org/D117227	2022-01-14 11:59:39 +00:00
jacquesguan	88c0e0806b	[RISCV] Improve i64 splat vector lowering in RV32. We could use vmv.v.i/vmv.v.x whose eew is 32 to lower the i64 splat vector if the i64 constant scalar could be splitted into two same i32 scalar. Differential Revision: https://reviews.llvm.org/D117079	2022-01-14 14:06:01 +08:00
David Sherwood	ba471ba8d2	Revert "[CodeGen][AArch64] Ensure isSExtCheaperThanZExt returns true for negative constants" This reverts commit `31009f0b5a`. It seems to be causing SVE VLA buildbot failures and has introduced a genuine regression. Reverting for now.	2022-01-13 15:59:43 +00:00
David Sherwood	31009f0b5a	[CodeGen][AArch64] Ensure isSExtCheaperThanZExt returns true for negative constants When we know the value we're extending is a negative constant then it makes sense to use SIGN_EXTEND because this may improve code quality in some cases, particularly when doing a constant splat of an unpacked vector type. For example, for SVE when splatting the value -1 into all elements of a vector of type <vscale x 2 x i32> the element type will get promoted from i32 -> i64. In this case we want the splat value to sign-extend from (i32 -1) -> (i64 -1), whereas currently it zero-extends from (i32 -1) -> (i64 0xFFFFFFFF). Sign-extending the constant means we can use a single mov immediate instruction. New tests added here: CodeGen/AArch64/sve-vector-splat.ll I believe we see some code quality improvements in these existing tests too: CodeGen/AArch64/dag-numsignbits.ll CodeGen/AArch64/reduce-and.ll CodeGen/AArch64/unfold-masked-merge-vector-variablemask.ll The apparent regressions in CodeGen/AArch64/fast-isel-cmp-vec.ll only occur because the test disables codegen prepare and branch folding. Differential Revision: https://reviews.llvm.org/D114357	2022-01-13 09:43:07 +00:00
Lian Wang	16877c5d2c	[RISCV] Add bfp and bfpw intrinsic in zbf extension Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D116994	2022-01-13 02:53:00 +00:00
Alex Bradbury	33d008b169	[RISCV] Update recently ratified Zb{a,b,c,s} extensions to no longer be experimental Agreed policy is that RISC-V extensions that have not yet been ratified should be marked as experimental, and enabling them requires the use of the -menable-experimental-extensions flag when using clang alongside the version number. These extensions have now been ratified, so this is no longer necessary, and the target feature names can be renamed to no longer be prefixed with "experimental-". Differential Revision: https://reviews.llvm.org/D117131	2022-01-12 19:33:44 +00:00
Craig Topper	632c263eb3	[RISCV] Add RISCVProcFamilyEnum and add SiFive7. Use it to remove explicit string compares from unrolling preferences. I'm of two minds on this. Ideally, we would define things in terms of architectural or microarchitectural features, but it's hard to do that with things like unrolling preferences without just ending up with FeatureSiFive7UnrollingPreferences. Having a proc enum is consistent with ARM and AArch64. X86 only has a few and is trying to move away from it. Reviewed By: asb, mcberg2021 Differential Revision: https://reviews.llvm.org/D117060	2022-01-12 09:34:02 -08:00
Shao-Ce SUN	edb9175de6	[RISCV][llvm] Update CSRs According the newest RISC-V Privileged Spec, updated CSRs. Reviewed By: asb Differential Revision: https://reviews.llvm.org/D116645	2022-01-12 20:14:04 +08:00
Craig Topper	63b17eb9ec	[RISCV] Add strictfp support for compares. This adds support for STRICT_FSETCC(quiet) and STRICT_FSETCCS(signaling). FEQ matches well to STRICT_FSETCC oeq. FLT/FLE matches well to STRICT_FSETCCS olt/ole. Others require commuting operands or multiple instructions. STRICT_FSETCC olt/ole/ogt/oge/ult/ule/ugt/uge uses FLT/FLE, but we need to save/restore FFLAGS around them to avoid spurious exceptions. I've implemented pseudo instructions with a CustomInserter to insert the save/restore CSR instructions. Unfortunately, this doesn't honor exceptions for signaling NANs but I'm not sure if signaling nans are really supported by the constrained intrinsics. STRICT_FSETCC one and ueq expand to a pair of FLT instructions with a save/restore of fflags around each. This could be improved in the future. There may be some opportunities to generate better code for strict comparisons mixed with nonans fast math flags. I've left FIXMEs in the .td files for that. Co-Authored-by: ShihPo Hung <shihpo.hung@sifive.com> Reviewed By: arcbbb Differential Revision: https://reviews.llvm.org/D116694	2022-01-11 20:01:41 -08:00
Craig Topper	be1cc64cc1	[RISCV] Add DAG combine to fold (fp_to_int (ffloor X)) -> (fcvt X, rdn) Similar for ceil, trunc, round, and roundeven. This allows us to use static rounding modes to avoid a libcall. This optimization is done for AArch64 as isel patterns. RISCV doesn't have instructions for ceil/floor/trunc/round/roundeven so the operations don't stick around until isel to enable a pattern match. Thus I've implemented a DAG combine. We only handle XLen types except i32 on RV64. i32 will be type legalized to a RISCVISD node. All other types will be type legalized to XLen and maintain the FP_TO_SINT/UINT ISD opcode. Reviewed By: asb Differential Revision: https://reviews.llvm.org/D116771	2022-01-11 09:05:57 -08:00
wangpc	c6430fade3	[RISCV] Generate 32 bits jumptable entries when code model is small The code can only address the whole RV32 address space or the lower 2 GiB of the RV64 address space in small code model, so 32 bits entry is enough. Cache hit ratio and code size have some improvements. Reviewed By: asb Differential Revision: https://reviews.llvm.org/D116435	2022-01-11 18:20:37 +08:00
wangpc	98d51c2542	[RISCV] Override TargetLowering::BuildSDIVPow2 to generate SELECT When `Zbt` is enabled, we can generate SELECT for division by power of 2, so that there is no data dependency. Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D114856	2022-01-11 15:54:35 +08:00
Chenbing.Zheng	9ea772ff81	[RISCV] Block vmsgeu.vi with 0 immediate in Isel For vmsgeu.vi with 0, we know this is always true. So we can replace it with vmset.m (unmasked) or vmset.m+vmand.mm (masked). Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D116584	2022-01-11 03:04:44 +00:00
jacquesguan	d0554ae4cf	[RISCV] Select vl op to X0 when it is equal to ~0. Now the backend will select ~0 vl to a register and load instruction, we could use X0 to replace it. Differential Revision: https://reviews.llvm.org/D116798	2022-01-11 10:56:25 +08:00
Haocong.Lu	bd653f6406	[RISCV] Use shift for zero extension when Zbb and Zbp are not enabled Now AND is used for zero extension when both Zbb and Zbp are not enabled. It may be better to use shift operation if the trailing ones mask exceeds simm12. This patch optimzes LUI+ADDI+AND to SLLI+SRLI. Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D116720	2022-01-11 02:37:03 +00:00
jacquesguan	b607cd3928	[RISCV] Use vmv.s.x to build one element splat vector. When we want to create an splat vector that only the first element is initialized, we could use vmv.s.x or vfmv.s.f to build it. Differential Revision: https://reviews.llvm.org/D116277	2022-01-11 10:21:18 +08:00
Craig Topper	b645bcd98a	[RISCV] Generalize (srl (and X, 0xffff), C) -> (srli (slli X, (XLen-16), (XLen-16) + C) optimization. This can be generalized to (srl (and X, C2), C) -> (srli (slli X, (XLen-C3), (XLen-C3) + C). Where C2 is a mask with C3 trailing ones. This can avoid constant materialization for C2. This is beneficial even when C2 can be selected to ANDI because the SLLI can become C.SLLI, but C.ANDI cannot cover all the immediates of ANDI. This also enables CSE in some cases of i8 sdiv by constant codegen.	2022-01-09 23:37:10 -08:00
Craig Topper	296e8cae5c	[RISCV] Isel (sra (sext_inreg X, i16), C) -> (srai (slli X, (XLen-16), (XLen-16) + C). Similar for (sra (sext_inreg X, i8), C). With Zbb, sext_inreg of i8 and i16 are legal for sext.b and sext.h. This transform makes the Zbb codegen the same as without Zbb. The shifts are more compressible. This also exposes an opportunity for CSE with another slli in the i16 sdiv by constant codegen.	2022-01-09 21:23:43 -08:00
jacquesguan	6b8362eb8d	[RISCV] Disable EEW=64 for index values when XLEN=32. Disable EEW=64 for vector index load/store when XLEN=32. Differential Revision: https://reviews.llvm.org/D106518	2022-01-10 10:51:27 +08:00
Craig Topper	2dd52f840b	[RISCV] Fold (srl (and X, 0xffff), C)->(srli (slli X, (XLen-16), (XLen-16) + C) even with Zbb/Zbp. We can use zext.h with Zbb, but srli/slli may offer more opportunities for compression.	2022-01-09 18:42:03 -08:00
Kazu Hirata	435a5a3652	[llvm] Fix bugprone argument comments (NFC) Identified with bugprone-argument-comment.	2022-01-08 11:56:38 -08:00
Craig Topper	042394b69e	[RISCV] Add a command line option to control the LMUL used by TTI's getRegisterBitWidth. By default we return the width of an LMUL=1 register. We can enable testing with larger LMUL values by returning a larger bit width. This patch adds a RISCV specific option to provide a LMUL which will be multiplied by the LMUL=1 bit width. Reviewed By: kito-cheng Differential Revision: https://reviews.llvm.org/D116339	2022-01-07 20:02:10 -08:00
Kito Cheng	f142c45f1e	[RISCV] Set getMinVectorRegisterBitWidth to 16 if enable fixed length vector code gen for RVV getMinVectorRegisterBitWidth means what vector types is supported in this target, and actually RISC-V support all fixed length vector types with vector length less than `getMinRVVVectorSizeInBits`, so set it to 16, means 2 x i8, that is minimal fixed length vector size in theory. That also fixed one issue, some testcase migth become non-vectorizable when `-riscv-v-vector-bits-min` set to larger value, because the vector size is smaller than `-riscv-v-vector-bits-min`. For example, following code can vectorize by SLP with `-riscv-v-vector-bits-min=128` or `-riscv-v-vector-bits-min=256`, but can't vectorize `-riscv-v-vector-bits-min=512` or larger: ``` void foo(double *da) { da[0] = 0; da[1] = 1; da[2] = 2; da[3] = 3; } ``` Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D116534	2022-01-08 11:16:21 +08:00
Baoshan Pang	af931a51b9	[RISCV] Materializing constants with 'rori' Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D116574	2022-01-07 15:39:22 -08:00
Lian Wang	e8f1dfe923	[RISCV] Supplement PACKH instruction pattern Optimize (rs1 & 255) \| ((rs2 & 255) << 8) -> (PACKH rs1, rs2). Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D116791	2022-01-07 17:59:19 +08:00
Kazu Hirata	f3a344d212	[Target] Remove redundant member initialization (NFC) Identified with readability-redundant-member-init.	2022-01-06 22:01:44 -08:00
wangpc	91cf2a9b6c	[RISCV][NFC] Use sub operator to generate register list There are several duplicated lines for generating GPRXXX's register list that can be eliminated by using `sub` operator. Reviewed By: asb Differential Revision: https://reviews.llvm.org/D116729	2022-01-07 12:29:58 +08:00
Shao-Ce SUN	808c0987c3	[NFC][RISCV] Make the macro names more uniform Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D116719	2022-01-07 11:09:41 +08:00
Liqin Weng	92153a9aa7	[RISCV] Support immediate vtype of VSETVLI/VSETIVLI in asm parser Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D115133	2022-01-07 02:26:41 +00:00
Craig Topper	ec4dd862bf	[RISCV] Use simm5_plus1_nonzero in isel patterns for vmsgeu.vi/vmsltu.vi intrinsics. The 0 immediate can't be selected to vmsgtu.vi/vmsleu.vi by decrementing the immediate. To prevent his we had special patterns that provided alternate lowering for the 0 cases. This relied on tablegen prioritizing the 0 pattern over the sim5_plus1 range. This patch introduces simm5_plus1_nonzero that excludes 0. It also excludes the special case for vmsltu.vi since we can just use vmsltu.vx and let the 0 be selected to X0. This is an alternative to some of the changes in D116584. Reviewed By: Chenbing.Zheng, asb Differential Revision: https://reviews.llvm.org/D116723	2022-01-06 08:27:27 -08:00

... 6 7 8 9 10 ...

2266 Commits