llvm-project

Commit Graph

Author	SHA1	Message	Date
Craig Topper	45e85feba6	[RISCV] Pull APInt/computeKnonwbits specifics out of computeGREVOrGORC. NFC This function now takes a uint64_t instead of an APInt. The caller is responsible for masking the shift amount, extracting and inserting into the KnownBits APInts, and inverting to compute zeros. This is less code and cleaner division of responsibilities.	2022-03-28 20:53:54 -07:00
Shao-Ce SUN	662b9fa02c	[NFC][CodeGen] Add a setTargetDAGCombine use ArrayRef Reviewed By: arsenm Differential Revision: https://reviews.llvm.org/D122557	2022-03-29 09:53:24 +08:00
Craig Topper	01203918d1	[RISCV] Add computeKnownBits support for RISCVISD::GORC. Reviewed By: luismarques Differential Revision: https://reviews.llvm.org/D121575	2022-03-28 16:56:33 -07:00
Craig Topper	e68257fcee	[RISCV][SelectionDAG] Enable TargetLowering::hasBitTest for masks that fit in ANDI. Modified DAGCombiner to pass the shift the bittest input and the shift amount to hasBitTest. This matches the other call to hasBitTest in TargetLowering.h This is an alternative to D122454. Reviewed By: luismarques Differential Revision: https://reviews.llvm.org/D122458	2022-03-28 12:46:36 -07:00
Craig Topper	cfe533da05	[RISCV] Add lowering for vp.fptosi and vp.sitofp. This as an alternative version of D120641. Starting from the code here https://repo.hca.bsc.es/gitlab/rferrer/llvm-epi/-/raw/EPI/llvm/lib/Target/RISCV/RISCVISelLowering.cpp but with some modifications to how the interim types are calculated, and adding support for f16. Still need to add fptosi for mask vectors. Lots of masked isel patterns added so we can pass the mask through the type changes. Reviewed By: frasercrmck, arcbbb Differential Revision: https://reviews.llvm.org/D122512	2022-03-28 11:06:41 -07:00
Craig Topper	6c90a654bb	[RISCV] Simplify some code in lowering vector int<->fp conversions. NFC Don't call EltVT.getSizeInBits() or SrcEltVT.getSizeInBits() a second time. They are already in EltSize or SrcEltSize variables. Refactor some comparisons to use multiply instead of division.	2022-03-23 12:09:05 -07:00
Craig Topper	51940d69cb	[RISCV] Special case sign extended scalars when type legalizing nxvXi64 .vx instrinsics on RV32. On RV32, we need to type legalize i64 scalar arguments to intrinsics. We usually do this by splatting the value into a vector separately. If the scalar happens to be sign extended, we can continue using a .vx intrinsic. We already special cased sign extended constants, this extends it to any sign extended value. I've only added tests for one case of vadd. Most intrinsics go through the same check. Reviewed By: khchen Differential Revision: https://reviews.llvm.org/D122186	2022-03-22 10:29:06 -07:00
Craig Topper	cc5b0868ff	Revert "[RISCV] Special case sign extended scalars when type legalizing nxvXi64 .vx instrinsics on RV32." This reverts commit `8c4937b33f`. Committed by mistake.	2022-03-21 14:58:11 -07:00
Craig Topper	d4aeb5000f	[RISCV] Simplify some code. NFC	2022-03-21 14:50:56 -07:00
Craig Topper	19de2e8db6	[RISCV] Remove stray slash from comment. NFC	2022-03-21 14:50:56 -07:00
Craig Topper	8c4937b33f	[RISCV] Special case sign extended scalars when type legalizing nxvXi64 .vx instrinsics on RV32. On RV32, we need to type legalize i64 scalar arguments to intrinsics. We usually do this by splatting the value into a vector separately. If the scalar happens to be sign extended, we can continue using a .vx intrinsic. We already special cased sign extended constants, this extends it to any sign extended value. I've only added tests for one case of vadd. Most intrinsics go through the same check. I can add more tests if we're concerned. Differential Revision: https://reviews.llvm.org/D122186	2022-03-21 14:50:55 -07:00
Jessica Clarke	63ea7797dd	[RISCV] Fix buildbot breakage by explicitly instantiating templates RISCVISelDAGToDAG's selectImm uses RISCVTargetLowering::getAddr (specifically the ConstantPoolSDNode) as of `41454ab256` ("[RISCV] Use constant pool for large integers"), but nothing explicitly instantiates any of the templates, the only reason they exist is because of the various lowering methods in RISCVISelLowering.cpp that themselves use the methods. However, with inlining, those can end up not existing as real functions and thus not be exported, leading to link errors. Up until now this hasn't happened, but for whatever reason D121654 has triggered this on the sanitizer-ppc64be-linux buildbot, giving: ../../../../lib/libLLVMRISCVCodeGen.a(RISCVISelDAGToDAG.cpp.o): In function `selectImm(llvm::SelectionDAG, llvm::SDLoc const&, llvm::MVT, long, llvm::RISCVSubtarget const&)': RISCVISelDAGToDAG.cpp:(.text._ZL9selectImmPN4llvm12SelectionDAGERKNS_5SDLocENS_3MVTElRKNS_14RISCVSubtargetE+0x3d8): undefined reference to `llvm::SDValue llvm::RISCVTargetLowering::getAddr<llvm::ConstantPoolSDNode>(llvm::ConstantPoolSDNode, llvm::SelectionDAG&, bool) const' collect2: error: ld returned 1 exit status Fix this by explicitly instantiating getAddr in its four different forms so separate translation units can reliably use it. Fixes: `41454ab256` ("[RISCV] Use constant pool for large integers")	2022-03-18 02:22:17 +00:00
Craig Topper	7e15303062	[RISCV] Simplify scalable vector case in lowerVectorMaskExt. Since we have SPLAT_VECTOR_PARTS these days, I don't think we need to go through extra lengths to avoid introducing an illegal scalar type. We can just call getConstant using the scalable vector type and let it create either a SPLAT_VECTOR or a SPLAT_VECTOR_PARTS. Reviewed By: frasercrmck, rogfer01 Differential Revision: https://reviews.llvm.org/D121645	2022-03-17 09:43:13 -07:00
Jessica Clarke	659363c0cc	[RISCV] Ensure PseudoLA* can be hoisted Since we mark the pseudos as mayLoad but do not provide any MMOs, isSafeToMove conservatively returns false, stopping MachineLICM from hoisting the instructions. PseudoLA_TLS_GD does not actually expand to a load, so stop marking that as mayLoad to allow it to be hoisted, and for the others make sure to add MMOs during lowering to indicate they're GOT loads and thus can be freely moved. Fixes https://github.com/llvm/llvm-project/issues/54372 Reviewed By: MaskRay, arichardson Differential Revision: https://reviews.llvm.org/D121654	2022-03-16 18:45:36 +00:00
Craig Topper	06c5d74090	[RISCV] Remove lowerSPLAT_VECTOR This code handles fixed vector SPLAT_VECTOR, but is never called in any tests. We only form fixed vector splat vectors for vXi64 on RV32 as part of DAGCombine. This will be type legalized to SPLAT_VECTOR_PARTS. So the Custom handling for SPLAT_VECTOR is never needed. This patch makes SPLAT_VECTOR for vXi64 'Legal' on RV32 so that DAGCombine will create it, but there's no need for Custom handler. It will still be type legalized to SPLAT_VECTOR_PARTS. Reviewed By: frasercrmck Differential Revision: https://reviews.llvm.org/D121673	2022-03-15 08:22:13 -07:00
Craig Topper	eeb3bfd74a	[RISCV] Merge ReplaceNodeResults code for SHFL and GREV/GORC. NFC	2022-03-13 18:42:26 -07:00
Lehua Ding	1648852c98	[RISCV][RVV] Fix vslide1up/down intrinsics overflow bug for SEW=64 on RV32 Reviewed By: craig.topper, kito-cheng Differential Revision: https://reviews.llvm.org/D120899	2022-03-13 18:06:09 +08:00
Craig Topper	fd4d584d6b	[RISCV] Add DAGCombine to fold (bitreverse (bswap X)) to brev8 with Zbkb. If the type is less than XLenVT, type legalization will turn this into (srl (bitreverse (bswap (srl (bswap X), C))), C). We can't completely recover from these shifts. They introduce zeros into the upper bits of the result and we can't easily tell if they are needed. By doing a DAG combine early, we avoid introducing these shifts.	2022-03-12 16:39:39 -08:00
Craig Topper	43f668b98e	[RISCV] Move GORCIW/GREVIW formation to isel patterns. Type legalize narrow RISCVISD::GREV/GORC with constant to a larger type without switching to W. Detect sext_inreg+gorci/grevi with a uimm5 immediate during isel to emit GREVIW/GORCIW. This allows us to better propagate known bits information through extended bits after type legalization. It will also simplify a change I'm considering for BREV8 with Zbkb. A future patch will add computeKnownBits support for GORC. A further improvement here would be to use hasAllWUsers and doPeepholeSExtW like we do for SLLIW, but I don't think we have the test coverage for that yet.	2022-03-11 18:02:47 -08:00
Craig Topper	337d49da84	[RISCV] Fix typo in comment. NFC	2022-03-10 22:00:18 -08:00
Craig Topper	1f3a8d58a6	[RISCV] Use ZERO_EXTEND instead of ANY_EXTEND when promoting i32 RISCVISD::SHFL. NFC We know the shift amount is a constant with bit 31 clear. anyext of constant will be either zext or sext which will produce the same result here. But we really shouldn't rely on that. It would be valid to put a random number in the upper bits. Our isel patterns expect the upper bits to be 0 so we should ask for it explicitly.	2022-03-10 20:57:04 -08:00
Craig Topper	9ce6b1ca86	[RISCV] Remove performANY_EXTENDCombine. This doesn't appear to be needed any more. I did some inspecting of the gcc torture suite and SPEC2006 with this removed and didn't find any meaningful changes. I think we're more aggressive about forming ADDIW now using sign_extend_inreg during type legalization and hasAllWUsers in isel. This probably helps catch the cases this helped with before.	2022-03-10 11:29:31 -08:00
Luke	0803dba7dd	[RISCV] Add fixed-length vector instrinsics for segment load Inspired by reviews.llvm.org/D107790. Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D119834	2022-03-10 16:23:40 +08:00
Craig Topper	d53707508a	[RISCV] Remove RISCVISD::VLE_VL/VSE_VL. Use intrinsics instead. Similar to what we do for other loads/stores, use the intrinsic version that we already have custom isel for. Reviewed By: rogfer01 Differential Revision: https://reviews.llvm.org/D121166	2022-03-09 22:44:28 -08:00
Craig Topper	845bfcede1	[RISCV] Rename 'SplatOperand' to 'ScalarOperand'. NFC vslide1up/down have this flag set, but the value isn't a splat. Rename for clarity. Reviewed By: khchen Differential Revision: https://reviews.llvm.org/D121037	2022-03-07 11:28:32 -08:00
Craig Topper	bd5f124716	[RISCV] Add SimplifyDemandedBits support for FSR/FSL/FSRW/FSLW.	2022-03-05 21:26:51 -08:00
Craig Topper	232f57319d	[RISCV] Move vslide1up/down intrinsics into lowerVectorIntrinsicSplats. NFC Rename to lowerVectorIntrinsicScalars. This allows us to share the code that checks if the scalar needs to be type legalized.	2022-03-04 18:21:53 -08:00
Craig Topper	3d4e83f17d	[RISCV] With Zbb, fold (sext_inreg (abs X)) -> (max X, (negw X)) With Zbb, abs is expanded to (max X, neg) by default. If X has 33 or more sign bits, we can expand it a little early using negw instead of neg to save a sext_inreg. If X started as a 32 bit value, type legalization would have inserted a sext before the abs so X having 33 sign bits should always be true. Note: I've used ISD::FREEZE here since we increase the number of uses. Our default expansion for ABS doesn't do that, but I think that's a bug. We can't do this with custom type legalization because ISD::FREEZE doesn't propagate sign bits so later DAG combine won't expand be able to see optmize it. Alives2 https://alive2.llvm.org/ce/z/Gx3RNe Reviewed By: spatel Differential Revision: https://reviews.llvm.org/D120597	2022-03-03 15:42:29 -08:00
Craig Topper	6cb42cd666	[RISCV] More correctly ignore Zfinx register classes in getRegForInlineAsmConstraint. Until Zfinx is supported in CodeGen we need to convert all Zfinx register classes to GPR. Remove the zfinx-types.ll test which didn't test anything meaningful since -mattr=zfinx isn't implemented completely in llc. Follow up to D93298.	2022-03-02 11:22:46 -08:00
Craig Topper	a1f8349d77	[RISCV] Don't combine ROTR ((GREV x, 24), 16)->(GREV x, 8) on RV64. This miscompile was introduced in D119527. This was a special pattern for rotate+bswap on RV32. It doesn't work for RV64 since the rotate needs to be half the bitwidth. The equivalent pattern for RV64 is ROTR ((GREV x, 56), 32) so match that instead. This could be generalized further as noted in the new FIXME. Reviewed By: Chenbing.Zheng Differential Revision: https://reviews.llvm.org/D120686	2022-03-02 09:47:06 -08:00
Shao-Ce SUN	0e38b29543	[RISCV] add the MC layer support of Zfinx extension This patch added the MC layer support of Zfinx extension. Authored-by: StephenFan Co-Authored-by: Shao-Ce Sun Reviewed By: asb Differential Revision: https://reviews.llvm.org/D93298	2022-03-02 14:25:19 +08:00
Craig Topper	b9d6e8c441	[RISCV] Lower VECTOR_SPLICE to RVV instructions. This lowers VECTOR_SPLICE of scalable vectors to a slidedown follow by a slideup. Fixed vectors are encouraged to use shufflevector instruction. The equivalent patch for fixed vectors is D119039. I've used a tail agnostic slidedown and limited the VL to only the elements that will not be overwritten by the slideup. The slideup uses VLMax for its VL. It unfortunately uses tail undisturbed policy but it isn't required as there is no tail. We just need the merge operand to carry the bits for the lower portion of the result. Care was taken to ensure that either the slideup or slidedown will be able to use a .vi instruction when the immediate is small. Which one uses the immediate depends on the sign of the immediate. Reviewed By: frasercrmck, ABataev Differential Revision: https://reviews.llvm.org/D119303	2022-03-01 10:10:13 -08:00
Craig Topper	e83db8c001	[RISCV] Only enable combineROTR_ROTL_RORW_ROLW with Zbp. I think the immediate values we check for on the GREV nodes already protect this, but better to be explicit.	2022-02-28 12:47:36 -08:00
Craig Topper	b083157b7b	[RISCV] Don't call combineROTR_ROTL_RORW_ROLW for SLLW/SRLW/SRAW nodes. NFC I think the function does the correct thing internally, but it's confusing to read.	2022-02-28 11:05:10 -08:00
Craig Topper	f46890711f	[RISCV] Custom type legalize i32 ISD::ABS on RV64 without Zbb. Default type legalization will create sext_inreg+abs, but we may not be able to remove the sext_inreg. Instead this patch expands abs during type legalization to Y = sraiw X, 31; subw(xor X, Y), Y) which doesn't require the input to be sign extended. This gives a big improvement for some neg-abs tests where the abs is used more than the the neg. Previously the abs was expanded a different way before and after type legalization. Now they are expanded in a similar way enabling more CSE. Reviewed By: asb Differential Revision: https://reviews.llvm.org/D120636	2022-02-28 09:30:27 -08:00
Chenbing Zheng	b20e80aa59	[RISCV] DAG Combine vcpop and vfirst with VL=0 to li imm vcpop and vfirst are still useful when VL=0. vcpop equivalents to li 0 and vfirst equivalents to li -1, since no mask elements are active. Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D120302	2022-02-25 14:44:25 +08:00
Craig Topper	a975ca97c3	[RISCV] Fold (sext_inreg (fmv_x_anyexth X), i16) -> (fmv_x_signexth X). Add a new ISD opcode to represent the sign extending behavior of vmv.x.h. Keep the previous anyext opcode to allow the existing (fmv_x_anyexth (fmv_h_x X)) combine to keep working without needing to generate a sign extend. For fmv.x.w we are able to match the sext_inreg in an isel pattern, but a 16-bit sext_inreg is lowered to a shift pair before isel. This seemed like a larger match than we should do in isel. Reviewed By: asb Differential Revision: https://reviews.llvm.org/D118974	2022-02-24 09:19:01 -08:00
Shao-Ce SUN	78b5f0fb05	[NFC][RISCV] Reuse ISD::NodeType in float extension Reviewed By: asb Differential Revision: https://reviews.llvm.org/D120412	2022-02-24 19:57:55 +08:00
Craig Topper	5b7ac107b1	[RISCV] Use SelectionDAG::getFreeze to simplify some code. NFC	2022-02-23 21:13:01 -08:00
Craig Topper	c7d6448d03	[DAGCombiner][TargetLowering] Pass SDValue by value to isMulAddWithConstProfitable. Internally to DAGCombiner the SDValues were passed by non-const reference despite not being modified. They were then passed by const reference to TLI. This patch passes them by value which is consistent with the vast majority of code. Reviewed By: RKSimon Differential Revision: https://reviews.llvm.org/D120420	2022-02-23 12:40:45 -08:00
Alex Bradbury	c5bcfb983e	[RISCV] Avoid infinite loop between DAGCombiner::visitMUL and RISCVISelLowering::transformAddImmMulImm See https://github.com/llvm/llvm-project/issues/53831 for a full discussion. The basic issue is that DAGCombiner::visitMUL and RISCVISelLowering;:transformAddImmMullImm get stuck in a loop, as the current checks in transformAddImmMulImm aren't sufficient to avoid all cases where DAGCombiner::isMulAddWithConstProfitable might trigger a transformation. This patch makes transformAddImmMulImm bail out if C0 (the constant used for multiplication) has more than one use. Differential Revision: https://reviews.llvm.org/D120332	2022-02-23 11:05:46 +00:00
Zakk Chen	f7dfc5d1af	[RISCV] Optimize tail agnostic vmv.s.x which don't need to select tail value. Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D120250	2022-02-21 14:53:37 -08:00
Craig Topper	90d240553d	[RISCV] Teach shouldSinkOperands to sink splat operands of vp.fma intrinsics. Reviewed By: frasercrmck Differential Revision: https://reviews.llvm.org/D120167	2022-02-21 11:52:59 -08:00
Craig Topper	bbee9e77f3	[RISCV] Match shufflevector corresponding to slideup. This generalizes isElementRotate to work when there's only a single slide needed. I've removed matchShuffleAsSlideDown which is now redundant. Reviewed By: frasercrmck, khchen Differential Revision: https://reviews.llvm.org/D119759	2022-02-17 08:19:10 -08:00
Zakk Chen	eeb7754f68	[RISCV] Add the passthru operand for vmv.vv/vmv.vx/vfmv.vf IR intrinsics. Add the passthru operand for VMV_V_X_VL, VFMV_V_F_VL and SPLAT_VECTOR_SPLIT_I64_VL also. The goal is support tail and mask policy in RVV builtins. We focus on IR part first. If the passthru operand is undef, we use tail agnostic, otherwise use tail undisturbed. Reviewed By: rogfer01 Differential Revision: https://reviews.llvm.org/D119688	2022-02-17 06:38:14 -08:00
Craig Topper	cfbbcc544c	[RISCV] Improve lowering of SHL_PARTS/SRL_PARTS/SRA_PARTS. Part of the shift lowering creates a (sub XLEN-1, ShAmt). When this value is used we know that ShAmt is [0..XLEN-1]. Since XLEN is a power of 2 we can replace the sub with an xor. This allows us to use XORI instead of LI+SUB. Reviewed By: asb Differential Revision: https://reviews.llvm.org/D119411	2022-02-16 09:22:11 -08:00
Zakk Chen	b784719904	[RISCV] Add the passthru operand for RVV nomask binary intrinsics. The goal is support tail and mask policy in RVV builtins. We focus on IR part first. If the passthru operand is undef, we use tail agnostic, otherwise use tail undisturbed. Add passthru operand for VSLIDE1UP_VL and VSLIDE1DOWN_VL to support i64 scalar in rv32. The masked VSLIDE1 would only emit mask undisturbed policy regardless of giving mask agnostic policy until InsertVSETVLI supports mask agnostic. Reviewed by: craig.topper, rogfer01 Differential Revision: https://reviews.llvm.org/D117989	2022-02-15 18:36:18 -08:00
Craig Topper	ab6e02dded	[RISCV] Match vwmulsu_vx with scalar splat input. This is a more generic version of D119110 that uses MaskedValueIsZero to do the matching and SimplifyDemandedBits to remove any unneeded AND instructions. Tests were taken from D119110. Reviewed By: Chenbing.Zheng Differential Revision: https://reviews.llvm.org/D119622	2022-02-15 08:45:21 -08:00
Craig Topper	478c237e21	[RISCV] Fix incorrect extend type in vwmulsu combine. While matching widening multiply, if we matched an extend from i8->i32, i16->i64 or i8->i64, we need to reintroduce a narrower extend. If we're matching a vwmulsu we need to use a sext for op0 and a zext for op1. This bug exists in LLVM 14 and will need to be backported. Differential Revision: https://reviews.llvm.org/D119618	2022-02-12 12:47:20 -08:00
Chenbing.Zheng	9e975e558b	[RISCV][NFC] Move some combine patterns to DAG combine. Move some combine patterns to DAG combine，and it dealt with fixme left in RISCVInstrInfoZb.td. Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D119527	2022-02-12 02:52:21 +00:00
Craig Topper	b0e77d5e48	[RISCV] Lower the shufflevector equivalent of vector.splice We can lower a vector splice to a vslidedown and a vslideup. The majority of the matching code here came from X86's code for matching PALIGNR and VPALIGND/Q. The slidedown and slideup lowering don't really require it to be concatenation, but it happened to be an interesting pattern with existing analysis code I could use. This helps with cases where the scalar loop optimizer forwarded a load result from a previous loop iteration. For example, this happens if the loop uses x[i] and x[i+1] on the same iteration. The scalar optimizer will forward x[i+1] load from the previous loop to satisfy x[i] on this loop. When this get vectorized it results in one element of a vector being forwarded from the previous loop to be concatenated with elements loaded on this iteration. Whether that's more efficient than doing a shifted loaded or reloading the single scalar and using vslide1up is an interesting question. But that's not something the backend can help with. Reviewed By: khchen Differential Revision: https://reviews.llvm.org/D119039	2022-02-10 09:39:35 -08:00
Craig Topper	b861ddf365	[RISCV] Move the creation of VLMaxSentinel to isel. Use X0 during lowering. The VLMaxSentinel is represented as TargetConstant, but that's included in isa<ConstantSDNode>. To keep constant VLs and VLMax separate as long as possible, use the X0 register during lowering and only convert to VLMaxSentinel during isel. Reviewed By: frasercrmck Differential Revision: https://reviews.llvm.org/D118845	2022-02-10 09:28:44 -08:00
Craig Topper	279b3b8179	[RISCV][VP] Lower VP_FMA to RVV instructions. We already had FMA_VL node, but we didn't have masked patterns. I have not added the fneg variations. I'll do those after I add llvm.vp.fneg. Reviewed By: frasercrmck Differential Revision: https://reviews.llvm.org/D119196	2022-02-09 11:33:12 -08:00
Craig Topper	63e711549c	[RISCV] Lower VP_FNEG to RVV instructions Reviewed By: frasercrmck Differential Revision: https://reviews.llvm.org/D119269	2022-02-09 10:56:39 -08:00
Fraser Cormack	62c4ac764b	[RISCV] Optimize splats of extracted vector elements This patch adds an optimization to splat-like operations where the splatted value is extracted from a identically-sized vector. On RVV we can splat that via vrgather.vx/vrgather.vi without dropping to scalar beforehand. We do have a similar VECTOR_SHUFFLE-specific optimization but that only works on fixed-length vector types and for those with a constant splat lane. This patch extends this optimization to make it work on scalable-vector types and on unknown extract indices. It is performed during fixed-vector BUILD_VECTOR lowering and during a new DAGCombine on SPLAT_VECTOR for scalable vectors. Reviewed By: craig.topper, khchen Differential Revision: https://reviews.llvm.org/D118456	2022-02-08 10:35:25 +00:00
wangpc	c53d99c37d	[RISCV] Split f64 undef into two i32 undefs So that no store instruction will be generated. Reviewed By: asb Differential Revision: https://reviews.llvm.org/D118222	2022-02-08 13:42:15 +08:00
Craig Topper	c1cef111a3	Revert "[RISCV] Fold (sext_inreg (fmv_x_anyexth X), i16) -> (fmv_x_signexth X)." This reverts commit `673d68cd92`. This hadn't been reviewed yet.	2022-02-05 12:51:01 -08:00
Craig Topper	673d68cd92	[RISCV] Fold (sext_inreg (fmv_x_anyexth X), i16) -> (fmv_x_signexth X). Add a new ISD opcode to represent the sign extending behavior of vmv.x.h. Keep the previous anyext opcode to allow the existing (fmv_x_anyexth (fmv_h_x X)) combine to keep working without needing to generate a sign extend. For fmv.x.w we are able to match the sext_inreg in an isel pattern, but a 16-bit sext_inreg is lowered to a shift pair before isel. This seemed like a larger match than we should do in isel. Differential Revision: https://reviews.llvm.org/D118974	2022-02-05 12:42:12 -08:00
Craig Topper	234e54bdd8	[RISCV] Add more types of shuffles isShuffleMaskLegal. Add the vslidedown and interleave patterns that I recently implemented. Reviewed By: frasercrmck Differential Revision: https://reviews.llvm.org/D118952	2022-02-04 09:13:13 -08:00
Craig Topper	c83905a308	[RISCV] Add inline expansion for vector fround. This avoids a crash for scalable vectors and or scalarization for fixed vectors. The algorithm is different enough that I don't think it makes sense to merge with ceil/floor/trunc. Algorithm is adapted from gcc's X86 SSE2 output. Reviewed By: frasercrmck Differential Revision: https://reviews.llvm.org/D117247	2022-02-04 09:12:09 -08:00
Craig Topper	2349fb0312	[RISCV] Remove RISCVISD::SPLAT_VECTOR_I64 in favor of RISCVISD::VMV_V_X_VL. SPLAT_VECTOR_I64 has the same semantics as RISCVISD::VMV_V_X_VL, it just assumed VLMax instead of carrying a VL operand. Include order of RISCVInstrInfoVSDPatterns.td and RISCVInstrInfoVVLPatterns.td has been swapped to avoid moving riscv_vmv_v_x_vl into RISCVInstrInfoVSDPatterns.td and to allow moving other "_vl" SDNodes back to RISCVInstrInfoVVLPatterns.td Reviewed By: frasercrmck Differential Revision: https://reviews.llvm.org/D118841	2022-02-03 08:30:25 -08:00
Craig Topper	abc6716038	[RISCV] Remove unused variables. NFC	2022-02-02 19:23:16 -08:00
Craig Topper	f1720abb54	[RISCV] Cleanup some places that assumed VLMaxSentinel and -1 constant mean the same thing. NFCI VLMaxSentintel happens to be represented as -1 TargetConstant. A user provided -1 would be an ISD::Constant. We shouldn't assume that they are the same thing. I'm still not entirely convinced that we should be treating -1 from the user as VLMAX. Also fix one place that failed to use XLenVT for the VLMaxSentinel, using MVT::i64 in code that only executes on RV32.	2022-02-02 12:23:12 -08:00
Craig Topper	b73d151a11	[RISCV] Add DAG combines to transform ADD_VL/SUB_VL into widening add/sub. This adds or reuses ISD opcodes for vadd.wv, vaddu.wv, vadd.vv, vaddu.vv and a similar set for sub. I've included support for narrowing scalar splats that have known sign/zero bits similar to what was done for MUL_VL. The conversion to vwadd.vv proceeds in two phases. First we'll form a vwadd.wv by narrowing one of the operands. Then we'll visit the vwadd.wv to try to narrow the other operand. This turned out to be simpler than catching all the cases in one step. The forming of of vwadd.wv can happen for either operand for add, but only the right hand side for sub since sub isn't commutable. An interesting quirk is that ADD_VL and VZEXT_VL/VSEXT_VL are formed during vector op legalization, but VMV_V_X_VL isn't usually formed until op legalization when BUILD_VECTORS are handled. This leads to VWADD_W_VL forming in one DAG combine round, and then a later DAG combine round sees the VMV_V_X_VL and needs to commute the operands to get the splat in position. This alone necessitated a VWADD_W_VL combine function which made forming vwadd.vv in two stages an easy choice. I've left out trying hard to form vwadd.wx instructions for now. It would only save an extend in the scalar domain which isn't as interesting. Might need to review the test coverage a bit. Most of the vwadd.wv instructions are coming from vXi64 tests on rv64. The tests were copy pasted from the existing multiply tests. Reviewed By: rogfer01 Differential Revision: https://reviews.llvm.org/D117954	2022-02-02 10:03:08 -08:00
Craig Topper	5a5037c602	[RISCV] Fix some 80 column violations in ComputeNumSignBitsForTargetNode. NFC	2022-02-01 21:43:11 -08:00
Craig Topper	2e45e8abb1	[RISCV] Add a fatal error if ISD::VSCALE is used with Zvl32b. We convert VLEN to vscale by dividing by RVVBitsPerBlock which is currently 64. This is only correct if VLEN is evenly divisible by 64. With only Zvl32b we can't assume that. This patch adds a fatal_error to prevent generating code that may be broken. We probably need to look at how we size stack frame objects too. Reviewed By: frasercrmck Differential Revision: https://reviews.llvm.org/D118583	2022-01-31 09:13:14 -08:00
Craig Topper	09606d6a63	[RISCV] Update the computeKnownBitsForTargetNode for RISCVISD::READ_VLENB to consider Zve/Zvl. We had previously hardcoded this to assume that vector registers are 128 bits. This was true when only V existed, but after Zve extensions were added this became incorrect. This patch adjusts it to support 128, 64, or 32 bit vectors depending on Zvl. The 128-bit limit is artificial, but we don't have any test coverage showing that we larger values so I was being conservative. None of our lit tests depend on this code today due to the custom lowering of ISD::VSCALE that inserts the appropriate left or right shift to convert from VLENB to VSCALE. That code was added after this code in computeKnownBitsForTargetNode. Reviewed By: frasercrmck Differential Revision: https://reviews.llvm.org/D118582	2022-01-31 09:13:14 -08:00
Nikita Popov	0801940c17	[RISCV] Avoid pointer element type access for masked atomicrmw intrinsics masked.atomicrmw.*.i32 intrinsics access an i32 (and then possibly mask it), so hardcode MVT::i32 as the access type here, rather than determining it from the pointer element type. Differential Revision: https://reviews.llvm.org/D118336	2022-01-31 09:28:39 +01:00
Craig Topper	5fbc3cda9e	[RISCV] Use existing variable intead of calling getOperand again. NFCI This is a slight change because I'm using the ANY_EXTEND result instead of the original operand, but getNode should constant fold. While there, add a comment about why the code specifically checks for a ConstantSDNode.	2022-01-30 18:42:19 -08:00
Craig Topper	744be8c502	[RISCV] Lower riscv_zip/unzip intrinsic to RISCVISD::SHFL/UNSHFL. These are special versions of the more general shfli/unshfli instructions. We can use the general ISD opcodes with the correct immediates.	2022-01-30 13:27:41 -08:00
Craig Topper	e1075186a6	[RISCV] Custom lower brev8 intrinsic to RISCVISD::GREV. We can use the RISCVISD::GREV encoding that swaps the bits in each byte. This allows it to use the existing computeKnownBits support for RISCVISD::GREV.	2022-01-30 12:41:09 -08:00
Craig Topper	524545317c	[RISCV] Remove RISCVISD::BREV8 and use RISCVISD::GREV instead. We already have an ISD opcode for the more general GREV/GREVI instructon. We can just use it with the encoding that corresponds to the behavior of brev8. This is similar to what we do for orc.b where we use the GORC ISD opcode.	2022-01-29 22:45:43 -08:00
Craig Topper	d8f929a567	[RISCV] Custom legalize BITREVERSE with Zbkb. With Zbkb, a bitreverse can be split into a rev8 and a brev8. Reviewed By: VincentWu Differential Revision: https://reviews.llvm.org/D118430	2022-01-28 23:11:12 -08:00
jacquesguan	1276678982	[RISCV] Improve extract_vector_elt for fixed mask registers. Now the backend promotes mask vector to an i8 vector and extract element from that. We could bitcast to a widen element vector, and extract from it to GPR, then use I instruction to extract the certain bit. Differential Revision: https://reviews.llvm.org/D117389	2022-01-29 11:07:53 +08:00
Craig Topper	ea05ee9059	[RISCV] Preserve VL when truncating i64 gather/scatter indices on RV32. We were creating a truncate with the default for the type, but for VP intrinsics we have a VL that we should use. Reviewed By: frasercrmck Differential Revision: https://reviews.llvm.org/D118406	2022-01-28 09:25:30 -08:00
Kito Cheng	a9d5bb926d	[RISCV] Use __extendhfsf2/__truncsfhf2 for fp16 <-> fp32 `__gnu_h2f_ieee` and `__gnu_f2h_ieee` are introduce by ARM and set that as default name for fp16 and fp32 conversion in LLVM. However RISC-V GCC using default naming scheme for that, which is `__extendhfsf2` and `__truncsfhf2` for that, that cause runtime ABI incompatible issue. Although we didn't have formal runtime ABI spec to specify those naming convention yet, but I think it would be great to fix the incompatible issue first. And I've plan to create a runtime ABI spec undere psABI spec this year. Reviewed By: asb Differential Revision: https://reviews.llvm.org/D118207	2022-01-29 00:01:00 +08:00
Chenbing.Zheng	6d6c44a3f3	[RISCV] Add support for matching vwmulsu from fixed vectors According to riscv-v-spec-1.0, widening signed(vs2)-unsigned integer multiply vwmulsu.vv vd, vs2, vs1, vm # vector-vector vwmulsu.vx vd, vs2, rs1, vm # vector-scalar It is worth noting that signed op is only for vs2. For vwmulsu.vv, we can swap two ops, and don't care which is sign extension, but for vwmulsu.vx signExt can not be a vector extended from scalar (rs1). I specifically added two functions ending with _swap in the test case. Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D118215	2022-01-28 02:33:30 +00:00
Fraser Cormack	84e85e025e	[SelectionDAG][VP] Provide expansion for VP_MERGE This patch adds support for expanding VP_MERGE through a sequence of vector operations producing a full-length mask setting up the elements past EVL/pivot to be false, combining this with the original mask, and culminating in a full-length vector select. This expansion should work for any data type, though the only use for RVV is for boolean vectors, which themselves rely on an expansion for the VSELECT. Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D118058	2022-01-27 09:00:41 +00:00
Wu Xinlong	615d71d9a3	[RISCV][CodeGen] Implement IR Intrinsic support for K extension This revision implements IR Intrinsic support for RISCV Scalar Crypto extension according to the specification of version [[ https://github.com/riscv/riscv-crypto/releases/tag/v1.0.0-scalar \| 1.0]] Co-author：@ksyx & @VincentWu & @lihongliang & @achieveartificialintelligence Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D102310	2022-01-27 15:53:35 +08:00
Craig Topper	f487a76430	[RISCV] Add hasStdExtZbp() to hasAndNotCompare.	2022-01-26 13:54:05 -08:00
Benjamin Kramer	f15014ff54	Revert "Rename llvm::array_lengthof into llvm::size to match std::size from C++17" This reverts commit `ef82063207`. - It conflicts with the existing llvm::size in STLExtras, which will now never be called. - Calling it without llvm:: breaks C++17 compat	2022-01-26 16:55:53 +01:00
serge-sans-paille	ef82063207	Rename llvm::array_lengthof into llvm::size to match std::size from C++17 As a conquence move llvm::array_lengthof from STLExtras.h to STLForwardCompat.h (which is included by STLExtras.h so no build breakage expected).	2022-01-26 16:17:45 +01:00
Zakk Chen	510710d037	[RISCV][NFC] Add getVLOperand for RVV intrinsics. Use the VLOperand information to get the VL. Differential Revision: https://reviews.llvm.org/D118156	2022-01-25 17:37:58 -08:00
Zakk Chen	9273378b85	[RISCV] Add the passthru operand for RVV nomask load intrinsics. The goal is support tail and mask policy in RVV builtins. We focus on IR part first. If the passthru operand is undef, we use tail agnostic, otherwise use tail undisturbed. Co-Authored-by: Hsiangkai Wang <Hsiangkai@gmail.com> Reviewers: craig.topper, frasercrmck Differential Revision: https://reviews.llvm.org/D117647	2022-01-25 17:31:36 -08:00
eopXD	b089e4072a	[RISCV] Don't allow i64 vector div by constant to use mulh with Zve64x EEW=64 of mulh and its vairants requires V extension. Authored by: Craig Topper <craig.topper@sifive.com> @craig.topper Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D117947	2022-01-25 09:55:05 -08:00
Nikita Popov	aa97bc116d	[NFC] Remove uses of PointerType::getElementType() Instead use either Type::getPointerElementType() or Type::getNonOpaquePointerElementType(). This is part of D117885, in preparation for deprecating the API.	2022-01-25 09:44:52 +01:00
Fraser Cormack	d42678b453	[RISCV] Add side-effect-free vsetvli intrinsics This patch introduces new intrinsics that enable the use of vsetvli in contexts where only the returned vector length is of interest. The pre-existing intrinsics are marked with side-effects, which prevents even trivial optimizations on/across them. These intrinsics are intended to be used in situations where the vector length is fed in turn to RVV intrinsics or to vector-predication intrinsics during loop vectorization, for example. Those codegen paths ensure that instructions are generated with their own implicit vsetvli, so the vector length and vtype can be relied upon to be correct. No corresponding C builtins are planned at this stage, though that is a possibility for the future if the need arises. Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D117910	2022-01-24 13:52:08 +00:00
Fraser Cormack	af773a1818	[RISCV][VP] Lower VP_MERGE to RVV instructions This patch adds lowering of the llvm.vp.merge.* intrinsic (ISD::VP_MERGE) to RVV vmerge/vfmerge instructions. It introduces a special pseudo form of vmerge which allows a tied merge operand, allowing us to specify the tail elements as being equal to the "on false" operand, using a tied-def constraint and a "tail undisturbed" policy. While this strategy allows us to often lower the intrinsic to just one instruction, it may be less efficient in fixed-vector types as the number of tail elements may extend far beyond the length of the fixed vector. Another strategy could be to use a vmerge/vfmerge instruction with an AVL equal to the length of the vector type, and manipulate the condition operand such that mask elements greater than the operation's EVL are false. I've also observed inefficient codegen in which our 'VF' patterns don't match raw floating-point SPLAT_VECTORs, which occur in scalable-vector code. Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D117561	2022-01-24 11:05:05 +00:00
Fraser Cormack	e7926e8d97	[RISCV] Match VF variants for masked VFRDIV/VFRSUB This patch follows up on D117697 to help the simple binary operations behave similarly in the presence of masks. It also enables CGP sinking support for vp.fdiv and vp.fsub intrinsics, now that VFRDIV and VFRSUB are consistently matched with a LHS splat for masked and unmasked variants. Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D117783	2022-01-24 10:59:43 +00:00
Craig Topper	d44b6be6ea	[RISCV] Don't Custom legalize f16/f32/f64 bitcasts if those types aren't Legal.	2022-01-22 11:55:18 -08:00
Craig Topper	48132bb1e4	[RISCV] Simplify interface to combineMUL_VLToVWMUL. NFC Instead of passing the both the SDNode* and 2 of the operands in two different orders, just pass the SDNode * and a bool to indicate which operand order to test. While there rename to combineMUL_VLToVWMUL_VL.	2022-01-21 11:43:06 -08:00
Fraser Cormack	4d268dc94a	[RISCV] Enable CGP to sink splat operands of VP intrinsics This patch brings better splat-matching to our VP support, by sinking splat operands of VP intrinsics back into the same block as the VP operation. The list of VP intrinsics we are interested in matches that of the regular instructions. Some optimization is still lacking. For instance, our VL nodes aren't recognized as commutative, so splats must be on the RHS. Because of this, we limit our sinking of splats to just the RHS operand for now. Improvement in this regard can come in another patch. Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D117703	2022-01-21 11:30:37 +00:00
Craig Topper	fa8bb22466	[RISCV] Optimize vector_shuffles that are interleaving the lowest elements of two vectors. RISCV only has a unary shuffle that requires places indices in a register. For interleaving two vectors this means we need at least two vrgathers and a vmerge to do a shuffle of two vectors. This patch teaches shuffle lowering to use a widening addu followed by a widening vmaccu to implement the interleave. First we extract the low half of both V1 and V2. Then we implement (zext(V1) + zext(V2)) + (zext(V2) * zext(2^eltbits - 1)) which simplifies to (zext(V1) + zext(V2) * 2^eltbits). This further simplifies to (zext(V1) + zext(V2) << eltbits). Then we bitcast the result back to the original type splitting the wide elements in half. We can only do this if we have a type with wider elements available. Because we're using extends we also have to be careful with fractional lmuls. Floating point types are supported by bitcasting to/from integer. The tests test a varied combination of LMULs split across VLEN>=128 and VLEN>=512 tests. There a few tests with shuffle indices commuted as well as tests for undef indices. There's one test for a vXi64/vXf64 vector which we can't optimize, but verifies we don't crash. Reviewed By: rogfer01 Differential Revision: https://reviews.llvm.org/D117743	2022-01-20 14:44:47 -08:00
Craig Topper	94e69fbb4f	[RISCV] Add DAG combine to fold (fp_to_int_sat (ffloor X)) -> (select X == nan, 0, (fcvt X, rdn)) Similar for ceil, trunc, round, and roundeven. This allows us to use static rounding modes to avoid a libcall. This is similar to D116771, but for the saturating conversions. This optimization is done for AArch64 as isel patterns. RISCV doesn't have instructions for ceil/floor/trunc/round/roundeven so the operations don't stick around until isel to enable a pattern match. Thus I've implemented a DAG combine. I'm only handling saturating to i64 or i32. This could be extended to other sizes in the future. Reviewed By: asb Differential Revision: https://reviews.llvm.org/D116864	2022-01-20 11:35:37 -08:00
Fraser Cormack	5a12024b95	[RISCV] Optimize lowering of floating-point -0.0 This idea has come up in several reviews -- D115978 and D105902 -- so I can't take any credit for the idea. Instead of using a constant pool to lower -0.0, we can emit a sequence of two instructions: fmv.[hwd].x freg, zero fsgnjn.[hsd] freg, freg, freg This is only done when the floating-point type is legal. Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D117687	2022-01-20 11:46:28 +00:00
Chenbing.Zheng	0be3da1fab	[RISCV] Add intrinsic for Zbt extension RV32: fsl, fsr, fsri RV64: fsl, fsr, fsri, fslw, fsrw, fsriw Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D117468	2022-01-20 08:27:05 +00:00
Craig Topper	4060b81e76	[RISCV] Obey -riscv-v-fixed-length-vector-elen-max when lowering mask BUILD_VECTORs. We may not be allowed to use vXiXLen vectors. Consult ELEN to determine what is allowed. This will become even more important when Zve32 is added. Reviewed By: frasercrmck, arcbbb Differential Revision: https://reviews.llvm.org/D117518	2022-01-19 10:47:37 -08:00
Craig Topper	5a6c622afd	[RISCV] Remove special case for constant shift amount in FSHL/FSHR lowering to FSL/FSR. Remove fshl/fshr with constant shift amount isel patterns. Replace with fsr/fsl with constant isel patterns. This hack was trying to preserve as much optimization opportunity for fshl/fshr by constant as possible, but the conversion to RISCVISD::FSR/FSL happens so late it probably isn't worth much. The new isel patterns are needed by D117468 anyway.	2022-01-18 11:47:50 -08:00
Craig Topper	aa7fc02feb	Recommit "[RISCV] Make the operand order for RISCVISD::FSL(W)/FSR(W) match the instruction register numbering." This reverts the revert commit `e328385739`. Accidental demanded bits change has been removed. The demanded bits code itself was remove in a pre-commit since it isn't tested. Original commit message: Previous we used the fshl/fshr operand ordering for simplicity. This made things confusing when D117468 proposed adding intrinsics for the instructions. We can't just use the generic funnel shifting intrinsics because fsl/fsr have different functionality that should be exposed to software. Now we use rs1, rs3, rs2/shamt order which matches the instruction printing order and the order used in this intrinsic header https://github.com/riscv/riscv-bitmanip/blob/main-history/cproofs/rvintrin.h	2022-01-18 10:52:43 -08:00
Craig Topper	b3a0ec7645	[RISCV] Remove DemandedBits handling for FSR/FSL until we have test cases for it. Testing may be easier after D117468. Right now we get demanded bits optimizations done on ISD::FSHL/FSHR before they become FSR/FSL. This makes it hard to test.	2022-01-18 10:52:43 -08:00

1 2 3 4 5 ...

682 Commits