llvm-project

Commit Graph

Author	SHA1	Message	Date
Craig Topper	aaad507546	[RISCV] Return false from isOffsetFoldingLegal instead of reversing the fold in lowering. When lowering GlobalAddressNodes, we were removing a non-zero offset and creating a separate ADD. It already comes out of SelectionDAGBuilder with a separate ADD. The ADD was being removed by DAGCombiner. This patch disables the DAG combine so we don't have to reverse it. Test changes all look to be instruction order changes. Probably due to different DAG node ordering. Differential Revision: https://reviews.llvm.org/D126558	2022-05-27 11:05:18 -07:00
Philip Reames	8a3b6ba756	[RISCV] Add a subtarget feature to enable unaligned scalar loads and stores A RISCV implementation can choose to implement unaligned load/store support. We currently don't have a way for such a processor to indicate a preference for unaligned load/stores, so add a subtarget feature. There doesn't appear to be a formal extension for unaligned support. The RISCV Profiles (https://github.com/riscv/riscv-profiles/blob/main/profiles.adoc#rva20u64-profile) docs use the name Zicclsm, but a) that doesn't appear to actually been standardized, and b) isn't quite what we want here anyway due to the perf comment. Instead, we can follow precedent from other backends and have a feature flag for the existence of misaligned load/stores with sufficient performance that user code should actually use them. Differential Revision: https://reviews.llvm.org/D126085	2022-05-26 15:25:47 -07:00
Craig Topper	e9ac99b609	[RISCV] Simplfy creation of IndexVT in lowerMaskedGather/lowerMaskedScatter. NFC The scalar element width is not a factor in how ContainerVT is determined. We don't need to check the relative size of VT and IndexVT.	2022-05-26 13:13:32 -07:00
jacquesguan	b271488e8b	[RISCV] Replace ISD::FP_EXTEND and ISD::FP_ROUND with RVV VL op. This patch tries to solve the incoordination between the direct and intermediate cast caused by D123975. This patch replaces ISD::FP_EXTEND and ISD::FP_ROUND with RVV VL op in the lowering of FP scalable vector direct cast to unify with the intermediate cast. And it also changes the FP widenning pattern with the VL op. Differential Revision: https://reviews.llvm.org/D125364	2022-05-26 02:17:31 +00:00
Craig Topper	172149e98c	[RISCV] Preserve fast math flags in lowerVPOp. Update test to check MIR after finalize-isel instead of debug output. This is of course not the only place we should preserve FMF, but it's the most obvious one. Reviewed By: frasercrmck Differential Revision: https://reviews.llvm.org/D126306	2022-05-25 09:16:07 -07:00
Paul Walker	258dac43d6	[SVE] Enable use of 32bit gather/scatter indices for fixed length vectors Differential Revision: https://reviews.llvm.org/D125193	2022-05-22 12:32:30 +01:00
Jay Foad	6bec3e9303	[APInt] Remove all uses of zextOrSelf, sextOrSelf and truncOrSelf Most clients only used these methods because they wanted to be able to extend or truncate to the same bit width (which is a no-op). Now that the standard zext, sext and trunc allow this, there is no reason to use the OrSelf versions. The OrSelf versions additionally have the strange behaviour of allowing extending to a smaller width, or truncating to a larger width, which are also treated as no-ops. A small amount of client code relied on this (ConstantRange::castOp and MicrosoftCXXNameMangler::mangleNumber) and needed rewriting. Differential Revision: https://reviews.llvm.org/D125557	2022-05-19 11:23:13 +01:00
Paul Walker	7dd05ba9ed	[SelectionDAG] Remove duplicate "is scaled" information from gather/scatter SDNodes. During early gather/scatter enablement two different approaches were taken to represent scaled indices: * A Scale operand whereby byte_offsets = Index * Scale * An IndexType whereby byte_offsets = Index * sizeof(MemVT.ElementType) Having multiple representations is bad as shown by this patch which fixes instances where the two are out of sync. The dedicated scale operand is more flexible and pervasive so this patch removes the UNSCALED values from IndexType. This means all indices are scaled but the scale can be one, hence unscaled. SDNodes now use the scale operand to answer the "isScaledIndex" question. I toyed with the idea of keeping the UNSCALED enums and helper functions but because they will have no uses and force SDNodes to validate the set of supported values I figured it's best to remove them. We can re-add them if there's a real need. For similar reasons I've kept the IndexType enum when a bool could be used as I think being explicitly looks better. Depends On D123347 Differential Revision: https://reviews.llvm.org/D123381	2022-05-16 20:47:52 +01:00
jacquesguan	a8426ada49	[RISCV][NFC] Replace for-each with array argument call. This patch replaces some for-each set with the new arrayref argument API, since it already used an array in defination, I think this change won't cause any ambiguity. Differential Revision: https://reviews.llvm.org/D125455	2022-05-16 02:12:48 +00:00
Craig Topper	5a19fbad83	[RISCV] Remove unneeded check for ISD::VSCALE operand being a constant. NFC ISD::VSCALE only allows constant operands.	2022-05-14 13:45:03 -07:00
Roger Ferrer Ibanez	189ca6958e	[RISCV] Use the new chain when converting a fixed RVV load When building the final merged node, we were using the original chain rather than the output chain of the new operation. After some collapsing of the chain this could cause the loads be incorrectly scheduled respect to later stores. This was uncovered by SingleSource/Regression/C/gcc-c-torture/execute/pr36038.c of the llvm testsuite. https://reviews.llvm.org/D125560	2022-05-13 22:21:08 +00:00
Zakk Chen	7dfc56c107	[RISCV] Add the passthru operand for RVV unmasked segment load IR intrinsics. The goal is support tail and mask policy in RVV builtins. We focus on IR part first. If the passthru operand is undef, we use tail agnostic, otherwise use tail undisturbed. Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D125323	2022-05-13 02:16:40 -07:00
Craig Topper	0ebb02b90a	[RISCV] Override TargetLowering::shouldProduceAndByConstByHoistingConstFromShiftsLHSOfAnd. This hook determines if SimplifySetcc transforms (X & (C l>>/<< Y)) ==/!= 0 into ((X <</l>> Y) & C) ==/!= 0. Where C is a constant and X might be a constant. The default implementation favors doing the transform if X is not a constant. Otherwise the code is left alone. There is a provision that if the target supports a bit test instruction then the transform will favor ((1 << Y) & X) ==/!= 0. RISCV does not say it has a variable bit test operation. RISCV with Zbs does have a BEXT instruction that performs (X >> Y) & 1. Without Zbs, (X >> Y) & 1 still looks preferable to ((1 << Y) & X) since we can fold use ANDI instead of putting a 1 in a register for SLL. This patch overrides this hook to favor bit extract patterns and otherwise falls back to the "do the transform if X is not a constant" heuristic. I've added tests where both C and X are constants with both the shl form and lshr form. I've also added a test for a switch statement that lowers to a bit test. That was my original motivation for looking at this. Reviewed By: asb Differential Revision: https://reviews.llvm.org/D124639	2022-05-11 11:13:17 -07:00
Craig Topper	0781742785	[RISCV] Add a DAG combine to pre-promote (i32 (and (srl X, Y), 1)) with Zbs on RV64. Type legalization will want to turn (srl X, Y) into RISCVISD::SRLW, which will prevent us from using a BEXT instruction. I don't think there is any precedent for type promotion checking users to decide how to promote. Instead, I've added this DAG combine to do it before type legalization. Reviewed By: asb Differential Revision: https://reviews.llvm.org/D124109	2022-05-11 10:49:16 -07:00
Fraser Cormack	c1d48b35d8	[SelectionDAG][VP] Rename VP sext/zext/trunc ISD opcodes Rather than VP_SEXT/VP_ZEXT/VP_TRUNC, having VP_SIGN_EXTEND/VP_ZERO_EXTEND/VP_TRUNCATE better matches their non-VP counterparts. Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D125298	2022-05-11 10:25:51 +01:00
jacquesguan	2509dcd58a	[RISCV] Add rvv codegen support for vp.fpext. This patch adds rvv codegen support for vp.fpext. The lowering of fp_round, vp.fptrunc, fp_extend and vp.fpext share most code so use a common lowering function to handle these four. And this patch changes the intermediate cast from ISD::FP_EXTEND/ISD::FP_ROUND to the RVV VL version op RISCVISD::FP_EXTEND_VL and RISCVISD::FP_ROUND_VL for scalable vectors. Reviewed By: frasercrmck Differential Revision: https://reviews.llvm.org/D123975	2022-05-11 03:28:25 +00:00
Fraser Cormack	0b2e7a7c72	[RISCV][NFC] Remove else after continue	2022-05-10 11:15:50 +01:00
Craig Topper	1d6430b9e2	[RISCV] Update isLegalAddressingMode for RVV. RVV instructions only support base register addressing. Reviewed By: reames Differential Revision: https://reviews.llvm.org/D124820	2022-05-03 19:49:11 -07:00
Hsiangkai Wang	eaaa31ff2c	[RISCV][TargetLowering] Special case overflow expansion for (uaddo X, C). Follow-up to D122933. Differential Revision: https://reviews.llvm.org/D124374	2022-05-03 03:51:36 +00:00
Yeting Kuo	c069e37019	[RISCV] Add DAGCombine to fold base operation and reduction. Transform (<bop> x, (reduce.<bop> vec, splat(neutral_element))) to (reduce.<bop> vec, splat (x)). Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D122563	2022-04-30 14:07:05 +08:00
Philip Reames	3ea191ed03	[RISCV] Factor repeating code into getMaskTypeFor(VT) [nfc]	2022-04-29 10:00:57 -07:00
Philip Reames	f927be0df8	[RISCV] Extract getAllOnesMask helper [nfc]	2022-04-29 09:30:18 -07:00
Craig Topper	ec11fbb1d6	[RISCV] Use default promotion for (i32 (shl 1, X)) on RV64 when Zbs is enabled. This improves opportunities to use bset/bclr/binv. Unfortunately, there are no W versions of these instrcutions so this isn't always a clear win. If we use SLLW we get free sign extend and shift masking, but need to put a 1 in a register and can't remove an or/xor. If we use bset/bclr/binv we remove the immediate materializationg and logic op, but might need a mask on the shift amount and sext.w. Reviewed By: luismarques Differential Revision: https://reviews.llvm.org/D124096	2022-04-28 09:58:30 -07:00
Lian Wang	dc0ae8ce18	[RISCV] Support VP_SETCC mask operations Support VP_SETCC mask operations, turn it to logical operation. Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D124438	2022-04-28 08:52:29 +00:00
Shao-Ce SUN	c59473aacc	[NFC][RISCV][CodeGen] Use ArrayRef in TargetLowering functions Based on D123467. Reviewed By: reames Differential Revision: https://reviews.llvm.org/D123653	2022-04-26 23:53:00 +08:00
Fraser Cormack	3e678cb772	[RISCV] Don't emit fractional VIDs with negative steps We can't shift-right negative numbers to divide them, so avoid emitting such sequences. Use negative numerators as a proxy for this situation, since the indices are always non-negative. An alternative strategy could be to add a compiler flag to emit division instructions, which would at least allow us to test the VID sequence matching itself. Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D123796	2022-04-21 07:00:34 +01:00
Craig Topper	6db0afb44e	[RISCV] Fold (xor (sllw 1, x), -1) -> (rolw ~1, x). There's an existing generic combine that does this for legal types. This patch adds a RISCV specific combine for W instructions. Reviewed By: luismarques Differential Revision: https://reviews.llvm.org/D123983	2022-04-19 15:03:43 -07:00
Fraser Cormack	c5cac48549	[RISCV] Fix lowering of BUILD_VECTORs as VID sequences This patch fixes a bug when lowering BUILD_VECTOR via VID sequences. After adding support for fractional steps in D106533, elements with zero steps may be skipped if no step has yet been computed. This allowed certain sequences to slip through the cracks, being identified as VID sequences when in fact they are not. The fix for this is to perform a second loop over the BUILD_VECTOR to validate the entire sequence once the step has been computed. This isn't the most efficient, but on balance the code is more readable and maintainable than doing back-validation during the first loop. Fixes the tests introduced in D123785. Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D123786	2022-04-19 07:43:38 +01:00
jacquesguan	25445b94db	[RISCV] Add rvv codegen support for vp.fptrunc. This patch adds rvv codegen support for vp.fptrunc. The lowering of fp_round and vp.fptrunc share most code so use a common lowering function to handle those two, similar to vp.trunc. Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D123841	2022-04-19 01:56:18 +00:00
jacquesguan	1aa4f0bb6c	[RISCV][VP] Add RVV codegen for vp.trunc. Differential Revision: https://reviews.llvm.org/D123579	2022-04-15 02:29:53 +00:00
Liqin Weng	8265679018	[RISCV][NFC] Refactor the type promotion of fsl/fsr/becompress/bdecompress/bfp Reviewed By: asb, jrtc27, craig.topper, frasercrmck Differential Revision: https://reviews.llvm.org/D123181	2022-04-13 08:52:04 +00:00
Craig Topper	2ce2562876	[RISCV][SelectionDAG] Add a hook to sign extend i32 ConstantInt operands of phis on RV64. Materializing constants on RISCV is simpler if the constant is sign extended from i32. By default i32 constant operands of phis are zero extended. This patch adds a hook to allow RISCV to override this for i32. We have an existing isSExtCheaperThanZExt, but it operates on EVT which we don't have at these places in the code. Reviewed By: efriedma Differential Revision: https://reviews.llvm.org/D122951	2022-04-11 14:38:39 -07:00
Craig Topper	76192182d0	[RISCV] Remove riscv-v-fixed-length-vector-elen-max command line option. This was added before Zve extensions were defined. I think users should use Zve32x or Zve32f now. Though we will lose support for limiting ELEN to 16 or 8, but I hope no one was using that. Reviewed By: frasercrmck Differential Revision: https://reviews.llvm.org/D123418	2022-04-11 10:14:48 -07:00
Craig Topper	e13a44b460	[RISCV] Add lowering for vp.sext and vp.zext. Including mask vector inputs. Reviewed By: frasercrmck, rogfer01 Differential Revision: https://reviews.llvm.org/D123150	2022-04-06 09:59:49 -07:00
Fraser Cormack	6be5e875be	[RISCV][VP] Add basic RVV codegen for vp.icmp This patch adds the minimum required to successfully lower vp.icmp via the new ISD::VP_SETCC node to RVV instructions. Regular ISD::SETCC goes through a lot of canonicalization which targets may rely on which has not hereto been ported to VP_SETCC. It also supports expansion of individual condition codes and a non-boolean return type. Support for all of that will follow in later patches. In the case of RVV this largely isn't a problem as the vector integer comparison instructions are plentiful enough that it can lower all VP_SETCC nodes on legal integer vectors except for boolean vectors, which regular SETCC folds away immediately into logical operations. Floating-point VP_SETCC operations aren't as well supported in RVV and the backend relies on condition code expansion, so support for those operations will come in later patches. Portions of this code were taken from the VP reference patches. Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D122743	2022-04-06 16:51:22 +01:00
Craig Topper	3c831c9b28	[RISCV] Add support for vp.fptosi where the result is a mask type. We can do this conversion by converting the same sized integer type, then compare the result with 0. The conversion is undefined if the converted FP value doesn't fit in an i1. Reviewed By: frasercrmck Differential Revision: https://reviews.llvm.org/D122678	2022-04-05 09:48:04 -07:00
Craig Topper	d970e96c53	[RISCV] Add lowering for vp.fptoui and vp.uitofp. This is a straightforward extension of D122512 to unsigned integers.	2022-04-01 18:28:46 -07:00
Craig Topper	fa630e7594	[RISCV][AMDGPU][TargetLowering] Special case overflow expansion for (uaddo X, 1). If we expand (uaddo X, 1) we previously expanded the overflow calculation as (X + 1) <u X. This potentially increases the live range of X and can prevent X+1 from reusing the register that previously held X. Since we're adding 1, overflow only occurs if X was UINT_MAX in which case (X+1) would be 0. So this patch adds a special case to expand the overflow calculation to (X+1) == 0. This seems to help with uaddo intrinsics that get introduced by CodeGenPrepare after LSR. Alternatively, we could block the uaddo transform in CodeGenPrepare for this case. Reviewed By: arsenm Differential Revision: https://reviews.llvm.org/D122933	2022-04-01 13:14:10 -07:00
Craig Topper	7417eb29ce	[RISCV] Use getSplatBuildVector instead of getSplatVector for fixed vectors. The splat_vector will be legalized to build_vector eventually anyway. This patch makes it take fewer steps. Unfortunately, this results in some codegen changes. It looks like it comes down to how the nodes were ordered in the topological sort for isel. Because the build_vector is created earlier we end up with a different ordering of nodes. Reviewed By: frasercrmck Differential Revision: https://reviews.llvm.org/D122185	2022-03-30 11:36:34 -07:00
Craig Topper	45e85feba6	[RISCV] Pull APInt/computeKnonwbits specifics out of computeGREVOrGORC. NFC This function now takes a uint64_t instead of an APInt. The caller is responsible for masking the shift amount, extracting and inserting into the KnownBits APInts, and inverting to compute zeros. This is less code and cleaner division of responsibilities.	2022-03-28 20:53:54 -07:00
Shao-Ce SUN	662b9fa02c	[NFC][CodeGen] Add a setTargetDAGCombine use ArrayRef Reviewed By: arsenm Differential Revision: https://reviews.llvm.org/D122557	2022-03-29 09:53:24 +08:00
Craig Topper	01203918d1	[RISCV] Add computeKnownBits support for RISCVISD::GORC. Reviewed By: luismarques Differential Revision: https://reviews.llvm.org/D121575	2022-03-28 16:56:33 -07:00
Craig Topper	e68257fcee	[RISCV][SelectionDAG] Enable TargetLowering::hasBitTest for masks that fit in ANDI. Modified DAGCombiner to pass the shift the bittest input and the shift amount to hasBitTest. This matches the other call to hasBitTest in TargetLowering.h This is an alternative to D122454. Reviewed By: luismarques Differential Revision: https://reviews.llvm.org/D122458	2022-03-28 12:46:36 -07:00
Craig Topper	cfe533da05	[RISCV] Add lowering for vp.fptosi and vp.sitofp. This as an alternative version of D120641. Starting from the code here https://repo.hca.bsc.es/gitlab/rferrer/llvm-epi/-/raw/EPI/llvm/lib/Target/RISCV/RISCVISelLowering.cpp but with some modifications to how the interim types are calculated, and adding support for f16. Still need to add fptosi for mask vectors. Lots of masked isel patterns added so we can pass the mask through the type changes. Reviewed By: frasercrmck, arcbbb Differential Revision: https://reviews.llvm.org/D122512	2022-03-28 11:06:41 -07:00
Craig Topper	6c90a654bb	[RISCV] Simplify some code in lowering vector int<->fp conversions. NFC Don't call EltVT.getSizeInBits() or SrcEltVT.getSizeInBits() a second time. They are already in EltSize or SrcEltSize variables. Refactor some comparisons to use multiply instead of division.	2022-03-23 12:09:05 -07:00
Craig Topper	51940d69cb	[RISCV] Special case sign extended scalars when type legalizing nxvXi64 .vx instrinsics on RV32. On RV32, we need to type legalize i64 scalar arguments to intrinsics. We usually do this by splatting the value into a vector separately. If the scalar happens to be sign extended, we can continue using a .vx intrinsic. We already special cased sign extended constants, this extends it to any sign extended value. I've only added tests for one case of vadd. Most intrinsics go through the same check. Reviewed By: khchen Differential Revision: https://reviews.llvm.org/D122186	2022-03-22 10:29:06 -07:00
Craig Topper	cc5b0868ff	Revert "[RISCV] Special case sign extended scalars when type legalizing nxvXi64 .vx instrinsics on RV32." This reverts commit `8c4937b33f`. Committed by mistake.	2022-03-21 14:58:11 -07:00
Craig Topper	d4aeb5000f	[RISCV] Simplify some code. NFC	2022-03-21 14:50:56 -07:00
Craig Topper	19de2e8db6	[RISCV] Remove stray slash from comment. NFC	2022-03-21 14:50:56 -07:00
Craig Topper	8c4937b33f	[RISCV] Special case sign extended scalars when type legalizing nxvXi64 .vx instrinsics on RV32. On RV32, we need to type legalize i64 scalar arguments to intrinsics. We usually do this by splatting the value into a vector separately. If the scalar happens to be sign extended, we can continue using a .vx intrinsic. We already special cased sign extended constants, this extends it to any sign extended value. I've only added tests for one case of vadd. Most intrinsics go through the same check. I can add more tests if we're concerned. Differential Revision: https://reviews.llvm.org/D122186	2022-03-21 14:50:55 -07:00
Jessica Clarke	63ea7797dd	[RISCV] Fix buildbot breakage by explicitly instantiating templates RISCVISelDAGToDAG's selectImm uses RISCVTargetLowering::getAddr (specifically the ConstantPoolSDNode) as of `41454ab256` ("[RISCV] Use constant pool for large integers"), but nothing explicitly instantiates any of the templates, the only reason they exist is because of the various lowering methods in RISCVISelLowering.cpp that themselves use the methods. However, with inlining, those can end up not existing as real functions and thus not be exported, leading to link errors. Up until now this hasn't happened, but for whatever reason D121654 has triggered this on the sanitizer-ppc64be-linux buildbot, giving: ../../../../lib/libLLVMRISCVCodeGen.a(RISCVISelDAGToDAG.cpp.o): In function `selectImm(llvm::SelectionDAG, llvm::SDLoc const&, llvm::MVT, long, llvm::RISCVSubtarget const&)': RISCVISelDAGToDAG.cpp:(.text._ZL9selectImmPN4llvm12SelectionDAGERKNS_5SDLocENS_3MVTElRKNS_14RISCVSubtargetE+0x3d8): undefined reference to `llvm::SDValue llvm::RISCVTargetLowering::getAddr<llvm::ConstantPoolSDNode>(llvm::ConstantPoolSDNode, llvm::SelectionDAG&, bool) const' collect2: error: ld returned 1 exit status Fix this by explicitly instantiating getAddr in its four different forms so separate translation units can reliably use it. Fixes: `41454ab256` ("[RISCV] Use constant pool for large integers")	2022-03-18 02:22:17 +00:00
Craig Topper	7e15303062	[RISCV] Simplify scalable vector case in lowerVectorMaskExt. Since we have SPLAT_VECTOR_PARTS these days, I don't think we need to go through extra lengths to avoid introducing an illegal scalar type. We can just call getConstant using the scalable vector type and let it create either a SPLAT_VECTOR or a SPLAT_VECTOR_PARTS. Reviewed By: frasercrmck, rogfer01 Differential Revision: https://reviews.llvm.org/D121645	2022-03-17 09:43:13 -07:00
Jessica Clarke	659363c0cc	[RISCV] Ensure PseudoLA* can be hoisted Since we mark the pseudos as mayLoad but do not provide any MMOs, isSafeToMove conservatively returns false, stopping MachineLICM from hoisting the instructions. PseudoLA_TLS_GD does not actually expand to a load, so stop marking that as mayLoad to allow it to be hoisted, and for the others make sure to add MMOs during lowering to indicate they're GOT loads and thus can be freely moved. Fixes https://github.com/llvm/llvm-project/issues/54372 Reviewed By: MaskRay, arichardson Differential Revision: https://reviews.llvm.org/D121654	2022-03-16 18:45:36 +00:00
Craig Topper	06c5d74090	[RISCV] Remove lowerSPLAT_VECTOR This code handles fixed vector SPLAT_VECTOR, but is never called in any tests. We only form fixed vector splat vectors for vXi64 on RV32 as part of DAGCombine. This will be type legalized to SPLAT_VECTOR_PARTS. So the Custom handling for SPLAT_VECTOR is never needed. This patch makes SPLAT_VECTOR for vXi64 'Legal' on RV32 so that DAGCombine will create it, but there's no need for Custom handler. It will still be type legalized to SPLAT_VECTOR_PARTS. Reviewed By: frasercrmck Differential Revision: https://reviews.llvm.org/D121673	2022-03-15 08:22:13 -07:00
Craig Topper	eeb3bfd74a	[RISCV] Merge ReplaceNodeResults code for SHFL and GREV/GORC. NFC	2022-03-13 18:42:26 -07:00
Lehua Ding	1648852c98	[RISCV][RVV] Fix vslide1up/down intrinsics overflow bug for SEW=64 on RV32 Reviewed By: craig.topper, kito-cheng Differential Revision: https://reviews.llvm.org/D120899	2022-03-13 18:06:09 +08:00
Craig Topper	fd4d584d6b	[RISCV] Add DAGCombine to fold (bitreverse (bswap X)) to brev8 with Zbkb. If the type is less than XLenVT, type legalization will turn this into (srl (bitreverse (bswap (srl (bswap X), C))), C). We can't completely recover from these shifts. They introduce zeros into the upper bits of the result and we can't easily tell if they are needed. By doing a DAG combine early, we avoid introducing these shifts.	2022-03-12 16:39:39 -08:00
Craig Topper	43f668b98e	[RISCV] Move GORCIW/GREVIW formation to isel patterns. Type legalize narrow RISCVISD::GREV/GORC with constant to a larger type without switching to W. Detect sext_inreg+gorci/grevi with a uimm5 immediate during isel to emit GREVIW/GORCIW. This allows us to better propagate known bits information through extended bits after type legalization. It will also simplify a change I'm considering for BREV8 with Zbkb. A future patch will add computeKnownBits support for GORC. A further improvement here would be to use hasAllWUsers and doPeepholeSExtW like we do for SLLIW, but I don't think we have the test coverage for that yet.	2022-03-11 18:02:47 -08:00
Craig Topper	337d49da84	[RISCV] Fix typo in comment. NFC	2022-03-10 22:00:18 -08:00
Craig Topper	1f3a8d58a6	[RISCV] Use ZERO_EXTEND instead of ANY_EXTEND when promoting i32 RISCVISD::SHFL. NFC We know the shift amount is a constant with bit 31 clear. anyext of constant will be either zext or sext which will produce the same result here. But we really shouldn't rely on that. It would be valid to put a random number in the upper bits. Our isel patterns expect the upper bits to be 0 so we should ask for it explicitly.	2022-03-10 20:57:04 -08:00
Craig Topper	9ce6b1ca86	[RISCV] Remove performANY_EXTENDCombine. This doesn't appear to be needed any more. I did some inspecting of the gcc torture suite and SPEC2006 with this removed and didn't find any meaningful changes. I think we're more aggressive about forming ADDIW now using sign_extend_inreg during type legalization and hasAllWUsers in isel. This probably helps catch the cases this helped with before.	2022-03-10 11:29:31 -08:00
Luke	0803dba7dd	[RISCV] Add fixed-length vector instrinsics for segment load Inspired by reviews.llvm.org/D107790. Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D119834	2022-03-10 16:23:40 +08:00
Craig Topper	d53707508a	[RISCV] Remove RISCVISD::VLE_VL/VSE_VL. Use intrinsics instead. Similar to what we do for other loads/stores, use the intrinsic version that we already have custom isel for. Reviewed By: rogfer01 Differential Revision: https://reviews.llvm.org/D121166	2022-03-09 22:44:28 -08:00
Craig Topper	845bfcede1	[RISCV] Rename 'SplatOperand' to 'ScalarOperand'. NFC vslide1up/down have this flag set, but the value isn't a splat. Rename for clarity. Reviewed By: khchen Differential Revision: https://reviews.llvm.org/D121037	2022-03-07 11:28:32 -08:00
Craig Topper	bd5f124716	[RISCV] Add SimplifyDemandedBits support for FSR/FSL/FSRW/FSLW.	2022-03-05 21:26:51 -08:00
Craig Topper	232f57319d	[RISCV] Move vslide1up/down intrinsics into lowerVectorIntrinsicSplats. NFC Rename to lowerVectorIntrinsicScalars. This allows us to share the code that checks if the scalar needs to be type legalized.	2022-03-04 18:21:53 -08:00
Craig Topper	3d4e83f17d	[RISCV] With Zbb, fold (sext_inreg (abs X)) -> (max X, (negw X)) With Zbb, abs is expanded to (max X, neg) by default. If X has 33 or more sign bits, we can expand it a little early using negw instead of neg to save a sext_inreg. If X started as a 32 bit value, type legalization would have inserted a sext before the abs so X having 33 sign bits should always be true. Note: I've used ISD::FREEZE here since we increase the number of uses. Our default expansion for ABS doesn't do that, but I think that's a bug. We can't do this with custom type legalization because ISD::FREEZE doesn't propagate sign bits so later DAG combine won't expand be able to see optmize it. Alives2 https://alive2.llvm.org/ce/z/Gx3RNe Reviewed By: spatel Differential Revision: https://reviews.llvm.org/D120597	2022-03-03 15:42:29 -08:00
Craig Topper	6cb42cd666	[RISCV] More correctly ignore Zfinx register classes in getRegForInlineAsmConstraint. Until Zfinx is supported in CodeGen we need to convert all Zfinx register classes to GPR. Remove the zfinx-types.ll test which didn't test anything meaningful since -mattr=zfinx isn't implemented completely in llc. Follow up to D93298.	2022-03-02 11:22:46 -08:00
Craig Topper	a1f8349d77	[RISCV] Don't combine ROTR ((GREV x, 24), 16)->(GREV x, 8) on RV64. This miscompile was introduced in D119527. This was a special pattern for rotate+bswap on RV32. It doesn't work for RV64 since the rotate needs to be half the bitwidth. The equivalent pattern for RV64 is ROTR ((GREV x, 56), 32) so match that instead. This could be generalized further as noted in the new FIXME. Reviewed By: Chenbing.Zheng Differential Revision: https://reviews.llvm.org/D120686	2022-03-02 09:47:06 -08:00
Shao-Ce SUN	0e38b29543	[RISCV] add the MC layer support of Zfinx extension This patch added the MC layer support of Zfinx extension. Authored-by: StephenFan Co-Authored-by: Shao-Ce Sun Reviewed By: asb Differential Revision: https://reviews.llvm.org/D93298	2022-03-02 14:25:19 +08:00
Craig Topper	b9d6e8c441	[RISCV] Lower VECTOR_SPLICE to RVV instructions. This lowers VECTOR_SPLICE of scalable vectors to a slidedown follow by a slideup. Fixed vectors are encouraged to use shufflevector instruction. The equivalent patch for fixed vectors is D119039. I've used a tail agnostic slidedown and limited the VL to only the elements that will not be overwritten by the slideup. The slideup uses VLMax for its VL. It unfortunately uses tail undisturbed policy but it isn't required as there is no tail. We just need the merge operand to carry the bits for the lower portion of the result. Care was taken to ensure that either the slideup or slidedown will be able to use a .vi instruction when the immediate is small. Which one uses the immediate depends on the sign of the immediate. Reviewed By: frasercrmck, ABataev Differential Revision: https://reviews.llvm.org/D119303	2022-03-01 10:10:13 -08:00
Craig Topper	e83db8c001	[RISCV] Only enable combineROTR_ROTL_RORW_ROLW with Zbp. I think the immediate values we check for on the GREV nodes already protect this, but better to be explicit.	2022-02-28 12:47:36 -08:00
Craig Topper	b083157b7b	[RISCV] Don't call combineROTR_ROTL_RORW_ROLW for SLLW/SRLW/SRAW nodes. NFC I think the function does the correct thing internally, but it's confusing to read.	2022-02-28 11:05:10 -08:00
Craig Topper	f46890711f	[RISCV] Custom type legalize i32 ISD::ABS on RV64 without Zbb. Default type legalization will create sext_inreg+abs, but we may not be able to remove the sext_inreg. Instead this patch expands abs during type legalization to Y = sraiw X, 31; subw(xor X, Y), Y) which doesn't require the input to be sign extended. This gives a big improvement for some neg-abs tests where the abs is used more than the the neg. Previously the abs was expanded a different way before and after type legalization. Now they are expanded in a similar way enabling more CSE. Reviewed By: asb Differential Revision: https://reviews.llvm.org/D120636	2022-02-28 09:30:27 -08:00
Chenbing Zheng	b20e80aa59	[RISCV] DAG Combine vcpop and vfirst with VL=0 to li imm vcpop and vfirst are still useful when VL=0. vcpop equivalents to li 0 and vfirst equivalents to li -1, since no mask elements are active. Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D120302	2022-02-25 14:44:25 +08:00
Craig Topper	a975ca97c3	[RISCV] Fold (sext_inreg (fmv_x_anyexth X), i16) -> (fmv_x_signexth X). Add a new ISD opcode to represent the sign extending behavior of vmv.x.h. Keep the previous anyext opcode to allow the existing (fmv_x_anyexth (fmv_h_x X)) combine to keep working without needing to generate a sign extend. For fmv.x.w we are able to match the sext_inreg in an isel pattern, but a 16-bit sext_inreg is lowered to a shift pair before isel. This seemed like a larger match than we should do in isel. Reviewed By: asb Differential Revision: https://reviews.llvm.org/D118974	2022-02-24 09:19:01 -08:00
Shao-Ce SUN	78b5f0fb05	[NFC][RISCV] Reuse ISD::NodeType in float extension Reviewed By: asb Differential Revision: https://reviews.llvm.org/D120412	2022-02-24 19:57:55 +08:00
Craig Topper	5b7ac107b1	[RISCV] Use SelectionDAG::getFreeze to simplify some code. NFC	2022-02-23 21:13:01 -08:00
Craig Topper	c7d6448d03	[DAGCombiner][TargetLowering] Pass SDValue by value to isMulAddWithConstProfitable. Internally to DAGCombiner the SDValues were passed by non-const reference despite not being modified. They were then passed by const reference to TLI. This patch passes them by value which is consistent with the vast majority of code. Reviewed By: RKSimon Differential Revision: https://reviews.llvm.org/D120420	2022-02-23 12:40:45 -08:00
Alex Bradbury	c5bcfb983e	[RISCV] Avoid infinite loop between DAGCombiner::visitMUL and RISCVISelLowering::transformAddImmMulImm See https://github.com/llvm/llvm-project/issues/53831 for a full discussion. The basic issue is that DAGCombiner::visitMUL and RISCVISelLowering;:transformAddImmMullImm get stuck in a loop, as the current checks in transformAddImmMulImm aren't sufficient to avoid all cases where DAGCombiner::isMulAddWithConstProfitable might trigger a transformation. This patch makes transformAddImmMulImm bail out if C0 (the constant used for multiplication) has more than one use. Differential Revision: https://reviews.llvm.org/D120332	2022-02-23 11:05:46 +00:00
Zakk Chen	f7dfc5d1af	[RISCV] Optimize tail agnostic vmv.s.x which don't need to select tail value. Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D120250	2022-02-21 14:53:37 -08:00
Craig Topper	90d240553d	[RISCV] Teach shouldSinkOperands to sink splat operands of vp.fma intrinsics. Reviewed By: frasercrmck Differential Revision: https://reviews.llvm.org/D120167	2022-02-21 11:52:59 -08:00
Craig Topper	bbee9e77f3	[RISCV] Match shufflevector corresponding to slideup. This generalizes isElementRotate to work when there's only a single slide needed. I've removed matchShuffleAsSlideDown which is now redundant. Reviewed By: frasercrmck, khchen Differential Revision: https://reviews.llvm.org/D119759	2022-02-17 08:19:10 -08:00
Zakk Chen	eeb7754f68	[RISCV] Add the passthru operand for vmv.vv/vmv.vx/vfmv.vf IR intrinsics. Add the passthru operand for VMV_V_X_VL, VFMV_V_F_VL and SPLAT_VECTOR_SPLIT_I64_VL also. The goal is support tail and mask policy in RVV builtins. We focus on IR part first. If the passthru operand is undef, we use tail agnostic, otherwise use tail undisturbed. Reviewed By: rogfer01 Differential Revision: https://reviews.llvm.org/D119688	2022-02-17 06:38:14 -08:00
Craig Topper	cfbbcc544c	[RISCV] Improve lowering of SHL_PARTS/SRL_PARTS/SRA_PARTS. Part of the shift lowering creates a (sub XLEN-1, ShAmt). When this value is used we know that ShAmt is [0..XLEN-1]. Since XLEN is a power of 2 we can replace the sub with an xor. This allows us to use XORI instead of LI+SUB. Reviewed By: asb Differential Revision: https://reviews.llvm.org/D119411	2022-02-16 09:22:11 -08:00
Zakk Chen	b784719904	[RISCV] Add the passthru operand for RVV nomask binary intrinsics. The goal is support tail and mask policy in RVV builtins. We focus on IR part first. If the passthru operand is undef, we use tail agnostic, otherwise use tail undisturbed. Add passthru operand for VSLIDE1UP_VL and VSLIDE1DOWN_VL to support i64 scalar in rv32. The masked VSLIDE1 would only emit mask undisturbed policy regardless of giving mask agnostic policy until InsertVSETVLI supports mask agnostic. Reviewed by: craig.topper, rogfer01 Differential Revision: https://reviews.llvm.org/D117989	2022-02-15 18:36:18 -08:00
Craig Topper	ab6e02dded	[RISCV] Match vwmulsu_vx with scalar splat input. This is a more generic version of D119110 that uses MaskedValueIsZero to do the matching and SimplifyDemandedBits to remove any unneeded AND instructions. Tests were taken from D119110. Reviewed By: Chenbing.Zheng Differential Revision: https://reviews.llvm.org/D119622	2022-02-15 08:45:21 -08:00
Craig Topper	478c237e21	[RISCV] Fix incorrect extend type in vwmulsu combine. While matching widening multiply, if we matched an extend from i8->i32, i16->i64 or i8->i64, we need to reintroduce a narrower extend. If we're matching a vwmulsu we need to use a sext for op0 and a zext for op1. This bug exists in LLVM 14 and will need to be backported. Differential Revision: https://reviews.llvm.org/D119618	2022-02-12 12:47:20 -08:00
Chenbing.Zheng	9e975e558b	[RISCV][NFC] Move some combine patterns to DAG combine. Move some combine patterns to DAG combine，and it dealt with fixme left in RISCVInstrInfoZb.td. Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D119527	2022-02-12 02:52:21 +00:00
Craig Topper	b0e77d5e48	[RISCV] Lower the shufflevector equivalent of vector.splice We can lower a vector splice to a vslidedown and a vslideup. The majority of the matching code here came from X86's code for matching PALIGNR and VPALIGND/Q. The slidedown and slideup lowering don't really require it to be concatenation, but it happened to be an interesting pattern with existing analysis code I could use. This helps with cases where the scalar loop optimizer forwarded a load result from a previous loop iteration. For example, this happens if the loop uses x[i] and x[i+1] on the same iteration. The scalar optimizer will forward x[i+1] load from the previous loop to satisfy x[i] on this loop. When this get vectorized it results in one element of a vector being forwarded from the previous loop to be concatenated with elements loaded on this iteration. Whether that's more efficient than doing a shifted loaded or reloading the single scalar and using vslide1up is an interesting question. But that's not something the backend can help with. Reviewed By: khchen Differential Revision: https://reviews.llvm.org/D119039	2022-02-10 09:39:35 -08:00
Craig Topper	b861ddf365	[RISCV] Move the creation of VLMaxSentinel to isel. Use X0 during lowering. The VLMaxSentinel is represented as TargetConstant, but that's included in isa<ConstantSDNode>. To keep constant VLs and VLMax separate as long as possible, use the X0 register during lowering and only convert to VLMaxSentinel during isel. Reviewed By: frasercrmck Differential Revision: https://reviews.llvm.org/D118845	2022-02-10 09:28:44 -08:00
Craig Topper	279b3b8179	[RISCV][VP] Lower VP_FMA to RVV instructions. We already had FMA_VL node, but we didn't have masked patterns. I have not added the fneg variations. I'll do those after I add llvm.vp.fneg. Reviewed By: frasercrmck Differential Revision: https://reviews.llvm.org/D119196	2022-02-09 11:33:12 -08:00
Craig Topper	63e711549c	[RISCV] Lower VP_FNEG to RVV instructions Reviewed By: frasercrmck Differential Revision: https://reviews.llvm.org/D119269	2022-02-09 10:56:39 -08:00
Fraser Cormack	62c4ac764b	[RISCV] Optimize splats of extracted vector elements This patch adds an optimization to splat-like operations where the splatted value is extracted from a identically-sized vector. On RVV we can splat that via vrgather.vx/vrgather.vi without dropping to scalar beforehand. We do have a similar VECTOR_SHUFFLE-specific optimization but that only works on fixed-length vector types and for those with a constant splat lane. This patch extends this optimization to make it work on scalable-vector types and on unknown extract indices. It is performed during fixed-vector BUILD_VECTOR lowering and during a new DAGCombine on SPLAT_VECTOR for scalable vectors. Reviewed By: craig.topper, khchen Differential Revision: https://reviews.llvm.org/D118456	2022-02-08 10:35:25 +00:00
wangpc	c53d99c37d	[RISCV] Split f64 undef into two i32 undefs So that no store instruction will be generated. Reviewed By: asb Differential Revision: https://reviews.llvm.org/D118222	2022-02-08 13:42:15 +08:00
Craig Topper	c1cef111a3	Revert "[RISCV] Fold (sext_inreg (fmv_x_anyexth X), i16) -> (fmv_x_signexth X)." This reverts commit `673d68cd92`. This hadn't been reviewed yet.	2022-02-05 12:51:01 -08:00
Craig Topper	673d68cd92	[RISCV] Fold (sext_inreg (fmv_x_anyexth X), i16) -> (fmv_x_signexth X). Add a new ISD opcode to represent the sign extending behavior of vmv.x.h. Keep the previous anyext opcode to allow the existing (fmv_x_anyexth (fmv_h_x X)) combine to keep working without needing to generate a sign extend. For fmv.x.w we are able to match the sext_inreg in an isel pattern, but a 16-bit sext_inreg is lowered to a shift pair before isel. This seemed like a larger match than we should do in isel. Differential Revision: https://reviews.llvm.org/D118974	2022-02-05 12:42:12 -08:00
Craig Topper	234e54bdd8	[RISCV] Add more types of shuffles isShuffleMaskLegal. Add the vslidedown and interleave patterns that I recently implemented. Reviewed By: frasercrmck Differential Revision: https://reviews.llvm.org/D118952	2022-02-04 09:13:13 -08:00
Craig Topper	c83905a308	[RISCV] Add inline expansion for vector fround. This avoids a crash for scalable vectors and or scalarization for fixed vectors. The algorithm is different enough that I don't think it makes sense to merge with ceil/floor/trunc. Algorithm is adapted from gcc's X86 SSE2 output. Reviewed By: frasercrmck Differential Revision: https://reviews.llvm.org/D117247	2022-02-04 09:12:09 -08:00
Craig Topper	2349fb0312	[RISCV] Remove RISCVISD::SPLAT_VECTOR_I64 in favor of RISCVISD::VMV_V_X_VL. SPLAT_VECTOR_I64 has the same semantics as RISCVISD::VMV_V_X_VL, it just assumed VLMax instead of carrying a VL operand. Include order of RISCVInstrInfoVSDPatterns.td and RISCVInstrInfoVVLPatterns.td has been swapped to avoid moving riscv_vmv_v_x_vl into RISCVInstrInfoVSDPatterns.td and to allow moving other "_vl" SDNodes back to RISCVInstrInfoVVLPatterns.td Reviewed By: frasercrmck Differential Revision: https://reviews.llvm.org/D118841	2022-02-03 08:30:25 -08:00

1 2 3 4 5 ...

721 Commits