llvm-project

Commit Graph

Author	SHA1	Message	Date
Kazu Hirata	435a5a3652	[llvm] Fix bugprone argument comments (NFC) Identified with bugprone-argument-comment.	2022-01-08 11:56:38 -08:00
Craig Topper	75117fb340	[RISCV] Don't advertise i32->i64 zextload as free for RV64. The zextload hook is only used to determine whether to insert a zero_extend or any_extend for narrow types leaving a basic block. Returning true from this hook tends to cause any load whose output leaves the basic block to become an LWU instead of an LW. Since we tend to prefer sexts for i32 compares on RV64, this can cause extra sext.w instructions to be created in other basic blocks. If we use LW instead of LWU this gives the MIR pass from D116397 a better chance of removing them. Another option might be to teach getPreferredExtendForValue in FunctionLoweringInfo.cpp about our preference for sign_extend of i32 compares. That would cause SIGN_EXTEND to be chosen for any value used by a compare instead of using the isZExtFree heuristic. That will require code to convert from the llvm::Type* to EVT/MVT as well as querying the type legalization actions to get the promoted type in order to call TargetLowering::isSExtCheaperThanZExt. That seemed like many extra steps when no other target wants it. Though it would avoid us needing to lean on the MIR pass in some cases. Reviewed By: asb Differential Revision: https://reviews.llvm.org/D116567	2022-01-06 08:13:42 -08:00
Craig Topper	808c662665	[RISCV] Change RISCVISD::FCVT*RTZ opcodes to take rounding mode as an operand. Pre-work for a future change that will use these opcodes with other rounding modes. Differential Revision: https://reviews.llvm.org/D116724	2022-01-06 08:12:12 -08:00
Victor Perez	5527139302	[RISCV][VP] Add RVV codegen for [nX]vXi1 vp.select Expand [nX]vXi1 vp.select the same way as [nX]vXi1 vselect. Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D115546	2022-01-02 23:12:32 -08:00
Craig Topper	15787ccd45	[RISCV] Add support for STRICT_LRINT/LLRINT/LROUND/LLROUND. Tests for other strict intrinsics. This patch adds isel support for STRICT_LRINT/LLRINT/LROUND/LLROUND. It also adds test cases for f32 and f64 constrained intrinsics that correspond to the intrinsics in float-intrinsics.ll and double-intrinsics.ll. Support for promoting the integer argument of STRICT_FPOWI was added. I've skipped adding tests for f16 intrinsics, since we don't have libcalls for them and we have inconsistent support for promoting them in LegalizeDAG. This will need to be examined more closely. Reviewed By: asb Differential Revision: https://reviews.llvm.org/D116323	2021-12-30 11:54:32 -08:00
Hsiangkai Wang	a1c7ddf926	[RISCV] Support passing scalable vectur values through the stack. After consuming all vector registers, the scalable vector values will be passed indirectly. The pointer values will be saved in general registers. If all general registers are used up, we will report an error to notify users the compiler does not support passing scalable vector values through the stack. In this patch, we remove the restriction. After all general registers are used up, we use the stack to save the pointers which point to the indirect passed scalable vector values. Differential Revision: https://reviews.llvm.org/D116310	2021-12-28 09:26:36 +08:00
Kazu Hirata	e7774f499b	Use static_assert instead of assert (NFC) Identified with misc-static-assert.	2021-12-26 14:26:44 -08:00
Jim Lin	02478a26f2	[RISCV] Use DAG variable directly instead of DCI.DAG Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D116087	2021-12-24 13:06:55 +08:00
Craig Topper	0a35211b34	[RISCV] Don't allow vector types to be used with inline asm 'r' constraint The 'r' constraint uses the GPR class. There is generic support for bitcasting and extending/truncating non-integer VTs to the required integer VT. This doesn't work for scalable vectors and instead crashes. To prevent this, explicitly reject vectors. Fixed vectors might work without crashing, but it doesn't seem worthwhile to allow. While there remove an unnecessary level of indentation in the "vr" and "vm" constraint handling. Differential Revision: https://reviews.llvm.org/D115810	2021-12-23 20:32:36 -06:00
Victor Perez	10b3675aa9	[RISCV][VP] Lower mask vector VP AND/OR/XOR to RVV instructions For fixed and scalable vectors, each intrinsic x is lowered to vmx.mm, dropping the mask, which is safe to do as masked-off elements are undef anyway. Differential Revision: https://reviews.llvm.org/D115339	2021-12-23 15:02:32 -06:00
Craig Topper	7704c503ec	[RISCV] Use positive 0.0 for the neutral element in fadd reductions if nsz is present. -0.0 requires a constant pool. +0.0 can be made with vmv.v.x x0. Not doing this in getNeutralElement for fear of changing other targets. Differential Revision: https://reviews.llvm.org/D115978	2021-12-23 10:38:00 -06:00
Craig Topper	b7b260e19a	[RISCV] Support strict FP conversion operations. This adds support for strict conversions between fp types and between integer and fp. NOTE: RISCV has static rounding mode instructions, but the constrainted intrinsic metadata is not used to select static rounding modes. Dynamic rounding mode is always used. Differential Revision: https://reviews.llvm.org/D115997	2021-12-23 09:40:58 -06:00
jacquesguan	28a3e7dea2	[RISCV] Override hasAndNotCompare to use more andn when have Zbb extension. Enable transform (X & Y) == Y ---> (~X & Y) == 0 and (X & Y) != Y ---> (~X & Y) != 0 when have Zbb extension to use more andn instruction. Differential Revision: https://reviews.llvm.org/D115922	2021-12-23 10:42:20 +08:00
Craig Topper	66bbefeb13	[RISCV] Revert Zfhmin related changes that aren't tested and depend on f16 being a legal type. Our Zfhmin support is only MC layer, but these are CodeGen layer interfaces. If f16 isn't a Legal type for CodeGen with Zfhmin, then these interfaces should keep their non-Zfh behavior. Reviewed By: luismarques Differential Revision: https://reviews.llvm.org/D115822	2021-12-16 08:55:28 -08:00
Craig Topper	3926893439	[RISCV] Add isel support for scalar STRICT_FADD/FSUB/FMUL/FDIV/FSQRT. Test that STRICT_FMINNUM/FMAXNUM are lowered to libcalls for f32/f64. The RISC-V instructions don't match the behavior of fmin/fmax libcalls with respect to SNaN. Promoting FMINNUM/FMAXNUM for f16 needs more work outside of the RISC-V backend. Reviewed By: asb, arcbbb Differential Revision: https://reviews.llvm.org/D115680	2021-12-14 10:50:55 -08:00
Craig Topper	3f1c403a2b	[RISCV] Use AdjustInstrPostInstrSelection to insert a FRM dependency for scalar FP instructions with dynamic rounding mode. In order to support constrained FP intrinsics we need to model FRM dependency. Whether or not a instruction uses FRM is based on a 3 bit field in the instruction. Because of this we can't add 'Uses = [FRM]' to the tablegen descriptions. This patch examines the immediate after isel and adds an implicit use of FRM. This idea came from Roger Ferrer Ibanez. Other ideas: We could be overly conservative and just pretend all instructions with frm field read the FRM register. Or we could have pseudoinstructions for CodeGen with rounding mode. Reviewed By: asb, frasercrmck, arcbbb Differential Revision: https://reviews.llvm.org/D115555	2021-12-14 10:17:57 -08:00
Craig Topper	b18b2a01ef	[RISCV] Don't use VLMAX for start value splat in reduction lowering. The reduction instructions only reads the first element. The execution time for a splat may take longer with a larger VL. We should use the smallest VL we can. Reviewed By: frasercrmck, HsiangKai Differential Revision: https://reviews.llvm.org/D115536	2021-12-13 09:06:42 -08:00
Kito Cheng	39c861719b	[RISCV] Fix vm operand constraint to fit GCC's behavior - `vm` constraint is used for masking operand, which always v0. - Update testcase, only masking operand should use `vm`, vector mask operations should just use `vr` for any vector register. - Revise the description of `vm` constraint. - This patch also fix issue on RISCVRegisterInfo.td and RISCVISelLowering.cpp. RISCVRegisterInfo.td: - The first VT in the list must be the largest total size since the SelectionDAGBuilder uses the first register in the list as the canonical type for the register. RISCVISelLowering.cpp: - Fix RISCVTargetLowering::splitValueIntoRegisterParts and RISCVTargetLowering::joinRegisterPartsIntoValue for handling vectors with different total size, that will happened on fractional LMUL since fractional LMUL is always occupy one vector register. Reviewed By: frasercrmck Differential Revision: https://reviews.llvm.org/D112599	2021-12-09 14:46:49 +08:00
Craig Topper	acdbd34cfb	[RISCV] Loosen some restrictions on lowering constant BUILD_VECTORs using vid.v. The immediate size check on StepNumerator did not take into account that vmul.vi does not exist. It also did not account for power of 2 constants that can be done with vshl.vi. This patch fixes this by moving the conversion from mul to shift further up. Then we can consider the immediates separately for MUL vs SHL. For MUL I've allowed simm12 which requires a single addi before a vmul.vx. For SHL I've allowed any uimm5 which works with vshl.vi. We could relax these further in the future. This is a starting point that allows us to emit the same number of instructions we were already using for smaller numerators. Reviewed By: frasercrmck Differential Revision: https://reviews.llvm.org/D115081	2021-12-06 09:34:40 -08:00
Victor Perez	9eb7322748	[RISCV][VP] Add RVV codegen for vp.select Lower vp.select instrinsic to VSELECT_VL. Reviewed By: rogfer01 Differential Revision: https://reviews.llvm.org/D114629	2021-12-03 11:02:20 +00:00
Craig Topper	2f6beb7b0e	[RISCV] Add inline expansion for vector ftrunc/fceil/ffloor. This prevents scalarization of fixed vector operations or crashes on scalable vectors. We don't have direct support for these operations. To emulate ftrunc we can convert to the same sized integer and back to fp using round to zero. We don't need to do a convert if the value is large enough to have no fractional bits or is a nan. The ceil and floor lowering would be better if we changed FRM, but we don't model FRM correctly yet. So I've used the trunc lowering with a conditional add or subtract with 1.0 if the truncate rounded in the wrong direction. There are also missed opportunities to use masked instructions. Reviewed By: frasercrmck Differential Revision: https://reviews.llvm.org/D113543	2021-12-01 11:25:28 -08:00
Craig Topper	d8f9eaad89	[RISCV] Teach RISCVTargetLowering::shouldSinkOperands to handle udiv/sdiv/urem/srem. The V extension supports .vx instructions for integer division and remainder so we should sink splats for that operand.	2021-11-30 18:47:51 -08:00
David Green	9e8a71caf0	[DAG] Create fptosi.sat from clamped fptosi This adds a fold in DAGCombine to create fptosi_sat from sequences for smin(smax(fptosi(x))) nodes, where the min/max saturate the output of the fp convert to a specific bitwidth (say INT_MIN and INT_MAX). Because it is dealing with smin(/smax) in DAG they may currently be ISD::SMIN, ISD::SETCC/ISD::SELECT, ISD::VSELECT or ISD::SELECT_CC nodes which need to be handled similarly. A shouldConvertFpToSat method was added to control when converting may be profitable. The original fptosi will have a less strict semantics than the fptosisat, with less values that need to produce defined behaviour. This especially helps on ARM/AArch64 where the vcvt instructions naturally saturate the result. Differential Revision: https://reviews.llvm.org/D111976	2021-11-30 15:29:14 +00:00
Hans Wennborg	a87782c34d	Revert "[DAG] Create fptosi.sat from clamped fptosi" It causes builds to fail with this assert: llvm/include/llvm/ADT/APInt.h:990: bool llvm::APInt::operator==(const llvm::APInt &) const: Assertion `BitWidth == RHS.BitWidth && "Comparison requires equal bit widths"' failed. See comment on the code review. > This adds a fold in DAGCombine to create fptosi_sat from sequences for > smin(smax(fptosi(x))) nodes, where the min/max saturate the output of > the fp convert to a specific bitwidth (say INT_MIN and INT_MAX). Because > it is dealing with smin(/smax) in DAG they may currently be ISD::SMIN, > ISD::SETCC/ISD::SELECT, ISD::VSELECT or ISD::SELECT_CC nodes which need > to be handled similarly. > > A shouldConvertFpToSat method was added to control when converting may > be profitable. The original fptosi will have a less strict semantics > than the fptosisat, with less values that need to produce defined > behaviour. > > This especially helps on ARM/AArch64 where the vcvt instructions > naturally saturate the result. > > Differential Revision: https://reviews.llvm.org/D111976 This reverts commit `52ff3b0093`.	2021-11-30 15:36:56 +01:00
David Green	52ff3b0093	[DAG] Create fptosi.sat from clamped fptosi This adds a fold in DAGCombine to create fptosi_sat from sequences for smin(smax(fptosi(x))) nodes, where the min/max saturate the output of the fp convert to a specific bitwidth (say INT_MIN and INT_MAX). Because it is dealing with smin(/smax) in DAG they may currently be ISD::SMIN, ISD::SETCC/ISD::SELECT, ISD::VSELECT or ISD::SELECT_CC nodes which need to be handled similarly. A shouldConvertFpToSat method was added to control when converting may be profitable. The original fptosi will have a less strict semantics than the fptosisat, with less values that need to produce defined behaviour. This especially helps on ARM/AArch64 where the vcvt instructions naturally saturate the result. Differential Revision: https://reviews.llvm.org/D111976	2021-11-30 11:05:32 +00:00
Craig Topper	b121d23a9c	[RISCV] Promote f16 log/pow/exp/sin/cos/etc. to f32 libcalls. Prevents crashes or cannot select errors. Reviewed By: frasercrmck Differential Revision: https://reviews.llvm.org/D113822	2021-11-29 18:49:11 -08:00
Philipp Tomsich	af57a71d18	[RISCV] Don't call setHasMultipleConditionRegisters(), so icmp is sunk On RISC-V, icmp is not sunk (as the following snippet shows) which generates the following suboptimal branch pattern: ``` core_list_find: lh a2, 2(a1) seqz a3, a0 << bltz a2, .LBB0_5 bnez a3, .LBB0_9 << should sink the seqz [...] j .LBB0_9 .LBB0_5: bnez a3, .LBB0_9 << should sink the seqz lh a1, 0(a1) [...] ``` due to an icmp not being sunk. The blocks after `codegenprepare` look as follows: ``` define dso_local %struct.list_head_s* @core_list_find(%struct.list_head_s* readonly %list, %struct.list_data_s* nocapture readonly %info) local_unnamed_addr #0 { entry: %idx = getelementptr inbounds %struct.list_data_s, %struct.list_data_s* %info, i64 0, i32 1 %0 = load i16, i16* %idx, align 2, !tbaa !4 %cmp = icmp sgt i16 %0, -1 %tobool.not37 = icmp eq %struct.list_head_s* %list, null br i1 %cmp, label %while.cond.preheader, label %while.cond9.preheader while.cond9.preheader: ; preds = %entry br i1 %tobool.not37, label %return, label %land.rhs11.lr.ph ``` where the `%tobool.not37` is the result of the icmp that is not sunk. Note that it is computed in the basic-block up until what becomes the `bltz` instruction and the `bnez` is a basic-block of its own. Compare this to what happens on AArch64 (where the icmp is correctly sunk): ``` define dso_local %struct.list_head_s* @core_list_find(%struct.list_head_s* readonly %list, %struct.list_data_s* nocapture readonly %info) local_unnamed_addr #0 { entry: %idx = getelementptr inbounds %struct.list_data_s, %struct.list_data_s* %info, i64 0, i32 1 %0 = load i16, i16* %idx, align 2, !tbaa !6 %cmp = icmp sgt i16 %0, -1 br i1 %cmp, label %while.cond.preheader, label %while.cond9.preheader while.cond9.preheader: ; preds = %entry %1 = icmp eq %struct.list_head_s* %list, null br i1 %1, label %return, label %land.rhs11.lr.ph ``` This is caused by sinkCmpExpression() being skipped, if multiple condition registers are supported. Given that the check for multiple condition registers affect only sinkCmpExpression() and shouldNormalizeToSelectSequence(), this change adjusts the RISC-V target as follows: * we no longer signal multiple condition registers (thus changing the behaviour of sinkCmpExpression() back to sinking the icmp) * we override shouldNormalizeToSelectSequence() to let always select the preferred normalisation strategy for our backend With both changes, the test results remain unchanged. Note that without the target-specific override to shouldNormalizeToSelectSequence(), there is worse code (more branches) generated for select-and.ll and select-or.ll. The original test case changes as expected: ``` core_list_find: lh a2, 2(a1) bltz a2, .LBB0_5 beqz a0, .LBB0_9 << [...] j .LBB0_9 .LBB0_5: beqz a0, .LBB0_9 << lh a1, 0(a1) [...] ``` Differential Revision: https://reviews.llvm.org/D98932	2021-11-19 08:32:59 -08:00
Zarko Todorovski	5b8bbbecfa	[NFC][llvm] Inclusive language: reword and remove uses of sanity in llvm/lib/Target Reworded removed code comments that contain `sanity check` and `sanity test`.	2021-11-17 21:59:00 -05:00
Craig Topper	0274be28d7	[RISCV] Lower vector CTLZ_ZERO_UNDEF/CTTZ_ZERO_UNDEF by converting to FP and extracting the exponent. If we have a large enough floating point type that can exactly represent the integer value, we can convert the value to FP and use the exponent to calculate the leading/trailing zeros. The exponent will contain log2 of the value plus the exponent bias. We can then remove the bias and convert from log2 to leading/trailing zeros. This doesn't work for zero since the exponent of zero is zero so we can only do this for CTLZ_ZERO_UNDEF/CTTZ_ZERO_UNDEF. If we need a value for zero we can use a vmseq and a vmerge to handle it. We need to be careful to make sure the floating point type is legal. If it isn't we'll continue using the integer expansion. We could split the vector and concatenate the results but that needs some additional work and evaluation. Differential Revision: https://reviews.llvm.org/D111904	2021-11-17 10:29:41 -08:00
Craig Topper	391b0ba603	[RISCV] Override TargetLowering::hasAndNot for Zbb. Differential Revision: https://reviews.llvm.org/D113937	2021-11-15 18:44:07 -08:00
Craig Topper	ee7a006ce4	[RISCV] Promote f16 ceil/floor/round/roundeven/nearbyint/rint/trunc intrinsics to f32 libcalls. Previously these would crash. I don't think these can be generated directly from C. Not sure if any optimizations can introduce them. Reviewed By: asb Differential Revision: https://reviews.llvm.org/D113527	2021-11-11 08:28:41 -08:00
Craig Topper	4183522e80	[RISCV] Promote f16 frem with Zfh. Add riscv64 coverage for f32 and f64 frem. Reviewed By: luismarques Differential Revision: https://reviews.llvm.org/D113531	2021-11-10 17:35:07 -08:00
Craig Topper	9ee5cec688	[RISCV] Prevent bad legalizer behavior when bitcasting fixed vectors to i64 on RV32 with Zve32. Similar to D113219, we need to make sure we don't create a vXi64 vector when it isn't legal. This fixes an error found by an expensive checks build.	2021-11-10 11:58:49 -08:00
Craig Topper	57bc7b1089	[RISCV] Prevent crashes when bitcasting between fixed vectors and scalars. Not all scalar element types are allowed in vectors so we may not be able to bitcast to a 1 element vector to use insert/extract. This will become a bigger issue when the Zve extensions are commited. For now, I'm using the ELEN limit to limit the element types. Reviewed By: frasercrmck Differential Revision: https://reviews.llvm.org/D113219	2021-11-10 09:21:52 -08:00
Craig Topper	376233113e	[RISCV] Use TargetConstant for CSR number for READ_CSR/WRITE_CSR. This is consistent with what we do for other operands that are required to be constants. I don't think this results in any real changes. The pattern match code for isel treats ConstantSDNode and TargetConstantSDNode the same.	2021-11-08 15:10:24 -08:00
Craig Topper	304edbb553	[RISCV] SMUL_LOHI/UMUL_LOHI should expand for RVV. These and MULHS/MULHU both default to Legal. Targets need to set the ones they don't support to Expand. I think MULHS/MULHU likely has priority in most places so this change probably isn't directly testable. I found it while looking at disabling MULHS/MULHU for nxvXi64 as required for Zve64x. Reviewed By: frasercrmck Differential Revision: https://reviews.llvm.org/D113325	2021-11-08 09:38:36 -08:00
Ben Shi	e32cf690df	[RISCV] Optimize (add (mul r, c0), c1) Optimize (add (mul x, c0), c1) -> (add (mul (add x, c1/c0+1), c0), c1%c0-c0), if c1/c0+1 and c1%c0-c0 are simm12, while c1 is not. Optimize (add (mul x, c0), c1) -> (add (mul (add x, c1/c0-1), c0), c1%c0+c0), if c1/c0-1 and c1%c0+c0 are simm12, while c1 is not. Reviewed By: craig.topper, asb Differential Revision: https://reviews.llvm.org/D111141	2021-11-08 02:58:25 +00:00
Shao-Ce SUN	5c3d7184b4	[RISCV] Support Zfhmin extension According to RISC-V Unprivileged ISA 15.6. Reviewed By: asb Differential Revision: https://reviews.llvm.org/D111866	2021-11-06 01:41:02 +08:00
Zakk Chen	0649dfebba	[RISCV] Rename some assembler mnemonic and intrinsic functions for RVV 1.0. Rename vpopc/vmandnot/vmornot to vcpop/vmandn/vmorn assembler mnemonic. Reviewed By: frasercrmck, jrtc27, craig.topper Differential Revision: https://reviews.llvm.org/D111062	2021-11-04 10:08:01 -07:00
Fraser Cormack	d065b03801	[RISCV] Optimize vp.load with an all-ones mask Similar to D110206, this patch optimizes unmasked vp.load intrinsics to avoid the need of a vmset instruction to set the mask. It does so by selecting a riscv_vle intrinsic rather than a riscv_vle_mask intrinsic. Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D113022	2021-11-02 17:23:39 +00:00
Craig Topper	ada5458521	[RISCV] Expand scalable vector bswap. Fix crash for bitreverse. Fix LegalizeVectorOps to not try shuffle or unrolling expansions for scalable vectors. Differential Revision: https://reviews.llvm.org/D112236	2021-10-31 10:01:27 -07:00
Craig Topper	1387483e72	[RISCV] Replace most uses of RISCVSubtarget::hasStdExtV. NFCI Add new hasVInstructions() which is currently equivalent. Replace vector uses of hasStdExtZfh/F/D with new vector specific versions. The vector spec no longer requires that the vectors implement the same types as scalar. It only requires that the scalar type is the maximum size the vectors can support. This is currently implemented using the scalar rule we were using before. Add new hasVInstructionsI64() begin using to qualify code that requires i64 vector elements. This is all NFC for now, but we can start using this to better implement D112408 which introduces the Zve extensions. Reviewed By: frasercrmck, eopXD Differential Revision: https://reviews.llvm.org/D112496	2021-10-27 19:33:48 -07:00
Craig Topper	2783a5cfaf	[RISCV] Add ICmp and FCmp to shouldSinkOperands.	2021-10-26 22:23:54 -07:00
Craig Topper	d55be79d75	[RISCV] Expand scalable vector CTTZ/CTLZ/CTPOP. Differential Revision: https://reviews.llvm.org/D112233	2021-10-21 10:50:04 -07:00
Craig Topper	c4803bd416	[RISCV] Handle vector of pointer in getTgtMemIntrinsic for strided load/store. getScalarSizeInBits() doesn't work if the scalar type is a pointer. For that we need to go through DataLayout.	2021-10-07 10:11:56 -07:00
Craig Topper	a2a07e8db3	[RISCV] Fold store of vmv.x.s to a vse with VL=1. This can avoid a loss of decoupling with the scalar unit on cores with decoupled scalar and vector units. We should support FP too, but those use extract_element and not a custom ISD node so it is a little different. I also left a FIXME in the test for i64 extract and store on RV32. Reviewed By: frasercrmck Differential Revision: https://reviews.llvm.org/D109482	2021-09-27 09:54:46 -07:00
Craig Topper	933182e948	[RISCV] Improve support for forming widening multiplies when one input is a scalar splat. If one input of a fixed vector multiply is a sign/zero extend and the other operand is a splat of a scalar, we can use a widening multiply if the scalar value has sufficient sign/zero bits. Reviewed By: frasercrmck Differential Revision: https://reviews.llvm.org/D110028	2021-09-27 09:37:07 -07:00
Fraser Cormack	d48f6df1f8	[RISCV] Create the correct mask type when lowering EXTRACT_VECTOR_ELT This particular case was creating a `VMSET_VL` using the old fixed-length type in order to pass a mask to other custom nodes operating on the scalable container type. This kind of thing wasn't caught for us; I only noticed when experimenting with odd-length vectors, where it was trying to generate an invalid `v3i1` MVT. Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D110420	2021-09-27 09:43:40 +01:00
Hsiangkai Wang	7d39a8a921	[RISCV] (1/2) Add the tail policy argument to builtins/intrinsics. Add the tail policy argument to LLVM IR intrinsics. There are two policies for tail elements. Tail agnostic means users do not care about the values in the tail elements and tail undisturbed means the values in the tail elements need to be kept after the operation. In order to let users control the tail policy, we add an additional argument at the end of the argument list. For unmasked operations, we have no maskedoff and the tail policy is always tail agnostic. If users want to keep tail elements under unmasked operations, they could use all one mask in the masked operations to do it. So, we only add the additional argument for masked operations for most cases. There are exceptions listed below. In this patch, we do not handle the following cases to reduce the complexity of the patch. There could be two separate patches for them. * Use dest argument to control tail policy vmerge.vvm/vmerge.vxm/vmerge.vim (add _t builtins with additional dest argument) vfmerge.vfm (add _t builtins with additional dest argument) vmv.v.v (add _t builtins with additional dest argument) vmv.v.x (add _t builtins with additional dest argument) vmv.v.i (add _t builtins with additional dest argument) vfmv.v.f (add _t builtins with additional dest argument) vadc.vvm/vadc.vxm/vadc.vim (add _t builtins with additional dest argument) vsbc.vvm/vsbc.vxm (add _t builtins with additional dest argument) * Always has tail argument for masked/unmasked intrinsics Vector Single-Width Integer Multiply-Add Instructions (add _t and _mt builtins) Vector Widening Integer Multiply-Add Instructions (add _t and _mt builtins) Vector Single-Width Floating-Point Fused Multiply-Add Instructions (add _t and _mt builtins) Vector Widening Floating-Point Fused Multiply-Add Instructions (add _t and _mt builtins) Vector Reduction Operations (add _t and _mt builtins) Vector Slideup Instructions (add _t and _mt builtins) Vector Slidedown Instructions (add _t and _mt builtins) Discussion: https://github.com/riscv/rvv-intrinsic-doc/pull/101 Differential Revision: https://reviews.llvm.org/D105092	2021-09-24 17:09:50 +08:00
Craig Topper	40b230f685	[RISCV] Limit transformAddImmMulImm to prevent an infinite loop. This fixes an issue reported in D108607.	2021-09-23 15:53:11 -07:00
Fraser Cormack	e7c879a69d	[RISCV][VP] Add support for VP_REDUCE_* operations This patch adds codegen support for lowering the vector-predicated reduction intrinsics to RVV instructions. The process is similar to that of the other reduction intrinsics, save for the fact that every VP reduction has a start value. We reuse the existing custom "VL" nodes, adding extra patterns where required to handle non-true masks. To support these nodes, the `RISCVISD::VECREDUCE_*_VL` nodes have been given an explicit "merge" operand. This is to faciliate the VP reductions, where we must be careful to ensure that even if no operation is performed (when VL=0) we still produce the start value. The RVV reductions don't update the destination register under these conditions, so we tie the splatted start value to the output register. Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D107657	2021-09-23 11:11:05 +01:00
Craig Topper	b33a1cc05b	[RISCV] Optimize vp.store with an all ones mask to avoid a vmset. We can use riscv_vse intrinsic instead of riscv_vse_mask. The code here is based on similar code for handling masked.scatter and vp.scatter. Reviewed By: frasercrmck Differential Revision: https://reviews.llvm.org/D110206	2021-09-22 09:12:47 -07:00
Craig Topper	7c975665b4	[RISCV] Make some arrays of constants 'static const'. NFC This helps the compiler generate better code.	2021-09-21 10:52:47 -07:00
Craig Topper	aeb63d464f	[RISCV] Teach RISCVTargetLowering::shouldSinkOperands to sink splats for and/or/xor. This requires a minor change to CodeGenPrepare to ensure that shouldSinkOperands will be called for And. Reviewed By: frasercrmck Differential Revision: https://reviews.llvm.org/D110106	2021-09-21 10:07:29 -07:00
Ben Shi	b3052013b4	[RISCV] Optimize (add (mul x, c0), c1) Optimize (add (mul x, c0), c1) -> (ADDI (MUL (ADDI, c1/c0), c0), c1%c0), if c1/c0 and c1%c0 are simm12, while c1 is not. Optimize (add (mul x, c0), c1) -> (MUL (ADDI, c1/c0), c0), if c1%c0 is zero, and c1/c0 is simm12 while c1 is not. Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D108607	2021-09-21 14:13:14 +00:00
Craig Topper	a95ba81073	[RISCV] Teach RISCVTargetLowering::shouldSinkOperands to sink splats for FMA. If either of the multiplicands is a splat, we can sink it to use vfmacc.vf or similar.	2021-09-20 11:49:50 -07:00
Craig Topper	04ab6c85ef	[RISCV] Teach RISCVTargetLowering::shouldSinkOperands to sink splats for FAdd/FSub/FMul/FDiv.	2021-09-20 10:25:46 -07:00
Craig Topper	d85e347a28	[RISCV] Add a pass to recognize VLS strided loads/store from gather/scatter. For strided accesses the loop vectorizer seems to prefer creating a vector induction variable with a start value of the form <i32 0, i32 1, i32 2, ...>. This value will be incremented each loop iteration by a splat constant equal to the length of the vector. Within the loop, arithmetic using splat values will be done on this vector induction variable to produce indices for a vector GEP. This pass attempts to dig through the arithmetic back to the phi to create a new scalar induction variable and a stride. We push all of the arithmetic out of the loop by folding it into the start, step, and stride values. Then we create a scalar GEP to use as the base pointer for a strided load or store using the computed stride. Loop strength reduce will run after this pass and can do some cleanups to the scalar GEP and induction variable. Reviewed By: frasercrmck Differential Revision: https://reviews.llvm.org/D107790	2021-09-20 09:39:44 -07:00
Ben Shi	dee5a8ca32	[RISCV] Optimize (add (shl x, c0), (shl y, c1)) with SHADD Optimize (add (shl x, c0), (shl y, c1)) -> (SLLI (SHADD x, y), c1), if c0-c1 == 1/2/3. Reviewed By: craig.topper, luismarques Differential Revision: https://reviews.llvm.org/D108916	2021-09-19 16:35:12 +08:00
Craig Topper	1b736bda3b	[RISCV] Enable CGP to sink splat operands of Add/Sub/Mul/Shl/LShr/AShr LICM may have pulled out a splat, but with .vx instructions we can fold it into an operation. This patch enables CGP to reverse the LICM transform and move the splat back into the loop. I've started with the commutable integer operations and shifts, but we can extend this with more operations in future patches. Reviewed By: frasercrmck Differential Revision: https://reviews.llvm.org/D109394	2021-09-10 09:04:01 -07:00
Craig Topper	a574f0e0c3	[RISCV] Disable use of i128 shift libcalls on RV32. Since i128 isn't a legal C type on RV32, I don't believe libgcc implements these functions for RV32. compiler-rt does implement them because i128 support is enabled in order to handle long double. This is consistent with 32-bit X86 and ARM. Reviewed By: efriedma Differential Revision: https://reviews.llvm.org/D109383	2021-09-08 14:26:07 -07:00
Kazu Hirata	5c6338de16	[RISCV] Fix "set but not used" warnings	2021-09-07 09:19:31 -07:00
Fraser Cormack	a823bdf3ab	[RISCV][VP] Custom lower VP_STORE and VP_LOAD This patch adds support for the vector-predicated `VP_STORE` and `VP_LOAD` nodes. We do this in the same way we lower `MSTORE` and `MLOAD`: to regular load/store instructions via intrinsics. One necessary change was made to `SelectionDAGLegalize` so that `VP_STORE` nodes' operation actions are taken from the stored "value" operands, in the same vein as `STORE` or `MSTORE`. Reviewed By: craig.topper, rogfer01 Differential Revision: https://reviews.llvm.org/D108999	2021-09-07 10:53:25 +01:00
Fraser Cormack	f4dee8cb82	[RISCV][VP] Custom lower VP_SCATTER and VP_GATHER This patch adds support for the `VP_SCATTER` and `VP_GATHER` nodes by lowering them to RVV's `vsox`/`vlux` instructions, respectively. This process is almost identical to the existing `MSCATTER`/`MGATHER` support. One extra change was made to `SelectionDAGLegalize` so that `VP_SCATTER`'s operation action is derived from its stored "value" operand rather than its return type (which is always the chain). Reviewed By: craig.topper, rogfer01 Differential Revision: https://reviews.llvm.org/D108987	2021-09-07 10:43:07 +01:00
Craig Topper	75620fadf5	[RISCV] Change how we encode AVL operands in vector pseudoinstructions to use GPRNoX0. This patch changes the register class to avoid accidentally setting the AVL operand to X0 through MachineIR optimizations. There are cases where we really want to use X0, but we can't get that past the MachineVerifier with the register class as GPRNoX0. So I've use a 64-bit -1 as a sentinel for X0. All other immediate values should be uimm5. I convert it to X0 at the earliest possible point in the VSETVLI insertion pass to avoid touching the rest of the algorithm. In SelectionDAG lowering I'm using a -1 TargetConstant to hide it from instruction selection and treat it differently than if the user used -1. A user -1 should be selected to a register since it doesn't fit in uimm5. This is the rest of the changes started in D109110. As mentioned there, I don't have a failing test from MachineIR optimizations anymore. Reviewed By: frasercrmck Differential Revision: https://reviews.llvm.org/D109116	2021-09-03 09:19:25 -07:00
Craig Topper	ccbb4c8b4f	[RISCV] Fold (RISCVISD::SELECT_CC X, Y, CC, Z, Z) -> Z. If the true and false values are the same, we don't need a SELECT_CC. This would normally be folded before a select is legalized to select_cc. The test case exploits the late legalization of vscale to trigger a case where they become identical after legalization. This works around an issue found on a test case in D107957. In that case the true/false values were both eventually 0 and the select was used by a vector AVL operand. The select_cc got expanded to control flow and a phi, but the phi inputs were both copies from X0. MachineIR optimizations simplified this to a single copy from X0 going into the vector instruction. This became the input of a vsetvli after vsetvli insertion. Then register coalescing folded the copy into the vsetvli. X0 as the source of a vsetvli is a special encoding and should not be created by coalesing. We need to fix our vsetvli handling to make sure this can never happen any other way, but removing the unneeded select is still a worthwhile optimization.	2021-09-01 12:37:52 -07:00
Nick Desaulniers	e9b3f25730	[RISCVISelLowering] avoid emitting libcalls to __mulodi4() and __multi3() Similar to D108842, D108844, D108926, D108928, and D108936. __has_builtin(builtin_mul_overflow) returns true for 32b RISCV targets, but Clang is deferring to compiler RT when encountering long long types. If the semantics of __has_builtin mean "the compiler resolves these, always" then we shouldn't conditionally emit a libcall. Link: https://bugs.llvm.org/show_bug.cgi?id=28629 Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D108939	2021-08-31 11:23:56 -07:00
Craig Topper	0560a4adb3	[RISCV] Enable CONCAT_VECTORS for fixed FP vectors. Reviewed By: frasercrmck Differential Revision: https://reviews.llvm.org/D108487	2021-08-30 08:47:45 -07:00
Craig Topper	0eeab8b282	[RISCV] Add -riscv-v-fixed-length-vector-elen-max to limit the ELEN used for fixed length vectorization. This adds an ELEN limit for fixed length vectors. This will scalarize any elements larger than this. It will also disable some fractional LMULs. For example, if ELEN=32 then mf8 becomes illegal, i32/f32 vectors can't use any fractional LMULs, i16/f16 can only use mf2, and i8 can use mf2 and mf4. We may also need something for the scalable vectors, but that has interactions with the intrinsics and we can't scalarize a scalable vector. Longer term this should come from one of the Zve* features	2021-08-27 10:17:35 -07:00
Craig Topper	1b9417454e	[RISCV] Insert a sext_inreg when type legalizing i32 shl by constant on RV64. Similar to what we do for add/sub/mul. This can help remove some sext.w. There are some regressions on some bswap tests, but I have an idea how to fix that for a follow up. A new PACKW pattern is added to handle the new sext_inreg placement. Differential Revision: https://reviews.llvm.org/D108663	2021-08-26 10:20:19 -07:00
Ben Shi	f69fb7ac72	[DAGCombiner] Add target hook function to decide folding (mul (add x, c1), c2) Reviewed by: lebedev.ri, spatel, craig.topper, luismarques, jrtc27 Differential Revision: https://reviews.llvm.org/D107711	2021-08-22 16:53:32 +08:00
Craig Topper	36d8316cc8	[RISCV] Reduce duplicate code for calling SimplifyDemandedBits. This encapsulates the APInt creation and worklist management into a helper function. To keep one common interface I've use Log2_32 in places that previously created a mask by subtracting 1 from a power of 2. Differential Revision: https://reviews.llvm.org/D108324	2021-08-19 07:09:38 -07:00
Craig Topper	6d7ea597ef	[RISCV] Insert sext_inreg when type legalizing add/sub/mul with constant LHS. We already do this for non-constants RHS. This just removes the special case. I believe the special case may have been needed because the ANY_EXTEND of a constant used to create zero extended constants, but we recently changed that to produce sign extended constants. D107658 is needed to prevent some regressions. Reviewed By: luismarques Differential Revision: https://reviews.llvm.org/D107697	2021-08-18 10:44:25 -07:00
Craig Topper	d63f117210	[RISCV] Support RISCVISD::SELECT_CC in ComputeNumSignBitsForTargetNode.	2021-08-13 18:00:09 -07:00
Craig Topper	6f5edc3487	[RISCV] Fold (add (select lhs, rhs, cc, 0, y), x) -> (select lhs, rhs, cc, x, (add x, y)) Similar for sub except sub isn't commutative. Modify the existing and/or/xor folds to also work on ISD::SELECT and not just RISCVISD::SELECT_CC. This is needed to make sure we do this transform before type legalization turns i32 add/sub into add/sub+sign_extend_inreg on RV64. If we don't do this before that, the sign_extend_inreg will still be after the select. Reviewed By: frasercrmck Differential Revision: https://reviews.llvm.org/D107603	2021-08-10 09:02:56 -07:00
Fraser Cormack	2b4a1d4b86	[RISCV] Improve codegen for shuffles with LHS/RHS splats Shuffles which are broken into separate halves reveal splats in which a half is accessed via one index; such operations can be optimized to use "vrgather.vi". This optimization could be achieved by adding extra patterns to match `vrgather_vv_vl` which uses a splat as an index operand, but this patch instead identifies splat earlier. This way, future optimizations can build on top of the data gathered here, e.g., to splat-gather dominant indices and insert any leftovers. Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D107449	2021-08-09 10:31:40 +01:00
Craig Topper	2f3b738960	[RISCV] Add optimizations for FMV_X_ANYEXTH similar to FMV_X_ANYEXTW_RV64. This enables the fneg and fabs combines we have for FMV_X_ANYEXTW_RV64.	2021-08-08 18:30:48 -07:00
Craig Topper	88bc29f5f2	[RISCV] Introduce a RISCV CondCode enum instead of using ISD:SET* in MIR. NFC Previously we converted ISD condition codes to integers and stored them directly in our MIR instructions. The ISD enum kind of belongs to SelectionDAG so that seems like incorrect layering. This patch instead uses a CondCode node on RISCV::SELECT_CC until isel and then converts it from ISD encoding to a RISCV specific value. This value can be converted to/from the RISCV branch opcodes in the RISCV namespace. My larger motivation is to possibly support a microarchitectural feature of some CPUs where a short forward branch over a single instruction can be predicated internally. This will require a new pseudo instruction for select that needs to carry a branch condition and live probably until RISCVExpandPseudos. At that point it can be expanded to control flow without other instructions ending up in the predicated basic block. Using an ISD encoding in RISCVExpandPseudos doesn't seem like correct layering. Reviewed By: luismarques Differential Revision: https://reviews.llvm.org/D107400	2021-08-08 17:25:37 -07:00
Craig Topper	d4ee84ceee	[RISCV] Support FP_TO_S/UINT_SAT for i32 and i64. The fcvt fp to integer instructions saturate if their input is infinity or out of range, but the instructions produce a maximum integer for nan instead of 0 required for the ISD opcodes. This means we can use the instructions to do the saturating conversion, but we'll need to fix up the nan case at the end. We can probably improve the i8 and i16 default codegen as well, but I'll leave that for a follow up. Reviewed By: luismarques Differential Revision: https://reviews.llvm.org/D107230	2021-08-07 16:06:00 -07:00
Fraser Cormack	cba6aab971	[RISCV] Support simple fractional steps in matching VID sequences This patch extends the optimization of VID-sequence BUILD_VECTORs introduced in D104921 to include simple fractional steps composed of a separated integer numerator and denominator. A notable limitation in this sequence detection is that only sequences with steps N/1 or 1/D are found, meaning that the step between elements and the frequency with which it changes is consistent across the whole sequence. Fractional steps such as 2/3 won't be matched as those would involve more complex tracking of state or some level of backtracking. As is stands, however, this patch is sufficient to match common interleave-type shuffle indices, for example matching `<0,0,1,1>` (or commonly `<0,u,1,u>` or `<u,0,u,1>`) to an index sequence divided by 2. While the optimization is relatively `undef`-tolerant, due to greedy pattern-matching there even are some simple patterns which confuse the sequence detection into identifying either a suboptimal sequence or no sequence at all. Currently only fractional-step sequences identified as having a power-of-two denominator are actually lowered to RVV instructions. This is to avoid introducing divisions into the generated code. Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D106533	2021-08-03 10:38:24 +01:00
Hsiangkai Wang	8b33839f01	[RISCV] Rename vector inline constraint from 'v' to 'vr' and 'vm' in IR. Differential Revision: https://reviews.llvm.org/D107139	2021-08-01 05:58:17 +08:00
Craig Topper	593059b328	[RISCV] Rename RISCVISD::FCVT_W_RV64 to FCVT_W_RTZ_RV64. NFC fcvt.w(u) supports multiple rounding modes, but the ISD node doesn't encode that. So name it to match the rounding mode it uses.	2021-07-31 11:14:59 -07:00
Fraser Cormack	02dd4b59bc	[RISCV] Optimize floating-point "dominant value" BUILD_VECTORs This patch aims to improve the performance of BUILD_VECTORs which are identified as containing a dominant element. Given that most floating-point constants themselves require a load from the constant pool, it was possible for the optimization to actually increase the number of individual loads on small vectors. The exception is the zero constant -- +0.0 -- which can be materialized efficiently. While this optimization could do with a proper cost model to weigh the benfits of a single vector load vs. the manipulation of individual elements -- even for integer vectors which often require several instructions to materialize -- without a concrete RVV implementation to work with any heuristic is likely to be both more obtuse and inaccurate. Until then, this patch fixes at least one known obvious deficiency. Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D106963	2021-07-29 09:22:34 +01:00
Ben Shi	264b8e2a20	[RISCV] Optimize mul in the zba extension with SH*ADD This patch makes the following optimization, if the immediate multiplier is not a simm12. (mul x, (power_of_2 + 2)) => (SH1ADD x, (SLLI x, bits)) (mul x, (power_of_2 + 4)) => (SH2ADD x, (SLLI x, bits)) (mul x, (power_of_2 + 8)) => (SH3ADD x, (SLLI x, bits)) Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D106648	2021-07-29 09:46:41 +08:00
Craig Topper	3106f85945	[RISCV] Fix grammar in a comment. NFC	2021-07-28 09:09:26 -07:00
Craig Topper	54588bcc05	[RISCV] Restrict performANY_EXTENDCombine to prevent an infinite loop. The sign_extend we insert here can get turned into a zero_extend if the sign bit is known zero. This can enable a setcc combine that shrinks compares with zero_extend. This reduces the use count of the zero_extend allowing other combines to turn it back into an any_extend. This restricts the combine to only cases where the result is used by a CopyToReg. This works for my original motivating case. I hope the CopyToReg use will prevent any converted extends from turning back into an any_extend. Reviewed By: luismarques Differential Revision: https://reviews.llvm.org/D106754	2021-07-28 09:05:45 -07:00
Fraser Cormack	172487fe4c	[RISCV] Add support for vector saturating add/sub operations This patch adds support for lowering the saturating vector add/sub intrinsics to RVV instructions, for both fixed-length and scalable-vector forms alike. Note that some of the DAG combines are still not triggering for the scalable-vector tests. These require a bit more work in the DAGCombiner itself. Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D106651	2021-07-27 10:04:14 +01:00
Craig Topper	c63dbd8501	[RISCV] Custom lower (i32 (fptoui/fptosi X)). I stumbled onto a case where our (sext_inreg (assertzexti32 (fptoui X)), i32) isel pattern can cause an fcvt.wu and fcvt.lu to be emitted if the assertzexti32 has an additional user. If we add a one use check it would just cause a fcvt.lu followed by a sext.w when only need a fcvt.wu to satisfy both users. To mitigate this I've added custom isel and new ISD opcodes for fcvt.wu. This allows us to keep know it started life as a conversion to i32 without needing to match multiple nodes. ComputeNumSignBits has been taught that this new nodes produces 33 sign bits. To prevent regressions when we need to zero extend the result of an (i32 (fptoui X)), I've added a DAG combine to convert it to an (i64 (fptoui X)) before type legalization. In most cases this would happen in InstCombine, but a zero_extend can be created for function returns or arguments. To keep everything consistent I've added new nodes for fptosi as well. Reviewed By: luismarques Differential Revision: https://reviews.llvm.org/D106346	2021-07-24 10:50:43 -07:00
Fraser Cormack	b115c038d2	[RISCV] Fix a crash when lowering split float arguments Lowering certain float vectors without legal vector types could cause a crash due to a bad interaction between passing floats via GPRs and argument splitting. Split vector floats appear just like scalar floats. Under certain situations we choose to pass these float arguments via GPRs and use an XLenVT location and set the 'BCvt' info to track how they must be converted back to floating-point values. However, later logic for handling split arguments may take over, in which case we lose the previous information and set the 'Indirect' info, thus incorrectly lowering to integer types. I don't believe that we would have come across the notion of split floating-point arguments before. This patch addresses the issue by updating the lowering so that split arguments are only passed indirectly when they are scalar integer types. This has some change to how we lower some larger illegal float vectors, as can be seen in 'fastcc-float.ll' where the vector is now passed partly in registers and partly on the stack. Reviewed By: luismarques Differential Revision: https://reviews.llvm.org/D102852	2021-07-22 09:55:26 +01:00
Fraser Cormack	7b3a69bc16	[RISCV] Lower more BUILD_VECTOR sequences to RVV's VID This relands `a6ca88e908` which was originally reverted due to overflow bugs in `e3fa2b1eab`. This patch teaches the compiler to identify a wider variety of `BUILD_VECTOR`s which form integer arithmetic sequences, and to lower them to `vid.v` with modifications for non-unit steps and non-zero addends. The sequences handled by this optimization must either be monotonically increasing or decreasing. Consecutive elements holding the same value indicate a fractional step which, while simple mathematically, becomes more complex to handle both in the realm of lossy integer division and in the presence of `undef`s. For example, a common "interleaving" shuffle index will be lowered by LLVM to both `<0,u,1,u,2,...>` and `<u,0,u,1,u,...>` `BUILD_VECTOR` nodes. Either of these would ideally be lowered to `vid.v` shifted right by 1. Detection of this sequence in presence of general `undef` values is more complicated, however: `<0,u,u,1,>` could match either `<0,0,0,1,>` or `<0,0,1,1,>` depending on later values in the sequence. Both are possible, so backtracking or multiple passes is inevitable. Sticking to monotonic sequences keeps the logic simpler as it can be done in one pass. Fractional steps will likely be a separate optimization in a future patch. Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D104921	2021-07-22 09:36:12 +01:00
Eli Friedman	0ca46a1757	[SelectionDAG] Fix the representation of ISD::STEP_VECTOR. The existing rule about the operand type is strange. Instead, just say the operand is a TargetConstant with the right width. (Legalization ignores TargetConstants, so it doesn't matter if that width is legal.) Highlights: 1. I had to substantially rewrite the AArch64 isel patterns to expect a TargetConstant. Nothing too exotic, but maybe a little hairy. Maybe worth considering a target-specific node with some dagcombines instead of this complicated nest of isel patterns. 2. Our behavior on RV32 for vectors of i64 has changed slightly. In particular, we correctly preserve the width of the arithmetic through legalization. This changes the DAG a bit. Maybe room for improvement here. 3. I explicitly defined the behavior around overflow. This is necessary to make the DAGCombine transforms legal, and I don't think it causes any practical issues. Differential Revision: https://reviews.llvm.org/D105673	2021-07-21 10:58:40 -07:00
Craig Topper	81efb82570	[RISCV] Teach RISCVMatInt about cases where it can use LUI+SLLI to replace LUI+ADDI+SLLI for large constants. If we need to shift left anyway we might be able to take advantage of LUI implicitly shifting its immediate left by 12 to cover part of the shift. This allows us to use more bits of the LUI immediate to avoid an ADDI. isDesirableToCommuteWithShift now considers compressed instruction opportunities when deciding if commuting should be allowed. I believe this is the same or similar to one of the optimizations from D79492. Reviewed By: luismarques, arcbbb Differential Revision: https://reviews.llvm.org/D105417	2021-07-20 09:22:06 -07:00
Craig Topper	84877a098a	[RISCV] Use unordered indexed loads for MGATHER. I don't think the semantics of the llvm masked gather intrinsic care about the order the elements are loaded. For example, type legalization by splitting will chain them in parallel. This is different than scatter which we do chain in order. Reviewed By: frasercrmck Differential Revision: https://reviews.llvm.org/D106025	2021-07-20 08:46:02 -07:00
Craig Topper	50302feb1d	[SelectionDAG][RISCV] Use isSExtCheaperThanZExt to control whether sext or zext is used for constant folding any_extend. RISCV would prefer a sign extended constant since that works better with our constant materialization. We have an existing TLI hook we use to control sign extension of setcc operands in type legalization. That hook happens to do the right check we need here, but might be straying from its original purpose. With only RISCV defining this hook in tree, I wasn't sure if it was worth adding another hook with identical behavior. This is an alternative to D105785 where I tried to handle this in the RISCV backend by not creating ANY_EXTENDs in some places. Reviewed By: frasercrmck Differential Revision: https://reviews.llvm.org/D105918	2021-07-19 09:25:28 -07:00
Craig Topper	d0f8047d37	[RISCV] Teach computeKnownBitsForTargetNode that VLENB will never be more than 65536/8.	2021-07-17 11:24:20 -07:00
Craig Topper	173332d175	[RISCV] Manually emit the best shift for VSCALE lowering to improve codegen. We assume VLENB is a multiple of 8 and previously relied on shift pairs being optimized to an AND+SHL/SHR and computeKnownBits removing the AND. This doesn't happen if (vlenb >> 3) gets CSEd to have multiple uses. This patch manually emits the best shift to workaround this.	2021-07-17 00:52:07 -07:00
Craig Topper	4dbb788068	[RISCV] Teach constant materialization that it can use zext.w at the end with Zba to reduce number of instructions. If the upper 32 bits are zero and bit 31 is set, we might be able to use zext.w to fill in the zeros after using an lui and/or addi. Most of this patch is plumbing the subtarget features into the constant materialization. Reviewed By: luismarques Differential Revision: https://reviews.llvm.org/D105509	2021-07-16 09:35:56 -07:00
Craig Topper	0ce13f92b7	[RISCV] Add curly braces around a case body that declares variables. NFC This is at the end of the switch so doesn't cause any issues now, but if a new case is added it will break.	2021-07-16 09:35:56 -07:00
Fraser Cormack	e3fa2b1eab	Revert "[RISCV] Lower more BUILD_VECTOR sequences to RVV's VID" This reverts commit `a6ca88e908`. More caution is required to avoid overflow/underflow. Thanks to the santizers for catching this.	2021-07-16 15:00:20 +01:00
Fraser Cormack	a6ca88e908	[RISCV] Lower more BUILD_VECTOR sequences to RVV's VID This patch teaches the compiler to identify a wider variety of `BUILD_VECTOR`s which form integer arithmetic sequences, and to lower them to `vid.v` with modifications for non-unit steps and non-zero addends. The sequences handled by this optimization must either be monotonically increasing or decreasing. Consecutive elements holding the same value indicate a fractional step which, while simple mathematically, becomes more complex to handle both in the realm of lossy integer division and in the presence of `undef`s. For example, a common "interleaving" shuffle index will be lowered by LLVM to both `<0,u,1,u,2,...>` and `<u,0,u,1,u,...>` `BUILD_VECTOR` nodes. Either of these would ideally be lowered to `vid.v` shifted right by 1. Detection of this sequence in presence of general `undef` values is more complicated, however: `<0,u,u,1,>` could match either `<0,0,0,1,>` or `<0,0,1,1,>` depending on later values in the sequence. Both are possible, so backtracking or multiple passes is inevitable. Sticking to monotonic sequences keeps the logic simpler as it can be done in one pass. Fractional steps will likely be a separate optimization in a future patch. Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D104921	2021-07-16 10:35:13 +01:00
Fraser Cormack	03a4702c88	[RISCV] Fix the neutral element in vector 'fadd' reductions Using positive zero as the neutral element in 'fadd' reductions, while it generates better code, is incorrect. The correct neutral element is negative zero: 0.0 + -0.0 = 0.0, whereas -0.0 + -0.0 = -0.0. There are perhaps more optimal lowerings of negative zero avoiding constant-pool loads which could be left as future work. Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D105902	2021-07-14 10:18:38 +01:00
Craig Topper	1e670dc7d7	[RISCV] Use DIVUW/REMUW/DIVW instructions for i8/i16/i32 udiv/urem/sdiv when LHS is constant. We don't really have optimizations for division with a constant LHS. If we don't use a W instruction we end up needing to sign or zero extend the RHS to use the 64-bit instruction. I had to sign_extend i32 constants on the LHS instead of using any_extend which becomes zero_extend. If we don't do this, constants that were originally negative become harder to materialize. I think this problem exists for more of our W instruction cases. For example (i32 (shl -1, X)), but we don't have lit tests. I'll work on that as a follow up. I also left a FIXME for enabling W instruction for RHS constants under -Oz. Reviewed By: luismarques Differential Revision: https://reviews.llvm.org/D105769	2021-07-13 10:33:57 -07:00
Fangrui Song	3d89fb4d13	[RISCV] Support machine constraint "S" Similar to D46745, "S" represents an absolute symbolic operand, which can be used to specify the access models, e.g. extern int var; void addr_via_asm() { void ret; asm("lui %0, %%hi(%1)\naddi %0,%0,%%lo(%1)" : "=r"(ret) : "S"(&var)); return ret; } 'S' is documented in trunk GCC: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=101275 Reviewed By: luismarques Differential Revision: https://reviews.llvm.org/D105254	2021-07-13 09:30:09 -07:00
Fraser Cormack	d991b7212b	[RISCV] Pass undef VECTOR_SHUFFLE indices on to BUILD_VECTOR Often when lowering vector shuffles, we split the shuffle into two LHS/RHS shuffles which are then blended together. To do so we split the original indices into two, indexed into each respective vector. These two index vectors are then separately lowered as BUILD_VECTORs. This patch forwards on any undef indices to the BUILD_VECTOR, rather than having the VECTOR_SHUFFLE lowering decide on an optimal concrete index. The motiviation for ths change is so that we don't duplicate optimization logic between the two lowering methods and let BUILD_VECTOR do what it does best. Propagating undef in this way allows us, for example, to generate `vid.v` to produce the LHS indices of commonly-used interleave-type shuffles. I have designs on further optimizing interleave-type and other common shuffle patterns in the near future. Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D104789	2021-07-13 10:41:54 +01:00
Craig Topper	12d51f95fe	[RISCV] Implement lround/llround/lrint/llrint with fcvt instruction with -fno-math-errno These are fp->int conversions using either RMM or dynamic rounding modes. The lround and lrint opcodes have a return type of either i32 or i64 depending on sizeof(long) in the frontend which should follow xlen. llround/llrint should always return i64 so we'll need a libcall for those on rv32. The frontend will only emit the intrinsics if -fno-math-errno is in effect otherwise a libcall will be emitted which will not use these ISD opcodes. gcc also does this optimization. Reviewed By: arcbbb Differential Revision: https://reviews.llvm.org/D105206	2021-07-06 11:43:22 -07:00
Craig Topper	2b5e53111a	[RISCV] Add support for matching vwmul(u) and vwmacc(u) from fixed vectors. This adds a DAG combine to detect sext/zext inputs and emit a new ISD opcode. The extends will either be removed or replaced with narrower extends. Isel patterns are used to match add and widening mul to vwmacc similar to the recently added vmacc patterns. There's still some work to be to match vmulsu. We should also rewrite splats that were extended as scalars and then splatted. Reviewed By: arcbbb Differential Revision: https://reviews.llvm.org/D104802	2021-07-06 10:24:31 -07:00
Craig Topper	3b6dfa381e	[RISCV] Protect the SHL/SRA/SRL handlers in LowerOperation against being called for an illegal i32 shift amount. It seems it is possible for DAG combine to create a shl with an i64 result type and an i32 shift amount. This is ok before type legalization since the type don't need to match in SelectionDAG. This results in type legalization calling LowerOperation to legalize just the amount. We weren't expecting this so we asserted for not finding a fixed vector shift. To fix this, I've added a check for the fixed vector case and returned SDValue() to get the default type legalizer. I've factored all shifts together and added a fixed vector specific handler to avoid repeating similar code for each in LowerOperation. The particular case I found was exposed by D104581, but the bad shift is created after that patch triggers.	2021-06-29 09:45:13 -07:00
Jim Lin	779d2b0a42	[RISCV][NFC] Combine the control flow for different RetOp of interrupt function Reviewed By: frasercrmck Differential Revision: https://reviews.llvm.org/D104838	2021-06-26 17:28:03 +08:00
Craig Topper	d4f4a1ba62	[RISCV] Add DAG combine to detect opportunities to replace (i64 (any_extend (i32 X)) with sign_extend. If type legalization is going to insert a sign_extend for other users of X and we can fold the sign_extend into ADDW/MULW/SUBW, it is better to replace the ANY_EXTEND so we don't end up with a separate ADD/MUL/SUB instruction for the users of the ANY_EXTEND. I'm only handling setcc uses right now, but there are other instructions that force sign_extends like ashr. There are probably other *W instructions we could use in addition to ADDW/SUBW/MULW. My motivating case was a loop terminating compare and a phi use as seen in the new test file. Reviewed By: asb Differential Revision: https://reviews.llvm.org/D104581	2021-06-25 23:16:37 -07:00
Fraser Cormack	a4729f7f88	[RISCV] Lower RVV vector SELECTs to VSELECTs This patch optimizes the code generation of vector-type SELECTs (LLVM select instructions with scalar conditions) by custom-lowering to VSELECTs (LLVM select instructions with vector conditions) by splatting the condition to a vector. This avoids the default expansion path which would either introduce control flow or fully scalarize. Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D104772	2021-06-24 10:12:51 +01:00
Fraser Cormack	fed1503e85	[RISCV][VP] Lower FP VP ISD nodes to RVV instructions With the exception of `frem`, this patch supports the current set of VP floating-point binary intrinsics by lowering them to to RVV instructions. It does so by using the existing `RISCVISD *_VL` custom nodes as an intermediate layer. Both scalable and fixed-length vectors are supported by using this method. The `frem` node is unsupported due to a lack of available instructions. For fixed-length vectors we could scalarize but that option is not (currently) available for scalable-vector types. The support is intentionally left out so it equivalent for both vector types. The matching of vector/scalar forms is currently lacking, as scalable vector types do not lower to the custom `VFMV_V_F_VL` node. We could either make floating-point scalable vector splats lower to this node, or support the matching of multiple kinds of splat via a `ComplexPattern`, much like we do for integer types. Reviewed By: rogfer01 Differential Revision: https://reviews.llvm.org/D104237	2021-06-17 10:04:00 +01:00
Fraser Cormack	c75e454cb9	[RISCV] Transform unaligned RVV vector loads/stores to aligned ones This patch adds support for loading and storing unaligned vectors via an equivalently-sized i8 vector type, which has support in the RVV specification for byte-aligned access. This offers a more optimal path for handling of unaligned fixed-length vector accesses, which are currently scalarized. It also prevents crashing when `LegalizeDAG` sees an unaligned scalable-vector load/store operation. Future work could be to investigate loading/storing via the largest vector element type for the given alignment, in case that would be more optimal on hardware. For instance, a 4-byte-aligned nxv2i64 vector load could loaded as nxv4i32 instead of as nxv16i8. Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D104032	2021-06-14 18:12:18 +01:00
Fraser Cormack	502edebd9d	[ValueTypes][RISCV] Cap RVV fixed-length vectors by size This patch changes RVV's policy for its supported list of fixed-length vector types by capping by vector size rather than element count. Now all 1024-byte vectors (of supported element types) are supported, rather than all 256-element vectors. This is a more natural fit for the architecture, and allows us to, for example, improve the support for vector bitcasts. This change necessitated the adding of some new simple types to avoid "regressing" on the number of currently-supported vectors. We round out the 1024-byte types by adding `v512i8`, `v1024i8`, `v512i16` and `v512f16`. Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D103884	2021-06-09 12:15:37 +01:00
Fraser Cormack	e8f1f89103	[RISCV] Support CONCAT_VECTORS on scalable masks This patch is a simple fix which registers CONCAT_VECTORS as custom-lowered for scalable mask vectors. This follows the pattern of all other scalable-vector types, as the default expansion of CONCAT_VECTORS cannot handle scalable types, and even if it did it'd go through the stack and generate worse code. Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D103896	2021-06-09 09:07:44 +01:00
Craig Topper	f30f8b4f12	[RISCV] Lower i8/i16 bswap/bitreverse to grevi/greviw with Zbp. Include known bits support so we know we don't need to zext the output if the input was already zero extended. Reviewed By: luismarques Differential Revision: https://reviews.llvm.org/D103757	2021-06-07 10:31:51 -07:00
Craig Topper	8bde5f06a1	[RISCV] Replace && with \|\|. Spotted by coverity. We should be exiting when the shift amount is greater than the bit width regardless of whether it is a power of 2. Reported by Simon Pilgrim here https://reviews.llvm.org/D96661 This requires getting a shift amount that is out of bounds that wasn't already optimized by SelectionDAG. This would be pretty trick to construct a test for. Or it would require a non-power of 2 shift amount and a mask that has runs of ones and zeros of the next lowest power of 2 from that shift amount. I tried a little to produce a test for this, but didn't get it to work.	2021-06-06 13:09:51 -07:00
Nikita Popov	1ffa6499ea	[TargetLowering] Use IRBuilderBase instead of IRBuilder<> (NFC) Don't require a specific kind of IRBuilder for TargetLowering hooks. This allows us to drop the IRBuilder.h include from TargetLowering.h. Differential Revision: https://reviews.llvm.org/D103759	2021-06-06 16:29:50 +02:00
Nikita Popov	9914200393	[CodeGen] Add missing includes (NFC) These currently rely on the IRBuilder.h include in TargetLowering.h. Make them explicit.	2021-06-06 15:48:27 +02:00
Fraser Cormack	3b0a33d0ad	[RISCV] Expand unaligned fixed-length vector memory accesses RVV vectors must be aligned to their element types, so anything less is unaligned. For regular loads and stores, our custom-lowering of fixed-length vectors meant that we opted out of LegalizeDAG's built-in unaligned expansion. This patch adds that logic in to our custom lower function. For masked intrinsics, we declare that anything unaligned is not legal, leaving the ScalarizeMaskedMemIntrin pass to do the expansion for us. Note that neither of these methods can handle the expansion of scalable-vector memory ops, so those cases are left alone by this patch. Scalable loads and stores already go through expansion by default but hit an assertion, and scalable masked intrinsics will silently generate incorrect code. It may be prudent to return an error in both of these cases. Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D102493	2021-06-02 09:27:44 +01:00
Fraser Cormack	4f500c402b	[RISCV] Support vector types in combination with fastcc This patch extends the RISC-V lowering of the 'fastcc' calling convention to vector types, both fixed-length and scalable. Without this patch, any function passing or returning vector types by value would throw a compiler error. Vectors are handled in 'fastcc' much as they are in the default calling convention, the noticeable difference being the extended set of scalar GPR registers that can be used to pass vectors indirectly. Reviewed By: HsiangKai Differential Revision: https://reviews.llvm.org/D102505	2021-06-01 10:31:18 +01:00
Fraser Cormack	2b37c405cc	[RISCV] Scale scalably-typed split argument offsets by VSCALE This patch fixes a bug in lowering scalable-vector types in RISC-V's main calling convention. When scalable-vector types are split and passed indirectly, the target is responsible for scaling the offset -- initially set to the known-minimum store size -- by the scalable factor. Before this we were issuing overlapping loads or stores to the different parts, leading to incorrect codegen. Credit to @HsiangKai for spotting this. Reviewed By: HsiangKai Differential Revision: https://reviews.llvm.org/D103262	2021-05-31 10:43:13 +01:00
Fraser Cormack	eb23936591	[RISCV] Support vector conversions between fp and i1 This patch custom lowers FP_TO_[US]INT and [US]INT_TO_FP conversions between floating-point and boolean vectors. As the default action is scalarization, this patch both supports scalable-vector conversions and improves the code generation for fixed-length vectors. The lowering for these conversions can piggy-back on the existing lowering, which lowers the operations to a supported narrowing/widening conversion and then either an extension or truncation. Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D103312	2021-05-31 09:55:39 +01:00
Fraser Cormack	5a80dc4988	[VP][SelectionDAG] Add a target-configurable EVL operand type This patch adds a way for the target to configure the type it uses for the explicit vector length operands of VP SDNodes. The type must be a legal integer type (there is still no target-independent legalization of this operand) and must currently be at least as big as i32, the type used by the IR intrinsics. An implicit zero-extension takes place on targets which choose a larger type. All VP nodes should be created with this type used for the EVL operand. This allows 64-bit RISC-V to avoid custom legalization of all VP nodes, keeping them in their target-independent form for that bit longer. Reviewed By: simoll Differential Revision: https://reviews.llvm.org/D103027	2021-05-27 15:27:36 +01:00
Fraser Cormack	b7101e218c	[DAGCombine][RISCV] Don't try to trunc-store combined vector stores DAGCombine's `mergeStoresOfConstantsOrVecElts` optimization is told whether it's to use vector types and also whether it's to issue a truncating store. However, the truncating store code path assumes a scalar integer `ConstantSDNode`, and when using vector types it creates either a `BUILD_VECTOR` or `CONCAT_VECTORS` to store: neither of which is a constant. The `riscv64` target is able to expose a crash here because it switches on both code paths at the same time. The `f32` is stored as `i32` which must be promoted to `i64`, necessitating a truncating store. It also decides later that it prefers a vector store of `v2f32`. While vector truncating stores are legal, this combine is not able to emit them. We also don't have a test case. This patch adds an assert to catch this case more gracefully, and updates one of the caller functions to the function to turn off the use of truncating stores when preferring vectors. Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D103173	2021-05-27 14:16:32 +01:00
Fraser Cormack	8c73a31c11	[RISCV] Allow passing fixed-length vectors via the stack The vector calling convention dictates that when the vector argument registers are exhaused, GPRs are used to pass the address via the stack. When the GPRs themselves are exhausted, at best we would previously crash with an assertion, and at worst we'd generate incorrect code. This patch addresses this issue by passing fixed-length vectors via the stack with their full fixed-length size and aligned to their element type size. Since the calling convention lowering can't yet handle scalable vector types, this patch adds a fatal error to make it clear that we are lacking in this regard. Reviewed By: HsiangKai Differential Revision: https://reviews.llvm.org/D102422	2021-05-27 14:14:07 +01:00
Fraser Cormack	772b58a641	[SelectionDAG][RISCV] Don't unroll 0/1-type bool VSELECTs This patch extends the cases in which the legalizer is able to express VSELECT in terms of XOR/AND/OR. When dealing with a VSELECT between boolean vector types, the mask itself is an all-ones or all-ones value of the operand type, so a 0/1 boolean type behaves identically to a 0/-1 type. This greatly helps RISC-V which relies on expansion for these nodes. It also allows scalable-vector bool VSELECTs to use the default expansion, where before it would crash in SelectionDAG::UnrollVectorOp. Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D103147	2021-05-27 10:08:57 +01:00
Craig Topper	9065118b64	[RISCV] Optimize SEW=64 shifts by splat on RV32. SEW=64 shifts only uses the log2(64) bits of shift amount. If we're splatting a 64 bit value in 2 parts, we can avoid splatting the upper bits and just let the low bits be sign extended. They won't be read anyway. For the purposes of SelectionDAG semantics of the generic ISD opcodes, if hi was non-zero or bit 31 of the low is 1, the shift was already undefined so it should be ok to replace high with sign extend of low. In order do be able to find the split i64 value before it becomes a stack operation, I added a new ISD opcode that will be expanded to the stack spill in PreprocessISelDAG. This new node is conceptually similar to BuildPairF64, but I expanded earlier so that we could go through regular isel to get the right VLSE opcode for the LMUL. BuildPairF64 is expanded in a CustomInserter. Reviewed By: frasercrmck Differential Revision: https://reviews.llvm.org/D102521	2021-05-26 10:23:32 -07:00
Craig Topper	b510e4cf1b	[RISCV] Add a vsetvli insert pass that can be extended to be aware of incoming VL/VTYPE from other basic blocks. This is a replacement for D101938 for inserting vsetvli instructions where needed. This new version changes how we track the information in such a way that we can extend it to be aware of VL/VTYPE changes in other blocks. Given how much it changes the previous patch, I've decided to abandon the previous patch and post this from scratch. For now the pass consists of a single phase that assumes the incoming state from other basic blocks is unknown. A follow up patch will extend this with a phase to collect information about how VL/VTYPE change in each block and a second phase to propagate this information to the entire function. This will be used by a third phase to do the vsetvli insertion. Reviewed By: frasercrmck Differential Revision: https://reviews.llvm.org/D102737	2021-05-24 11:47:27 -07:00
Fraser Cormack	7a211ed110	[RISCV] Prevent store combining from infinitely looping RVV code generation does not successfully custom-lower BUILD_VECTOR in all cases. When it resorts to default expansion it may, on occasion, be expanded to scalar stores through the stack. Unfortunately these stores may then be picked up by the post-legalization DAGCombiner which merges them again. The merged store uses a BUILD_VECTOR which is then expanded, and so on. This patch addresses the issue by overriding the `mergeStoresAfterLegalization` hook. A lack of granularity in this method (being passed the scalar type) means we opt out in almost all cases when RVV fixed-length vector support is enabled. The only exception to this rule are mask vectors, which are always either custom-lowered or are expanded to a load from a constant pool. Reviewed By: HsiangKai Differential Revision: https://reviews.llvm.org/D102913	2021-05-24 10:19:32 +01:00
Fraser Cormack	c74ab891fc	[RISCV] Ensure small mask BUILD_VECTORs aren't expanded The default expansion for BUILD_VECTORs -- save for going through shuffles -- is to go through the stack. This method only works when the type is at least byte-sized, so for v2i1 and v4i1 we would crash. This patch ensures that small mask-type BUILD_VECTORs are always handled without crashing. We lower to a SETCC of the equivalent i8 type. This also exposes some pre-existing issues where the lowering when optimizing for size results in larger code than without. Those will be tackled in future patches. Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D102767	2021-05-20 19:12:29 +01:00
Fraser Cormack	26bd2250c1	[RISCV] Ensure shuffle splat operands are type-legal The use of `SelectionDAG::getSplatValue` isn't guaranteed to return a type-legal splat value as it may implicitly extract a vector element from another shuffle. It is not permitted to introduce an illegal type when lowering shuffles. This patch addresses the crash by adding a boolean flag to `getSplatValue`, defaulting to false, which when set will ensure a type-legal return value. If it is unable to do that it will fail to return a splat value. I've been through the existing uses of `getSplatValue` in other targets and was unable to find a need or test cases showing a need to update their uses. In some cases, the call is made during `LegalizeVectorOps` which may still produce illegal scalar types. In other situations, the illegally-typed splat value may be quickly patched up to a legal type (such as any-extending the returned `extract_vector_elt` up to a legal type) before `LegalizeDAG` notices. Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D102687	2021-05-20 18:00:03 +01:00
Fraser Cormack	ca2c245ba4	[RISCV] Support INSERT_VECTOR_ELT into i1 vectors Like the element extraction of these vectors, we choose to promote up to an i8 vector type and perform the insertion there. Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D102697	2021-05-19 09:41:50 +01:00
Craig Topper	9c345407b4	[RISCV] Remove RISCVII:VSEW enum. Make encodeVYPE operate directly on SEW. The VSEW encoding isn't a useful value to pass around. It's better to use SEW or log2(SEW) directly. The only real ugliness is that the vsetvli IR intrinsics use the VSEW encoding, but it's easy enough to decode that when the intrinsic is processed.	2021-05-12 13:19:08 -07:00
Evandro Menezes	3a64b7080d	[RISCV] Move instruction information into the RISCVII namespace (NFC) Move instruction attributes into the `RISCVII` namespace and add associated helper functions. Differential Revision: https://reviews.llvm.org/D102268	2021-05-11 16:32:42 -05:00
Craig Topper	ce6e4f27dd	[RISCV] Use fractional LMULs for fixed length types smaller than riscv-v-vector-bits-min. My thought process is that if v2i64 is an LMUL=1 type then v2i32 should be an LMUL=1/2 type. We limit the fractional LMUL so that SEW=64 clips to LMUL=1, SEW=32 clips to LMUL=1/2, etc. This ensures there's always a fractional LMUL available to truncate a type. This does reduce the number of vsetvlis in some cases. Some tests increase vsetvlis because the best container type for a mask type is dependent on the LMUL+SEW that the mask was produced from, but you can't tell that from the type. I think this is something we need to solve this in the machine IR when optimizing vsetvlis. Reviewed By: frasercrmck Differential Revision: https://reviews.llvm.org/D101215	2021-05-11 09:42:48 -07:00
Craig Topper	80b9510806	[RISCV] Correct VL for fixed length masked scatter. We were incorrectly calling getVectorNumElements on a scalable vector type. This shouldn't be allowed. This gives a warning on EVT, but not MVT.	2021-05-10 09:50:08 -07:00
Fraser Cormack	6db0cedd23	[LegalizeVectorOps][RISCV] Add scalable-vector SELECT expansion This patch extends VectorLegalizer::ExpandSELECT to permit expansion also for scalable vector types. The only real change is conditionally checking for BUILD_VECTOR or SPLAT_VECTOR legality depending on the vector type. We can use this to fix "cannot select" errors for scalable vector selects on the RISCV target. Note that in future patches RISCV will possibly custom-lower vector SELECTs to VSELECTs for branchless codegen. Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D102063	2021-05-10 08:22:35 +01:00
Craig Topper	6660319cef	[RISCV] Remove unused RISCV::VLEFF and VLEFF_MASK. NFC Looks like these got left behind when vleff isel was moved to X86ISelDAGToDAG.cpp	2021-05-06 09:41:29 -07:00
Fraser Cormack	6f17613bfb	[RISCV][VP] Lower VP ISD nodes to RVV instructions This patch supports all of the current set of VP integer binary intrinsics by lowering them to to RVV instructions. It does so by using the existing RISCVISD *_VL custom nodes as an intermediate layer. Both scalable and fixed-length vectors are supported by using this method. One notable change to the existing vector codegen strategy is that scalable all-ones and all-zeros mask SPLAT_VECTORs are now lowered to RISCVISD VMSET_VL and VMCLR_VL nodes to match their fixed-length BUILD_VECTOR counterparts. This allows them to reuse the existing "all-ones" VL patterns. To reduce the size of the phabricator diff, some tests are intentionally left out and will be added later if the patch is accepted. Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D101826	2021-05-05 12:32:24 +01:00
Fraser Cormack	cd6a52fede	[RISCV] Cap legal fixed-length vectors to 256-element types Previously, RISC-V would make legal all fixed-length vectors types whose size are less than or equal to some function of the minimum value of VLEN and the maximum-permissible LMUL grouping. Due to vector legalization issues, this patch instead caps the legal fixed-length vector types to those with 256 elements. This value was chosen because it is the longest vector length which has corresponding MVTs across all supported element types. Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D101839	2021-05-05 09:51:08 +01:00
Fraser Cormack	46fa214a6f	[RISCV] Lower splats of non-constant i1s as SETCCs This patch adds support for splatting i1 types to fixed-length or scalable vector types. It does so by lowering the operation to a SETCC of the equivalent i8 type. Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D101465	2021-05-04 09:14:05 +01:00
Fraser Cormack	d23e4f6872	[RISCV] Add support for fmin/fmax vector reductions Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D101518	2021-05-03 10:33:51 +01:00
Craig Topper	ba63cdb8f2	[RISCV] Store SEW in RISCV vector pseudo instructions in log2 form. This shrinks the immediate that isel table needs to emit for these instructions. Hoping this allows me to change OPC_EmitInteger to use a better variable length encoding for representing negative numbers. Similar to what was done a few months ago for OPC_CheckInteger. The alternative encoding uses less bytes for negative numbers, but increases the number of bytes need to encode 64 which was a very common number in the RISCV table due to SEW=64. By using Log2 this becomes 6 and is no longer a problem.	2021-05-02 12:09:20 -07:00
Fraser Cormack	791766e6d2	[RISCV] Support STEP_VECTOR with a step greater than one DAGCombiner was recently taught how to combine STEP_VECTOR nodes, meaning the step value is no longer guaranteed to be one by the time it reaches the backend for lowering. This patch supports such cases on RISC-V by lowering to other step values to a multiply following the vid.v instruction. It includes a small optimization for common cases where the multiply can be expressed as a shift left. Reviewed By: rogfer01 Differential Revision: https://reviews.llvm.org/D100856	2021-04-30 09:36:18 +01:00
Craig Topper	dcdda2bdf2	[RISCV] Teach DAG combine to fold (and (select_cc lhs, rhs, cc, -1, c), x) -> (select_cc lhs, rhs, cc, x, (and, x, c)) Similar for or/xor with 0 in place of -1. This is the canonical form produced by InstCombine for something like `c ? x & y : x;` Since we have to use control flow to expand select we'll usually end up with a mv in basic block. By folding this we may be able to pull the and/or/xor into the block instead and avoid a mv instruction. The code here is based on code from ARM that uses this to create predicated instructions. I'm doing it on SELECT_CC so it happens late, but we could do it on select earlier which is what ARM does. I'm not sure if we lose any combine opportunities if we do it earlier. I left out add and sub because this can separate sext.w from the add/sub. It also made a conditional i64 addition/subtraction on RV32 worse. I guess both of those would be fixed by doing this earlier on select. The select-binop-identity.ll test has not been commited yet, but I made the diff show the changes to it. Reviewed By: luismarques Differential Revision: https://reviews.llvm.org/D101485	2021-04-29 09:43:51 -07:00
Craig Topper	0c330afdfa	[RISCV] Enable SPLAT_VECTOR for fixed vXi64 types on RV32. This replaces D98479. This allows type legalization to form SPLAT_VECTOR_PARTS so we don't lose the splattedness when the scalar type is split. I'm handling SPLAT_VECTOR_PARTS for fixed vectors separately so we can continue using non-VL nodes for scalable vectors. I limited to RV32+vXi64 because DAGCombiner::visitBUILD_VECTOR likes to form SPLAT_VECTOR before seeing if it can replace the BUILD_VECTOR with other operations. Especially interesting is a splat BUILD_VECTOR of the extract_vector_elt which can become a splat shuffle, but won't if we form SPLAT_VECTOR first. We either need to reorder visitBUILD_VECTOR or add visitSPLAT_VECTOR. Reviewed By: frasercrmck Differential Revision: https://reviews.llvm.org/D100803	2021-04-29 08:20:09 -07:00
Craig Topper	25391cec3a	[RISCV] Teach computeKnownBits that vsetvli returns number less than 2^31. This seems like a reasonable upper bound on VL. WG discussions for the V spec would probably allow us to use 2^16 as an upper bound on VLEN, but this is good enough for now. This allows us to remove sext and zext if user happens to assign the size_t result into an int and then uses it as a VL intrinsic argument which is size_t. Reviewed By: frasercrmck, rogfer01, arcbbb Differential Revision: https://reviews.llvm.org/D101472	2021-04-29 08:07:59 -07:00
Fraser Cormack	43ad058a01	[RISCV] Fix stack slot for argument types (Bug 49500) This is an complementary/alternative fix for D99068. It takes a slightly different approach by explicitly summing up all of the required split part type sizes and ensuring we allocate enough space for them. It also takes the maximum alignment of each part. Compared with D99068 there are fewer changes to the stack objects in existing tests. However, @luismarques has shown in that patch that there are opportunities to reduce our stack usage in the future. Reviewed By: luismarques Differential Revision: https://reviews.llvm.org/D99087	2021-04-29 09:10:48 +01:00
Craig Topper	ce09dd54e6	[RISCV] Select 5 bit immediate for VSETIVLI during isel rather than peepholing in the custom inserter. This adds a special operand type that is allowed to be either an immediate or register. By giving it a unique operand type the machine verifier will ignore it. This perturbs a lot of tests but mostly it is just slightly different instruction orders. Something bad did happen to some min/max reduction tests. We're spilling vector registers when we weren't before. Reviewed By: khchen Differential Revision: https://reviews.llvm.org/D101246	2021-04-27 14:38:16 -07:00
Craig Topper	262a72f50f	[RISCV] Use stack slot to handle SPLAT_VECTOR_PARTS on RV32. Reduces the amount of vector ALU operations and reduces vector register pressure.	2021-04-26 15:43:02 -07:00
Craig Topper	e2cd92cb9b	[RISCV] Match splatted load to scalar load + splat. Form strided load during isel. This modifies my previous patch to push the strided load formation to isel. This gives us opportunity to fold the splat into a .vx operation first. Using a scalar register and a .vx operation reduces vector register pressure which can be important for larger LMULs. If we can't fold the splat into a .vx operation, then it can make sense to use a strided load to free up the vector arithmetic ALU to do actual arithmetic rather than tying it up with vmv.v.x. Reviewed By: khchen Differential Revision: https://reviews.llvm.org/D101138	2021-04-26 13:32:03 -07:00
Craig Topper	837442de9c	[RISCV] Cleanup setOperationAction calls for INTRINSIC_WO_CHAIN/INTRINSIC_W_CHAIN We have several extensions that need i32 to be Custom for INTRINSIC_WO_CHAIN with RV64 so enable it for all RV64. For V extension, make i32 Custom for RV64 and i64 Custom for RV32. When the i32 or i64 is legal, the operation action doesn't matter. LegalizeDAG checks MVT::Other rather than the real type.	2021-04-25 23:44:28 -07:00
Craig Topper	8f5cd49405	[RISCV] Teach DAG combine what bits Zbp instructions demanded from their inputs. This teaches DAG combine that shift amount operands for grev, gorc shfl, unshfl only read a few bits. This also teaches DAG combine that grevw, gorcw, shflw, unshflw, bcompressw, bdecompressw only consume the lower 32 bits of their inputs. In the future we can teach SimplifyDemandedBits to also propagate demanded bits of the output to the inputs in some cases.	2021-04-25 21:54:06 -07:00
Levy Hsu	8cf54c7ff5	[RISCV] [1/2] Add IR intrinsic for Zbe extension RV32/64: bcompress bdecompress RV64 ONLY: bcompressw bdecompressw Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D101143	2021-04-25 19:14:34 -07:00
Craig Topper	bd28d86119	[RISCV] Removed getLMULForFixedLengthVector. Use getContainerForFixedLengthVector and getRegClassIDForVecVT to get the register class to use when making a fixed vector type legal. Inline it into the other two call sites. I'm looking into using fractional lmul for fixed length vectors and getLMULForFixedLengthVector returned an integer making it unable to express this. I considered returning the LMUL enum, but that seemed like it would introduce more complexity to convert it for use.	2021-04-23 16:56:46 -07:00
Craig Topper	bcf321015b	[RISCV] Move getLMULForFixedLengthVector out of RISCVSubtarget. Make it a static function RISCVISelLowering, the only place it is used. I think I'm going to make this return a fractional LMULs in some cases so I'm sorting out where it should live before I start making changes.	2021-04-23 15:06:20 -07:00
Craig Topper	baa107f018	[RISCV] Only expose one interface for getContainerForFixedLengthVector in the RISCVTargetLowering class We can have RISCVISelDAGToDAG.cpp call the VT only version by finding the RISCVTargetLowering object via the Subtarget. Make the static versions just global static functions in RISCVISelLowering that can be called by static functions in that file.	2021-04-23 15:06:10 -07:00
Fraser Cormack	83b8f8da82	[RISCV] Custom lower vector F(MIN\|MAX)NUM to vf(min\|max) This patch adds support for both scalable- and fixed-length vector code lowering of the llvm.minnum and llvm.maxnum intrinsics to the equivalent RVV instructions. Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D101035	2021-04-23 12:22:15 +01:00
Levy Hsu	b49337bbb9	[RISCV] [1/2] Add IR intrinsic for Zbp extension RV32/64: grev grevi gorc gorci shfl shfli unshfl unshfli RV64 ONLY: grevw greviw gorcw gorciw shflw shfli (For non-existing shfliw) unshfli (For non-existing unshfliw) Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D100830	2021-04-22 16:34:51 -07:00
Craig Topper	70254ccb69	[RISCV] Turn splat shuffles of vector loads into strided load with stride of x0. Implementations are allowed to optimize an x0 stride to perform less memory accesses. This is the case in SiFive cores. No idea if this is the case in other implementations. We might need a tuning flag for this. Reviewed By: frasercrmck, arcbbb Differential Revision: https://reviews.llvm.org/D100815	2021-04-22 10:02:57 -07:00
Craig Topper	77f14c96e5	[RISCV] Use stack temporary to splat two GPRs into SEW=64 vector on RV32. Rather than doing splatting each separately and doing bit manipulation to merge them in the vector domain, copy the data to the stack and splat it using a strided load with x0 stride. At least on some implementations this vector load is optimized to not do a load for each element. This is equivalent to how we move i64 to f64 on RV32. I've only implemented this for the intrinsic fallbacks in this patch. I think we do similar splatting/shifting/oring in other places. If this is approved, I'll refactor the others to share the code. Differential Revision: https://reviews.llvm.org/D101002	2021-04-22 09:50:07 -07:00
Serge Pavlov	740962e5d0	[RISCV] Custom lowering of SET_ROUNDING Differential Revision: https://reviews.llvm.org/D91242	2021-04-22 15:04:55 +07:00
Craig Topper	58c5b4c2c3	[RISCV] Use TargetConstant for condition code of RISCVISD::SELECT_CC. The value is always an immediate and can never be in a register. This the kind of thing TargetConstant is for. Saves a step GenDAGISel to convert a Constant to a TargetConstant.	2021-04-21 23:08:52 -07:00
Serge Pavlov	6e63dfdae2	[RISCV] Custom lowering of FLT_ROUNDS_ Differential Revision: https://reviews.llvm.org/D90854	2021-04-22 11:39:15 +07:00
Craig Topper	f6d8cf7798	[RISCV] Teach lowerSPLAT_VECTOR_PARTS to detect cases where Hi is sign extended from Lo. This recognizes the case when Hi is (sra Lo, 31). We can use SPLAT_VECTOR_I64 rather than splatting the high bits and combining them in the vector register.	2021-04-21 20:24:23 -07:00
Craig Topper	87afefcd22	[RISCV] Fix mistake in comment. NFC	2021-04-19 11:15:32 -07:00
Craig Topper	7ed01a420a	[RISCV] Pad v4i1/v2i1/v1i1 stores with 0s to make a full byte. As noted in the FIXME there's a sort of agreement that the any extra bits stored will be 0. The generated code is pretty terrible. I was really hoping we could use a tail undisturbed trick, but tail undisturbed no longer applies to masked destinations in the current draft spec. Fingers crossed that it isn't common to do this. I doubt IR from clang or the vectorizer would ever create this kind of store. Reviewed By: frasercrmck Differential Revision: https://reviews.llvm.org/D100618	2021-04-19 11:05:18 -07:00
Fraser Cormack	c9a93c3e01	[RISCV] Lower vector shuffles to vrgather operations This patch extends the lowering of RVV fixed-length vector shuffles to avoid the default stack expansion and instead lower to vrgather instructions. For "permute"-style shuffles where one vector is swizzled, we can lower to one vrgather. For shuffles involving two vector operands, we lower to one unmasked vrgather (or splat, where appropriate) followed by a masked vrgather which blends in the second half. On occasion, when it's not possible to create a legal BUILD_VECTOR for the indices, we use vrgatherei16 instructions with 16-bit index types. For 8-bit element vectors where we may have indices over 255, we have a fairly blunt fallback to the stack expansion to avoid custom-splitting of the vector types. To enable the selection of masked vrgather instructions, this patch extends the various RISCVISD::VRGATHER nodes to take a passthru operand. Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D100549	2021-04-19 11:13:13 +01:00
Craig Topper	1afdfc6169	[RISCV] Rename RISCVISD::GREVI(W)/GORCI(W) to RISCVISD::GREV(W)/GORC(W). Don't require second operand to be a constant. Prep work for adding intrinsics for these instructions in the future.	2021-04-13 11:04:28 -07:00
Craig Topper	7c9bbbf735	[RISCV] Rename RISCVISD::SHFLI to RISCVISD::SHFL and don't require the second operand to be an immediate. Prep work for adding intrinsics in the future. Left an assert that the input is constant in ReplaceNodeResults, as the intrinsic shouldn't go through that path.	2021-04-12 23:46:50 -07:00
Craig Topper	cb4c793e46	[RISCV] Update computeKnownBitsForTargetNode to treat READ_VLENB as being 16 byte aligned. According to the 0.10 spec, VLEN is at least 128 bits and is a power of 2.	2021-04-11 17:54:23 -07:00
Craig Topper	bc0e052730	[RISCV] Teach targetShrinkDemandedConstant to preserve (and X, 0xffff) when zext.h is supported. Similar to what we do for zext.w. Disable the (srl (and X, 0xffff), C) custom isel when zext.h is available.	2021-04-11 10:03:35 -07:00
Fraser Cormack	a5693445ca	[RISCV] Support OR/XOR/AND reductions on vector masks This patch adds RVV codegen support for OR/XOR/AND reductions for both scalable- and fixed-length vector types. There are a few possible codegen strategies for each -- vmfirst.m, vmsbf.m, and vmsif.m could be used to some extent -- but the vpopc.m instruction was chosen since it produces the scalar result in one instruction, after which scalar instructions can finish off the computation. The reductions are lowered identically for both scalable- and fixed-length vectors, although some alternate strategies may be more optimal on fixed-length vectors since it's cheaper to get the length of those types. Other reduction types were not deemed to be relevant for mask vectors. Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D100030	2021-04-08 09:46:38 +01:00
Serge Pavlov	65b1103798	[RISCV] DAG nodes and pseudo instructions for CSR access New custom DAG nodes were added to represent operations on CSR. These nodes are lowered to corresponding pseudo instruction. Using the pseudo instructions allows to specify different scheduling information for operations on different system registers. It also make possible to specify dependencies of instructions on specific system registers. Differential Revision: https://reviews.llvm.org/D98936	2021-04-08 10:36:36 +07:00
Craig Topper	56ea2e2fdd	[RISCV] Add a special case to lowerSELECT for select of 2 constants with a SETLT condition. If the constants have a difference of 1 we can convert one to the other by adding or subtracting the condition. We have a DAG combine for this, but it only runs before type legalization. If the select is introduced later during type legalization or op legalization we will miss it. We don't need a specific condition, but some conditions are harder to materialize than others on RISCV. I know that SETLT will be a single instruction and it is what is used by the motivating pattern from signed saturating add/sub. Differential Revision: https://reviews.llvm.org/D99021	2021-04-07 13:47:17 -07:00
Craig Topper	f087d7544a	[RISCV] Support vslide1up/down intrinsics for SEW=64 on RV32. This can't use our normal strategy of splatting the scalar and using a .vv operation instead of .vx. Instead this patch bitcasts the vector to the equivalent SEW=32 vector and inserts the scalar parts using two vslide1up/down. We do that unmasked and apply the mask separately at the end with a vmerge. For vslide1up there maybe some other options here like getting i64 into element 0 and using vslideup.vi with this vector as vd and the original source as vs1. Masking would still need to be done afterwards. That idea doesn't work for vslide1down. We need to slidedown and then insert a single scalar at vl-1 which we could do with a vslideup, but that assumes vl > 0 which I don't think we can assume. The i32 double slide1down implemented here is the best I could come up with and I just made vslide1up consistent. Reviewed By: frasercrmck Differential Revision: https://reviews.llvm.org/D99910	2021-04-07 10:44:53 -07:00
Craig Topper	01a23dccb1	[RISCV] Add an assertion to the ReplaceNodeResults handling of bitcasts to make sure the VT is always a scalar integer.	2021-04-06 16:48:40 -07:00
Craig Topper	2641c1f15e	[RISCV] Don't custom type legalize fixed vector to scalar integer bitcasts if the fixed vector type isn't legal. We encountered a hang in our internal code base. I'm having trouble creating a test case because the test that hit it was testing some code that is not upstream.	2021-04-06 15:00:33 -07:00
Craig Topper	af2837675a	[RISCV] Split RISCVISD::VMV_S_XF_VL into separate integer and FP. It's a bit silly, but it allows us to write stricter type constraints for isel. There's still some extra type checks in the generated table due to some type interference limitations around HWMode.	2021-04-05 12:57:35 -07:00
Fraser Cormack	af3a839c70	[RISCV] Add support for bitcasts between scalars and fixed-length vectors This patch supports bitcasts from scalar types to fixed-length vectors and vice versa. It custom-lowers and custom-legalizes them to EXTRACT_VECTOR_ELT/INSERT_VECTOR_ELT operations, using a single-element vectors to hold the scalar where appropriate. Previously, some of these would fail to select, others would be expanded through stack loads and stores. Effort was made to ensure the codegen avoids the stack for both legal and illegal scalar types. Some of the codegen could be improved, but on first glance it looks like a general optimization of EXTRACT_VECTOR_ELT when extracting an i64 element on RV32. Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D99667	2021-04-05 17:21:55 +01:00
Fraser Cormack	3f0df4d7b0	[RISCV] Expand scalable-vector truncstores and extloads Caught in internal testing, these operations are assumed legal by default, even for scalable vector types. Expand them back into separate truncations and stores, or loads and extensions. Also add explicit fixed-length vector tests for these operations, even though they should have been correct already. Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D99654	2021-04-05 17:03:45 +01:00
Craig Topper	4708a05da0	[RISCV] Use gorciw for i32 orc.b intrinsic when Zbp is enabled. The W version of orc.b does not exist in Zbp so we need to use gorci encoding. If we have Zbp, we can use gorciw which can avoid a sext.w in some cases.	2021-04-04 17:14:28 -07:00
Craig Topper	98d5db3e3a	[RISCV] Lower orc.b intrinsic to RISCVISD::GORCI. This will allow us to share any future known bits, demaned bits, or sign bits improvements.	2021-04-04 12:31:41 -07:00
Craig Topper	a2ea003fcb	[RISCV] Don't convert fshr/fshl to target specific FSL/FSR node if shift amount is a constant. As long as it's a constant we can directly pattern match it without any problems. It's only when it isn't a constant that we need to add an AND. In theory this should allow more target independent optimizations to remain active.	2021-04-03 23:13:30 -07:00
Levy Hsu	944adbf285	Recommit "[RISCV] Add IR intrinsic for Zbb extension" Forgot to amend the Author. Original commit message: Header files are included in a separate patch in case the name needs to be changed. RV32 / 64: orc.b Differential Revision: https://reviews.llvm.org/D99320	2021-04-02 11:50:19 -07:00
Craig Topper	1f0b309f24	Revert "[RISCV] Add IR intrinsic for Zbb extension" This reverts commit `1808194590`. I forgot to change the author.	2021-04-02 11:47:02 -07:00
Craig Topper	1808194590	[RISCV] Add IR intrinsic for Zbb extension Header files are included in a separate patch in case the name needs to be changed. RV32 / 64: orc.b	2021-04-02 11:23:57 -07:00
Craig Topper	dbbc95e3e5	[RISCV] Use softPromoteHalf legalization for fp16 without Zfh rather than PromoteFloat. The default legalization strategy is PromoteFloat which keeps half in single precision format through multiple floating point operations. Conversion to/from float is done at loads, stores, bitcasts, and other places that care about the exact size being 16 bits. This patches switches to the alternative method softPromoteHalf. This aims to keep the type in 16-bit format between every operation. So we promote to float and immediately round for any arithmetic operation. This should be closer to the IR semantics since we are rounding after each operation and not accumulating extra precision across multiple operations. X86 is the only other target that enables this today. See https://reviews.llvm.org/D73749 I had to update getRegisterTypeForCallingConv to force f16 to use f32 when the F extension is enabled. This way we can still pass it in the lower bits of an FPR for ilp32f and lp64f ABIs. The softPromoteHalf would otherwise always give i16 as the argument type. Reviewed By: asb, frasercrmck Differential Revision: https://reviews.llvm.org/D99148	2021-04-01 12:41:57 -07:00
Craig Topper	d157e3f387	[RISCV] Fix handling of nxvXi64 vmsgt(u).vx intrinsics on RV32. We need to splat the scalar separately and use .vv, but there is no vmsgt(u).vv. So add isel patterns to select vmslt(u).vv with swapped operands. We also need to get VT to use for the splat from an operand rather than the result since the result VT is nxvXi1. Reviewed By: HsiangKai Differential Revision: https://reviews.llvm.org/D99704	2021-04-01 10:38:05 -07:00
Craig Topper	b7c2e577cc	[RISCV] Add custom type legalization to form MULHSU when possible. There's no target independent ISD opcode for MULHSU, so custom legalize 2*XLen multiplies ourselves. We have to be a little careful to prefer MULHU or MULHSU. I thought about doing this in isel by pattern matching the (add (mul X, (srai Y, XLen-1)), (mulhu X, Y)) pattern. I decided against this because the add might become part of a chain of adds. I don't trust DAG combine not to reassociate with other adds making it difficult to find both pieces again. Reviewed By: asb Differential Revision: https://reviews.llvm.org/D99479	2021-04-01 10:15:55 -07:00
Craig Topper	2a8b7cab6a	[RISCV] Add RISCVISD opcodes for CLZW and CTZW. Our CLZW isel pattern is quite easily broken by surrounding code preventing it from matching sometimes. This usually results in failing to remove the and X, 0xffffffff inserted by type legalization. The add with -32 that type legalization also inserts will often gets combined into other add/sub nodes. That doesn't usually result in extra code when we don't use clzw. CTTZ seems to be less fragile, but I wanted to keep it consistent with CTLZ. Reviewed By: asb, HsiangKai Differential Revision: https://reviews.llvm.org/D99317	2021-03-31 09:40:07 -07:00
Fraser Cormack	10fc6e4358	[RISCV] Add support for the stepvector intrinsic This adds almost everything required for supporting the new stepvector intrinsic on RVV. It is lowered to the existing VID_VL SDNode. The only exception is a limitation that RV32 cannot yet lower the intrinsic on i64 vectors. This is because the step operand is (currently) required to be at least as large as the vector element type. I will look into patching that out and loosening the requirement to only an integer pointer type. Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D99594	2021-03-31 11:41:17 +01:00
Craig Topper	a33fcafaf0	[RISCV] Pass 'half' in the lower 16 bits of an f32 value when F extension is enabled, but Zfh is not. Without Zfh the half type isn't legal, but it could still be used as an argument/return in IR. Clang will not generate this today. Previously we promoted the half value to float for arguments and returns if the F extension is enabled but Zfh isn't. Then depending on which ABI is enabled we would pass it in either an FPR or a GPR in float format. If the F extension isn't enabled, it would get passed in the lower 16 bits of a GPR in half format. With this patch the value will always in half format and will be in the lower bits of a GPR or FPR. This should be consistent with where the bits are located when Zfh is enabled. I've based this implementation off of how this is done on ARM. I've manually nan-boxed the value to 32 bits using integer ops. It looks like flw, fsw, fmv.s, fmv.w.x, fmf.x.w won't canonicalize nans so should leave the value alone. I think those are the instructions that could get used on this value. Reviewed By: kito-cheng Differential Revision: https://reviews.llvm.org/D98670	2021-03-30 09:47:54 -07:00
Craig Topper	f069000b43	[RISCV] Remove floating point condition code legalization from lowerFixedLengthVectorSetccToRVV. After D98939, this is done by LegalizeVectorOps making this code dead. Reviewed By: frasercrmck Differential Revision: https://reviews.llvm.org/D99519	2021-03-30 09:11:56 -07:00
Craig Topper	c40cea6f08	[RISCV] Teach targetShrinkDemandedConstant to preserve (and X, 0xffffffff). We look for this pattern frequently in isel patterns so its a good idea to try to preserve it. This also let's us remove our special isel handling for srliw and use a direct pattern match of (srl (and X, 0xffffffff), C) since no bits will be removed from the and mask. Differential Revision: https://reviews.llvm.org/D99042	2021-03-25 09:03:25 -07:00
Fraser Cormack	99211352c1	[RISCV] Optimize select-like vector shuffles This patch adds a small optimization for vector shuffle lowering, detecting shuffles which can be re-expressed as vector selects. Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D99270	2021-03-25 11:39:57 +00:00
Fraser Cormack	321a71a772	[RISCV] Optimize BUILD_VECTOR sequences that reveal hidden splats This patch adds further optimization techniques to RVV BUILD_VECTOR lowering. It teaches the compiler to find splats of larger vector element types "hidden" in smaller ones. For example, a v4i8 build_vector (0x1, 0x2, 0x1, 0x2) could be splat as v2i16 0x0201. This is generally more optimal than the dominant-element BUILD_VECTORs and so takes priority. This optimization is currently limited to all-constant-or-undef BUILD_VECTORs as those were found to be the most common. There's no reason this couldn't be extended to other BUILD_VECTORs, but the additional bit-manipulation instructions may require more sophisticated heuristics. There are some cases where the materialization of the larger constant takes more scalar instructions than it does to build the vector with vector instructions. We could add heuristics to try and catch this. Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D99195	2021-03-25 10:35:31 +00:00
Craig Topper	0f99c6c56e	[RISCV] Remove duplicate DebugLoc variables from cases in ReplaceNodeResults. NFC We already created a DebugLoc at the top of the function. We can just use that one.	2021-03-24 20:23:03 -07:00
Fraser Cormack	feff66a082	[RISCV] Further optimize BUILD_VECTORs with repeated elements This patch builds upon the initial BUILD_VECTOR work introduced in D98700. It further optimizes the lowering of BUILD_VECTOR by using VSELECT operations to effectively insert repeated elements into the vector with relatively few instructions. This allows us to optimize more BUILD_VECTORs without significantly increasing the size of the generated code. Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D98969	2021-03-23 14:14:48 +00:00
Fraser Cormack	5bfbd9d938	[RISCV] Optimize all-constant mask BUILD_VECTORs This patch adds an optimization for mask-vector BUILD_VECTOR nodes whose elements are all constants or undef. It lowers such operations by building up the vector via a series of integer operations, in which multiple mask elements are inserted into a vector at a time via i8/i16/i32/i64 element types. The final result is then bitcast from that integer vector. We restrict this optimization in certain circumstances when optimizing for size. If we are required to use more than one integer insert operation, then it will likely increase code size compared with using a load from a constant pool. Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D98860	2021-03-23 10:11:19 +00:00
Craig Topper	294efcd6f7	[RISCV] Add support for fixed vector masked gather/scatter. I've split the gather/scatter custom handler to avoid complicating it with even more differences between gather/scatter. Tests are the scalable vector tests with the vscale removed and dropped the tests that used vector.insert. We're probably not as thorough on the splitting cases since we use 128 for VLEN here but scalable vector use a known min size of 64. Reviewed By: frasercrmck Differential Revision: https://reviews.llvm.org/D98991	2021-03-22 10:17:30 -07:00
Craig Topper	5d315691c4	[RISCV] Add missing bitcasts to the results of lowerINSERT_SUBVECTOR and lowerEXTRACT_SUBVECTOR when handling mask vectors. Found by adding asserts to LegalizeDAG to catch incorrect result types being returned. Reviewed By: frasercrmck Differential Revision: https://reviews.llvm.org/D98964	2021-03-19 10:54:33 -07:00
Craig Topper	85f3f6b3cc	[RISCV] Lower scalable vector masked loads to intrinsics to match fixed vectors and reduce isel patterns. Reviewed By: frasercrmck Differential Revision: https://reviews.llvm.org/D98840	2021-03-19 10:39:35 -07:00
Fraser Cormack	d399b82e2a	[RISCV] Maintain fixed-length info when optimizing BUILD_VECTORs I'm not sure how I failed to notice this before, but when optimizing dominant-element BUILD_VECTORs we would lower via the scalable container type, which lost us the information about the fixed length of the vector types. By lowering via the fixed-length type we can preserve that information and eliminate redundant vsetvli instructions. Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D98938	2021-03-19 17:21:06 +00:00
Fraser Cormack	550292ecb1	[RISCV] Fix missing scalable->fixed-length vector conversion Returning the scalable-vector container type would present problems when the fixed-length INSERT_VECTOR_ELT was used by later operations. Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D98776	2021-03-19 16:49:47 +00:00
Craig Topper	c9861f722e	[RISCV] Correct the output chain in lowerFixedLengthVectorMaskedLoadToRVV We returned the input chain instead of the output chain from the new load. This bypasses the load in the chain. I haven't found a good way to test this yet. IR order prevents my initial attempts at causing reordering.	2021-03-18 16:34:35 -07:00
Fraser Cormack	3495031a39	[RISCV] Support scalable-vector masked scatter operations This patch adds support for masked scatter intrinsics on scalable vector types. It is mostly an extension of the earlier masked gather support introduced in D96263, since the addressing mode legalization is the same. Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D96486	2021-03-18 10:17:50 +00:00
Fraser Cormack	0331399dc9	[RISCV] Support scalable-vector masked gather operations This patch supports the masked gather intrinsics in RVV. The RVV indexed load/store instructions only support the "unsigned unscaled" addressing mode; indices are implicitly zero-extended or truncated to XLEN and are treated as byte offsets. This ISA supports the intrinsics directly, but not the majority of various forms of the MGATHER SDNode that LLVM combines to. Any signed or scaled indexing is extended to the XLEN value type and scaled accordingly. This is done during DAG combining as widening the index types to XLEN may produce illegal vectors that require splitting, e.g. nxv16i8->nxv16i64. Support for scalable-vector CONCAT_VECTORS was added to avoid spilling via the stack when lowering split legalized index operands. Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D96263	2021-03-18 09:26:18 +00:00
Fraser Cormack	c2b4600ec8	[RISCV] Support bitcasts of fixed-length mask vectors Without this patch, bitcasts of fixed-length mask vectors would go through the stack. Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D98779	2021-03-18 08:52:42 +00:00
Craig Topper	696ddef569	[RISCV] Support masked load/store for fixed vectors. Reviewed By: frasercrmck Differential Revision: https://reviews.llvm.org/D98561	2021-03-17 10:26:15 -07:00
Fraser Cormack	70251759a2	[RISCV] Optimize "dominant element" BUILD_VECTORs This patch adds an optimization path for BUILD_VECTOR nodes where the majority of the elements are identical. These can be splatted, with the remaining elements patched up with INSERT_VECTOR_ELTs. The threshold can be tweaked as required - it is currently conservative. Undef elements are disregarded when judging the dominance of a particular element. This allows them to be covered by the splat value. In addition, vectors of 2 elements are always optimized to a splat (for the upper element) and an insert at element zero. This optimization is disabled when optimizing for size. Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D98700	2021-03-17 10:09:04 +00:00
Craig Topper	229eeb187d	[RISCV] Look through copies when trying to find an implicit def in addVSetVL. The InstrEmitter can sometimes insert a copy after an IMPLICIT_DEF before connecting it to the vector instruction. This occurs when constrainRegClass reduces to a class with less than 4 registers. I believe LMUL8 on masked instructions triggers this since the result can only use the v8, v16, or v24 register group as the mask is using v0. Reviewed By: frasercrmck Differential Revision: https://reviews.llvm.org/D98567	2021-03-16 07:59:09 -07:00
Craig Topper	a33ce06cf5	[RISCV] Improve i32 UADDSAT/USUBSAT on RV64. The default promotion uses zero extends that become shifts. We cam use sign extend instead which is better for RISCV. I've used two different implementations based on whether we have minu/maxu instructions. Differential Revision: https://reviews.llvm.org/D98683	2021-03-16 07:44:06 -07:00
Craig Topper	41759c3d92	[RISCV] Add RISCVISD::BR_CC similar to RISCVISD::SELECT_CC. This allows me to introduce similar combines for branches as we have recently added for SELECT_CC. Some of them are less useful for standalone setccs and only help branch instructions. By having a BR_CC node its easier to only affect branches. I'm using CondCodeSDNode to make isel patterns easier to write so we can refer to the codes by name. SELECT_CC uses a constant instead. I've translated the condition code just like SELECT_CC so we need less patterns for the swapped conditions. This includes special cases for X < 1 and X > -1 that get translated to blez and bgez by using a 0 constant. computeKnownBitsForTargetNode support for SELECT_CC is added to allow MaskedValueIsZero to work for cases where the true and false values of the SELECT_CC are setccs and the result of the SELECT_CC is used by a BR_CC. This was needed to avoid regressions in some of the overflow tests. Reviewed By: luismarques Differential Revision: https://reviews.llvm.org/D98159	2021-03-15 11:54:01 -07:00
Craig Topper	3dc5b533e0	[RISCV] Improve legalization of i32 UADDO/USUBO on RV64. The default legalization uses zero extends that require pair of shifts on RISCV. Instead we can take advantage of the fact that unsigned compares work equally well on sign extended inputs. This allows us to use addw/subw and sext.w. Reviewed By: luismarques Differential Revision: https://reviews.llvm.org/D98233	2021-03-15 09:30:23 -07:00
Fraser Cormack	0c5b789c73	[RISCV] Support fixed-length vectors in the calling convention This patch adds fixed-length vector support to the calling convention when RVV is used to lower fixed-length vectors. The scheme follows the regular vector calling convention for the argument/return registers, but uses scalable vector container types as the LocVTs, and converts to/from the fixed-length vector value types as required. Fixed-length vector types may be split when the combination of minimum VLEN and the maximum allowable LMUL is not large enough to fully contain the vector. In this case the behaviour differs between fixed-length vectors passed as parameters and as return values: 1. For return values, vectors must be passed entirely via registers or via the stack. 2. For parameters, unlike scalar values, split vectors continue to be passed by value, and are split across multiple registers until there are no remaining registers. Thus vector parameters may be found partly in registers and partly on the stack. As with scalable vectors, the first fixed-length mask vector is passed via v0. Split mask fixed-length vectors are passed first via v0 and then via the next available vector register: v8,v9,etc. The handling of vector return values uses all available argument registers v8-v23 which does not adhere to the calling convention we're supposedly implementing, but since this issue affects both fixed-length and scalable-vector values, it was left as-is. Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D97954	2021-03-15 10:43:51 +00:00
Hsiangkai Wang	a81dff1e58	[RISCV] Support inline asm for vector instructions. Types of fractional LMUL and LMUL=1 are all using VR register class. When using inline asm, it will use the first type in the register class as the type for the register. It is not necessary the same as the value type. We need to use INSERT_SUBVECTOR/EXTRACT_SUBVECToR/BITCAST to make it legal to put the value in the corresponding register class. Differential Revision: https://reviews.llvm.org/D97480	2021-03-15 11:02:18 +08:00
Craig Topper	51151828ac	[RISCV] Teach normaliseSetCC to canonicalize X > -1 to X >= 0 and X < 1 to 0 >= X. This allows the use of BGE with X0 instead of puting -1/1 in a register. Reviewed By: jrtc27 Differential Revision: https://reviews.llvm.org/D98542	2021-03-12 11:50:10 -08:00
Fraser Cormack	641f5700f9	[RISCV] Optimize INSERT_VECTOR_ELT sequences This patch optimizes the codegen for INSERT_VECTOR_ELT in various ways. Primarily, it removes the use of vslidedown during lowering, and the vector element is inserted entirely using vslideup with a custom VL and slide index. Additionally, lowering of i64-element vectors on RV32 has been optimized in several ways. When the 64-bit value to insert is the same as the sign-extension of the lower 32-bits, the codegen can follow the regular path. When this is not possible, a new sequence of two i32 vslide1up instructions is used to get the vector element into a vector. This sequence was suggested by @craig.topper. From there, the value is slid into the final position for more consistent lowering across RV32 and RV64. Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D98250	2021-03-12 09:13:38 +00:00
Fraser Cormack	4d2d5855c7	[RISCV] Fix up stale VECREDUCE comments. NFC. Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D98399	2021-03-12 08:49:46 +00:00
Craig Topper	1d26bbcf9b	[RISCV] Return false from isShuffleMaskLegal except for splats. We don't support any other shuffles currently. This changes the bswap/bitreverse tests that check for this in their expansion code. Previously we expanded a byte swapping shuffle through memory. Now we're scalarizing and doing bit operations on scalars to swap bytes. In the future we can probably use vrgather.vx to do a byte swap shuffle.	2021-03-11 20:02:49 -08:00
Craig Topper	c82f442954	[RISCV] Support fixed vector copysign. Reviewed By: frasercrmck Differential Revision: https://reviews.llvm.org/D98394	2021-03-11 09:57:24 -08:00
Craig Topper	0dff8a9627	[RISCV] Handle vmv.x.s intrinsic for i64 vectors on RV32. Reviewed By: frasercrmck Differential Revision: https://reviews.llvm.org/D98372	2021-03-11 09:39:50 -08:00
Craig Topper	9c841cb8e8	[RISCV] Support extract_vector_elt for fixed and scalable masked registers. This uses a really simple approach of converting to an i8 vector and extracting. This is probably not the best approach especially if you know the index is constant. Other ideas: -Store to stack temporary using vse1, load as scalar and shift. -Sort of bitcast the vector to a vector of i8, slide down the appropriate 8 bit element, copy to scalar, shift down the correct bit within the 8 bits we extracted. Not exactly sure how to describe such a bitcast from i1 vector to i8 vector within the type system for elements less than 8. Reviewed By: frasercrmck Differential Revision: https://reviews.llvm.org/D98310	2021-03-11 09:26:44 -08:00
Craig Topper	9106d04554	[RISCV][SelectionDAG] Introduce an ISD::SPLAT_VECTOR_PARTS node that can represent a splat of 2 i32 values into a nxvXi64 vector for riscv32. On riscv32, i64 isn't a legal scalar type but we would like to support scalable vectors of i64. This patch introduces a new node that can represent a splat made of multiple scalar values. I've used this new node to solve the current crashes we experience when getConstant is used after type legalization. For RISCV, we are now default expanding SPLAT_VECTOR to SPLAT_VECTOR_PARTS when needed and then handling the SPLAT_VECTOR_PARTS later during LegalizeOps. I've remove the special case I previously put in for ABS for D97991 as the default expansion is now able to succesfully use getConstant. Reviewed By: frasercrmck Differential Revision: https://reviews.llvm.org/D98004	2021-03-10 09:46:18 -08:00
Craig Topper	0c73a506e8	[RISCV] Starting fixing issues that prevent us from testing vXi64 intrinsics on RV32. Currently we crash in type legalization any time an intrinsic uses a scalar i64 on RV32. This patch adds support for type legalizing this to prevent crashing. I don't promise that it uses the best possible codegen just that it is functional. This first version handles 3 cases. vmv.v.x intrinsic, vmv.s.x intrinsic and intrinsics that take a scalar input, splat it and then do some operation. For vmv.v.x we'll either rely on hardware sign extension for constants or we'll convert it to multiple splats and bit manipulation. For vmv.s.x we use a really unoptimal sequence inspired by what we do for an INSERT_VECTOR_ELT. For the third case we'll either try to use the .vi form for constants or convert to a complicated splat and bitmanip and use the .vv form of the operation. I've renamed the ExtendOperand field to SplatOperand now use it specifically for the third case. The first two cases are handled by custom lowering specifically for those intrinsics. I haven't updated all tests yet, but I tried to cover a subset that includes single-width, widening, and narrowing. Reviewed By: frasercrmck Differential Revision: https://reviews.llvm.org/D97895	2021-03-10 09:45:38 -08:00
Craig Topper	1e39118638	[RISCV] Manually split vector operands to VECREDUCE when handling vXi64 vectors on RV32. The type legalizer will visit the result before the operands. To avoid creating an illegal target specific node or falling back to scalarization, we need to manually split vector operands. This still doesn't handle the case of non-power of 2 operands which need to be widened. I'm not sure the type legalizer is ready for it. I think we would need to insert an INSERT_SUBVECTOR with the power of 2 type we want, with an undef first operand, and the non-power of 2 orignal operand as the vector to insert. Then fill in the neutral elements into the elements the padded elements. Alternatively we INSERT_SUBVECTOR into a neutral vector. From there we carry on splitting if needed to get to a legal type then do the target specific code. The problem with this is the type legalizer doesn't know how to widen an insert_subvector yet. We would need to add that including the handling for a non-undef first vector. Reviewed By: frasercrmck Differential Revision: https://reviews.llvm.org/D98292	2021-03-10 09:27:38 -08:00
Craig Topper	351844edf1	[RISCV] Add support for VECTOR_REVERSE for scalable vector types. I've left mask registers to a future patch as we'll need to convert them to full vectors, shuffle, and then truncate. Reviewed By: frasercrmck Differential Revision: https://reviews.llvm.org/D97609	2021-03-09 10:03:45 -08:00
Craig Topper	77ac3166e5	[RISCV] Add support for fixed vector reductions. I've included tests that require type legalization to split the vector. The i64 version of these scalarizes on RV32 due to type legalization visiting the result before the vector type. So we have to abort our custom expansion to avoid creating target specific nodes with an illegal type. Then type legalization ends up scalarizing. We might be able to fix this by doing custom splitting for large vectors in our handler to get down to a legal type. Reviewed By: frasercrmck Differential Revision: https://reviews.llvm.org/D98102	2021-03-09 09:39:59 -08:00
Craig Topper	1c7ad4dd88	[RISCV] Don't modify the SEW immediate on the V extension pseudo instructions after inserting VSETVLI. Previously we set the value to -1, but the SEW information could be useful for scheduling. Reviewed By: frasercrmck, rogfer01 Differential Revision: https://reviews.llvm.org/D98062	2021-03-09 09:02:19 -08:00
Craig Topper	72ecf2f43f	[RISCV] Optimize fixed vector ABS. Fix crash on scalable vector ABS for SEW=64 with RV32. The default fixed vector expansion uses sra+xor+add since it can't see that smax is legal due to our custom handling. So we select smax(X, sub(0, X)) manually. Scalable vectors are able to use the smax expansion automatically for most cases. It crashes in one case because getConstant can't build a SPLAT_VECTOR for nxvXi64 when i64 scalars aren't legal. So we manually emit a SPLAT_VECTOR_I64 for that case. Reviewed By: frasercrmck Differential Revision: https://reviews.llvm.org/D97991	2021-03-09 08:51:03 -08:00
Craig Topper	7a64cc4a76	[RISCV] Make use of DAG.getNeutralElement in lowerVECREDUCE to avoid repeating the same list of constants. NFC Reviewed By: frasercrmck, khchen Differential Revision: https://reviews.llvm.org/D98091	2021-03-08 09:11:10 -08:00
Fraser Cormack	18173c57bd	[RISCV] Add new entry points to getContainerForFixedLengthVector While working on adding fixed-length vectors to the calling convention, it was necessary to be able to query for a fixed-length vector container type without access to an instance of SelectionDAG. This patch modifies the "main" getContainerForFixedLengthVector function to use an instance of TargetLowering rather than SelectionDAG, and preserves the SelectionDAG overload as a wrapper. An additional non-static version of the function was also added to simplify the common case in RISCVTargetLowering. Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D97925	2021-03-08 09:26:19 +00:00
Craig Topper	c91b3c9e63	[RISCV] Fold (select_cc (setlt X, Y), 0, ne, trueV, falseV) -> (select_cc X, Y, lt, trueV, falseV) A setcc can be created during LegalizeDAG after select_cc has been created. This combine will enable us to fold these late setccs. Reviewed By: luismarques Differential Revision: https://reviews.llvm.org/D98132	2021-03-07 09:44:56 -08:00
Craig Topper	fdbd5d3206	[RISCV] Fold (select_cc (xor X, Y), 0, eq/ne, trueV, falseV) -> (select_cc X, Y, eq/ne, trueV, falseV) This pattern occurs when lowering for overflow operations introduce an xor after select_cc has already been formed. I had to rework another combine that looked for select_cc of an xor with 1. That xor will now get combined away so we just need to look for the RHS of the select_cc being 1. Reviewed By: luismarques Differential Revision: https://reviews.llvm.org/D98130	2021-03-07 09:29:55 -08:00
Fraser Cormack	8e7ceffd0b	[RISCV] Fix crash when inserting large fixed-length subvectors This patch addresses a compiler crash resulting from passing a fixed-length type to one that expects scalable vector types. An assertion was added to prevent this regressing in the future. Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D97868	2021-03-04 09:27:16 +00:00
Fraser Cormack	d8e1d2ebf4	[RISCV] Preserve fixed-length VL on insert_vector_elt in more cases This patch fixes up one case where the fixed-length-vector VL was dropped (falling back to VLMAX) when inserting vector elements, as the code would lower via ISD::INSERT_VECTOR_ELT (at index 0) which loses the fixed-length vector information. To this end, a custom node, VMV_S_XF_VL, was introduced to carry the VL operand through to the final instruction. This node wraps the RVV vmv.s.x and vmv.s.f instructions, which were being selected by insert_vector_elt anyway. There should be no observable difference in scalable-vector codegen. There is still one outstanding drop from fixed-length VL to VLMAX, when an i64 element is inserted into a vector on RV32; the splat (which is custom legalized) has no notion of the original fixed-length vector type. Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D97842	2021-03-04 09:21:10 +00:00
Fraser Cormack	c1695ddf7d	[RISCV] Support fixed-length INSERT_VECTOR_ELT This patch enables support for lowering INSERT_VECTOR_ELT on fixed-length vector types. The strategy follows that for scalable vector types. This patch also includes a quick fix to prevent the compiler infinitely looping between lowering BUILD_VECTOR as VECTOR_SHUFFLE and back again. Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D97698	2021-03-02 16:48:38 +00:00
Fraser Cormack	de2b70010a	[RISCV] Lower CONCAT_VECTORS to INSERT_SUBVECTOR nodes The default expansion of CONCAT_VECTORS goes through the stack. This patch avoids that penalty by custom-lowering CONCAT_VECTORS to a series of INSERT_SUBVECTOR nodes. Futher optimizations are possible, but this is a good start. Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D97692	2021-03-02 11:13:59 +00:00
Fraser Cormack	3fea9226ee	[RISCV] Support INSERT_SUBVECTOR on vector masks Like with EXTRACT_SUBVECTOR, INSERT_SUBVECTOR poses a problem for vector masks as RVV isn't able to slide mask types around. We choose instead to bitcast to equivalently-sized i8 types where we can, else we zero-extend, perform the operation, and truncate back down. One test was left disabled due to a crash in the legalizer. Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D97559	2021-03-01 12:04:11 +00:00
Fraser Cormack	e80ca3af82	[RISCV] Fix INSERT/EXTRACT_SUBVECTOR on fractional LMUL types This patch fixes a bug where the lowering for INSERT_SUBVECTOR and EXTRACT_SUBVECTOR would insist on first extracting a register-aligned LMUL1 vector type before perfoming the slide up/down. This was even if the vector was a fractional LMUL type, in which case the aligned EXTRACT_SUBVECTOR was invalid. This issue only occurred for scalable vector types, but a variety of tests for both scalable and fixed-length vectors have been added to ensure this does not regress in the future. Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D97556	2021-03-01 11:51:05 +00:00
Fraser Cormack	4ea734e6ec	[RISCV] Unify scalable- and fixed-vector INSERT_SUBVECTOR lowering This patch unifies the two disparate paths for lowering INSERT_SUBVECTOR operations under one roof. Consequently, with this patch it is possible to support any fixed-length subvector insertion, not just "cast-like" ones. As before, support for the insertion of mask vectors will come in a separate patch. Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D97543	2021-03-01 11:38:47 +00:00
Fraser Cormack	bd4d421688	[RISCV] Support EXTRACT_SUBVECTOR on vector masks This patch adds support for extracting subvectors from vector masks. This can be either extracting a scalable vector from another, or a fixed-length vector from a fixed-length or scalable vector. Since RVV lacks a way to slide vector masks down on an element-wise basis and we don't know the true length of the vector registers, in many cases we must resort to using equivalently-sized i8 vectors to perform the operation. When this is not possible we fall back and extend to a suitable i8 vector. Support was also added for fixed-length truncation to mask types. Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D97475	2021-03-01 11:20:09 +00:00
Fraser Cormack	37014db013	[RISCV] Use existing method for the LMUL1 type. NFCI. Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D97467	2021-02-26 09:44:05 +00:00
Craig Topper	d7fca3f0bf	[RISCV] Support fixed vector extract_element for FP types.	2021-02-25 16:30:28 -08:00
Fraser Cormack	02f435db0b	[RISCV] Support fixed-length vector i2fp/fp2i conversions This patch extends the support for scalable-vector int->fp and fp->int conversions by additionally handling fixed-length vectors. The existing scalable-vector lowering re-expresses widening/narrowing by x4+ conversions as standard nodes. The fixed-length vector support slots in at "the end" of this process by lowering the now equally-sized and widening/narrowing by x2 nodes to our custom VL versions. Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D97374	2021-02-25 13:47:58 +00:00
Fraser Cormack	9620ce90d7	[RISCV] Support fixed-length vector FP_ROUND & FP_EXTEND This patch extends the support for vector FP_ROUND and FP_EXTEND by including support for fixed-length vector types. Since fixed-length vectors use "VL" nodes and scalable vectors can use the standard nodes, there is slightly more to do in the fixed-length case. A helper function was introduced to try and reduce the divergent paths. It is expected that this function will similarly come in useful for lowering the int-to-fp and fp-to-int operations for fixed-length vectors. Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D97301	2021-02-25 12:16:06 +00:00
Fraser Cormack	84413e1947	[RISCV] Support fixed-length vector truncates This patch extends support for our custom-lowering of scalable-vector truncates to include those of fixed-length vectors. It does this by co-opting the custom RISCVISD::TRUNCATE_VECTOR node and adding mask and VL operands. This avoids unnecessary duplication of patterns and inflation of the ISel table. Some truncates go through CONCAT_VECTORS which currently isn't efficiently handled, as it goes through the stack. This can be improved upon in the future. Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D97202	2021-02-25 12:11:34 +00:00
Fraser Cormack	3bc5ed3875	[RISCV] Support fixed-length vector sign/zero extension This patch adds support for the custom lowering sign- and zero-extension of fixed-length vector types. It does so through custom nodes. Since the source and destination types are (necessarily) of different sizes, it is possible that the source type is legal whilst the larger destination type isn't. In this case the legalization makes heavy use of EXTRACT_SUBVECTOR. Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D97194	2021-02-25 12:05:17 +00:00
Fraser Cormack	821f8bb29a	[RISCV] Unify scalable- and fixed-vector EXTRACT_SUBVECTOR lowering This patch unifies the two disparate paths for lowering EXTRACT_SUBVECTOR operations under one roof. Consequently, with this patch it is possible to support any fixed-length subvector extraction, not just "cast-like" ones. Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D97192	2021-02-25 11:46:57 +00:00
Craig Topper	efcdd598b7	[RISCV] Teach VSETVLI inserter to use VSETIVLI when possible. We always create the VL operand using a register, but if we can determine that it came from an ADDI X0, imm with a sufficiently small immediate, we can use VSETIVLI. Reviewed By: frasercrmck Differential Revision: https://reviews.llvm.org/D97332	2021-02-24 16:07:33 -08:00
Craig Topper	086670d367	[RISCV] Support fixed vector extract element. Use VL=1 for scalable vector extract element. I've changed to use VL=1 for slidedown and shifts to avoid extra element processing that we don't need. The i64 fixed vector handling on i32 isn't great if the vector type isn't legal due to an ordering issue in type legalization. If the vector type isn't legal, we fall back to default legalization which will bitcast the vector to vXi32 and use two independent extracts. Doing better will require handling several different cases by manually inserting insert_subvector/extract_subvector to adjust the type to a legal vector before emitting custom nodes. Reviewed By: frasercrmck Differential Revision: https://reviews.llvm.org/D97319	2021-02-24 10:17:00 -08:00
Fraser Cormack	dd68f3cf28	[RISCV] Support insertion of misaligned subvectors This patch extends the support for RVV INSERT_SUBVECTOR to cover those which don't align to a vector register boundary. Like the support for EXTRACT_SUBVECTOR in D96959, it accomplishes this by extracting the nearest register-sized subvector (a subregister operation), then sliding the vector down with VSLIDEDOWN, inserting the subvector to the first position, and sliding the vector back up again afterwards. Unlike subvector extraction, for vectors that occupy less than a full vector register we must preserve the untouched elements. We do this by lowering to an LMUL=1 INSERT_SUBVECTOR using the above method and lowering that to a VSLIDEUP with a zero offset. This uses a tail-undisturbed policy and so has the effect of "sliding in" the subvector elements while preserving the surrounding ones. Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D96972	2021-02-23 10:31:06 +00:00
Craig Topper	1cd2a5a7da	[RISCV] Add isel support for bitcasts between fixed vector types. This should fix the issue reported in D96972. I don't have a good test case for this without those changes. Differential Revision: https://reviews.llvm.org/D97082	2021-02-22 12:05:46 -08:00
Craig Topper	1aeb927fed	[RISCV] Custom isel the rest of the vector load/store intrinsics. A previous patch moved the index versions. This moves the rest. I also removed the custom lowering for VLEFF since we can now do everything directly in the isel handling. I had to update getLMUL to handle mask registers to index the pseudo table correctly for VLE1/VSE1. This is good for another 15K reduction in llc size. Reviewed By: frasercrmck Differential Revision: https://reviews.llvm.org/D97097	2021-02-22 09:53:46 -08:00
Fraser Cormack	3e1317fd32	[RISCV] Support extraction of misaligned subvectors This patch extends the support for RVV EXTRACT_SUBVECTOR to cover those which don't align to a vector register boundary. It accomplishes this by extracting the nearest register-sized subvector (a subregister operation), then sliding the vector down with VSLIDEDOWN and extracting the subvector from the first position (a COPY operation). Since this procedure involves the use of VSCALE and multiplication, the handling of such operations is done during lowering to simplify the implementation and make use of DAG combining. This necessitated moving some helper functions from RISCVISelDAGToDAG to RISCVTargetLowering. Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D96959	2021-02-20 15:43:54 +00:00
Craig Topper	98dff5e804	[RISCV] Move SHFLI matching to DAG combine. Add 32-bit support for RV64 We previously used isel patterns for this, but that used quite a bit of space in the isel table due to OR being associative and commutative. It also wouldn't handle shifts/ands being in reversed order. This generalizes the shift/and matching from GREVI to take the expected mask table as input so we can reuse it for SHFLI. There is no SHFLIW instruction, but we can promote a 32-bit SHFLI to i64 on RV64. As long as bit 4 of the control bit isn't set, a 64-bit SHFLI will preserve 33 sign bits if the input had at least 33 sign bits. ComputeNumSignBits has been updated to account for that to avoid sext.w in the tests. Reviewed By: frasercrmck Differential Revision: https://reviews.llvm.org/D96661	2021-02-19 10:07:12 -08:00
Craig Topper	156fc07e19	[RISCV] Add support for fixed vector MULHU/MULHS. This uses to division by constant optimization to use MULHU/MULHS. Reviewed By: frasercrmck, arcbbb Differential Revision: https://reviews.llvm.org/D96934	2021-02-18 09:15:08 -08:00
Craig Topper	792627be35	[RISCV] Add support for fixed vector sign/zero extend from mask types. Due to vXi64 on RV32, I've directly emitted this using _VL ISD opcodes. If it wasn't for that we could just use fixed vector BUILD_VECTOR and VSELECT and let those each be legalized. Reviewed By: frasercrmck Differential Revision: https://reviews.llvm.org/D96910	2021-02-18 09:08:10 -08:00
Craig Topper	016eca8f90	[RISCV] Guard LowerINSERT_VECTOR_ELT against fixed vectors. The type legalizer can call this code based on the scalar type so we need to verify the vector type is a scalable vector. I think due to how type legalization visits nodes, the vector type will have already been legalized so we don't have an issue with using MVT here like we did for EXTRACT_VECTOR_ELT. I've added a test just in case.	2021-02-17 19:27:08 -08:00
Craig Topper	00c4e0a8f6	[RISCV] Guard the ISD::EXTRACT_VECTOR_ELT handling in ReplaceNodeResults against fixed vectors and non-MVT types. The type legalizer is calling this code based on the scalar type so we need to verify the input type is a scalable vector. The vector type has also not been legalized yet when this is called so we need to use EVT for it.	2021-02-17 18:25:38 -08:00
Craig Topper	3bdd02735b	[RISCV] Localize RISCVZvlssegTable to RISCVISelDAGToDAG.cpp, the only place it is used.	2021-02-17 11:37:28 -08:00
Fraser Cormack	d81161646a	[RISCV] Add support for fixed vector vselect This patch adds support for fixed-length vector vselect. It does so by lowering them to a custom unmasked VSELECT_VL node with a vector length operand. Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D96768	2021-02-17 10:59:00 +00:00
Craig Topper	07ca13fe07	[RISCV] Add support for fixed vector mask logic operations. Reviewed By: frasercrmck Differential Revision: https://reviews.llvm.org/D96741	2021-02-16 09:34:00 -08:00
Fraser Cormack	04977ce5ce	[RISCV] Fix a crash in fixed-length build_vector lowering Non-splatted non-integer build_vector nodes were mistakenly being lowered as VID expressions, which should not happen. VID can only be used to select integer build_vector nodes. Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D96718	2021-02-16 10:25:15 +00:00
Fraser Cormack	b870199020	[RISCV] Add patterns for scalable-vector fabs & fcopysign The patterns mostly follow the scalar counterparts, save for some extra optimizations to match the vector/scalar forms. The patch adds a DAGCombine for ISD::FCOPYSIGN to try and reorder ISD::FNEG around any ISD::FP_EXTEND or ISD::FP_TRUNC of the second operand. This helps us achieve better codegen to match vfsgnjn. Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D96028	2021-02-16 10:21:09 +00:00
Craig Topper	7ba2e1c601	[RISCV] Add support for fixed vector floating point setcc. This is annoying because the condition code legalization belongs to LegalizeDAG, but our custom handler runs in Legalize vector ops which occurs earlier. This adds some of the mask binary operations so that we can combine multiple compares that we need for expansion. I've also fixed up RISCVISelDAGToDAG.cpp to handle copies of masks. This patch contains a subset of the integer setcc patch as well. That patch is dependent on the integer binary ops patch. I'll rebase based on what order the patches go in. Reviewed By: frasercrmck Differential Revision: https://reviews.llvm.org/D96567	2021-02-15 12:52:25 -08:00
Fraser Cormack	4bd5bd4009	[RISCV] Convert VSLIDE(UP\|DOWN) nodes to "VL" versions (NFC) This patch prepares the RISCV VSLIDEUP and VSLIDEDOWN custom nodes to ones carrying additional mask and vector-length operands. This is primarily so they can be used by both systems. This also takes the opportunity to create some helper functions to deal with the common task of getting the default (unmasked) VL operands. Reviewed By: craig.topper, arcbbb Differential Revision: https://reviews.llvm.org/D96505	2021-02-15 10:32:56 +00:00
Craig Topper	4220a81c84	[RISCV] Add support for fixed vector fabs	2021-02-12 15:33:36 -08:00
Craig Topper	36658376d5	[RISCV] Add support for fixed vector sqrt.	2021-02-12 15:33:29 -08:00
Craig Topper	1697cc78b1	[RISCV] Add support for integer fixed vector setcc I believe I've covered all orderings of splat operands here. Better canonicalization in lowering might help reduce this. I did not handle the immediate adjustments needed for set(u)gt/set(u)lt. Testing here is limited to byte types because the scalable vector type used for masks for the store is calculated assuming 8 byte elements. But for the setcc its based on the element count of the container type for the setcc input. So they don't agree. We'll need to enhanced D96352 to handle this I think. Differential Revision: https://reviews.llvm.org/D96443	2021-02-12 09:29:41 -08:00
Fraser Cormack	e88da1d677	[RISCV] Add support for integer fixed min/max This patch extends the initial fixed-length vector support to include smin, smax, umin, and umax. Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D96491	2021-02-12 09:19:45 +00:00
Craig Topper	033b1bd185	[RISCV] Add support loads, stores, and splats of vXi1 fixed vectors. This refines how we determine which masks types are legal and adds support for loads, stores, and all ones/zeros splats. I left a fixme in store handling where I think we need to zero extra bits if the type isn't a multiple of a byte. If I remember right from X86 there was some case we could have a store of a 1, 2, or 4 bit mask and have a scalar zextload that then expected the bits to be 0. Its tricky to zero the bits with RVV. We need to do something like round VL up, zero a register, lower the VL back down, then do a tail undisturbed move into the zero register. Another option might be to generate a mask of 1/2/4 bits set with a VL of 8 and use that to mask off the bits. Reviewed By: frasercrmck Differential Revision: https://reviews.llvm.org/D96468	2021-02-11 09:13:16 -08:00
Craig Topper	0c254b4a69	[RISCV] Add support for selecting vrgather.vx/vi for fixed vector splat shuffles. The test cases extract a fixed element from a vector and splat it into a vector. This gets DAG combined into a splat shuffle. I've used some very wide vectors in the test to make sure we have at least a couple tests where the element doesn't fit into the uimm5 immediate of vrgather.vi so we fall back to vrgather.vx. Reviewed By: frasercrmck Differential Revision: https://reviews.llvm.org/D96186	2021-02-10 10:01:56 -08:00
Fraser Cormack	a3c74d6d53	[RISCV] Add support for selecting vid.v from build_vector This patch optimizes a build_vector "index sequence" and lowers it to the existing custom RISCVISD::VID node. This pattern is common in autovectorized code. The custom node was updated to allow it to be used by both scalable and fixed-length vectors, thus avoiding pattern duplication. Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D96332	2021-02-10 10:58:40 +00:00
Hsiangkai Wang	a5b07a221a	[RISCV] Initial support of LoopVectorizer for RISC-V Vector. Define an option -riscv-vector-bits-max to specify the maximum vector bits for vectorizer. Loop vectorizer will use the value to check if it is safe to use the whole vector registers to vectorize the loop. It is not the optimum solution for loop vectorizing for scalable vector. It assumed the whole vector registers will be used to vectorize the code. If it is possible, we should configure vl to do vectorize instead of using whole vector registers. We only consider LMUL = 1 in this patch. This patch just an initial work for loop vectorizer for RISC-V Vector. Differential Revision: https://reviews.llvm.org/D95659	2021-02-09 06:32:18 +08:00
Craig Topper	8d8cafa32e	[RISCV] Add support for splat fixed length build_vectors using RVV. Building on the fixed vector support from D95705 I've added ISD nodes for vmv.v.x and vfmv.v.f and switched to lowering the intrinsics to it. This allows us to share the same isel patterns for both. This doesn't handle splats of i64 on RV32 yet. The build_vector gets converted to a vXi32 build_vector+bitcast during type legalization. Not sure the best way to handle this at the moment. Differential Revision: https://reviews.llvm.org/D96108	2021-02-08 11:12:56 -08:00
Craig Topper	b8d719fbe8	[RISCV] Add support for fixed vector FMA. Follow up to D95705. Does not include the commuting support from D95800. Differential Revision: https://reviews.llvm.org/D96103	2021-02-08 11:12:56 -08:00
Craig Topper	a719b667a9	[RISCV] Add initial support for converting fixed vectors to scalable vectors during lowering to use RVV instructions. This is an alternative to D95563. This is modeled after a similar feature for AArch64's SVE that uses predicated scalable vector instructions.a Rather than use predication, this patch uses an explicit VL operand. I've limited it to always use LMUL=1 for now, but we can improve this in the future. This requires a bunch of new ISD opcodes to carry the VL operand. I think we can probably lower intrinsics to these ISD opcodes to cut down on the size of the isel table. Which is why I've added patterns for all integer/float types and not just LMUL=1. I'm only testing one vector width right now, but the width is programmable via the command line. Reviewed By: frasercrmck Differential Revision: https://reviews.llvm.org/D95705	2021-02-08 10:41:30 -08:00
Craig Topper	b7b4f4cbc3	[RISCV] Make scalable vector FMA commutable for register allocation. This adds support for commuting operands and converting between vfmadd and vfmacc to avoid register copies. To avoid messing up intrinsic behavior, I've added new pseudo instructions that have the isCommutable flag set. These pseudos also force a tail agnostic policy. The intrinsic version still use the tail undisturbed policy. For best results it looks like we need to start with fmadd and only pick fmacc if its beneficial. MachineCSE commutes without contraining the operands and then commutes back if it didn't help with CSE. So I've made sure that when the operand choice isn't constrained, we will keep fmadd for MachineCSE and when it does the second commute, we get back the original instruction. Reviewed By: frasercrmck Differential Revision: https://reviews.llvm.org/D95800	2021-02-08 10:05:33 -08:00
Mikael Holmen	eb8c27c60c	[RISCV] Use std::make_tuple to make some toolchains happy again My toolchain (LLVM 8.0, libstdc++ 5.4.0) complained with: 12:38:19 ../lib/Target/RISCV/RISCVISelLowering.cpp:1717:12: error: chosen constructor is explicit in copy-initialization 12:38:19 return {RISCVISD::VECREDUCE_FADD, Op.getOperand(0), 12:38:19 ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 12:38:19 /proj/flexasic/app/llvm/8.0/bin/../lib/gcc/x86_64-unknown-linux-gnu/5.4.0/../../../../include/c++/5.4.0/tuple:479:19: note: explicit constructor declared here 12:38:19 constexpr tuple(_UElements&&... __elements) 12:38:19 ^ 12:38:19 ../lib/Target/RISCV/RISCVISelLowering.cpp:1720:12: error: chosen constructor is explicit in copy-initialization 12:38:19 return {RISCVISD::VECREDUCE_SEQ_FADD, Op.getOperand(1), Op.getOperand(0)}; 12:38:19 ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 12:38:19 /proj/flexasic/app/llvm/8.0/bin/../lib/gcc/x86_64-unknown-linux-gnu/5.4.0/../../../../include/c++/5.4.0/tuple:479:19: note: explicit constructor declared here 12:38:19 constexpr tuple(_UElements&&... __elements) 12:38:19 ^ 12:38:19 2 errors generated. This commit adds explicit calls to std::make_tuple to work around the problem.	2021-02-08 14:37:25 +01:00
Fraser Cormack	b46aac125d	[RISCV] Support the scalable-vector fadd reduction intrinsic This patch adds support for both the fadd reduction intrinsic, in both the ordered and unordered modes. The fmin and fmax intrinsics are not currently supported due to a discrepancy between the LLVM semantics and the RVV ISA behaviour with regards to signaling NaNs. This behaviour is likely fixed in version 2.3 of the RISC-V F/D/Q extension, but until then the intrinsics can be left unsupported. Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D95870	2021-02-08 09:52:27 +00:00
Fraser Cormack	e046c0c28b	[RISCV] Support scalable-vector integer reduction intrinsics This patch adds support for the integer reduction intrinsics supported by RVV. This excludes "mul" which has no corresponding instruction. The reduction instructions in RVV have slightly complicated type constraints given they always produce a single "M1" vector register. They are lowered to custom nodes including the second "scalar" reduction operand to simplify the patterns and in the hope that they can be useful for future DAG combines. Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D95620	2021-02-05 10:10:08 +00:00
Fraser Cormack	c3eb2da6c4	[RISCV] Optimize sign-extended EXTRACT_VECTOR_ELT nodes This patch custom-legalizes all integer EXTRACT_VECTOR_ELT nodes where SEW < XLEN to VMV_S_X nodes to help the compiler infer sign bits from the result. This allows us to eliminate redundant sign extensions. For parity, all integer EXTRACT_VECTOR_ELT nodes are legalized this way so that we don't need TableGen patterns for some and not others. Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D95741	2021-02-05 10:05:22 +00:00
Craig Topper	44cc5abbf9	[RISCV] Custom lower fshl/fshr with Zbt extension. We need to add a mask to the shift amount for these operations to use the FSR/FSL instructions. We were previously doing this in isel patterns, but custom lowering will make the mask visible to optimizations earlier.	2021-01-31 17:49:15 -08:00
Fraser Cormack	fc2f27ccf3	[RISCV] Add support for RVV int<->fp & fp<->fp conversions This patch adds support for the full range of vector int-to-float, float-to-int, and float-to-float conversions on legal types. Many conversions are supported natively in RVV so are lowered with patterns. These include conversions between (element) types of the same size, and those that are half/double the size of the input. When conversions take place between types that are less than half or more than double the size we must lower them using sequences of instructions which go via intermediate types. Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D95447	2021-01-28 09:50:32 +00:00
Craig Topper	a40e01e442	[RISCV] Rework fault first only load isel. -Remove the ISD opcode for READ_VL. Just emit the MachineSDNode directly. -Move segmented fault first only load intrinsic handling completely to RISCVISelDAGToDAG.cpp and emit the ReadVL MachineSDNode there instead of lowering to ISD opcodes first.	2021-01-27 11:51:41 -08:00
Craig Topper	04570e98c8	[RISCV] Group the legal vector types into lists we can iterator over in the RISCVISelLowering constructor Remove the RISCVVMVTs namespace because I don't think it provides a lot of value. If we change the mappings we'd likely have to add or remove things from the list anyway. Add a wrapper around addRegisterClass that can determine the register class from the fixed size of the type. Reviewed By: frasercrmck, rogfer01 Differential Revision: https://reviews.llvm.org/D95491	2021-01-27 10:20:12 -08:00
Fraser Cormack	9a75a808c2	[RISCV] Fix a codegen crash in getSetCCResultType This patch fixes some crashes coming from `RISCVISelLowering::getSetCCResultType`, which would occasionally return an EVT constructed from an invalid MVT, which has a null Type pointer. The attached test shows this happening currently for some fixed-length vectors, which hit this issue when the V extension was enabled, even though they're not legal types under the V extension. The fix was also pre-emptively extended to scalable vectors which can't be represented as an MVT, even though a test case couldn't be found for them. Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D95434	2021-01-27 10:22:54 +00:00
Craig Topper	f9d7f77267	[RISCV] Have customLegalizeToWOp truncate to the original type instead of i32 now that we use it for i8/i16 as well. `239cfbccb0` add support for legalizing i8/i16 UDIV/UREM/SDIV to use *W instructions. So we need to truncate to i8/i16 if we're legalizing one of those.	2021-01-26 10:50:03 -08:00
Hsiangkai Wang	b69932b550	[RISCV] Implement vlsegff intrinsics. Differential Revision: https://reviews.llvm.org/D95303	2021-01-26 12:02:43 +08:00
Fraser Cormack	15141cd115	[RISCV] Add RVV insertelt/extractelt scalable-vector patterns Original patch by @rogfer01. This patch adds support for insertelt and extractelt operations on scalable vectors. Special care must be taken on RV32 when dealing with i64 vectors as there are no straightforward ways to insert a 64-bit element without a register of that size. To that end, both are custom-lowered to different sequences. Authored-by: Roger Ferrer Ibanez <rofirrim@gmail.com> Co-Authored-by: Fraser Cormack <fraser@codeplay.com> Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D94615	2021-01-25 22:03:52 +00:00
Craig Topper	239cfbccb0	[RISCV] Custom type legalize i8/i16 UDIV/UREM/SDIV on RV64 so we can use divuw/remuw/divw. This makes our i8/i16 codegen more similar to the i32 codegen. I've also added computeKnownBits support for DIVUW/REMUW so that we can remove zero extending ANDs from the output. Without this we end up turning DIVUW/REMUW back into DIVU/REMU via some isel patterns. Reviewed By: frasercrmck, luismarques Differential Revision: https://reviews.llvm.org/D95322	2021-01-25 10:47:22 -08:00
Craig Topper	4eb4f8963f	[RISCV] Use sign extend for i32 arguments and returns in makeLibCall on RV64. As far as I know 32 bits arguments and returns on RV64 are always sign extended to i64. So I think we should be taking this into account around libcalls. Reviewed By: luismarques Differential Revision: https://reviews.llvm.org/D95285	2021-01-25 09:33:48 -08:00
Fraser Cormack	fde2466171	[SelectionDAG] Support scalable-vector splats in more cases This patch adds support for scalable-vector splats in DAGCombiner's `isConstantOrConstantVector` and `ISD::matchUnaryPredicate` functions, which enable the SelectionDAG div/rem-by-constant optimizations for scalable vector types. It also fixes up one case where the UDIV optimization was generating a SETCC without first consulting the target for its preferred SETCC result type. Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D94501	2021-01-25 10:58:15 +00:00
Craig Topper	4d5aa760a7	[RISCV] Add support for rev8 and orc.b to Zbb. These instructions use a portion of the encodings for grevi and gorci. The full encodings are only supported with Zbp. Note, rev8 has a different encoding between rv32 and rv64. Zbb is closer to being finalized that Zbp which has motivated some decisions in this patch. I'm treating rev8 and orc.b as separate instructions when either Zbb or Zbp is enabled. This allows us to print to suggest that either feature needs to be enabled to support these mnemonics. I had tried to put HasStdExtZbbAndNotZbp on the Zbb instructions, but that caused a diagnostic that said Zbp is required if neither feature is enabled. We should really mention Zbb since its closer to final. This does require extra isel patterns for the different cases so that bswap will always print as rev8 in assembly listing since we can't use an InstAlias. llvm-objdump disassembling should always pick the rev8 or orc.b instructions. llvm-mc parsing and printing text will not convert the grevi/gorci spellings to rev8/gorc.b. We could probably fix this with a special case in processInstruction in the assembly parser if it its important. Reviewed By: asb, frasercrmck Differential Revision: https://reviews.llvm.org/D94944	2021-01-22 12:49:10 -08:00
Craig Topper	3b5430eb0d	[RISCV] Add a VL output to vleff intrinsics. The fault-only-first-load instructions can reduce VL if an element other than element 0 triggers a memory fault. This can be used to vectorize loops with data dependent exit conditions like strcmp or strlen. This patch adds a VL output to these intrinsics so that the new VL value can be captured by software. This will be expanded to 'csrr gpr, vl' after the vleff instruction during SelectionDAG. By doing this with one intrinsic we are able to guarantee that the csrr reads the VL value produced by the vleff instruction. Having it as a separate intrinsic would make it impossible to guarantee ordering without making every other vector intrinsic have side effects. The intrinsics are expanded during lowering into two ISD nodes that are glued together. These ISD nodes will go through isel separately, but should maintain the glue so that they get emitted adjacently by InstrEmitter. I've only ran the chain through the vleff instruction, allowing the READ_VL to be deleted if it is unused. Reviewed By: HsiangKai Differential Revision: https://reviews.llvm.org/D94286	2021-01-21 17:19:58 -08:00
Hsiangkai Wang	6e360460f1	[RISCV] Use v8-v23 as argument registers to conform to the proposal. The maximum LMUL is 8. We need 16 vector registers for two LMUL-8 arguments. The modification follows the proposal of psABI in https://github.com/riscv/riscv-elf-psabi-doc/pull/171 Differential Revision: https://reviews.llvm.org/D95134	2021-01-22 07:55:24 +08:00
Michael Munday	4ab0f51a75	Recommit "[RISCV] Legalize select when Zbt extension available" This recommits `71ed4b6ce5` with the polarity of some of the pattern corrected. Original commit message: The custom expansion of select operations in the RISC-V backend interferes with the matching of cmov instructions. Legalizing select when the Zbt extension is available solves that problem. Reviewed By: luismarques, craig.topper Differential Revision: https://reviews.llvm.org/D93767	2021-01-21 12:07:44 -08:00
Craig Topper	9d792fef57	[RISCV] Remove unnecessary APInt copy. NFC getAPIntValue returns a const APInt& so keep it as a reference.	2021-01-20 10:33:09 -08:00
Hsiangkai Wang	8ca4b174d7	[RISCV] Implement vlseg intrinsics. For Zvlsseg, we need continuous vector registers for the values. We need to define new register classes for the different combinations of (number of fields and LMUL). For example, when the number of fields(NF) = 3, LMUL = 2, the values will be assigned to (V0M2, V2M2, V4M2), (V2M2, V4M2, V6M2), (V4M2, V6M2, V8M2), ... We define the vlseg intrinsics with multiple outputs. There is no way to describe the codegen patterns with multiple outputs in the tablegen files. We do the codegen in RISCVISelDAGToDAG and use EXTRACT_SUBREG to extract the values of output. The multiple scalable vector values will be put into a struct. This patch is depended on the support for scalable vector struct. Differential Revision: https://reviews.llvm.org/D94229	2021-01-20 14:26:04 +08:00
Craig Topper	ce8b3937dd	[RISCV] Add DAG combine to turn (setcc X, 1, setne) -> (setcc X, 0, seteq) if we can prove X is 0/1. If we are able to compare with 0 instead of 1, we might be able to fold the setcc into a beqz/bnez. Often these setccs start life as an xor that gets converted to a setcc by DAG combiner's rebuildSetcc. I looked into a detecting (xor X, 1) and converting to (seteq X, 0) based on boolean contents being 0/1 in rebuildSetcc instead of using computeKnownBits. It was very perturbing to AMDGPU tests which I didn't look closely at. It had a few changes on a couple other targets, but didn't seem to be much if any improvement. Reviewed By: lenary Differential Revision: https://reviews.llvm.org/D94730	2021-01-19 11:21:48 -08:00
Fraser Cormack	9c6a00fe99	[RISCV] Add ISel patterns for scalable mask exts & truncs Original patch by @rogfer01. This patch adds support for sign-, zero-, and any-extension from scalable mask vector types to integer vector types, as well as truncation in the opposite direction. Authored-by: Roger Ferrer Ibanez <rofirrim@gmail.com> Co-Authored-by: Fraser Cormack <fraser@codeplay.com> Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D94590	2021-01-19 18:13:15 +00:00
Fraser Cormack	ac603c8d38	[RISCV] Add scalable vector truncate patterns Original patch by @rogfer01. This patch supports vector truncates, which on RVV must be done in a series of instructions truncating by one power-of-two at a time. This is done through custom-lowering and a custom node to avoid LLVM re-combining the split TRUNCATE nodes. Authored-by: Roger Ferrer Ibanez <rofirrim@gmail.com> Co-Authored-by: Fraser Cormack <fraser@codeplay.com> Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D94796	2021-01-18 10:18:43 +00:00
Craig Topper	383b6501ff	[RISCV] Use tail agnostic policy for instructions with tied defs if the use operand is IMPLICIT_DEF. The vcompress intrinsic is defined such that it requires a tail undisturbed policy. This patch makes it so we can use the tail agnostic policy if the user has passed vundefined to the dest operand. We need to do something similar for masked policy, but we need annotation of which instructions use the mask policy first. Not sure if this is sufficient for scheduling or if we'll need to select different pseudos that don't have a tied def. Reviewed By: evandro Differential Revision: https://reviews.llvm.org/D94566	2021-01-17 23:47:58 -08:00
Craig Topper	86e604c4d6	[RISCV] Add implementation of targetShrinkDemandedConstant to optimize AND immediates. SimplifyDemandedBits can remove set bits from immediates from instructions like AND/OR/XOR. This can prevent them from being efficiently codegened on RISCV. This adds an initial version that tries to keep or form 12 bit sign extended immediates for AND operations to enable use of ANDI. If that doesn't work we'll try to create a 32 bit sign extended immediate to use LUI+ADDIW. More optimizations are possible for different size immediates or different operations. But this is a good starting point that already has test coverage. Reviewed By: frasercrmck Differential Revision: https://reviews.llvm.org/D94628	2021-01-15 11:14:14 -08:00
Craig Topper	b894a9fb23	[RISCV] Optimize select_cc after fp compare expansion Some FP compares expand to a sequence ending with (xor X, 1) to invert the result. If the consumer is a select_cc we can likely get rid of this xor by fixing up the select_cc condition. This patch combines (select_cc (xor X, 1), 0, setne, trueV, falseV) - (select_cc X, 0, seteq, trueV, falseV) if we can prove X is 0/1. Reviewed By: lenary Differential Revision: https://reviews.llvm.org/D94546	2021-01-14 13:41:40 -08:00
Craig Topper	387d3c2479	[RISCV] Merge Utils library into MCTargetDesc MCTargetDesc includes headers from Utils and Utils includes headers from MCTargetDesc. So from a library layering perspective it makes sense for them to be in the same library. I guess the other option might be to move the tablegen includes from RISCVMCTargetDesc.h to RISCVBaseInfo.h so that RISCVBaseInfo.h didn't need to include RISCVMCTargetDesc.h. Everything else that depends on Utils also depends on MCTargetDesc so having one library seemed simpler. Differential Revision: https://reviews.llvm.org/D93168	2021-01-14 11:47:30 -08:00
Sam Elliott	7c9c2a2ea5	Revert "[RISCV] Legalize select when Zbt extension available" We found issues with this patch in additional testing. Backing out while we work on a fix. This reverts commit `71ed4b6ce5`.	2021-01-14 16:44:34 +00:00
Craig Topper	dfc1901d51	[RISCV] Custom lower ISD::VSCALE. This patch custom lowers ISD::VSCALE into a csrr vlenb followed by a shift right by 3 followed by a multiply by the scale amount. I've added computeKnownBits support to indicate that the csrr vlenb always produces 3 trailng bits of 0s so the shift right is "exact". This allows the shift and multiply sequence to be nicely optimized into a single shift or removed completely when the scale amount is a power of 2. The non power of 2 case multiplying by 24 is still producing suboptimal code. We could remove the right shift and use a multiply by 3. Hopefully we can improve DAG combine to fix that since it's not unique to this sequence. This replaces D94144. Reviewed By: HsiangKai Differential Revision: https://reviews.llvm.org/D94249	2021-01-13 17:14:49 -08:00
Michael Munday	71ed4b6ce5	[RISCV] Legalize select when Zbt extension available The custom expansion of select operations in the RISC-V backend interferes with the matching of cmov instructions. Legalizing select when the Zbt extension is available solves that problem. Reviewed By: lenary, craig.topper Differential Revision: https://reviews.llvm.org/D93767	2021-01-12 21:24:38 +00:00
Fraser Cormack	37b41bd087	[RISCV] Add scalable vector fcmp ISel patterns Original patch by @rogfer01. All ordered comparisons except ONE are supported natively, and all unordered comparisons except UNE are expanded into sequences involving explicit NaN checks and mask arithmetic. Additionally, we expand GT,OGT,GE,OGE to their swapped-operand versions, and pattern-match those back to the "original", swapping operands once more. This way we catch both operations and both "vf" and "fv" forms with fewer patterns. Also add support for floating-point splat_vector, with an optimization for splatting fpimm0. Authored-by: Roger Ferrer Ibanez <rofirrim@gmail.com> Co-Authored-by: Fraser Cormack <fraser@codeplay.com> Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D94242	2021-01-11 19:38:56 +00:00
Craig Topper	5cf73dca77	[RISCV] Convert most of the information about RVV Pseudos into bits in TSFlags. This patch moves all but the BaseInstr to bits in TSFlags. For the index fields, we can just use a bit to indicate their presence. The locations of the operands are well defined. This reduces the llc binary by about 32K on my build. It also removes the binary search of the table from the custom inserter. Instead we just check that the SEW op is present. Reviewed By: rogfer01 Differential Revision: https://reviews.llvm.org/D94375	2021-01-10 19:15:45 -08:00
Ben Shi	55f0a1b066	[RISCV] Optimize multiplication with constant 1. Break MUL with specific constant to a SLLI and an ADD/SUB on riscv32 with the M extension. 2. Break MUL with specific constant to two SLLI and an ADD/SUB, if the constant needs a pair of LUI/ADDI to construct. Reviewed by: craig.topper Differential Revision: https://reviews.llvm.org/D93619	2021-01-09 10:37:21 +08:00
Craig Topper	c68faed041	[RISCV] Return a vXi1 vector type from getSetCCResultType if V extension is enabled. nvxXi1 types are legal with V extension and that's the result vmseq/vmsne/vmslt/etc instructions return. No test cases yet because the setcc isel patterns aren't in and we'll need more than basic tests to observe this. I locally tested that this plus D947078, D94168, D94142, and D94149 was enough to be able to handle the overflow result from llvm.sadd.overflow.	2021-01-06 11:50:15 -08:00
Fraser Cormack	1d4411e9ea	[RISCV] Add vector integer min/max ISel patterns Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D94012	2021-01-05 09:15:50 +00:00
Craig Topper	79cbb003c5	[RISCV] Don't use tail agnostic policy on instructions where destination is tied to source If the destination is tied, then user has some control of the register used for input. They would have the ability to control the value of any tail elements. By using tail agnostic we take this option away from them. Its not clear that the intrinsics are defined such that this isn't supposed to work. And undisturbed is a valid implementation for agnostic so code wouldn't even fail to work on all systems if we always used agnostic. The vcompress intrinsic is defined to require tail undisturbed so at minimum we need this for that instruction or need to redefine the intrinsic. I've made an exception here for vmv.s.x/fmv.s.f and reduction instructions which only write to element 0 regardless of the tail policy. This allows us to keep the agnostic policy on those which should allow better redundant vsetvli removal. An enhancement would be to check for undef input and keep the agnostic policy, but we don't have good test coverage for that yet. Reviewed By: khchen Differential Revision: https://reviews.llvm.org/D93878	2020-12-29 10:37:58 -08:00
Fraser Cormack	aebb4a6052	[RISCV] Rewrite and simplify helper function. NFC. Reviewed By: luismarques Differential Revision: https://reviews.llvm.org/D93851	2020-12-29 11:29:44 +00:00
Kazu Hirata	d6ff5cf995	[Target] Use llvm::any_of (NFC)	2020-12-24 19:43:26 -08:00
Fraser Cormack	1a7ac29a89	[RISCV] Add ISel support for RVV vector/scalar forms This patch extends the SDNode ISel support for RVV from only the vector/vector instructions to include the vector/scalar and vector/immediate forms. It uses splat_vector to carry the scalar in each case, except when XLEN<SEW (RV32 SEW=64) when a custom node `SPLAT_VECTOR_I64` is used for type-legalization and to encode the fact that the value is sign-extended to SEW. When the scalar is a full 64-bit value we use a sequence to materialize the constant into the vector register. The non-intrinsic ISel patterns have also been split into their own file. Authored-by: Roger Ferrer Ibanez <rofirrim@gmail.com> Co-Authored-by: Fraser Cormack <fraser@codeplay.com> Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D93312	2020-12-23 20:16:18 +00:00
Nandor Licker	0586f048d7	[RISCV] Basic jump table lowering This patch enables jump table lowering in the RISC-V backend. In addition to the test case included, the new lowering was tested by compiling the OCaml runtime and running it under qemu. Differential Revision: https://reviews.llvm.org/D92097	2020-12-22 15:05:54 +00:00
Craig Topper	09468a9148	[RISCV] Sign extend constant arguments to V intrinsics when promoting to XLen. The default behavior for any_extend of a constant is to zero extend. This occurs inside of getNode rather than allowing type legalization to promote the constant which would sign extend. By using sign extend with getNode the constant will be sign extended. This gives a better chance for isel to find a simm5 immediate since all xlen bits are examined there. For instructions that use a uimm5 immediate, this change only affects constants >= 128 for i8 or >= 32768 for i16. Constants that large already wouldn't have been eligible for uimm5 and would need to use a scalar register. If the instruction isn't able to use simm5 or the immediate is too large, we'll need to materialize the immediate in a register. As far as I know constants with all 1s in the upper bits should materialize as well or better than all 0s. Longer term we should probably have a SEW aware PatFrag to ignore the bits above SEW before checking simm5. I updated about half the test cases in some tests to use a negative constant to get coverage for this. Reviewed By: evandro Differential Revision: https://reviews.llvm.org/D93487	2020-12-18 11:43:38 -08:00
Craig Topper	86d282baed	[RISCV] Add intrinsics for vmv.x.s and vmv.s.x This adds intrinsics for vmv.x.s and vmv.s.x. I've used stricter type constraints on these intrinsics than what we've been doing on the arithmetic intrinsics so far. This will allow us to not need to pass the scalar type to the Intrinsic::getDeclaration call when creating these intrinsics. A custom ISD is used for vmv.x.s in order to implement the change in computeNumSignBitsForTargetNode which can remove sign extends on the result. I also modified the MC layer description of these instructions to show the tied source/dest operand. This is different than what we do for masked instructions where we drop the tied source operand when converting to MC. But it is a more accurate description of the instruction. We can't do this for masked instructions since we use the same MC instruction for masked and unmasked. Tools like llvm-mca operate in the MC layer and rely on ins/outs and Uses/Defs for analysis so I don't know if we'll be able to maintain the current behavior for masked instructions. So I went with the accurate description here since it was easy. Reviewed By: frasercrmck Differential Revision: https://reviews.llvm.org/D93365	2020-12-18 10:30:48 -08:00
Monk Chiang	ee2cb90e3b	[RISCV] Define vsadd/vsaddu/vssub/vssubu intrinsics. We work with @rogfer01 from BSC to come out this patch. Authored-by: Roger Ferrer Ibanez <rofirrim@gmail.com> Co-Authored-by: ShihPo Hung <shihpo.hung@sifive.com> Co-Authored-by: Monk Chiang <monk.chiang@sifive.com> Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D93366	2020-12-18 10:24:24 +08:00
Hsiangkai Wang	f03609b5c7	[RISCV] V does not imply F. If users want to use vector floating point instructions, they need to specify 'F' extension additionally. Differential Revision: https://reviews.llvm.org/D93282	2020-12-17 10:57:36 +08:00
Craig Topper	028efac2d7	[RISCV] Only custom legalize i32 arguments to vector intrinsics on RV64.	2020-12-15 13:54:41 -08:00
Hsiangkai Wang	14a91d676b	[RISCV][NFC] Define scalable vectors for half types. This is a preperation work for vfadd intrinsics. Differential Revision: https://reviews.llvm.org/D93275	2020-12-15 16:23:22 +08:00
Hsiangkai Wang	a6805a0e02	[RISCV] Define vadd/vsub/vrsub intrinsics and lower to V instructions. This patch is based on the proposal from Roger Ferrer Ibanez. http://lists.llvm.org/pipermail/llvm-dev/2020-October/145850.html Differential Revision: https://reviews.llvm.org/D93013	2020-12-15 12:56:49 +08:00
Craig Topper	b90e2d850e	[RISCV] Use tail agnostic policy for vsetvli instruction emitted in the custom inserter The compiler is making no effort to preserve upper elements. To do so would require another source operand tied with the destination and a different intrinsic interface to give control of this source to the programmer. This patch changes the tail policy to agnostic so that the CPU doesn't need to make an effort to preserve them. This is consistent with the RVV intrinsic spec here https://github.com/riscv/rvv-intrinsic-doc/blob/master/rvv-intrinsic-rfc.md#configuration-setting Differential Revision: https://reviews.llvm.org/D93080	2020-12-10 19:48:03 -08:00
Craig Topper	a1ae3c6ac9	[RISCV][LegalizeDAG] Expand SETO and SETUO comparisons. Teach LegalizeDAG to expand SETUO expansion when UNE isn't legal. If SETUNE isn't legal, UO can use the NOT of the SETO expansion. Removes some complex isel patterns. Most of the test changes are from using XORI instead of SEQZ. Differential Revision: https://reviews.llvm.org/D92008	2020-12-10 09:15:52 -08:00
Fraser Cormack	af5fd65895	[RISCV] Fix missing def operand when creating VSETVLI pseudos The register operand was not being marked as a def when it should be. No tests for this in the main branch as there are not yet any pseudos without a non-negative VLIndex. Also change the type of a virtual register operand from unsigned to Register and adjust formatting. Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D92823	2020-12-09 09:35:28 +00:00
Craig Topper	846f576bea	[RISCV] Add a table showing the layout of the fields in VTYPE. Rename MaskedOffAgnostic->MaskAgnostic. NFC	2020-12-08 20:41:57 -08:00
Craig Topper	a64998be99	[RISCV] Share VTYPE encoding code between the assembler and the CustomInserter for adding VSETVLI before vector instructions This merges the SEW and LMUL enums that each used into singles enums in RISCVBaseInfo.h. The patch also adds a new encoding helper to take SEW, LMUL, tail agnostic, mask agnostic and turn it into a vtype immediate. I also stopped storing the Encoding in the VTYPE operand in the assembler. It is easy to calculate when adding the operand which should only happen once per instruction. Differential Revision: https://reviews.llvm.org/D92813	2020-12-08 16:04:20 -08:00
Craig Topper	5c819eb389	[RISCV] Form GORCI from (or (rotl/rotr X, Bitwidth/2), X). A rotate by half the bitwidth swaps the bottom and top half which is the same as one of the MSB GREVI stage. We have to do this as a special combine because we prefer to keep (rotl/rotr X, BitWidth/2) as a rotate rather than a single stage GREVI. Differential Revision: https://reviews.llvm.org/D92286	2020-12-07 10:28:04 -08:00
Craig Topper	5baef6353e	[RISCV] Initial infrastructure for code generation of the RISC-V V-extension The companion RFC (http://lists.llvm.org/pipermail/llvm-dev/2020-October/145850.html) gives lots of details on the overall strategy, but we summarize it here: LLVM IR involving vector types is going to be selected using pseudo instructions (only MachineInstr). These pseudo instructions contain dummy operands to represent the vector type being operated and the vector length for the operation. These two dummy operands, as set by instruction selection, will be used by the custom inserter to prepend every operation with an appropriate vsetvli instruction that ensures the vector architecture is properly configured for the operation. Not in this patch: later passes will remove the redundant vsetvli instructions. Register classes of tuples of vector registers are used to represent vector register groups (LMUL > 1). Those pseudos are eventually lowered into the actual instructions when emitting the MCInsts. About the patch: Because there is a bit of initial infrastructure required, this is the minimal patch that allows us to select instructions for 3 LLVM IR instructions: load, add and store vectors of integers. LLVM IR operations have "whole-vector" semantics (as in they generate values for all the elements). Later patches will extend the information represented in TableGen. Authored-by: Roger Ferrer Ibanez <rofirrim@gmail.com> Co-Authored-by: Evandro Menezes <evandro.menezes@sifive.com> Co-Authored-by: Craig Topper <craig.topper@sifive.com> Differential Revision: https://reviews.llvm.org/D89449	2020-12-04 11:39:30 -08:00
Craig Topper	3fcdf9ca78	[RISCV] Rename FPCCToExtend->FPOpToExpand and FPOpToExtend->FPOpToExpand. NFC These are used to call setOperationAction/setCondCodeAction with the Expand action so it seems that Expand is a better name than Extend.	2020-12-03 16:00:49 -08:00
Craig Topper	a18d5e3e9f	[RISCV] Merge FMV_H_X_RV32/FMV_H_X_RV64 into a single opcode. Same with FMV_X_ANYEXTH_RV32/RV64 Rather than having a different opcode for RV32 and RV64. Let's just say the integer type is XLenVT and use a single opcode for both modes. Differential Revision: https://reviews.llvm.org/D92538	2020-12-03 11:12:40 -08:00
Craig Topper	e52a91e156	[RISCV] Add f16 to isFMAFasterThanFMulAndFAdd now that the Zfh extension is supported	2020-12-02 20:31:43 -08:00
Hsiangkai Wang	f7bc7c2981	[RISCV] Support Zfh half-precision floating-point extension. Support "Zfh" extension according to https://github.com/riscv/riscv-isa-manual/blob/zfh/src/zfh.tex Differential Revision: https://reviews.llvm.org/D90738	2020-12-03 09:16:33 +08:00
Craig Topper	bfc4f29f46	[RISCV] Combine (GORCI (GORCI x, C2), C1) -> (GORCI x, C1\|C2). Unlike GREVI, GORCI stages can't be undone, but they are redundant if done more than once. Differential Revision: https://reviews.llvm.org/D92295	2020-11-30 08:42:46 -08:00
Craig Topper	76d1026b59	[RISCV] Custom legalize bswap/bitreverse to GREVI with Zbp extension to enable them to combine with other GREVI instructions This enables bswap/bitreverse to combine with other GREVI patterns or each other without needing to add more special cases to the DAG combine or new DAG combines. I've also enabled the existing GREVI combine for GREVIW so that it can pick up the i32 bswap/bitreverse on RV64 after they've been type legalized to GREVIW. Differential Revision: https://reviews.llvm.org/D92253	2020-11-30 08:30:40 -08:00
Craig Topper	cbbd7021f1	[RISCV] Only combine (or (GREVI x, shamt), x) -> GORCI if shamt is a power of 2. GORCI performs an OR between each stage. So we need to ensure only one stage is active before doing this combine. Initial attempts at finding a test case for this failed due to the order things get combined. It's most likely that we'll form one stage of GREVI then combine to GORCI before the two stages of GREVI are able to be formed and combined with each other to form a multi stage GREVI. Differential Revision: https://reviews.llvm.org/D92289	2020-11-30 08:10:39 -08:00
Craig Topper	8709d9d872	[RISCV] Replace getSimpleValueType() with getValueType() in DAG combines to prevent asserts with weird types.	2020-11-27 12:49:12 -08:00
Craig Topper	ed95cafbc5	[RISCV] Add an implementation of isFMAFasterThanFMulAndFAdd Start with an assumption that FMA is faster than Fmul+FAdd. If thats not true on some particular implementation we can add a tuning parameter in the future. I've update the fmuladd test cases and added new test cases for fast math flag based contraction. Differential Revision: https://reviews.llvm.org/D91987	2020-11-25 15:07:34 -08:00
Craig Topper	751b0d970e	[RISCV] Make SMIN/SMAX/UMIN/UMAX legal with Zbb extension. This is the logically correct thing to do. But it generates worse code for i32 umin/umax on the rv64 due to type legalize requesting zext even though the arguments are sext. Maybe we can teach type legalizer to use sext for umin/umax for RISCV. It's also producing possibly worse code on i64 on RV32 since we still end up with selects that become branches. But this seems like something we could improve in type legalization or DAG combine. Hopefully this makes D92095 work for RISCV with Zbb.	2020-11-25 12:48:43 -08:00
Craig Topper	c26e8697d7	[RISCV] Custom type legalize i32 fshl/fshr on RV64 with Zbt. This adds custom opcodes for FSLW/FSRW so we can type legalize fshl/fshr without needing to match a sign_extend_inreg. I've used the operand order from fshl/fshr to make the isel pattern similar to the non-W form. It was also hard to decide another order since the register instruction has the shift amount as the second operand, but the immediate instruction has it as the third operand. Differential Revision: https://reviews.llvm.org/D91479	2020-11-25 10:01:47 -08:00
Luís Marques	a8dc2110cd	[RISCV] Add GHC calling convention This is a special calling convention to be used by the GHC compiler. Patch by Andreas Schwab (schwab) Differential Revision: https://reviews.llvm.org/D89788	2020-11-24 22:35:23 +00:00
Luís Marques	e4d9380245	Revert "[RISCV] Add GHC calling convention" This reverts commit `f8317bb256` due to lack of proper attribution.	2020-11-24 22:34:20 +00:00
Luís Marques	f8317bb256	[RISCV] Add GHC calling convention This is a special calling convention to be used by the GHC compiler. Differential Revision: https://reviews.llvm.org/D89788	2020-11-24 21:56:28 +00:00
Fraser Cormack	ca1f2f2716	[RISCV] Combine GREVI sequences This combine step performs the following type of transformation: rev.p a0, a0 # grevi a0, a0, 0b01 rev2.n a0, a0 # grevi a0, a0, 0b10 --> rev.n a0, a0 # grevi a0, a0, 0b11 Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D91877	2020-11-24 12:07:13 +00:00
Craig Topper	84b8222705	[RISCV] Use separate Lo and Hi MemOperands when expanding BuildPairF64Pseudo and SplitF64Pseudo. We generate two 4 byte loads or two stores as part of the expansion. Previously the MemOperand was set the same for both to cover the full 8 bytes. Now we set a separate 4 byte mem operand for each with a 4 byte offset for the high part.	2020-11-22 00:46:12 -08:00
Craig Topper	6a1d8b91ed	[RISCV] Custom type legalize i32 bswap/bitreverse to GREVIW on RV64 with Zbp extension Previously we required a sra to pattern match these properly in isel. If the consumer didn't need the result sign extended we'll have an srl instead of sra and fail to match. This patch switches to custom legalizing to GREVIW using portions of D91259. Differential Revision: https://reviews.llvm.org/D91457	2020-11-20 10:41:01 -08:00
Craig Topper	78767b7f8e	[RISCV] Add RISCVISD::ROLW/RORW use those for custom legalizing i32 rotl/rotr on RV64IZbb. This should result in better utilization of RORIW since we don't need to look for a SIGN_EXTEND_INREG that may not exist. Also remove rotl/rotr isel matching to GREVI and just prefer RORI. This is to keep consistency so we don't have to match ROLW/RORW to GREVIW as well. I imagine RORI/RORIW performance will be the same or better than GREVI. Differential Revision: https://reviews.llvm.org/D91449	2020-11-20 10:25:47 -08:00
Fraser Cormack	1ac9b54831	[RISCV] Lower GREVI and GORCI as custom nodes This moves the recognition of GREVI and GORCI from TableGen patterns into a DAGCombine. This is done primarily to match "deeper" patterns in the future, like (grevi (grevi x, 1) 2) -> (grevi x, 3). TableGen is not best suited to matching patterns such as these as the compile time of the DAG matchers quickly gets out of hand due to the expansion of commutative permutations. Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D91259	2020-11-19 18:11:42 +00:00
Fraser Cormack	fe9dc2e54a	[RISCV] Use a macro to simplify getTargetNodeName Similar to the X86 and AMDGPU targets, this uses a macro to cut down on repetitive and error-prone code when converting RISCVISD node names to strings in getTargetNodeName. Reviewed By: asb Differential Revision: https://reviews.llvm.org/D91414	2020-11-16 09:33:47 +00:00
Craig Topper	637f19c36b	[RISCV] Remove traces of Glue from RISCVISD::SELECT_CC We were creating RISCVISD::SELECT_CC nodes with Glue output that was never being used, and the tablegen SDNode had the SDNPInGlue flag instead of the SDNPOutGlue flag. Since we don't seem to need the Glue just get rid of it from both places. Differential Revision: https://reviews.llvm.org/D91199	2020-11-11 09:30:48 -08:00
Craig Topper	5d3fd3df94	[RISCV] Make ctlz/cttz cheap to speculatively execute so CodeGenPrepare won't insert a zero check. Add additional isel patterns for ctzw/clzw instructions. Differential Revision: https://reviews.llvm.org/D91040	2020-11-09 10:13:45 -08:00
Craig Topper	4265cbaa34	[RISCV] Make SIGN_EXTEND_INREG from i8/i16 legal when Zbb extension is enabled. This produces better code for sign extend to i64 on RV32 target. Differential Revision: https://reviews.llvm.org/D91023	2020-11-09 10:13:45 -08:00
Craig Topper	ce5f4f22e9	[RISCV] Use the 'si' lib call for (double (fp_to_sint/uint i32 X)) when F extension is enabled. D80526 added custom lowering to pick the si lib call on RV64, but this custom handling is only enabled when the F and D extension are both disabled. This prevents the si library call from being used for double when F is enabled but D is not. This patch changes the behavior so we always enable the Custom hook on RV64 and decide in ReplaceNodeResults if we should emit a libcall based on whether the FP type should be softened or not. Differential Revision: https://reviews.llvm.org/D90817	2020-11-05 10:46:45 -08:00
Craig Topper	ce1270fc7e	[RISCV] Remove shadow register list passed to AllocateReg when allocating FP registers for calling convention The _F and _D registers are already sub/super registers. When one gets allocated all its aliases are already marked as allocated. We don't need to explicitly shadow it too. I believe shadow is for calling conventions like 64-bit Windows on X86 where have rules like this CCIfType<[i32], CCAssignToRegWithShadow<[ECX , EDX , R8D , R9D ], [XMM0, XMM1, XMM2, XMM3]>> For that calling convention the argument number determines which register is used regardless of how many scalars or vectors came before it. Removing this removes a question I had in D90738. Differential Revision: https://reviews.llvm.org/D90801	2020-11-05 09:49:42 -08:00
Simon Pilgrim	36920d5f9d	[RISCV] Avoid std::pair<> in FPReg StringSwitch to avoid MSVC compile failures. NFCI. As discussed on D90322, some MSVC builds are failing with is_trivially_copyable static asserts (see D86126) - we can avoid this by not using the std::pair<unsigned,unsigned> which held both the FP+DP Registers, just handle the FP register and convert to DP on the fly.	2020-11-02 11:30:57 +00:00
Craig Topper	a76cd10fcd	[RISCV] Use 'unsigned' instead of Register in getRegForInlineAsmConstraint. NFC The return value of this interface still uses an 'unsigned' on all targets. So we convert Register back to unsigned at the end. I'm hoping this will prevent the issue that caused the revert of D90322.	2020-11-01 10:16:52 -08:00
Craig Topper	6915c76e10	[RISCV] Don't use DCI.CombineTo to replace a single result. NFCI Just return the new node, which is the standard practice. I also noticed what appeared to be an unnecessary attempt at creating an ANY_EXTEND where the type should already be correct. I replace with an assert to verify the type. Differential Revision: https://reviews.llvm.org/D90444	2020-10-30 10:46:32 -07:00
Craig Topper	74b078294f	[RISCV] Improve worklist management in the DAG combine for SLLW/SRLW/SRAW This combine makes two calls to SimplifyDemandedBits, one for the LHS and one for the RHS. If the LHS call returns true, we don't make the RHS call. When SimplifyDemandedBits makes a change, it will add the nodes around the change to the DAG combiner worklist. If the simplification happens on the first recursion step, the N will get added to the worklist. But if the simplification happens deeper in the recursion, then N will not be revisited until the next time the DAG combiner runs. This patch explicitly addes N to the worklist anytime a Simplification is made. Without this we might miss additional simplifications on the LHS or never simplify the RHS. Special care also needs to be taken to not add N if it has been CSEd by the simplification. There are similar examples in DAGCombiner and the X86 target, but I don't have a test for it for RISC-V. I've also returned SDValue(N, 0) instead of SDValue() so DAGCombiner knows a change was made and will update its Statistic variable. The test here was constructed so that 2 simplifications happen to the LHS. Without this fix one happens in the post type legalization DAG combine and the other happens after LegalizeDAG. This prevents the RHS from ever being simplified causing the left and right shift to clear the upper 32 bits of the RHS to be left behind. Differential Revision: https://reviews.llvm.org/D90339	2020-10-29 14:52:53 -07:00
StephenFan	a96921afa7	[RISCV] eliminate the repetition declare of SDLoc DL Differential revision: https://reviews.llvm.org/D85002	2020-08-03 10:24:30 +08:00
Yuanfang Chen	ca1e69a675	[NFC] remove unused includes of SelectionDAGISel.h	2020-07-20 10:43:29 -07:00
lewis-revill	c9c955ada8	[RISCV] Add matching of codegen patterns to RISCV Bit Manipulation Zbt asm instructions This patch provides optimization of bit manipulation operations by enabling the +experimental-b target feature. It adds matching of single block patterns of instructions to specific bit-manip instructions from the ternary subset (zbt subextension) of the experimental B extension of RISC-V. It adds also the correspondent codegen tests. This patch is based on Claire Wolf's proposal for the bit manipulation extension of RISCV: https://github.com/riscv/riscv-bitmanip/blob/master/bitmanip-0.92.pdf Differential Revision: https://reviews.llvm.org/D79875	2020-07-15 12:19:34 +01:00
lewis-revill	6144f0a1e5	[RISCV] Add matching of codegen patterns to RISCV Bit Manipulation Zbbp asm instructions This patch provides optimization of bit manipulation operations by enabling the +experimental-b target feature. It adds matching of single block patterns of instructions to specific bit-manip instructions belonging to both the permutation and the base subsets of the experimental B extension of RISC-V. It adds also the correspondent codegen tests. This patch is based on Claire Wolf's proposal for the bit manipulation extension of RISCV: https://github.com/riscv/riscv-bitmanip/blob/master/bitmanip-0.92.pdf Differential Revision: https://reviews.llvm.org/D79873	2020-07-15 12:19:34 +01:00
lewis-revill	31b52b4345	[RISCV] Add matching of codegen patterns to RISCV Bit Manipulation Zbp asm instructions This patch provides optimization of bit manipulation operations by enabling the +experimental-b target feature. It adds matching of single block patterns of instructions to specific bit-manip instructions from the permutation subset (zbp subextension) of the experimental B extension of RISC-V. It adds also the correspondent codegen tests. This patch is based on Claire Wolf's proposal for the bit manipulation extension of RISCV: https://github.com/riscv/riscv-bitmanip/blob/master/bitmanip-0.92.pdf Differential Revision: https://reviews.llvm.org/D79871	2020-07-15 12:19:34 +01:00
lewis-revill	e2692f0ee7	[RISCV] Add matching of codegen patterns to RISCV Bit Manipulation Zbb asm instructions This patch provides optimization of bit manipulation operations by enabling the +experimental-b target feature. It adds matching of single block patterns of instructions to specific bit-manip instructions from the base subset (zbb subextension) of the experimental B extension of RISC-V. It adds also the correspondent codegen tests. This patch is based on Claire Wolf's proposal for the bit manipulation extension of RISCV: https://github.com/riscv/riscv-bitmanip/blob/master/bitmanip-0.92.pdf Differential Revision: https://reviews.llvm.org/D79870	2020-07-15 12:19:34 +01:00
Ben Shi	cb82de2960	[RISCV] Optimize multiplication by constant ... to shift/add or shift/sub. Do not enable it on riscv32 with the M extension where decomposeMulByConstant may not be an optimization. Reviewed By: luismarques, MaskRay Differential Revision: https://reviews.llvm.org/D82660	2020-07-07 18:50:24 -07:00
Sam Elliott	7dc892661e	[RISCV] Implement Hooks to avoid chaining SELECT Summary: This implements two hooks that attempt to avoid control flow for RISC-V. RISC-V will lower SELECTs into control flow, which is not a great idea. The hook `hasMultipleConditionRegisters()` turns off the following DAGCombiner folds: select(C0\|C1, x, y) <=> select(C0, x, select(C1, x, y)) select(C0&C1, x, y) <=> select(C0, select(C1, x, y), y) The second hook `setJumpIsExpensive` controls a flag that has a similar purpose and is used in CodeGenPrepare and the SelectionDAGBuilder. Both of these have the effect of ensuring more logic is done before fewer jumps. Note: with the `B` extension, we may be able to lower select into a conditional move instruction, so at some point these hooks will need to be guarded based on enabled extensions. Reviewed By: luismarques Differential Revision: https://reviews.llvm.org/D79268	2020-07-01 11:56:31 +01:00
Matt Arsenault	08649f0a9d	RISCV: Don't store function in RISCVMachineFunctionInfo Targets should not depend on the MachineFunction state during the MachineFunctionInfo construction.	2020-06-30 16:08:51 -04:00
Guillaume Chatelet	2e7bba693e	[Alignment][NFC] Use Align for TargetCallingConv::OrigAlign This patch replaces D69249. This is patch is part of a series to introduce an Alignment type. See this thread for context: http://lists.llvm.org/pipermail/llvm-dev/2019-July/133851.html See this patch for the introduction of the type: https://reviews.llvm.org/D64790 Differential Revision: https://reviews.llvm.org/D82307	2020-06-25 13:21:22 +00:00
Kamlesh Kumar	7622ea5835	[RISCV64] Emit correct lib call for fp(float/double) to ui/si Since i32 is not legal in riscv64, it always promoted to i64 before emitting lib call and for conversions like float/double to int and float/double to unsigned int wrong lib call was emitted. This commit fix it using custom lowering. Differential Revision: https://reviews.llvm.org/D80526	2020-06-18 19:34:16 +05:30
Guillaume Chatelet	1778564f91	[Alignment][NFC] Migrate the rest of backends Summary: This is a followup on D81196 Reviewers: courbet Subscribers: arsenm, dschuff, jyknight, dylanmckay, sdardis, nemanjai, jvesely, nhaehnle, sbc100, jgravelle-google, hiraditya, aheejin, kbarton, fedor.sergeev, asb, rbar, johnrusso, simoncook, sabuasal, niosHD, jrtc27, MaskRay, zzheng, edward-jones, atanasyan, rogfer01, MartinMosbeck, brucehoult, the_o, PkmX, jocewei, Jim, lenary, s.egerton, pzheng, sameer.abuasal, apazos, luismarques, kerbowa, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D81278	2020-06-08 07:17:20 +00:00
Ben Shi	4b6f0ea66c	[RISCV] Fix a typo in RISCVISelLowering.cpp The 9th parameter of "static bool CC_RISCV(...)" is isFixed, not isRet. Reviewed By: MaskRay Differential Revision: https://reviews.llvm.org/D81333	2020-06-06 18:41:00 -07:00
Zequan Wu	80e107ccd0	Add NoMerge MIFlag to avoid MIR branch folding Let the codegen recognized the nomerge attribute and disable branch folding when the attribute is given Differential Revision: https://reviews.llvm.org/D79537	2020-05-29 12:31:06 -07:00
Craig Topper	d1119980e5	[SelectionDAG] Use Align/MaybeAlign for ConstantPoolSDNode. This patch stores the alignment for ConstantPoolSDNode as an Align and updates the getConstantPool interface to take a MaybeAlign. Removing getAlignment() will be done as a follow up. Differential Revision: https://reviews.llvm.org/D79436	2020-05-08 16:04:11 -07:00
Craig Topper	113f37a1f9	[CallSite removal][TargetLowering] Replace ImmutableCallSite with CallBase Differential Revision: https://reviews.llvm.org/D77995	2020-04-13 13:50:15 -07:00
Matt Arsenault	84aa58cbe2	CodeGen: Use Register in TargetLowering	2020-04-08 12:10:58 -04:00
Guillaume Chatelet	bdf77209b9	[Alignment][NFC] Use Align version of getMachineMemOperand Summary: This is patch is part of a series to introduce an Alignment type. See this thread for context: http://lists.llvm.org/pipermail/llvm-dev/2019-July/133851.html See this patch for the introduction of the type: https://reviews.llvm.org/D64790 Reviewers: courbet Subscribers: jyknight, sdardis, nemanjai, hiraditya, kbarton, fedor.sergeev, asb, rbar, johnrusso, simoncook, sabuasal, niosHD, jrtc27, MaskRay, zzheng, edward-jones, atanasyan, rogfer01, MartinMosbeck, brucehoult, the_o, jfb, PkmX, jocewei, Jim, lenary, s.egerton, pzheng, sameer.abuasal, apazos, luismarques, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D77059	2020-03-30 15:46:27 +00:00
Kamlesh Kumar	aabc24acf0	[RISCV] Support llvm.thread.pointer Fixes https://bugs.llvm.org/show_bug.cgi?id=45303 (clang crashed on __builtin_thread_pointer) Reviewed By: lenary, MaskRay, luismarques Differential Revision: https://reviews.llvm.org/D76828	2020-03-27 17:30:12 -07:00
Roger Ferrer Ibanez	3c24aee7ee	[RISCV] Select +0.0 immediate using fmv.{w,d}.x / fcvt.d.w Floating point positive zero can be selected using fmv.w.x / fmv.d.x / fcvt.d.w and the zero source register. Differential Revision: https://reviews.llvm.org/D75729	2020-03-20 09:42:24 +00:00
Andrew Wei	4ca753f4e3	[RISCV] Implement mayBeEmittedAsTailCall for tail call optimization Implement TargetLowering callback mayBeEmittedAsTailCall for riscv in CodeGenPrepare, which will duplicate return instructions to enable tailcall optimization. Differential Revision: https://reviews.llvm.org/D73699	2020-02-18 23:56:42 +08:00
Craig Topper	eeb63944e4	[LegalizeTypes][ARM][AArch64][PowerPC][RISCV][X86] Use BUILD_PAIR to return expanded integer results from ReplaceNodeResults instead of just returning two results. Remove code from LegalizeTypes that allowed this to work. We were already using BUILD_PAIR for this in some places so this standardizes on a single way to do this.	2020-02-08 09:52:31 -08:00
Guillaume Chatelet	333f2ad8b8	[Alignment][NFC] Use Align for getMemcpy/Memmove/Memset Summary: This is patch is part of a series to introduce an Alignment type. See this thread for context: http://lists.llvm.org/pipermail/llvm-dev/2019-July/133851.html See this patch for the introduction of the type: https://reviews.llvm.org/D64790 Reviewers: courbet Subscribers: arsenm, dschuff, jyknight, sdardis, nemanjai, jvesely, nhaehnle, sbc100, jgravelle-google, hiraditya, aheejin, kbarton, fedor.sergeev, asb, rbar, johnrusso, simoncook, sabuasal, niosHD, jrtc27, MaskRay, zzheng, edward-jones, atanasyan, rogfer01, MartinMosbeck, brucehoult, the_o, PkmX, jocewei, jsji, Jim, lenary, s.egerton, pzheng, sameer.abuasal, apazos, luismarques, kerbowa, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D73885	2020-02-03 17:13:19 +01:00
Zakk Chen	0cb274de39	[RISCV] Support ABI checking with per function target-features 1. if users don't specific -mattr, the default target-feature come from IR attribute. 2. fixed bug and re-land this patch Reviewers: lenary, asb Reviewed By: lenary Tags: #llvm Differential Revision: https://reviews.llvm.org/D70837	2020-01-22 08:12:28 -08:00
Zakk Chen	cef838e65f	Revert "[RISCV] Support ABI checking with per function target-features" This reverts commit `7bc58a779a`. It breaks EXPENSIVE_CHECKS on Windows	2020-01-16 18:01:07 -08:00
Zakk Chen	7bc58a779a	[RISCV] Support ABI checking with per function target-features if users don't specific -mattr, the default target-feature come from IR attribute. Reviewers: lenary, asb Reviewed By: lenary, asb Tags: #llvm Differential Revision: https://reviews.llvm.org/D70837	2020-01-15 04:35:01 -08:00
Zakk Chen	3bc2860e92	Revert "[RISCV] Support ABI checking with per function target-features" This reverts commit `109e4d12ed`.	2020-01-15 04:32:57 -08:00
Zakk Chen	109e4d12ed	[RISCV] Support ABI checking with per function target-features if users don't specific -mattr, the default target-feature come from IR attribute.	2020-01-15 02:30:43 -08:00
Matt Arsenault	255cc5a760	CodeGen: Use LLT instead of EVT in getRegisterByName Only PPC seems to be using it, and only checks some simple cases and doesn't distinguish between FP. Just switch to using LLT to simplify use from GlobalISel.	2020-01-09 17:37:52 -05:00
Reid Kleckner	9c2b72821b	Move tail call disabling code to target independent code When the "disable-tail-calls" attribute was added, checks were added for it in various backends. Now this code has proliferated, and it is something the target is responsible for checking. Move that responsibility back to the ISels (fast, global, and SD). There's no major functionality change, except for targets that never implemented this check. This LLVM attribute was originally added in `d9699bc7bd` (2015). Reviewers: echristo, MaskRay Differential Revision: https://reviews.llvm.org/D72118	2020-01-03 11:27:41 -08:00
Reid Kleckner	5d986953c8	[IR] Split out target specific intrinsic enums into separate headers This has two main effects: - Optimizes debug info size by saving 221.86 MB of obj file size in a Windows optimized+debug build of 'all'. This is 3.03% of 7,332.7MB of object file size. - Incremental step towards decoupling target intrinsics. The enums are still compact, so adding and removing a single target-specific intrinsic will trigger a rebuild of all of LLVM. Assigning distinct target id spaces is potential future work. Part of PR34259 Reviewers: efriedma, echristo, MaskRay Reviewed By: echristo, MaskRay Differential Revision: https://reviews.llvm.org/D71320	2019-12-11 18:02:14 -08:00
James Clarke	da7b129b1b	[RISCV] Don't force Local Exec TLS for non-PIC Summary: Forcing Local Exec TLS requires the use of copy relocations. Copy relocations need special handling in the runtime linker when being used against TLS symbols, which is present in glibc, but not in FreeBSD nor musl, and so cannot be relied upon. Moreover, copy relocations are a hack that embed the size of an object in the ABI when it otherwise wouldn't be, and break protected symbols (which are expected to be DSO local), whilst also wasting space, thus they should be avoided whenever possible. As discussed in D70398, RISC-V should move away from forcing Local Exec, and instead use Initial Exec like other targets, with possible linker relaxation to follow. The RISC-V GCC maintainers also intend to adopt this more-conventional behaviour (see https://github.com/riscv/riscv-elf-psabi-doc/issues/122). Reviewers: asb, MaskRay Reviewed By: MaskRay Subscribers: emaste, krytarowski, hiraditya, rbar, johnrusso, simoncook, sabuasal, niosHD, kito-cheng, shiva0217, zzheng, edward-jones, rogfer01, MartinMosbeck, brucehoult, the_o, rkruppe, PkmX, jocewei, psnobl, benna, Jim, lenary, s.egerton, pzheng, sameer.abuasal, apazos, llvm-commits, bsdjhb Tags: #llvm Differential Revision: https://reviews.llvm.org/D70649	2019-12-03 22:04:54 +00:00
Luís Marques	51b4b17eb7	[RISCV] Implement the TargetLowering::getRegisterByName hook Summary: The hook should work for any RISC-V register. Non-allocatable registers do not need to be reserved, for the remaining the hook will only succeed if you pass clang the -ffixed-xX flag. This builds upon D67185, which currently only allows reserving GPRs. Reviewers: asb, lenary Reviewed By: lenary Tags: #llvm Differential Revision: https://reviews.llvm.org/D69130	2019-11-04 11:23:54 +00:00
Sam Elliott	7214f7a79f	[RISCV] Lower llvm.trap and llvm.debugtrap Summary: Until this commit, these have lowered to a call to abort(). `llvm.trap()` now lowers to `unimp`, which should trap on all systems. `llvm.debugtrap()` now lowers to `ebreak`, which is exactly what this instruction is for. Reviewers: asb, luismarques Reviewed By: asb Subscribers: hiraditya, rbar, johnrusso, simoncook, apazos, sabuasal, niosHD, kito-cheng, shiva0217, jrtc27, MaskRay, zzheng, edward-jones, rogfer01, MartinMosbeck, brucehoult, the_o, rkruppe, PkmX, jocewei, psnobl, benna, Jim, s.egerton, pzheng, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D69390	2019-10-28 09:54:33 +00:00
Luís Marques	1baa50396d	[RISCV] Add support for half-precision floats Complete fp16 support by ensuring that load extension / truncate store operations are properly expanded. Reviewers: asb, lenary Reviewed By: lenary Differential Revision: https://reviews.llvm.org/D69246	2019-10-25 14:02:02 +01:00
Simon Cook	aed9d6d64a	[RISCV] Add support for -ffixed-xX flags This adds support for reserving GPRs such that the compiler will not choose a register for register allocation. The implementation follows the same design as for AArch64; each reserved register becomes a target feature and used for getting the reserved registers for a given MachineFunction. The backend checks that it does not need to write to any reserved register; if it does a relevant error is generated. Differential Revision: https://reviews.llvm.org/D67185	2019-10-22 21:25:01 +01:00
Shiva Chen	078bec6c48	[RISCV] Support fast calling convention LLVM may annotate the function with fastcc if there has only one caller and there're no other caller out of the module and the function is not naked or contain variable arguments. The fastcc functions could pass the arguments by the caller saved registers. Differential Revision: https://reviews.llvm.org/D68559 llvm-svn: 374857	2019-10-15 02:04:29 +00:00
Luis Marques	aae97bfd0c	[RISCV] Rename FPRs and use Register arithmetic The new names for FPRs ensure that the Register values within the same class are enumerated consecutively (the order is determined by the `LessRecordRegister` function object). Where there were tables mapping between 32- and 64-bit FPRs (and vice versa) this patch replaces them with Register arithmetic. The enumeration order between different register classes is expected to continue to be arbitrary, although it does impact the conversion from the (overloaded) asm FPR names to Register values, and therefore might require updates to the target if the sorting algorithm is changed. Static asserts were added to ensure that changes to the ordering that would impact the current implementation are detected. Differential Revision: https://reviews.llvm.org/D67423 llvm-svn: 373096	2019-09-27 15:49:10 +00:00
Guillaume Chatelet	18f805a7ea	[Alignment][NFC] Remove unneeded llvm:: scoping on Align types llvm-svn: 373081	2019-09-27 12:54:21 +00:00
Luis Marques	2d0cd6cac8	[RISCV] Fix static analysis issues Unlikely to be problematic but still worth fixing. Differential Revision: https://reviews.llvm.org/D67640 llvm-svn: 372391	2019-09-20 13:48:02 +00:00
Guillaume Chatelet	ad1cea0dda	[Alignment][NFC] Use Align with TargetLowering::setPrefFunctionAlignment Summary: This is patch is part of a series to introduce an Alignment type. See this thread for context: http://lists.llvm.org/pipermail/llvm-dev/2019-July/133851.html See this patch for the introduction of the type: https://reviews.llvm.org/D64790 Reviewers: courbet Subscribers: nemanjai, javed.absar, hiraditya, kbarton, asb, rbar, johnrusso, simoncook, apazos, sabuasal, niosHD, jrtc27, MaskRay, zzheng, edward-jones, rogfer01, MartinMosbeck, brucehoult, the_o, PkmX, jocewei, jsji, s.egerton, pzheng, ychen, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D67267 llvm-svn: 371212	2019-09-06 15:03:49 +00:00
Guillaume Chatelet	4fc3ad9e13	[Alignment][NFC] Use Align with TargetLowering::setMinFunctionAlignment Summary: This is patch is part of a series to introduce an Alignment type. See this thread for context: http://lists.llvm.org/pipermail/llvm-dev/2019-July/133851.html See this patch for the introduction of the type: https://reviews.llvm.org/D64790 Reviewers: courbet Subscribers: jyknight, sdardis, nemanjai, javed.absar, hiraditya, kbarton, fedor.sergeev, asb, rbar, johnrusso, simoncook, apazos, sabuasal, niosHD, jrtc27, MaskRay, zzheng, edward-jones, atanasyan, rogfer01, MartinMosbeck, brucehoult, the_o, PkmX, jocewei, jsji, s.egerton, pzheng, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D67229 llvm-svn: 371200	2019-09-06 12:48:34 +00:00
Guillaume Chatelet	aff45e4b23	[LLVM][Alignment] Make functions using log of alignment explicit Summary: This patch renames functions that takes or returns alignment as log2, this patch will help with the transition to llvm::Align. The renaming makes it explicit that we deal with log(alignment) instead of a power of two alignment. A few renames uncovered dubious assignments: - `MirParser`/`MirPrinter` was expecting powers of two but `MachineFunction` and `MachineBasicBlock` were using deal with log2(align). This patch fixes it and updates the documentation. - `MachineBlockPlacement` exposes two flags (`align-all-blocks` and `align-all-nofallthru-blocks`) supposedly interpreted as power of two alignments, internally these values are interpreted as log2(align). This patch updates the documentation, - `MachineFunctionexposes` exposes `align-all-functions` also interpreted as power of two alignment, internally this value is interpreted as log2(align). This patch updates the documentation, Reviewers: lattner, thegameg, courbet Subscribers: dschuff, arsenm, jyknight, dylanmckay, sdardis, nemanjai, jvesely, nhaehnle, javed.absar, hiraditya, kbarton, fedor.sergeev, asb, rbar, johnrusso, simoncook, apazos, sabuasal, niosHD, jrtc27, MaskRay, zzheng, edward-jones, atanasyan, rogfer01, MartinMosbeck, brucehoult, the_o, dexonsmith, PkmX, jocewei, jsji, Jim, s.egerton, llvm-commits, courbet Tags: #llvm Differential Revision: https://reviews.llvm.org/D65945 llvm-svn: 371045	2019-09-05 10:00:22 +00:00
Jim Lin	b77aa1d248	[RISCV] Enable tail call opt for variadic function Summary: Tail call opt can treat variadic function call the same as normal function call Reviewers: mgrang, asb, lenary, lewis-revill Reviewed By: lenary Subscribers: luismarques, pzheng, rbar, johnrusso, simoncook, apazos, sabuasal, niosHD, kito-cheng, shiva0217, jrtc27, MaskRay, zzheng, edward-jones, rogfer01, MartinMosbeck, brucehoult, the_o, rkruppe, PkmX, jocewei, psnobl, benna, s.egerton, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D66278 llvm-svn: 370835	2019-09-04 02:03:36 +00:00
Shiva Chen	b39876d8cd	[RISCV] Avoid generating AssertZext for LP64 ABI when lowering floating LibCall The patch fixed the issue that RV64 didn't clear the upper bits when return complex floating value with lp64 ABI. float _Complex complex_add(float _Complex a, float _Complex b) { return a + b; } RealResult = zero_extend(RealA + RealB) ImageResult = ImageA + ImageB Return (RealResult \| (ImageResult << 32)) The patch introduces shouldExtendTypeInLibCall target hook to suppress the AssertZext generation when lowering floating LibCall. Thanks to Eli's comments from the Bugzilla https://bugs.llvm.org/show_bug.cgi?id=42820 Differential Revision: https://reviews.llvm.org/D65497 llvm-svn: 370275	2019-08-28 23:40:37 +00:00
Benjamin Kramer	dc5f805d31	Do a sweep of symbol internalization. NFC. llvm-svn: 369803	2019-08-23 19:59:23 +00:00
Luis Marques	fa06e95898	[RISCV] Convert registers from unsigned to Register Only in public interfaces that have not yet been converted should there remain registers with unsigned type. Differential Revision: https://reviews.llvm.org/D66252 llvm-svn: 369114	2019-08-16 14:27:50 +00:00
Lewis Revill	7abf863f76	[RISCV] Lower inline asm constraint A for RISC-V This allows arguments with the constraint A to be lowered to input nodes for RISC-V, which implies a memory address stored in a register. This patch adds the minimal amount of code required to get operands with the right constraints to compile. https://reviews.llvm.org/D54296 llvm-svn: 369095	2019-08-16 10:28:34 +00:00
Daniel Sanders	3836874dbb	[risc-v] Apply llvm-prefer-register-over-unsigned from clang-tidy to LLVM Summary: This clang-tidy check is looking for unsigned integer variables whose initializer starts with an implicit cast from llvm::Register and changes the type of the variable to llvm::Register (dropping the llvm:: where possible). Depends on D65919 Reviewers: lenary Subscribers: jholewinski, MatzeB, qcolombet, dschuff, jyknight, dylanmckay, sdardis, nemanjai, jvesely, wdng, nhaehnle, sbc100, jgravelle-google, kristof.beyls, hiraditya, aheejin, kbarton, fedor.sergeev, javed.absar, asb, rbar, johnrusso, simoncook, apazos, sabuasal, niosHD, jrtc27, MaskRay, zzheng, edward-jones, atanasyan, rogfer01, MartinMosbeck, brucehoult, the_o, tpr, PkmX, jocewei, jsji, Petar.Avramovic, asbirlea, Jim, s.egerton, llvm-commits Tags: #llvm Differential Revision for full review was: https://reviews.llvm.org/D65962 llvm-svn: 368629	2019-08-12 22:41:02 +00:00
Sam Elliott	fee242aed4	[RISCV] Fix ICE in isDesirableToCommuteWithShift Summary: Ana Pazos reported a bug where we were not checking that an APInt would fit into 64-bits before calling `getSExtValue()`. This caused asserts when compiling large constants, such as i128s, as happens when compiling compiler-rt. This patch adds a testcase and makes the callback less error-prone. Reviewers: apazos, asb, luismarques Reviewed By: luismarques Subscribers: hiraditya, rbar, johnrusso, simoncook, sabuasal, niosHD, kito-cheng, shiva0217, jrtc27, MaskRay, zzheng, edward-jones, rogfer01, MartinMosbeck, brucehoult, the_o, rkruppe, PkmX, jocewei, psnobl, benna, Jim, s.egerton, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D66081 llvm-svn: 368572	2019-08-12 13:51:00 +00:00
Sam Elliott	856d5c5817	[RISCV] Allow ABI Names in Inline Assembly Constraints Summary: Clang will replace references to registers using ABI names in inline assembly constraints with references to architecture names, but other frontends do not. LLVM uses the regular assembly parser to parse inline asm, so inline assembly strings can contain references to registers using their ABI names. This patch adds support for parsing constraints using either the ABI name or the architectural register name. This means we do not need to implement the ABI name replacement code in every single frontend, especially those like Rust which are a very thin shim on top of LLVM IR's inline asm, and that constraints can more closely match the assembly strings they refer to. Reviewers: asb, simoncook Reviewed By: simoncook Subscribers: hiraditya, rbar, johnrusso, JDevlieghere, apazos, sabuasal, niosHD, kito-cheng, shiva0217, jrtc27, MaskRay, zzheng, edward-jones, rogfer01, MartinMosbeck, brucehoult, the_o, rkruppe, PkmX, jocewei, psnobl, benna, Jim, s.egerton, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D65947 llvm-svn: 368303	2019-08-08 14:59:16 +00:00
Shiva Chen	b12056bd33	[RISCV] Custom legalize i32 operations for RV64 to reduce signed extensions Differential Revision: https://reviews.llvm.org/D65434 llvm-svn: 367960	2019-08-06 00:24:00 +00:00
Guillaume Chatelet	c97a3d15d2	[LLVM][Alignment] Introduce Alignment Type Summary: This is patch is part of a serie to introduce an Alignment type. See this thread for context: http://lists.llvm.org/pipermail/llvm-dev/2019-July/133851.html See this patch for the introduction of the type: https://reviews.llvm.org/D64790 Reviewers: courbet, jfb, jakehehrlich Reviewed By: jfb Subscribers: wuzish, jholewinski, arsenm, dschuff, nemanjai, jvesely, nhaehnle, javed.absar, sbc100, jgravelle-google, hiraditya, aheejin, kbarton, asb, rbar, johnrusso, simoncook, apazos, sabuasal, niosHD, jrtc27, MaskRay, zzheng, edward-jones, rogfer01, MartinMosbeck, brucehoult, the_o, dexonsmith, PkmX, jocewei, jsji, s.egerton, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D65514 llvm-svn: 367828	2019-08-05 11:02:05 +00:00
Bill Wendling	41a2847a9a	Emit diagnostic if an inline asm constraint requires an immediate Summary: An inline asm call can result in an immediate after inlining. Therefore emit a diagnostic here if constraint requires an immediate but one isn't supplied. Reviewers: joerg, mgorny, efriedma, rsmith Reviewed By: joerg Subscribers: asb, rbar, johnrusso, simoncook, apazos, sabuasal, niosHD, zzheng, edward-jones, rogfer01, MartinMosbeck, brucehoult, the_o, PkmX, jocewei, s.egerton, MaskRay, jyknight, dylanmckay, javed.absar, fedor.sergeev, jrtc27, Jim, krytarowski, eraman, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D60942 llvm-svn: 367750	2019-08-03 05:52:47 +00:00
Sam Elliott	9e6b2e1605	[RISCV] Support 'f' Inline Assembly Constraint Summary: This adds the 'f' inline assembly constraint, as supported by GCC. An 'f'-constrained operand is passed in a floating point register. Exactly which kind of floating-point register (32-bit or 64-bit) is decided based on the operand type and the available standard extensions (-f and -d, respectively). This patch adds support in both the clang frontend, and LLVM itself. Reviewers: asb, lewis-revill Reviewed By: asb Subscribers: hiraditya, rbar, johnrusso, simoncook, apazos, sabuasal, niosHD, kito-cheng, shiva0217, jrtc27, MaskRay, zzheng, edward-jones, rogfer01, MartinMosbeck, brucehoult, the_o, rkruppe, PkmX, jocewei, psnobl, benna, Jim, s.egerton, cfe-commits, llvm-commits Tags: #clang, #llvm Differential Revision: https://reviews.llvm.org/D65500 llvm-svn: 367403	2019-07-31 09:45:55 +00:00
Simon Cook	8d7ec4d644	[RISCV] Add support for lowering floating point inlineasm clobbers This adds the required extension to RISC-V's getRegForInlineAsmConstraint in order to be able to correctly distringuish between the 32 and 64-bit floating point registers when the generic fX name appears in inlineasm clobber contraints. It also adds a check to validate that callee saved floating point registers are only saved in this case when a hard-float ABI is selected. Differential Revision: https://reviews.llvm.org/D64751 llvm-svn: 367397	2019-07-31 09:07:21 +00:00
Alex Bradbury	b8d352a08b	[RISCV] Reset NoPHIS MachineFunctionProperty in emitSelectPseudo We insered PHIS were there were none before, so the property must be reset. This error was found on an EXPENSIVE_CHECKS build. llvm-svn: 366412	2019-07-18 07:52:41 +00:00
Sam Elliott	114d2db49b	[RISCV] Fix ICE in isDesirableToCommuteWithShift Summary: There was an error being thrown from isDesirableToCommuteWithShift in some tests. This was tracked down to the method being called before legalisation, with an extended value type, not a machine value type. In the case I diagnosed, the error was only hit with an instruction sequence involving `i24`s in the add and shift. `i24` is not a Machine ValueType, it is instead an Extended ValueType which was causing the issue. I have added a test to cover this case, and fixed the error in the callback. Reviewers: asb, luismarques Reviewed By: asb Subscribers: hiraditya, rbar, johnrusso, simoncook, apazos, sabuasal, niosHD, kito-cheng, shiva0217, jrtc27, MaskRay, zzheng, edward-jones, rogfer01, MartinMosbeck, brucehoult, the_o, rkruppe, PkmX, jocewei, psnobl, benna, Jim, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D64425 llvm-svn: 365511	2019-07-09 16:24:16 +00:00
Alex Bradbury	0b9addb8c0	[RISCV] Specify registers used in DWARF exception handling Defines RISCV registers for getExceptionPointerRegister() and getExceptionSelectorRegister(). Differential Revision: https://reviews.llvm.org/D63411 Patch by Edward Jones. Modified by Alex Bradbury to add CHECK lines to exception-pointer-register.ll. llvm-svn: 365301	2019-07-08 09:16:47 +00:00
Sam Elliott	b2c9eed0d7	[RISCV] Support @llvm.readcyclecounter() Intrinsic On RISC-V, the `cycle` CSR holds a 64-bit count of the number of clock cycles executed by the core, from an arbitrary point in the past. This matches the intended semantics of `@llvm.readcyclecounter()`, which we currently leave to the default lowering (to the constant 0). With this patch, we will now correctly lower this intrinsic to the intended semantics, using the user-space instruction `rdcycle`. On 64-bit targets, we can directly lower to this instruction. On 32-bit targets, we need to do more, as `rdcycle` only returns the low 32-bits of the `cycle` CSR. In this case, we perform a custom lowering, based on the PowerPC lowering, using `rdcycleh` to obtain the high 32-bits of the `cycle` CSR. This custom lowering inserts a new basic block which detects overflow in the high 32-bits of the `cycle` CSR during reading (because multiple instructions are required to read). The emitted assembly matches the suggested assembly in the RISC-V specification. Differential Revision: https://reviews.llvm.org/D64125 llvm-svn: 365201	2019-07-05 12:35:21 +00:00
Lewis Revill	39263ac5d1	[RISCV] Add lowering of global TLS addresses This patch adds lowering for global TLS addresses for the TLS models of InitialExec, GlobalDynamic, LocalExec and LocalDynamic. LocalExec support required using a 4-operand add instruction, which uses the fourth operand to express a relocation on the symbol. The necessary fixup is emitted when the instruction is emitted. Differential Revision: https://reviews.llvm.org/D55305 llvm-svn: 363771	2019-06-19 08:40:59 +00:00
Sam Elliott	9f155bc6e5	[RISCV] Prevent re-ordering some adds after shifts Summary: DAGCombine will normally turn a `(shl (add x, c1), c2)` into `(add (shl x, c2), c1 << c2)`, where `c1` and `c2` are constants. This can be prevented by a callback in TargetLowering. On RISC-V, materialising the constant `c1 << c2` can be more expensive than materialising `c1`, because materialising the former may take more instructions, and may use a register, where materialising the latter would not. This patch implements the hook in RISCVTargetLowering to prevent this transform, in the cases where: - `c1` fits into the immediate field in an `addi` instruction. - `c1` takes fewer instructions to materialise than `c1 << c2`. In future, DAGCombine could do the check to see whether `c1` fits into an add immediate, which might simplify more targets hooks than just RISC-V. Reviewers: asb, luismarques, efriedma Reviewed By: asb Subscribers: xbolva00, lebedev.ri, craig.topper, lewis-revill, Jim, hiraditya, rbar, johnrusso, simoncook, apazos, sabuasal, niosHD, kito-cheng, shiva0217, jrtc27, zzheng, edward-jones, rogfer01, MartinMosbeck, brucehoult, the_o, rkruppe, PkmX, jocewei, psnobl, benna, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D62857 llvm-svn: 363736	2019-06-18 20:38:08 +00:00
Lewis Revill	74c8364954	[RISCV] Lower calls through PLT This patch adds support for generating calls through the procedure linkage table where required for a given ExternalSymbol or GlobalAddress callee. Differential Revision: https://reviews.llvm.org/D55304 llvm-svn: 363686	2019-06-18 14:29:45 +00:00
Lewis Revill	a5240361dd	[RISCV] Add lowering of addressing sequences for PIC This patch allows lowering of PIC addresses by using PC-relative addressing for DSO-local symbols and accessing the address through the global offset table for non-DSO-local symbols. Differential Revision: https://reviews.llvm.org/D55303 llvm-svn: 363058	2019-06-11 12:57:47 +00:00
Lewis Revill	28a5cadb3a	[RISCV] Lower inline asm constraints I, J & K for RISC-V This validates and lowers arguments to inline asm nodes which have the constraints I, J & K, with the following semantics (equivalent to GCC): I: Any 12-bit signed immediate. J: Immediate integer zero only. K: Any 5-bit unsigned immediate. Differential Revision: https://reviews.llvm.org/D54093 llvm-svn: 363054	2019-06-11 12:42:13 +00:00
Sam Elliott	f720647ddd	[RISCV] Support Bit-Preserving FP in F/D Extensions Summary: This allows some integer bitwise operations to instead be performed by hardware fp instructions. This is correct because the RISC-V spec requires the F and D extensions to use the IEEE-754 standard representation, and fp register loads and stores to be bit-preserving. This is tested against the soft-float ABI, but with hardware float extensions enabled, so that the tests also ensure the optimisation also fires in this case. Reviewers: asb, luismarques Reviewed By: asb Subscribers: hiraditya, rbar, johnrusso, simoncook, apazos, sabuasal, niosHD, kito-cheng, shiva0217, jrtc27, zzheng, edward-jones, rogfer01, MartinMosbeck, brucehoult, the_o, rkruppe, PkmX, jocewei, psnobl, benna, Jim, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D62900 llvm-svn: 362790	2019-06-07 12:20:14 +00:00
Luis Marques	20d2424016	[RISCV] Custom lower SHL_PARTS, SRA_PARTS, SRL_PARTS When not optimizing for minimum size (-Oz) we custom lower wide shifts (SHL_PARTS, SRA_PARTS, SRL_PARTS) instead of expanding to a libcall. Differential Revision: https://reviews.llvm.org/D59477 llvm-svn: 358498	2019-04-16 14:38:32 +00:00
Lewis Revill	24a74096a4	Test commit: Remove double variable assignment llvm-svn: 357601	2019-04-03 15:54:30 +00:00
Alex Bradbury	44668ae7c7	[RISCV] Attach VK_RISCV_CALL to symbols upon creation This patch replaces the addition of VK_RISCV_CALL in RISCVMCCodeEmitter by creating the RISCVMCExpr when tail/call are parsed, or in the codegen case when the callee symbols are created. This required adding a new CallSymbol operand to allow only adding VK_RISCV_CALL to tail/call instructions. This patch will allow further expansion of parsing and codegen to easily include PLT symbols which must generate the R_RISCV_CALL_PLT relocation. Differential Revision: https://reviews.llvm.org/D55560 Patch by Lewis Revill. llvm-svn: 357396	2019-04-01 14:53:17 +00:00
Alex Bradbury	da20f5ca74	[RISCV] Generate address sequences suitable for mcmodel=medium This patch adds an implementation of a PC-relative addressing sequence to be used when -mcmodel=medium is specified. With absolute addressing, a 'medium' codemodel may cause addresses to be out of range. This is because while 'medium' implies a 2 GiB addressing range, this 2 GiB can be at any offset as opposed to 'small', which implies the first 2 GiB only. Note that LLVM/Clang currently specifies code models differently to GCC, where small and medium imply the same functionality as GCC's medlow and medany respectively. Differential Revision: https://reviews.llvm.org/D54143 Patch by Lewis Revill. llvm-svn: 357393	2019-04-01 14:42:56 +00:00
Luis Marques	3091884e25	[RISCV] Add seto pattern expansion Adds a `seto` pattern expansion. Without it the lowerings of `fcmp one` and `fcmp ord` would be inefficient due to an unoptimized double negation. Differential Revision: https://reviews.llvm.org/D59699 llvm-svn: 357378	2019-04-01 09:54:14 +00:00
Alex Bradbury	0b2803ee65	[RISCV] Add codegen support for ilp32f, ilp32d, lp64f, and lp64d ("hard float") ABIs This patch adds support for the RISC-V hard float ABIs, building on top of rL355771, which added basic target-abi parsing and MC layer support. It also builds on some re-organisations and expansion of the upstream ABI and calling convention tests which were recently committed directly upstream. A number of aspects of the RISC-V float hard float ABIs require frontend support (e.g. flattening of structs and passing int+fp for fp+fp structs in a pair of registers), and will be addressed in a Clang patch. As can be seen from the tests, it would be worthwhile extending RISCVMergeBaseOffsets to handle constant pool as well as global accesses. Differential Revision: https://reviews.llvm.org/D59357 llvm-svn: 357352	2019-03-30 17:59:30 +00:00
Alex Bradbury	9681b01c21	[RISCV] Add DAGCombine for (SplitF64 (ConstantFP x)) The SplitF64 node is used on RV32D to convert an f64 directly to a pair of i32 (necessary as bitcasting to i64 isn't legal). When performed on a ConstantFP, this will result in a FP load from the constant pool followed by a store to the stack and two integer loads from the stack (necessary as there is no way to directly move between f64 FPRs and i32 GPRs on RV32D). It's always cheaper to just materialise integers for the lo and hi parts of the FP constant, so do that instead. llvm-svn: 357341	2019-03-30 09:15:47 +00:00
Alex Bradbury	dab1f6fc4e	[RISCV] Add basic RV32E definitions and MC layer support The RISC-V ISA defines RV32E as an alternative "base" instruction set encoding, that differs from RV32I by having only 16 rather than 32 registers. This patch adds basic definitions for RV32E as well as MC layer support (assembling, disassembling) and tests. The only supported ABI on RV32E is ILP32E. Add a new RISCVFeatures::validate() helper to RISCVUtils which can be called from codegen or MC layer libraries to validate the combination of TargetTriple and FeatureBitSet. Other targets have similar checks (e.g. erroring if SPE is enabled on PPC64 or oddspreg + o32 ABI on Mips), but they either duplicate the checks (Mips), or fail to check for both codegen and MC codepaths (PPC). Codegen for the ILP32E ABI support and RV32E codegen are left for a future patch/patches. Differential Revision: https://reviews.llvm.org/D59470 llvm-svn: 356744	2019-03-22 11:21:40 +00:00
Alex Bradbury	b9e78c3994	[RISCV] Optimize emission of SELECT sequences This patch optimizes the emission of a sequence of SELECTs with the same condition, avoiding the insertion of unnecessary control flow. Such a sequence often occurs when a SELECT of values wider than XLEN is legalized into two SELECTs with legal types. We have identified several use cases where the SELECTs could be interleaved with other instructions. Therefore, we extend the sequence to include non-SELECT instructions if we are able to detect that the non-SELECT instructions do not impact the optimization. This patch supersedes https://reviews.llvm.org/D59096, which attempted to address this issue by introducing a new SelectionDAG node. Hat tip to Eli Friedman for his feedback on how to best handle this issue. Differential Revision: https://reviews.llvm.org/D59355 Patch by Luís Marques. llvm-svn: 356741	2019-03-22 10:45:03 +00:00
Alex Bradbury	2c6c84e52c	[RISCV][NFC] Convert some MachineBaiscBlock::iterator(MI) to MI.getIterator() llvm-svn: 355864	2019-03-11 20:43:29 +00:00
Alex Bradbury	62c8a57a74	[RISCV][NFC] Minor refactoring of CC_RISCV Immediately check if we need to early-exit as we have a return value that can't be returned directly. Also tweak following if/else. llvm-svn: 355773	2019-03-09 11:16:27 +00:00
Alex Bradbury	bd0eff316a	[RISCV][NFC] Split out emitSelectPseudo from EmitInstrWithCustomInserter It's cleaner and more consistent to have a separate helper function here. llvm-svn: 355772	2019-03-09 09:30:14 +00:00
Alex Bradbury	fea4957177	[RISCV] Support -target-abi at the MC layer and for codegen This patch adds proper handling of -target-abi, as accepted by llvm-mc and llc. Lowering (codegen) for the hard-float ABIs will follow in a subsequent patch. However, this patch does add MC layer support for the hard float and RVE ABIs (emission of the appropriate ELF flags https://github.com/riscv/riscv-elf-psabi-doc/blob/master/riscv-elf.md#-file-header). ABI parsing must be shared between codegen and the MC layer, so we add computeTargetABI to RISCVUtils. A warning will be printed if an invalid or unrecognized ABI is given. Differential Revision: https://reviews.llvm.org/D59023 llvm-svn: 355771	2019-03-09 09:28:06 +00:00
Alex Bradbury	db67be889d	[RISCV][NFC] IsEligibleForTailCallOptimization -> isEligibleForTailCallOptimization Also clang-format the modified hunks. llvm-svn: 354584	2019-02-21 14:31:41 +00:00
Alex Bradbury	7539fa2c2d	[RISCV] Implement RV64D codegen This patch: * Adds necessary RV64D codegen patterns * Modifies CC_RISCV so it will properly handle f64 types (with soft float ABI) Note that in general there is no reason to try to select fcvt.w[u].d rather than fcvt.l[u].d for i32 conversions because fptosi/fptoui produce poison if the input won't fit into the target type. Differential Revision: https://reviews.llvm.org/D53237 llvm-svn: 352833	2019-02-01 03:53:30 +00:00
Alex Bradbury	d834d8301d	[RISCV] Add RV64F codegen support This requires a little extra work due tothe fact i32 is not a legal type. When call lowering happens post-legalisation (e.g. when an intrinsic was inserted during legalisation). A bitcast from f32 to i32 can't be introduced. This is similar to the challenges with RV32D. To handle this, we introduce target-specific DAG nodes that perform bitcast+anyext for f32->i64 and trunc+bitcast for i64->f32. Differential Revision: https://reviews.llvm.org/D53235 llvm-svn: 352807	2019-01-31 22:48:38 +00:00
Alex Bradbury	0092df0669	[RISCV] Add target DAG combine for bitcast fabs/fneg on RV32FD DAGCombiner::visitBITCAST will perform: fold (bitconvert (fneg x)) -> (xor (bitconvert x), signbit) fold (bitconvert (fabs x)) -> (and (bitconvert x), (not signbit)) As shown in double-bitmanip-dagcombines.ll, this can be advantageous. But RV32FD doesn't use bitcast directly (as i64 isn't a legal type), and instead uses RISCVISD::SplitF64. This patch adds an equivalent DAG combine for SplitF64. llvm-svn: 352247	2019-01-25 21:55:48 +00:00
Alex Bradbury	456d3798d6	[RISCV] Custom-legalise i32 SDIV/UDIV/UREM on RV64M Follow the same custom legalisation strategy as used in D57085 for variable-length shifts (see that patch summary for more discussion). Although we may lose out on some late-stage DAG combines, I think this custom legalisation strategy is ultimately easier to reason about. There are some codegen changes in rv64m-exhaustive-w-insts.ll but they are all neutral in terms of the number of instructions. Differential Revision: https://reviews.llvm.org/D57096 llvm-svn: 352171	2019-01-25 05:11:34 +00:00
Alex Bradbury	299d690a50	[RISCV] Custom-legalise 32-bit variable shifts on RV64 The previous DAG combiner-based approach had an issue with infinite loops between the target-dependent and target-independent combiner logic (see PR40333). Although this was worked around in rL351806, the combiner-based approach is still potentially brittle and can fail to select the 32-bit shift variant when profitable to do so, as demonstrated in the pr40333.ll test case. This patch instead introduces target-specific SelectionDAG nodes for SHLW/SRLW/SRAW and custom-lowers variable i32 shifts to them. pr40333.ll is a good example of how this approach can improve codegen. This adds DAG combine that does SimplifyDemandedBits on the operands (only lower 32-bits of first operand and lower 5 bits of second operand are read). This seems better than implementing SimplifyDemandedBitsForTargetNode as there is no guarantee that would be called (and it's not for e.g. the anyext return test cases). Also implements ComputeNumSignBitsForTargetNode. There are codegen changes in atomic-rmw.ll and atomic-cmpxchg.ll but the new instruction sequences are semantically equivalent. Differential Revision: https://reviews.llvm.org/D57085 llvm-svn: 352169	2019-01-25 05:04:00 +00:00
Matt Arsenault	39508331ef	Reapply "IR: Add fp operations to atomicrmw" This reapplies commits r351778 and r351782 with RISCV test fixes. llvm-svn: 351850	2019-01-22 18:18:02 +00:00

... 7 8 9 10 11 ...

909 Commits