llvm-project

Commit Graph

Author	SHA1	Message	Date
Craig Topper	c579ab53bd	[RISCV] Move vfma_vl+fneg_vl matching to DAG combine. This patch adds 3 new _VL RISCVISD opcodes to represent VFMA_VL with different portions negated. It also adds a DAG combine to peek through FNEG_VL to create these new opcodes. This is modeled after similar code from X86. This makes the isel patterns more regular and reduces the size of the isel table by ~37K. The test changes look like regressions, but they point to a bug that was already there. We aren't able to commute a masked FMA instruction to improve register allocation because we always use a mask undisturbed policy. Prior to this patch we matched two multiply operands in a different order and hid this issue for these test cases, but a different test still could have encountered it. Reviewed By: frasercrmck Differential Revision: https://reviews.llvm.org/D128310	2022-06-24 00:00:37 -07:00
Craig Topper	f912d21e67	[RISCV] Add RISCVISD opcodes for the rest of getAddr. This adds RISCVISD opccodes for LA, LA_TLS_IE, and LA_TLS_GD to remove creation of MachineSDNodes form getAddr. This makes the code consistent with the previous patches that added RISCVISD::HI, ADD_LO, LLA, and TPREL_ADD. Reviewed By: asb Differential Revision: https://reviews.llvm.org/D128325	2022-06-22 09:21:07 -07:00
Craig Topper	0efbf5bfbb	[RISCV] Move the passthru operand for RISCVISD::VRGATHER*_VL nodes. NFC Put it before the VL instead of as the first operand. I want to add passthru to more operands, but the commutable ones like VADD_VL require the commutable operands to be operand 0 and 1. So we can't have the passthru as operand 0 for those.	2022-06-21 14:01:02 -07:00
Craig Topper	e01353f816	[RISCV] Add RISCVISD opcode for PseudoAddTPRel. Use it along with RISCVISD::HI and ADD_LO to avoid emitting MachineSDNodes during lowering.	2022-06-20 20:56:52 -07:00
Craig Topper	16d3a82de5	[RISCV] Add merge operand to RISCVISD::VRGATHER_VL nodes. Use it in place of VSELECT_VL+VRGATHER_VL. This simplifies the isel patterns. Overall, I think trying to match select+op to create masked instructions in isel doesn't scale. We either need to do it in DAG combine, pre-isel peepole, or post-isel peephole. I don't yet know which is the right answer, but for this case it seemed best to be able to request the masked form directly from lowering. Reviewed By: frasercrmck Differential Revision: https://reviews.llvm.org/D128023	2022-06-20 18:58:24 -07:00
Craig Topper	9d7b01dc95	[RISCV] Implement RISCVTargetLowering::getTargetConstantFromLoad. This allows computeKnownBits to see the constant being loaded. This recovers the rv64zbp test case changes from D127520. Reviewed By: reames Differential Revision: https://reviews.llvm.org/D127679	2022-06-16 15:11:18 -07:00
Craig Topper	5afdceb82b	[RISCV] Add RISCVISD opcode for PseudoLLA. Rather than emitting a MachineSDNode from lowering. Let isel match it. This is consistent with the RISCVISD::HI and ADD_LO nodes that were also added. Having them both the same will make D127679 consistent. Reviewed By: reames Differential Revision: https://reviews.llvm.org/D127714	2022-06-16 15:11:03 -07:00
Craig Topper	4191de262f	[RISCV] Don't emit LUI/ADDI MachineSDNodes from getAddr Instead add RISCVISD opcodes that will be selected to LUI/ADDI during isel. I'm looking into maybe moving doPeepholeLoadStoreADDI into isel. Having the ADDI as a RISCVISD node will make it visible to isel. Reviewed By: reames Differential Revision: https://reviews.llvm.org/D127713	2022-06-16 14:56:07 -07:00
Craig Topper	cef03e3dcd	[RISCV] Move creation of constant pools from isel to lowering. This simplifies the isel code by removing the manual load creation. It also improves our ability to use 0 strided loads for vector splats. There is an assumption here that Mask and ShiftedMask constants are cheap enough that they don't become constant pool loads so that our isel optimizations involving And still work. I believe those constants are 3 instructions in the worst case. The rv64zbp-intrinsic.ll changes is a regression caused by intrinsics being expanded to RISCVISD also occuring during lowering. So the optimizations were only happening during the last DAGCombine, which can't see through the load. I believe we can fix this test by implementing TargetLowering::getTargetConstantFromLoad for RISC-V or by adding the intrinsic to computeKnownBitsForTargetNode to enable earlier DAG combine. Since Zbp is not a ratified extension, I don't view these as blocking this patch. Reviewed By: reames Differential Revision: https://reviews.llvm.org/D127520	2022-06-13 09:07:57 -07:00
Shao-Ce SUN	862f30a428	[RISCV] Add ISD::EH_DWARF_CFA Based on D24038. LLVM has an @llvm.eh.dwarf.cfa intrinsic, used to lower the GCC-compatible __builtin_dwarf_cfa() builtin. Reviewed By: StephenFan Differential Revision: https://reviews.llvm.org/D126181	2022-06-08 22:03:30 +08:00
Craig Topper	aaad507546	[RISCV] Return false from isOffsetFoldingLegal instead of reversing the fold in lowering. When lowering GlobalAddressNodes, we were removing a non-zero offset and creating a separate ADD. It already comes out of SelectionDAGBuilder with a separate ADD. The ADD was being removed by DAGCombiner. This patch disables the DAG combine so we don't have to reverse it. Test changes all look to be instruction order changes. Probably due to different DAG node ordering. Differential Revision: https://reviews.llvm.org/D126558	2022-05-27 11:05:18 -07:00
Paul Walker	258dac43d6	[SVE] Enable use of 32bit gather/scatter indices for fixed length vectors Differential Revision: https://reviews.llvm.org/D125193	2022-05-22 12:32:30 +01:00
Craig Topper	0ebb02b90a	[RISCV] Override TargetLowering::shouldProduceAndByConstByHoistingConstFromShiftsLHSOfAnd. This hook determines if SimplifySetcc transforms (X & (C l>>/<< Y)) ==/!= 0 into ((X <</l>> Y) & C) ==/!= 0. Where C is a constant and X might be a constant. The default implementation favors doing the transform if X is not a constant. Otherwise the code is left alone. There is a provision that if the target supports a bit test instruction then the transform will favor ((1 << Y) & X) ==/!= 0. RISCV does not say it has a variable bit test operation. RISCV with Zbs does have a BEXT instruction that performs (X >> Y) & 1. Without Zbs, (X >> Y) & 1 still looks preferable to ((1 << Y) & X) since we can fold use ANDI instead of putting a 1 in a register for SLL. This patch overrides this hook to favor bit extract patterns and otherwise falls back to the "do the transform if X is not a constant" heuristic. I've added tests where both C and X are constants with both the shl form and lshr form. I've also added a test for a switch statement that lowers to a bit test. That was my original motivation for looking at this. Reviewed By: asb Differential Revision: https://reviews.llvm.org/D124639	2022-05-11 11:13:17 -07:00
jacquesguan	2509dcd58a	[RISCV] Add rvv codegen support for vp.fpext. This patch adds rvv codegen support for vp.fpext. The lowering of fp_round, vp.fptrunc, fp_extend and vp.fpext share most code so use a common lowering function to handle these four. And this patch changes the intermediate cast from ISD::FP_EXTEND/ISD::FP_ROUND to the RVV VL version op RISCVISD::FP_EXTEND_VL and RISCVISD::FP_ROUND_VL for scalable vectors. Reviewed By: frasercrmck Differential Revision: https://reviews.llvm.org/D123975	2022-05-11 03:28:25 +00:00
Lian Wang	dc0ae8ce18	[RISCV] Support VP_SETCC mask operations Support VP_SETCC mask operations, turn it to logical operation. Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D124438	2022-04-28 08:52:29 +00:00
jacquesguan	25445b94db	[RISCV] Add rvv codegen support for vp.fptrunc. This patch adds rvv codegen support for vp.fptrunc. The lowering of fp_round and vp.fptrunc share most code so use a common lowering function to handle those two, similar to vp.trunc. Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D123841	2022-04-19 01:56:18 +00:00
jacquesguan	1aa4f0bb6c	[RISCV][VP] Add RVV codegen for vp.trunc. Differential Revision: https://reviews.llvm.org/D123579	2022-04-15 02:29:53 +00:00
Craig Topper	2ce2562876	[RISCV][SelectionDAG] Add a hook to sign extend i32 ConstantInt operands of phis on RV64. Materializing constants on RISCV is simpler if the constant is sign extended from i32. By default i32 constant operands of phis are zero extended. This patch adds a hook to allow RISCV to override this for i32. We have an existing isSExtCheaperThanZExt, but it operates on EVT which we don't have at these places in the code. Reviewed By: efriedma Differential Revision: https://reviews.llvm.org/D122951	2022-04-11 14:38:39 -07:00
Craig Topper	e13a44b460	[RISCV] Add lowering for vp.sext and vp.zext. Including mask vector inputs. Reviewed By: frasercrmck, rogfer01 Differential Revision: https://reviews.llvm.org/D123150	2022-04-06 09:59:49 -07:00
Craig Topper	e68257fcee	[RISCV][SelectionDAG] Enable TargetLowering::hasBitTest for masks that fit in ANDI. Modified DAGCombiner to pass the shift the bittest input and the shift amount to hasBitTest. This matches the other call to hasBitTest in TargetLowering.h This is an alternative to D122454. Reviewed By: luismarques Differential Revision: https://reviews.llvm.org/D122458	2022-03-28 12:46:36 -07:00
Craig Topper	cfe533da05	[RISCV] Add lowering for vp.fptosi and vp.sitofp. This as an alternative version of D120641. Starting from the code here https://repo.hca.bsc.es/gitlab/rferrer/llvm-epi/-/raw/EPI/llvm/lib/Target/RISCV/RISCVISelLowering.cpp but with some modifications to how the interim types are calculated, and adding support for f16. Still need to add fptosi for mask vectors. Lots of masked isel patterns added so we can pass the mask through the type changes. Reviewed By: frasercrmck, arcbbb Differential Revision: https://reviews.llvm.org/D122512	2022-03-28 11:06:41 -07:00
Lehua Ding	1648852c98	[RISCV][RVV] Fix vslide1up/down intrinsics overflow bug for SEW=64 on RV32 Reviewed By: craig.topper, kito-cheng Differential Revision: https://reviews.llvm.org/D120899	2022-03-13 18:06:09 +08:00
Craig Topper	d53707508a	[RISCV] Remove RISCVISD::VLE_VL/VSE_VL. Use intrinsics instead. Similar to what we do for other loads/stores, use the intrinsic version that we already have custom isel for. Reviewed By: rogfer01 Differential Revision: https://reviews.llvm.org/D121166	2022-03-09 22:44:28 -08:00
Craig Topper	845bfcede1	[RISCV] Rename 'SplatOperand' to 'ScalarOperand'. NFC vslide1up/down have this flag set, but the value isn't a splat. Rename for clarity. Reviewed By: khchen Differential Revision: https://reviews.llvm.org/D121037	2022-03-07 11:28:32 -08:00
Craig Topper	b9d6e8c441	[RISCV] Lower VECTOR_SPLICE to RVV instructions. This lowers VECTOR_SPLICE of scalable vectors to a slidedown follow by a slideup. Fixed vectors are encouraged to use shufflevector instruction. The equivalent patch for fixed vectors is D119039. I've used a tail agnostic slidedown and limited the VL to only the elements that will not be overwritten by the slideup. The slideup uses VLMax for its VL. It unfortunately uses tail undisturbed policy but it isn't required as there is no tail. We just need the merge operand to carry the bits for the lower portion of the result. Care was taken to ensure that either the slideup or slidedown will be able to use a .vi instruction when the immediate is small. Which one uses the immediate depends on the sign of the immediate. Reviewed By: frasercrmck, ABataev Differential Revision: https://reviews.llvm.org/D119303	2022-03-01 10:10:13 -08:00
Craig Topper	a975ca97c3	[RISCV] Fold (sext_inreg (fmv_x_anyexth X), i16) -> (fmv_x_signexth X). Add a new ISD opcode to represent the sign extending behavior of vmv.x.h. Keep the previous anyext opcode to allow the existing (fmv_x_anyexth (fmv_h_x X)) combine to keep working without needing to generate a sign extend. For fmv.x.w we are able to match the sext_inreg in an isel pattern, but a 16-bit sext_inreg is lowered to a shift pair before isel. This seemed like a larger match than we should do in isel. Reviewed By: asb Differential Revision: https://reviews.llvm.org/D118974	2022-02-24 09:19:01 -08:00
Craig Topper	c7d6448d03	[DAGCombiner][TargetLowering] Pass SDValue by value to isMulAddWithConstProfitable. Internally to DAGCombiner the SDValues were passed by non-const reference despite not being modified. They were then passed by const reference to TLI. This patch passes them by value which is consistent with the vast majority of code. Reviewed By: RKSimon Differential Revision: https://reviews.llvm.org/D120420	2022-02-23 12:40:45 -08:00
Zakk Chen	eeb7754f68	[RISCV] Add the passthru operand for vmv.vv/vmv.vx/vfmv.vf IR intrinsics. Add the passthru operand for VMV_V_X_VL, VFMV_V_F_VL and SPLAT_VECTOR_SPLIT_I64_VL also. The goal is support tail and mask policy in RVV builtins. We focus on IR part first. If the passthru operand is undef, we use tail agnostic, otherwise use tail undisturbed. Reviewed By: rogfer01 Differential Revision: https://reviews.llvm.org/D119688	2022-02-17 06:38:14 -08:00
Zakk Chen	b784719904	[RISCV] Add the passthru operand for RVV nomask binary intrinsics. The goal is support tail and mask policy in RVV builtins. We focus on IR part first. If the passthru operand is undef, we use tail agnostic, otherwise use tail undisturbed. Add passthru operand for VSLIDE1UP_VL and VSLIDE1DOWN_VL to support i64 scalar in rv32. The masked VSLIDE1 would only emit mask undisturbed policy regardless of giving mask agnostic policy until InsertVSETVLI supports mask agnostic. Reviewed by: craig.topper, rogfer01 Differential Revision: https://reviews.llvm.org/D117989	2022-02-15 18:36:18 -08:00
Craig Topper	c1cef111a3	Revert "[RISCV] Fold (sext_inreg (fmv_x_anyexth X), i16) -> (fmv_x_signexth X)." This reverts commit `673d68cd92`. This hadn't been reviewed yet.	2022-02-05 12:51:01 -08:00
Craig Topper	673d68cd92	[RISCV] Fold (sext_inreg (fmv_x_anyexth X), i16) -> (fmv_x_signexth X). Add a new ISD opcode to represent the sign extending behavior of vmv.x.h. Keep the previous anyext opcode to allow the existing (fmv_x_anyexth (fmv_h_x X)) combine to keep working without needing to generate a sign extend. For fmv.x.w we are able to match the sext_inreg in an isel pattern, but a 16-bit sext_inreg is lowered to a shift pair before isel. This seemed like a larger match than we should do in isel. Differential Revision: https://reviews.llvm.org/D118974	2022-02-05 12:42:12 -08:00
Craig Topper	2349fb0312	[RISCV] Remove RISCVISD::SPLAT_VECTOR_I64 in favor of RISCVISD::VMV_V_X_VL. SPLAT_VECTOR_I64 has the same semantics as RISCVISD::VMV_V_X_VL, it just assumed VLMax instead of carrying a VL operand. Include order of RISCVInstrInfoVSDPatterns.td and RISCVInstrInfoVVLPatterns.td has been swapped to avoid moving riscv_vmv_v_x_vl into RISCVInstrInfoVSDPatterns.td and to allow moving other "_vl" SDNodes back to RISCVInstrInfoVVLPatterns.td Reviewed By: frasercrmck Differential Revision: https://reviews.llvm.org/D118841	2022-02-03 08:30:25 -08:00
Craig Topper	b73d151a11	[RISCV] Add DAG combines to transform ADD_VL/SUB_VL into widening add/sub. This adds or reuses ISD opcodes for vadd.wv, vaddu.wv, vadd.vv, vaddu.vv and a similar set for sub. I've included support for narrowing scalar splats that have known sign/zero bits similar to what was done for MUL_VL. The conversion to vwadd.vv proceeds in two phases. First we'll form a vwadd.wv by narrowing one of the operands. Then we'll visit the vwadd.wv to try to narrow the other operand. This turned out to be simpler than catching all the cases in one step. The forming of of vwadd.wv can happen for either operand for add, but only the right hand side for sub since sub isn't commutable. An interesting quirk is that ADD_VL and VZEXT_VL/VSEXT_VL are formed during vector op legalization, but VMV_V_X_VL isn't usually formed until op legalization when BUILD_VECTORS are handled. This leads to VWADD_W_VL forming in one DAG combine round, and then a later DAG combine round sees the VMV_V_X_VL and needs to commute the operands to get the splat in position. This alone necessitated a VWADD_W_VL combine function which made forming vwadd.vv in two stages an easy choice. I've left out trying hard to form vwadd.wx instructions for now. It would only save an extend in the scalar domain which isn't as interesting. Might need to review the test coverage a bit. Most of the vwadd.wv instructions are coming from vXi64 tests on rv64. The tests were copy pasted from the existing multiply tests. Reviewed By: rogfer01 Differential Revision: https://reviews.llvm.org/D117954	2022-02-02 10:03:08 -08:00
Craig Topper	524545317c	[RISCV] Remove RISCVISD::BREV8 and use RISCVISD::GREV instead. We already have an ISD opcode for the more general GREV/GREVI instructon. We can just use it with the encoding that corresponds to the behavior of brev8. This is similar to what we do for orc.b where we use the GORC ISD opcode.	2022-01-29 22:45:43 -08:00
Craig Topper	d8f929a567	[RISCV] Custom legalize BITREVERSE with Zbkb. With Zbkb, a bitreverse can be split into a rev8 and a brev8. Reviewed By: VincentWu Differential Revision: https://reviews.llvm.org/D118430	2022-01-28 23:11:12 -08:00
Chenbing.Zheng	6d6c44a3f3	[RISCV] Add support for matching vwmulsu from fixed vectors According to riscv-v-spec-1.0, widening signed(vs2)-unsigned integer multiply vwmulsu.vv vd, vs2, vs1, vm # vector-vector vwmulsu.vx vd, vs2, rs1, vm # vector-scalar It is worth noting that signed op is only for vs2. For vwmulsu.vv, we can swap two ops, and don't care which is sign extension, but for vwmulsu.vx signExt can not be a vector extended from scalar (rs1). I specifically added two functions ending with _swap in the test case. Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D118215	2022-01-28 02:33:30 +00:00
Fraser Cormack	af773a1818	[RISCV][VP] Lower VP_MERGE to RVV instructions This patch adds lowering of the llvm.vp.merge.* intrinsic (ISD::VP_MERGE) to RVV vmerge/vfmerge instructions. It introduces a special pseudo form of vmerge which allows a tied merge operand, allowing us to specify the tail elements as being equal to the "on false" operand, using a tied-def constraint and a "tail undisturbed" policy. While this strategy allows us to often lower the intrinsic to just one instruction, it may be less efficient in fixed-vector types as the number of tail elements may extend far beyond the length of the fixed vector. Another strategy could be to use a vmerge/vfmerge instruction with an AVL equal to the length of the vector type, and manipulate the condition operand such that mask elements greater than the operation's EVL are false. I've also observed inefficient codegen in which our 'VF' patterns don't match raw floating-point SPLAT_VECTORs, which occur in scalable-vector code. Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D117561	2022-01-24 11:05:05 +00:00
Craig Topper	fa8bb22466	[RISCV] Optimize vector_shuffles that are interleaving the lowest elements of two vectors. RISCV only has a unary shuffle that requires places indices in a register. For interleaving two vectors this means we need at least two vrgathers and a vmerge to do a shuffle of two vectors. This patch teaches shuffle lowering to use a widening addu followed by a widening vmaccu to implement the interleave. First we extract the low half of both V1 and V2. Then we implement (zext(V1) + zext(V2)) + (zext(V2) * zext(2^eltbits - 1)) which simplifies to (zext(V1) + zext(V2) * 2^eltbits). This further simplifies to (zext(V1) + zext(V2) << eltbits). Then we bitcast the result back to the original type splitting the wide elements in half. We can only do this if we have a type with wider elements available. Because we're using extends we also have to be careful with fractional lmuls. Floating point types are supported by bitcasting to/from integer. The tests test a varied combination of LMULs split across VLEN>=128 and VLEN>=512 tests. There a few tests with shuffle indices commuted as well as tests for undef indices. There's one test for a vXi64/vXf64 vector which we can't optimize, but verifies we don't crash. Reviewed By: rogfer01 Differential Revision: https://reviews.llvm.org/D117743	2022-01-20 14:44:47 -08:00
Craig Topper	aa7fc02feb	Recommit "[RISCV] Make the operand order for RISCVISD::FSL(W)/FSR(W) match the instruction register numbering." This reverts the revert commit `e328385739`. Accidental demanded bits change has been removed. The demanded bits code itself was remove in a pre-commit since it isn't tested. Original commit message: Previous we used the fshl/fshr operand ordering for simplicity. This made things confusing when D117468 proposed adding intrinsics for the instructions. We can't just use the generic funnel shifting intrinsics because fsl/fsr have different functionality that should be exposed to software. Now we use rs1, rs3, rs2/shamt order which matches the instruction printing order and the order used in this intrinsic header https://github.com/riscv/riscv-bitmanip/blob/main-history/cproofs/rvintrin.h	2022-01-18 10:52:43 -08:00
Craig Topper	e328385739	Revert "[RISCV] Make the operand order for RISCVISD::FSL(W)/FSR(W) match the instruction register numbering." This reverts commit `b634f8a663`. I broke the SimplifyDemandedBits code, but we don't have tests.	2022-01-18 10:36:03 -08:00
Craig Topper	b634f8a663	[RISCV] Make the operand order for RISCVISD::FSL(W)/FSR(W) match the instruction register numbering. Previous we used the fshl/fshr operand ordering for simplicity. This made things confusing when D117468 proposed adding intrinsics for the instructions. We can't just use the generic funnel shifting intrinsics because fsl/fsr have different functionality that should be exposed to software. Now we use rs1, rs3, rs2/shamt order which matches the instruction printing order and the order used in this intrinsic header https://github.com/riscv/riscv-bitmanip/blob/main-history/cproofs/rvintrin.h	2022-01-18 09:47:28 -08:00
David Sherwood	f4515ab858	Revert "[CodeGen][AArch64] Ensure isSExtCheaperThanZExt returns true for negative constants" This reverts commit `197f3c0deb`. Reverting after miscompilation errors discovered with ffmpeg.	2022-01-18 08:40:20 +00:00
Han-Kuan Chen	ec9cb3a79c	[RISCV] Provide VLOperand in td. Currently, users expected VL is the last operand. However, since some intrinsics has tail policy in the last operand, this rule cannot be used anymore. Reviewed By: craig.topper, frasercrmck Differential Revision: https://reviews.llvm.org/D117452	2022-01-17 20:25:47 -08:00
Han-Kuan Chen	3fc4b5896a	[RISCV] Make SplatOperand start from 0. Current SplatOperand starts from 1 because operand 0 (or 1) is intrinsic id in SelectionDAG. Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D117453	2022-01-17 20:14:59 -08:00
David Sherwood	197f3c0deb	[CodeGen][AArch64] Ensure isSExtCheaperThanZExt returns true for negative constants When we know the value we're extending is a negative constant then it makes sense to use SIGN_EXTEND because this may improve code quality in some cases, particularly when doing a constant splat of an unpacked vector type. For example, for SVE when splatting the value -1 into all elements of a vector of type <vscale x 2 x i32> the element type will get promoted from i32 -> i64. In this case we want the splat value to sign-extend from (i32 -1) -> (i64 -1), whereas currently it zero-extends from (i32 -1) -> (i64 0xFFFFFFFF). Sign-extending the constant means we can use a single mov immediate instruction. New tests added here: CodeGen/AArch64/sve-vector-splat.ll I believe we see some code quality improvements in these existing tests too: CodeGen/AArch64/reduce-and.ll CodeGen/AArch64/unfold-masked-merge-vector-variablemask.ll The apparent regressions in CodeGen/AArch64/fast-isel-cmp-vec.ll only occur because the test disables codegen prepare and branch folding. Differential Revision: https://reviews.llvm.org/D114357	2022-01-17 11:08:57 +00:00
David Sherwood	ba471ba8d2	Revert "[CodeGen][AArch64] Ensure isSExtCheaperThanZExt returns true for negative constants" This reverts commit `31009f0b5a`. It seems to be causing SVE VLA buildbot failures and has introduced a genuine regression. Reverting for now.	2022-01-13 15:59:43 +00:00
David Sherwood	31009f0b5a	[CodeGen][AArch64] Ensure isSExtCheaperThanZExt returns true for negative constants When we know the value we're extending is a negative constant then it makes sense to use SIGN_EXTEND because this may improve code quality in some cases, particularly when doing a constant splat of an unpacked vector type. For example, for SVE when splatting the value -1 into all elements of a vector of type <vscale x 2 x i32> the element type will get promoted from i32 -> i64. In this case we want the splat value to sign-extend from (i32 -1) -> (i64 -1), whereas currently it zero-extends from (i32 -1) -> (i64 0xFFFFFFFF). Sign-extending the constant means we can use a single mov immediate instruction. New tests added here: CodeGen/AArch64/sve-vector-splat.ll I believe we see some code quality improvements in these existing tests too: CodeGen/AArch64/dag-numsignbits.ll CodeGen/AArch64/reduce-and.ll CodeGen/AArch64/unfold-masked-merge-vector-variablemask.ll The apparent regressions in CodeGen/AArch64/fast-isel-cmp-vec.ll only occur because the test disables codegen prepare and branch folding. Differential Revision: https://reviews.llvm.org/D114357	2022-01-13 09:43:07 +00:00
Lian Wang	16877c5d2c	[RISCV] Add bfp and bfpw intrinsic in zbf extension Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D116994	2022-01-13 02:53:00 +00:00
wangpc	c6430fade3	[RISCV] Generate 32 bits jumptable entries when code model is small The code can only address the whole RV32 address space or the lower 2 GiB of the RV64 address space in small code model, so 32 bits entry is enough. Cache hit ratio and code size have some improvements. Reviewed By: asb Differential Revision: https://reviews.llvm.org/D116435	2022-01-11 18:20:37 +08:00
wangpc	98d51c2542	[RISCV] Override TargetLowering::BuildSDIVPow2 to generate SELECT When `Zbt` is enabled, we can generate SELECT for division by power of 2, so that there is no data dependency. Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D114856	2022-01-11 15:54:35 +08:00

1 2 3 4 5

236 Commits