llvm-project

Commit Graph

Author	SHA1	Message	Date
Fraser Cormack	3e678cb772	[RISCV] Don't emit fractional VIDs with negative steps We can't shift-right negative numbers to divide them, so avoid emitting such sequences. Use negative numerators as a proxy for this situation, since the indices are always non-negative. An alternative strategy could be to add a compiler flag to emit division instructions, which would at least allow us to test the VID sequence matching itself. Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D123796	2022-04-21 07:00:34 +01:00
Craig Topper	186d5c8af5	[RISCV] Make getInstSeqCost handle other Zb* instructions. We haven't been updating this as Zb* instructions have been used for immediate materialization. They will hit the default case and trigger an llvm_unreachable. Instead of trying to list them all, assume instructions that aren't explicitly listed aren't compressible. Spotted while looking at integer materialization for other reasons. I haven't seen a crash from this yet.	2022-04-20 22:08:04 -07:00
Craig Topper	6db0afb44e	[RISCV] Fold (xor (sllw 1, x), -1) -> (rolw ~1, x). There's an existing generic combine that does this for legal types. This patch adds a RISCV specific combine for W instructions. Reviewed By: luismarques Differential Revision: https://reviews.llvm.org/D123983	2022-04-19 15:03:43 -07:00
Fraser Cormack	c5cac48549	[RISCV] Fix lowering of BUILD_VECTORs as VID sequences This patch fixes a bug when lowering BUILD_VECTOR via VID sequences. After adding support for fractional steps in D106533, elements with zero steps may be skipped if no step has yet been computed. This allowed certain sequences to slip through the cracks, being identified as VID sequences when in fact they are not. The fix for this is to perform a second loop over the BUILD_VECTOR to validate the entire sequence once the step has been computed. This isn't the most efficient, but on balance the code is more readable and maintainable than doing back-validation during the first loop. Fixes the tests introduced in D123785. Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D123786	2022-04-19 07:43:38 +01:00
jacquesguan	25445b94db	[RISCV] Add rvv codegen support for vp.fptrunc. This patch adds rvv codegen support for vp.fptrunc. The lowering of fp_round and vp.fptrunc share most code so use a common lowering function to handle those two, similar to vp.trunc. Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D123841	2022-04-19 01:56:18 +00:00
Lian Wang	545d353b3c	[RISCV][NFC] Refactor VL patterns for vnsrl and vnsra Reviewed By: frasercrmck Differential Revision: https://reviews.llvm.org/D123274	2022-04-15 07:42:59 +00:00
jacquesguan	1aa4f0bb6c	[RISCV][VP] Add RVV codegen for vp.trunc. Differential Revision: https://reviews.llvm.org/D123579	2022-04-15 02:29:53 +00:00
Lian Wang	3100893f63	[RISCV] Remove sext_inreg+riscv_grev/riscv_gorc isel patterns Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D123565	2022-04-14 08:16:32 +00:00
Lian Wang	38706dd940	[RISCV][NFC] Refactor patterns for Multiply Add instructions Reviewed By: craig.topper, frasercrmck Differential Revision: https://reviews.llvm.org/D123355	2022-04-14 08:00:00 +00:00
wangpc	d0828c5af9	[RISCV][NFC] Use addExpr() instead of createExpr() It seems to be neater. Reviewed By: asb Differential Revision: https://reviews.llvm.org/D123675	2022-04-14 10:48:25 +08:00
Liqin Weng	8265679018	[RISCV][NFC] Refactor the type promotion of fsl/fsr/becompress/bdecompress/bfp Reviewed By: asb, jrtc27, craig.topper, frasercrmck Differential Revision: https://reviews.llvm.org/D123181	2022-04-13 08:52:04 +00:00
Craig Topper	057c063c9b	[RISCV] Add a encodeLMUL function to RISCVVType. NFC This moves the encoding handling out of the assembly parser. Reviewed By: khchen, frasercrmck Differential Revision: https://reviews.llvm.org/D123553	2022-04-12 13:39:47 -07:00
Craig Topper	2ce2562876	[RISCV][SelectionDAG] Add a hook to sign extend i32 ConstantInt operands of phis on RV64. Materializing constants on RISCV is simpler if the constant is sign extended from i32. By default i32 constant operands of phis are zero extended. This patch adds a hook to allow RISCV to override this for i32. We have an existing isSExtCheaperThanZExt, but it operates on EVT which we don't have at these places in the code. Reviewed By: efriedma Differential Revision: https://reviews.llvm.org/D122951	2022-04-11 14:38:39 -07:00
Craig Topper	76192182d0	[RISCV] Remove riscv-v-fixed-length-vector-elen-max command line option. This was added before Zve extensions were defined. I think users should use Zve32x or Zve32f now. Though we will lose support for limiting ELEN to 16 or 8, but I hope no one was using that. Reviewed By: frasercrmck Differential Revision: https://reviews.llvm.org/D123418	2022-04-11 10:14:48 -07:00
Craig Topper	c266e50430	[RISCV] Remove ExtZvl enum from RISCVSubtarget. NFC Having an enum with names that contain the string representation of their value doesn't add any value. We can just use the numbers. Reviewed By: kito-cheng, frasercrmck Differential Revision: https://reviews.llvm.org/D123417	2022-04-11 10:01:17 -07:00
LiaoChunyu	505fce5a9e	[RISCV] Add basic code modeling for llvm.experimental.stepvector intrinsic Scalable vectors llvm.experimental.stepvector intrinsic will crash due to an invalid cost when run the code through the loopunroll. Reviewed By: kito-cheng Differential Revision: https://reviews.llvm.org/D122782	2022-04-11 10:19:23 +08:00
Craig Topper	4e561a581f	[RISCV] Remove unnecessary cast to i8* when converting gather/scatter to strided load/store. Not sure why I thought this necessary at the time.	2022-04-09 20:05:03 -07:00
Craig Topper	70046438d0	[RISCV] Only try LUI+SHADD+ADDI for int materialization if LUI+ADDI+SHADD failed. There's an assert in LUI+SHADD+ADDI materialization that makes sure the lower 12 bits aren't zero since that case should have been handled as LUI+ADDI+SHADD. But nothing prevented the LUI+SH*ADD+ADDI checks from running after the earlier code handled it. The sequence would be the same length or longer so it wouldn't replace the earlier sequence, but the assert happened before that was checked. The vector holding the sequence also wasn't reset before the second check so that guaranteed the sequence would never be found to be shorter. This patch fixes this by only trying the second expansion when the earlier fails. Fixes PR54812. Reviewed By: benshi001 Differential Revision: https://reviews.llvm.org/D123406	2022-04-09 08:52:15 -07:00
Fraser Cormack	34e1b4774a	[RISCV] Select unmasked FP setcc insts via ISel post-process Similar to D123217 but for the floating-point patterns. No change in generated output, while reducing the generated table size. Reviewed By: arcbbb Differential Revision: https://reviews.llvm.org/D123291	2022-04-08 17:13:43 +01:00
Craig Topper	1903b99154	[RISCV] Always select (and (srl X, C), Mask) as (srli (slli X, C2), C3). SLLI is always compressible to C.SLLI as long as the source and dest register is the same. ANDI and SRLI are only compressible if the register is x8-x15. By using SLLI we have a better chance of generating shorter code. I had to exclude one exclusion for the BEXTI case so that it's pattern match could still fire. Reviewed By: luismarques Differential Revision: https://reviews.llvm.org/D123336	2022-04-08 09:04:04 -07:00
Kito Cheng	9c5aedfbf5	[RISCV] Fixing stack offset for RVV object with vararg in stack. We found LLVM generate wrong stack offset for RVV object when stack having variable argument, that cause by we didn't count vaarg part during calculate RVV stack objects. Also update the stack layout diagram for including vaarg in the diagram. Stack layout ref: https://github.com/gcc-mirror/gcc/blob/master/gcc/config/riscv/riscv.cc#L3941 Reviewed By: rogfer01 Differential Revision: https://reviews.llvm.org/D123180	2022-04-08 12:01:16 +08:00
Kito Cheng	690085c9b7	[RISCV] Store/restore RISCVMachineFunctionInfo into MIR YAML file RISCVMachineFunctionInfo has some fields like VarArgsFrameIndex and VarArgsSaveSize are calculated at ISel lowering stage, those info are not contained in MIR files, that cause test cases rely on those field can't not reproduce correctly by MIR dump files. This patch adding the MIR read/write for those fields. Reviewed By: frasercrmck Differential Revision: https://reviews.llvm.org/D123178	2022-04-08 11:55:48 +08:00
jacquesguan	a55c19c44b	[RISCV][NFC] Use defvar to simplify pattern definations. Reviewed By: frasercrmck Differential Revision: https://reviews.llvm.org/D123292	2022-04-08 02:51:30 +00:00
Craig Topper	d98bea87ef	[RISCV] Add more .vx patterns for VLMax integer setccs. This patch synchronizes the structure of the templates with those in RISCVInstrInfoVVLPatterns.td so that we get patterns with .vx on the left hand side. Reviewed By: rogfer01 Differential Revision: https://reviews.llvm.org/D123255	2022-04-07 09:17:43 -07:00
Craig Topper	82662b753d	[RISCV] Add swapped patterns to VPatIntegerSetCCVL_VIPlus1. This matches VPatIntegerSetCCVL_VI_Swappable. But as noted in the FIXME this may only be needed due to lack of canonicalization on VP_SETCC. Reviewed By: rogfer01 Differential Revision: https://reviews.llvm.org/D123239	2022-04-07 09:17:08 -07:00
Luís Marques	d09d297c5d	[RISCV] Fix crash for section alignment with .option norvc The existing code wasn't getting the subtarget info from the fragment, so the current status of RVC would be ignored. This would cause a crash for the new test case when the target then reported it couldn't write the requested number of code alignment bytes. Differential Revision: https://reviews.llvm.org/D122236	2022-04-07 12:02:27 +01:00
Fraser Cormack	8ebc9b1560	[RISCV] Select unmasked integer setcc insts via ISel post-process This patch has no effect on the generated code, whilst mitigating the increase in ISel table size caused by the recent addition of masked patterns. I aim to do the same for floating-point patterns once D123051 lands, giving us a reason to use masked floating-point patterns. Reviewed By: arcbbb Differential Revision: https://reviews.llvm.org/D123217	2022-04-07 09:30:19 +01:00
Fraser Cormack	8216255c9f	[RISCV][VP] Add basic RVV codegen for vp.fcmp This patch adds the necessary infrastructure to lower vp.fcmp via ISD::VP_SETCC to RVV instructions. Most notably this patch adds cond-code legalization for VP_SETCC, reusing the existing TargetLowering::LegalizeSetCCCondCode by passing in additional SDValue parameters for the Mask and EVL. This method then uses VP operations to legalize the condcode. There is still a general lack of canonicalization on VP_SETCC as opposed to SETCC which results in worse code than is theoretically possible. Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D123051	2022-04-07 09:16:07 +01:00
Liqin Weng	f891123556	[RISCV] Add CMOV isel pattern for (select (setgt X, Imm), Y, Z) Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D122644	2022-04-07 05:55:53 +00:00
Lian Wang	1b547799c5	[RISCV] Supplement patterns for vnsrl.wx/vnsra.wx when splat shift is sext or zext Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D122786	2022-04-07 02:21:41 +00:00
Craig Topper	e13a44b460	[RISCV] Add lowering for vp.sext and vp.zext. Including mask vector inputs. Reviewed By: frasercrmck, rogfer01 Differential Revision: https://reviews.llvm.org/D123150	2022-04-06 09:59:49 -07:00
Fraser Cormack	6be5e875be	[RISCV][VP] Add basic RVV codegen for vp.icmp This patch adds the minimum required to successfully lower vp.icmp via the new ISD::VP_SETCC node to RVV instructions. Regular ISD::SETCC goes through a lot of canonicalization which targets may rely on which has not hereto been ported to VP_SETCC. It also supports expansion of individual condition codes and a non-boolean return type. Support for all of that will follow in later patches. In the case of RVV this largely isn't a problem as the vector integer comparison instructions are plentiful enough that it can lower all VP_SETCC nodes on legal integer vectors except for boolean vectors, which regular SETCC folds away immediately into logical operations. Floating-point VP_SETCC operations aren't as well supported in RVV and the backend relies on condition code expansion, so support for those operations will come in later patches. Portions of this code were taken from the VP reference patches. Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D122743	2022-04-06 16:51:22 +01:00
Craig Topper	3c831c9b28	[RISCV] Add support for vp.fptosi where the result is a mask type. We can do this conversion by converting the same sized integer type, then compare the result with 0. The conversion is undefined if the converted FP value doesn't fit in an i1. Reviewed By: frasercrmck Differential Revision: https://reviews.llvm.org/D122678	2022-04-05 09:48:04 -07:00
Craig Topper	d970e96c53	[RISCV] Add lowering for vp.fptoui and vp.uitofp. This is a straightforward extension of D122512 to unsigned integers.	2022-04-01 18:28:46 -07:00
Craig Topper	fa630e7594	[RISCV][AMDGPU][TargetLowering] Special case overflow expansion for (uaddo X, 1). If we expand (uaddo X, 1) we previously expanded the overflow calculation as (X + 1) <u X. This potentially increases the live range of X and can prevent X+1 from reusing the register that previously held X. Since we're adding 1, overflow only occurs if X was UINT_MAX in which case (X+1) would be 0. So this patch adds a special case to expand the overflow calculation to (X+1) == 0. This seems to help with uaddo intrinsics that get introduced by CodeGenPrepare after LSR. Alternatively, we could block the uaddo transform in CodeGenPrepare for this case. Reviewed By: arsenm Differential Revision: https://reviews.llvm.org/D122933	2022-04-01 13:14:10 -07:00
Lian Wang	62dd3674bc	[RISCV] Supplement SDNode patterns for vfwmul/vfwadd/vfwsub Reviewed By: jacquesguan Differential Revision: https://reviews.llvm.org/D122720	2022-04-01 03:09:50 +00:00
Fraser Cormack	ee51aefba0	[RISCV][NFC] Minor formatting fix	2022-03-31 16:15:22 +01:00
Fraser Cormack	a276d1f44b	[RISCV][NFC] Fix formatting on one line	2022-03-31 13:17:37 +01:00
ShihPo Hung	2f1261abe4	[RISCV][RVV] Add Uses = [FRM] and mayRaiseFPException = true to RVV instructions This patch adds Uses = [FRM] and mayRaiseFPException = true to following instructions: VFADD, VFSUB, VFRSUB, VFMUL, VFDIV, VFRDIV VFWADD, VFWSUB, VFWMUL VFMADD, VFMACC, VFMSAC, VFMSUB VFNMADD, VFNMACC, VFNMSAC, VVFNMSUB VFWMACC, VFWMSAC, VFWNMACC, VFWNMSAC VFSQRT, VFREC7 VFREDOSUM, VFREDUSUM, VFWREDOSUM, VFWREDUSUM and only adds mayRaiseFPException = true to following instructions: VFRSQRT7, VFMIN, VFMAX, VFREDMIN, VFREDMAX VMFEQ, VMFNE, VMFLT,VMFLE, VMFGT, VMFGE Reviewed By: rogfer01 Differential Revision: https://reviews.llvm.org/D121087	2022-03-31 01:33:17 -07:00
Fraser Cormack	893d63fbdc	[RISCV][NFC] Fix comment to refer to correct file	2022-03-31 08:59:10 +01:00
Lian Wang	b3851e9931	[RISCV] Add VL patterns for vfwmul/vfwadd/vfwsub Reviewed By: frasercrmck Differential Revision: https://reviews.llvm.org/D122369	2022-03-31 07:08:58 +00:00
Craig Topper	4477500533	[RISCV] ISel (and (shift X, C1), C2)) to shift pair in more cases Previously, these isel optimizations were disabled if the AND could be selected as a ANDI instruction. This patch disables the optimizations only if the immediate is valid for C.ANDI. If we can't use C.ANDI, we might be able to compress the shift instructions instead. I'm not checking the C extension since we have relatively poor test coverage of the C extension. Without C extension the code size should be equal. My only concern would be if the shift+andi had better latency/throughput on a particular CPU. I did have to add a peephole to match SRLIW if the input is zexti32 to prevent a regression in rv64zbp.ll. Reviewed By: luismarques Differential Revision: https://reviews.llvm.org/D122701	2022-03-30 11:46:42 -07:00
Craig Topper	7417eb29ce	[RISCV] Use getSplatBuildVector instead of getSplatVector for fixed vectors. The splat_vector will be legalized to build_vector eventually anyway. This patch makes it take fewer steps. Unfortunately, this results in some codegen changes. It looks like it comes down to how the nodes were ordered in the topological sort for isel. Because the build_vector is created earlier we end up with a different ordering of nodes. Reviewed By: frasercrmck Differential Revision: https://reviews.llvm.org/D122185	2022-03-30 11:36:34 -07:00
Liqin Weng	4cb85da811	[RISCV] Add CMIX isel pattern for (xor (and (xor rs1, rs3), rs2), rs3) Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D122702	2022-03-30 16:51:09 +08:00
Fraser Cormack	75047577d6	[RISCV] Trim RVV isel pats matchable via DAG post-process In D122512, several masked patterns were added to support lowering of vector-predicated float-to-int and int-to-float conversions. With the introduction of these patterns, all of the old "unmasked" patterns are matchable via the DAG post-process introduced in D118810, once the relevant opcode entries are set up in the helper table. Locally this reduces the generated isel table by 4%. Reviewed By: arcbbb Differential Revision: https://reviews.llvm.org/D122637	2022-03-30 08:56:38 +01:00
Liqin Weng	7f81765898	[RISCV][NFC] Add immediate tests for the icmp instruction Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D122651	2022-03-30 02:51:26 +00:00
Zakk Chen	b578330754	[RISCV] Use maskedoff to decide mask policy for masked compare and vmsbf/vmsif/vmsof. masked compare and vmsbf/vmsif/vmsof are always tail agnostic, we could check maskedoff value to decide mask policy rather than have a addtional policy operand. Reviewed By: craig.topper, arcbbb Differential Revision: https://reviews.llvm.org/D122456	2022-03-29 18:05:33 -07:00
Zakk Chen	10b2760da0	Revert "[RISCV] Add policy operand for masked compare and vmsbf/vmsif/vmsof IR" This reverts commit `10fd2822b7`. I have a better implementation for those operations without the additional policy operand. masked compare and vmsbf/vmsif/vmsof are always tail agnostic so we could assume undef maskedoff is mask agnostic. Differential Revision: https://reviews.llvm.org/D122455	2022-03-29 18:05:33 -07:00
Liqin Weng	d660c0d793	[RISCV] Optimize LI+SLT to SLTI+XORI for immediates in specific range This transform will reduce one GPR. Reviewed By: craig.topper, benshi001 Differential Revision: https://reviews.llvm.org/D122051	2022-03-29 14:46:49 +08:00
Craig Topper	45e85feba6	[RISCV] Pull APInt/computeKnonwbits specifics out of computeGREVOrGORC. NFC This function now takes a uint64_t instead of an APInt. The caller is responsible for masking the shift amount, extracting and inserting into the KnownBits APInts, and inverting to compute zeros. This is less code and cleaner division of responsibilities.	2022-03-28 20:53:54 -07:00
Shao-Ce SUN	662b9fa02c	[NFC][CodeGen] Add a setTargetDAGCombine use ArrayRef Reviewed By: arsenm Differential Revision: https://reviews.llvm.org/D122557	2022-03-29 09:53:24 +08:00
Craig Topper	01203918d1	[RISCV] Add computeKnownBits support for RISCVISD::GORC. Reviewed By: luismarques Differential Revision: https://reviews.llvm.org/D121575	2022-03-28 16:56:33 -07:00
Craig Topper	e68257fcee	[RISCV][SelectionDAG] Enable TargetLowering::hasBitTest for masks that fit in ANDI. Modified DAGCombiner to pass the shift the bittest input and the shift amount to hasBitTest. This matches the other call to hasBitTest in TargetLowering.h This is an alternative to D122454. Reviewed By: luismarques Differential Revision: https://reviews.llvm.org/D122458	2022-03-28 12:46:36 -07:00
Craig Topper	cfe533da05	[RISCV] Add lowering for vp.fptosi and vp.sitofp. This as an alternative version of D120641. Starting from the code here https://repo.hca.bsc.es/gitlab/rferrer/llvm-epi/-/raw/EPI/llvm/lib/Target/RISCV/RISCVISelLowering.cpp but with some modifications to how the interim types are calculated, and adding support for f16. Still need to add fptosi for mask vectors. Lots of masked isel patterns added so we can pass the mask through the type changes. Reviewed By: frasercrmck, arcbbb Differential Revision: https://reviews.llvm.org/D122512	2022-03-28 11:06:41 -07:00
Kazu Hirata	6212871968	[Target] Apply clang-tidy fixes for readability-redundant-member-init (NFC)	2022-03-27 22:22:37 -07:00
Maksim Panchenko	4ae9745af1	[Disassember][NFCI] Use strong type for instruction decoder All LLVM backends use MCDisassembler as a base class for their instruction decoders. Use "const MCDisassembler " for the decoder instead of "const void ". Remove unnecessary static casts. Reviewed By: skan Differential Revision: https://reviews.llvm.org/D122245	2022-03-25 18:53:59 -07:00
Dávid Bolvanský	9a738c477e	[NFCI] Fix set-but-unused warning in RISCVAsmParser.cpp	2022-03-24 08:33:40 +01:00
jacquesguan	8910ac400c	[RISCV] Add patterns for vector widening integer multiply Add patterns for vector widening integer multiply instructions Differential Revision: https://reviews.llvm.org/D117385	2022-03-24 15:26:08 +08:00
Vasileios Porpodas	39aa202aff	Recommit "[SLP] Fix lookahead operand reordering for splat loads." attempt 3, fixed assertion crash. Original review: https://reviews.llvm.org/D121354 This reverts commit `e6ead19b77`.	2022-03-23 18:32:17 -07:00
Craig Topper	6c90a654bb	[RISCV] Simplify some code in lowering vector int<->fp conversions. NFC Don't call EltVT.getSizeInBits() or SrcEltVT.getSizeInBits() a second time. They are already in EltSize or SrcEltSize variables. Refactor some comparisons to use multiply instead of division.	2022-03-23 12:09:05 -07:00
Arthur Eubanks	e6ead19b77	Revert "Recommit "[SLP] Fix lookahead operand reordering for splat loads." attempt 2, fixed assertion crash." This reverts commit `27bd8f9492`. Causes crashes, see comments in D121973	2022-03-23 10:57:45 -07:00
luxufan	5800fb41a6	[RISCV] Remove check and update test file in D121183 Differential Revision: https://reviews.llvm.org/D122290	2022-03-24 00:48:52 +08:00
luxufan	227496dc09	[RISCV] Generate correct ELF EFlags when .ll file has target-abi attribute In the past, when construct RISCVAsmBackend, MCTargetOptions.ABIName would be passed and stored in RISCVAsmBackend. But MCTargetOptions.ABIName can only be specified by -target-abi xxx in command line, if the .ll file has target-abi attribute, the codegen module will ignore it. And the generated object file would have incorrect EFlags value. https://github.com/llvm/llvm-project/issues/50591 also caused by this problem. This patch override the AsmPrinter::emitFunctionEntryLabel function and use it to set the target abi value that get from .ll file's target-abi attribute. And storing the target-abi in RISCVTargetStreamer instead of RISCVAsmBackend. Differential Revision: https://reviews.llvm.org/D121183	2022-03-24 00:48:52 +08:00
Vasileios Porpodas	27bd8f9492	Recommit "[SLP] Fix lookahead operand reordering for splat loads." attempt 2, fixed assertion crash. Original review: https://reviews.llvm.org/D121354 This reverts commit `f7d7d2a08d`.	2022-03-22 16:41:55 -07:00
Arthur Eubanks	f7d7d2a08d	Revert "Recommit "[SLP] Fix lookahead operand reordering for splat loads."" This reverts commit `79613185d3`. Causes crashes, see comments in https://reviews.llvm.org/D121973.	2022-03-22 13:33:49 -07:00
Craig Topper	51940d69cb	[RISCV] Special case sign extended scalars when type legalizing nxvXi64 .vx instrinsics on RV32. On RV32, we need to type legalize i64 scalar arguments to intrinsics. We usually do this by splatting the value into a vector separately. If the scalar happens to be sign extended, we can continue using a .vx intrinsic. We already special cased sign extended constants, this extends it to any sign extended value. I've only added tests for one case of vadd. Most intrinsics go through the same check. Reviewed By: khchen Differential Revision: https://reviews.llvm.org/D122186	2022-03-22 10:29:06 -07:00
Craig Topper	9b0f227d7b	[TableGen][RISCV] Add InstAliases with zero_reg to cover unmasked vnot.v, vncvt.x.x.w, vneg.v, etc. The mask being NoRegister prevented the existing aliases from matching since NoRegister isn't in the VMV0 register class. To workaround this I've added new aliases that look for zero_reg. I had to motify tablegen to generate matching code for zero_reg. And as a consequence, I had to change the EmitPriority for an ARM alias that used zero_reg that started printing. Reviewed By: frasercrmck Differential Revision: https://reviews.llvm.org/D121496	2022-03-22 10:14:43 -07:00
Zakk Chen	10fd2822b7	[RISCV] Add policy operand for masked compare and vmsbf/vmsif/vmsof IR intrinsics. Those operations are updated under a tail agnostic policy, but they could have mask agnostic or undisturbed. Reviewed By: rogfer01 Differential Revision: https://reviews.llvm.org/D120228	2022-03-22 07:47:21 -07:00
Zakk Chen	9ab18cc535	[RISCV] Add policy operand for masked vid and viota IR intrinsics. Reviewed By: rogfer01 Differential Revision: https://reviews.llvm.org/D120227	2022-03-22 02:32:31 -07:00
Zakk Chen	abb5a985e9	[RISCV] Support mask policy for RVV IR intrinsics. Add the UsesMaskPolicy flag to indicate the operations result would be effected by the mask policy. (ex. mask operations). It means RISCVInsertVSETVLI should decide the mask policy according by mask policy operand or passthru operand. If UsesMaskPolicy is false (ex. unmasked, store, and reduction operations), the mask policy could be either mask undisturbed or agnostic. Currently, RISCVInsertVSETVLI sets UsesMaskPolicy operations default to MA, otherwise to MU to keep the current mask policy would not be changed for unmasked operations. Add masked-tama, masked-tamu, masked-tuma and masked-tumu test cases. I didn't add all operations because most of implementations are using the same pseudo multiclass. Some tests maybe be duplicated in different tests. (ex. masked vmacc with tumu shows in vmacc-rv32.ll and masked-tumu) I think having different tests only for policy would make the testing clear. Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D120226	2022-03-22 01:19:16 -07:00
Yeting Kuo	ecd7a0132a	[RISCV] Add basic cost model for vector casting To perform the cost model of vector casting, the patch consider most vector casts as their scalar form and consider those vector form of free scalr castings as 1. Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D121771	2022-03-22 14:17:08 +08:00
Vasileios Porpodas	79613185d3	Recommit "[SLP] Fix lookahead operand reordering for splat loads." Original review: https://reviews.llvm.org/D121354 The original commit `9136145eb0` broke the build on several targets. Differential Revision: https://reviews.llvm.org/D121973	2022-03-21 15:57:32 -07:00
Craig Topper	cc5b0868ff	Revert "[RISCV] Special case sign extended scalars when type legalizing nxvXi64 .vx instrinsics on RV32." This reverts commit `8c4937b33f`. Committed by mistake.	2022-03-21 14:58:11 -07:00
Craig Topper	d4aeb5000f	[RISCV] Simplify some code. NFC	2022-03-21 14:50:56 -07:00
Craig Topper	19de2e8db6	[RISCV] Remove stray slash from comment. NFC	2022-03-21 14:50:56 -07:00
Craig Topper	8c4937b33f	[RISCV] Special case sign extended scalars when type legalizing nxvXi64 .vx instrinsics on RV32. On RV32, we need to type legalize i64 scalar arguments to intrinsics. We usually do this by splatting the value into a vector separately. If the scalar happens to be sign extended, we can continue using a .vx intrinsic. We already special cased sign extended constants, this extends it to any sign extended value. I've only added tests for one case of vadd. Most intrinsics go through the same check. I can add more tests if we're concerned. Differential Revision: https://reviews.llvm.org/D122186	2022-03-21 14:50:55 -07:00
Mohammed Nurul Hoque	7afa44f5f5	[RISCV] Add more sign-extending ops to MIR sext.w pass. This patch adds single-bit and bit-counting ops to list of sign-extending ops. A single-bit write propagates sign-extendedness if it's not in the sign-bits. Bit extraction and bit counting always outputs a small number, so sign-extended. Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D121152	2022-03-18 18:21:17 +08:00
Jessica Clarke	63ea7797dd	[RISCV] Fix buildbot breakage by explicitly instantiating templates RISCVISelDAGToDAG's selectImm uses RISCVTargetLowering::getAddr (specifically the ConstantPoolSDNode) as of `41454ab256` ("[RISCV] Use constant pool for large integers"), but nothing explicitly instantiates any of the templates, the only reason they exist is because of the various lowering methods in RISCVISelLowering.cpp that themselves use the methods. However, with inlining, those can end up not existing as real functions and thus not be exported, leading to link errors. Up until now this hasn't happened, but for whatever reason D121654 has triggered this on the sanitizer-ppc64be-linux buildbot, giving: ../../../../lib/libLLVMRISCVCodeGen.a(RISCVISelDAGToDAG.cpp.o): In function `selectImm(llvm::SelectionDAG, llvm::SDLoc const&, llvm::MVT, long, llvm::RISCVSubtarget const&)': RISCVISelDAGToDAG.cpp:(.text._ZL9selectImmPN4llvm12SelectionDAGERKNS_5SDLocENS_3MVTElRKNS_14RISCVSubtargetE+0x3d8): undefined reference to `llvm::SDValue llvm::RISCVTargetLowering::getAddr<llvm::ConstantPoolSDNode>(llvm::ConstantPoolSDNode, llvm::SelectionDAG&, bool) const' collect2: error: ld returned 1 exit status Fix this by explicitly instantiating getAddr in its four different forms so separate translation units can reliably use it. Fixes: `41454ab256` ("[RISCV] Use constant pool for large integers")	2022-03-18 02:22:17 +00:00
Craig Topper	bbd2ecf9f0	[RISCV] Add +experimental-zvfh extension to cover half types in vectors. Currently we allow half types in vectors if the scalar Zfh extension is enabled. This behavior is not inline with the vector spec. For f32 and f64 types, the Zve32f, Zve64f, Zve64d, and V explicitly control the availablity of floating point types in vectors. In order to make our compiler compliant, we either need to remove all support for half in vectors or we need an extension to control it. Draft spec here https://github.com/riscv/riscv-v-spec/pull/780 Reviewed By: kito-cheng Differential Revision: https://reviews.llvm.org/D121345	2022-03-17 10:04:02 -07:00
Craig Topper	7e15303062	[RISCV] Simplify scalable vector case in lowerVectorMaskExt. Since we have SPLAT_VECTOR_PARTS these days, I don't think we need to go through extra lengths to avoid introducing an illegal scalar type. We can just call getConstant using the scalable vector type and let it create either a SPLAT_VECTOR or a SPLAT_VECTOR_PARTS. Reviewed By: frasercrmck, rogfer01 Differential Revision: https://reviews.llvm.org/D121645	2022-03-17 09:43:13 -07:00
Lian Wang	214afc7116	[RISCV] Add patterns for vnsrl.wi and vnsra.wi instructions Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D121675	2022-03-17 07:22:32 +00:00
Lian Wang	b26abcad81	[RISCV][NFC] Replace redundant code with VLOpFrag Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D121783	2022-03-17 02:05:21 +00:00
Craig Topper	2e10671ec7	[RISCV] Improve detection of when to skip (and (srl x, c2) c1) -> (srli (slli x, c3-c2), c3) isel. We have a special case to skip this transform if c1 is 0xffffffff and x is sext_inreg in order to use sraiw+zext.w. But we were only checking that we have a sext_inreg opcode, not how many bits are being sign extended. This commit adds a check that it is a sext_inreg from i32 so we know for sure that an sraiw can be created.	2022-03-16 14:54:34 -07:00
Jessica Clarke	659363c0cc	[RISCV] Ensure PseudoLA* can be hoisted Since we mark the pseudos as mayLoad but do not provide any MMOs, isSafeToMove conservatively returns false, stopping MachineLICM from hoisting the instructions. PseudoLA_TLS_GD does not actually expand to a load, so stop marking that as mayLoad to allow it to be hoisted, and for the others make sure to add MMOs during lowering to indicate they're GOT loads and thus can be freely moved. Fixes https://github.com/llvm/llvm-project/issues/54372 Reviewed By: MaskRay, arichardson Differential Revision: https://reviews.llvm.org/D121654	2022-03-16 18:45:36 +00:00
Shengchen Kan	37b378386e	[NFC][CodeGen] Rename some functions in MachineInstr.h and remove duplicated comments	2022-03-16 20:25:42 +08:00
serge-sans-paille	989f1c72e0	Cleanup codegen includes This is a (fixed) recommit of https://reviews.llvm.org/D121169 after: 1061034926 before: 1063332844 Discourse thread: https://discourse.llvm.org/t/include-what-you-use-include-cleanup Differential Revision: https://reviews.llvm.org/D121681	2022-03-16 08:43:00 +01:00
Haocong.Lu	6a54776fe0	[RISCV] Select SRLI+SLLI for AND with leading ones mask Select SRLI+SLLI for and i64 %x, imm if the imm is a leading ones mask. It's useful in RV64 when the mask exceeds simm32 (cannot be generated by LUI). Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D121598	2022-03-16 02:10:57 +00:00
Craig Topper	06c5d74090	[RISCV] Remove lowerSPLAT_VECTOR This code handles fixed vector SPLAT_VECTOR, but is never called in any tests. We only form fixed vector splat vectors for vXi64 on RV32 as part of DAGCombine. This will be type legalized to SPLAT_VECTOR_PARTS. So the Custom handling for SPLAT_VECTOR is never needed. This patch makes SPLAT_VECTOR for vXi64 'Legal' on RV32 so that DAGCombine will create it, but there's no need for Custom handler. It will still be type legalized to SPLAT_VECTOR_PARTS. Reviewed By: frasercrmck Differential Revision: https://reviews.llvm.org/D121673	2022-03-15 08:22:13 -07:00
Yeting Kuo	ae7c6647f3	[RISCV] Add basic code modeling for fixed length vector reduction. Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D121447	2022-03-14 11:04:31 +08:00
Craig Topper	eeb3bfd74a	[RISCV] Merge ReplaceNodeResults code for SHFL and GREV/GORC. NFC	2022-03-13 18:42:26 -07:00
Lehua Ding	1648852c98	[RISCV][RVV] Fix vslide1up/down intrinsics overflow bug for SEW=64 on RV32 Reviewed By: craig.topper, kito-cheng Differential Revision: https://reviews.llvm.org/D120899	2022-03-13 18:06:09 +08:00
Craig Topper	fd4d584d6b	[RISCV] Add DAGCombine to fold (bitreverse (bswap X)) to brev8 with Zbkb. If the type is less than XLenVT, type legalization will turn this into (srl (bitreverse (bswap (srl (bswap X), C))), C). We can't completely recover from these shifts. They introduce zeros into the upper bits of the result and we can't easily tell if they are needed. By doing a DAG combine early, we avoid introducing these shifts.	2022-03-12 16:39:39 -08:00
Craig Topper	43f668b98e	[RISCV] Move GORCIW/GREVIW formation to isel patterns. Type legalize narrow RISCVISD::GREV/GORC with constant to a larger type without switching to W. Detect sext_inreg+gorci/grevi with a uimm5 immediate during isel to emit GREVIW/GORCIW. This allows us to better propagate known bits information through extended bits after type legalization. It will also simplify a change I'm considering for BREV8 with Zbkb. A future patch will add computeKnownBits support for GORC. A further improvement here would be to use hasAllWUsers and doPeepholeSExtW like we do for SLLIW, but I don't think we have the test coverage for that yet.	2022-03-11 18:02:47 -08:00
Craig Topper	d0969e485c	[RISCV] Optimize vfmv.s.f intrinsic with scalar 0.0 to vmv.s.x with x0. We already do this for RISCVISD::VFMV_S_F_VL and the vfmv.v.f intrinsic. Reviewed By: kito-cheng Differential Revision: https://reviews.llvm.org/D121429	2022-03-11 10:05:43 -08:00
Craig Topper	e9d4922543	[RISCV] Add tablegen helper classes to create PatFrag to check for one use. NFC Reduces code and the class can be instantiated in isel patterns to avoid creating more *_oneuse classes.	2022-03-10 23:14:21 -08:00
Craig Topper	337d49da84	[RISCV] Fix typo in comment. NFC	2022-03-10 22:00:18 -08:00
Eric Tang	336c92d5e8	[RISCV] Add alias for HFENCE.VVMA Signed-off-by: Eric Tang <eric.tang@starfivetech.com> Differential Revision: https://reviews.llvm.org/D120878	2022-03-11 13:32:52 +08:00
Craig Topper	1f3a8d58a6	[RISCV] Use ZERO_EXTEND instead of ANY_EXTEND when promoting i32 RISCVISD::SHFL. NFC We know the shift amount is a constant with bit 31 clear. anyext of constant will be either zext or sext which will produce the same result here. But we really shouldn't rely on that. It would be valid to put a random number in the upper bits. Our isel patterns expect the upper bits to be 0 so we should ask for it explicitly.	2022-03-10 20:57:04 -08:00
Craig Topper	9ce6b1ca86	[RISCV] Remove performANY_EXTENDCombine. This doesn't appear to be needed any more. I did some inspecting of the gcc torture suite and SPEC2006 with this removed and didn't find any meaningful changes. I think we're more aggressive about forming ADDIW now using sign_extend_inreg during type legalization and hasAllWUsers in isel. This probably helps catch the cases this helped with before.	2022-03-10 11:29:31 -08:00
Craig Topper	e0e8edf823	[RISCV] Add isel patterns for masked RISCVISD::FMA_VL with RISCVISD::FNEG_VL. This helps us form vfnmsub, vfnmadd, and vfmusb from masked VP intrinsics. I've used "srcvalue" for the mask parameter in the fneg nodes. We can't match "V0" because that doesn't ensure the mask the is the same. Instead it matches two different nodes and generates two copies to V0 of those separate values. Reviewed By: rogfer01 Differential Revision: https://reviews.llvm.org/D120287	2022-03-10 10:05:42 -08:00
Nico Weber	a278250b0f	Revert "Cleanup codegen includes" This reverts commit `7f230feeea`. Breaks CodeGenCUDA/link-device-bitcode.cu in check-clang, and many LLVM tests, see comments on https://reviews.llvm.org/D121169	2022-03-10 07:59:22 -05:00
serge-sans-paille	7f230feeea	Cleanup codegen includes after: 1061034926 before: 1063332844 Differential Revision: https://reviews.llvm.org/D121169	2022-03-10 10:00:30 +01:00
Luke	0803dba7dd	[RISCV] Add fixed-length vector instrinsics for segment load Inspired by reviews.llvm.org/D107790. Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D119834	2022-03-10 16:23:40 +08:00
Craig Topper	d53707508a	[RISCV] Remove RISCVISD::VLE_VL/VSE_VL. Use intrinsics instead. Similar to what we do for other loads/stores, use the intrinsic version that we already have custom isel for. Reviewed By: rogfer01 Differential Revision: https://reviews.llvm.org/D121166	2022-03-09 22:44:28 -08:00
Craig Topper	edd6632127	[RISCV] Support 'generic' as a valid CPU name. Most other targets support 'generic', but RISCV issues an error. This can require a special case in tools that use LLVM that aren't clang. This patch treats "generic" the same as an empty string and remaps it to generic-rv/rv64 based on the triple. Unfortunately, it has to be added to RISCV.td because MCSubtargetInfo is constructed and parses the CPU before RISCVSubtarget's constructor gets a chance to remap it. The CPU will then reparsed and the state in the MCSubtargetInfo subclass will be updated again. Fixes PR54146. Reviewed By: khchen Differential Revision: https://reviews.llvm.org/D121149	2022-03-09 16:43:22 -08:00
Shao-Ce SUN	365c858a5d	[RISCV] Share PatFprFpr classes for F, D, and Zfh Inspired by D115469 Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D121066	2022-03-08 13:02:04 +08:00
jacquesguan	e55b9b0d0a	[RISCV] Add patterns for vector widening floating-point reduction instructions. Add patterns for vector widening floating-point reduction instructions. Differential Revision: https://reviews.llvm.org/D120390	2022-03-08 10:53:49 +08:00
Craig Topper	845bfcede1	[RISCV] Rename 'SplatOperand' to 'ScalarOperand'. NFC vslide1up/down have this flag set, but the value isn't a splat. Rename for clarity. Reviewed By: khchen Differential Revision: https://reviews.llvm.org/D121037	2022-03-07 11:28:32 -08:00
Zakk Chen	3be907621f	[RISCV] Fix incorrect optimization for masked vmsgeu.vi with 0 immediate. vmsgeu.vi with 0 is always true, but in the masked with mask undisturbed policy, we still need to keep inactive elelemt which come from maskedoff. We could return mask directly if it's mask agnostic policy in the future. Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D121080	2022-03-06 19:22:35 -08:00
Benjamin Kramer	fbce4a7803	Drop some more global std::maps. NFCI.	2022-03-06 13:28:29 +01:00
Craig Topper	bd5f124716	[RISCV] Add SimplifyDemandedBits support for FSR/FSL/FSRW/FSLW.	2022-03-05 21:26:51 -08:00
Zakk Chen	33b61c5678	[RISCV] Fix incorrect codegen introduced by D119688. We should not emit a tail agnostic vlse for a tail undisturbed vmv.s.x In D119688: - if (IsScalarMove && !Node->getOperand(0).isUndef()) + bool HasPassthruOperand = Node->getOpcode() != ISD::SPLAT_VECTOR; + if (HasPassthruOperand && !IsScalarMove && !Node->getOperand(0).isUndef()) break; The IsScalarMove check in the if statement had been changed. Differential Revision: https://reviews.llvm.org/D120963	2022-03-05 06:10:26 -08:00
Craig Topper	1e569e3b7b	[RISCV] Add CMOV isel pattern for (select (setgt X, -1), Y, Z) setgt X, -1 is the canonical form of setge X, 0. We can swap the select operands and use setlt X, X0 when selecting CMOV. This avoid materializing the -1 in a register.	2022-03-04 22:35:13 -08:00
Craig Topper	232f57319d	[RISCV] Move vslide1up/down intrinsics into lowerVectorIntrinsicSplats. NFC Rename to lowerVectorIntrinsicScalars. This allows us to share the code that checks if the scalar needs to be type legalized.	2022-03-04 18:21:53 -08:00
Craig Topper	3d4e83f17d	[RISCV] With Zbb, fold (sext_inreg (abs X)) -> (max X, (negw X)) With Zbb, abs is expanded to (max X, neg) by default. If X has 33 or more sign bits, we can expand it a little early using negw instead of neg to save a sext_inreg. If X started as a 32 bit value, type legalization would have inserted a sext before the abs so X having 33 sign bits should always be true. Note: I've used ISD::FREEZE here since we increase the number of uses. Our default expansion for ABS doesn't do that, but I think that's a bug. We can't do this with custom type legalization because ISD::FREEZE doesn't propagate sign bits so later DAG combine won't expand be able to see optmize it. Alives2 https://alive2.llvm.org/ce/z/Gx3RNe Reviewed By: spatel Differential Revision: https://reviews.llvm.org/D120597	2022-03-03 15:42:29 -08:00
Alex Tsao	89f15fc687	[RISCV] Add cost modelling for masked memory op The patch adds very basic cost model for masked memory op on scalable vector. Reviewed By: frasercrmck Differential Revision: https://reviews.llvm.org/D117884	2022-03-03 20:47:58 +08:00
jacquesguan	44a430354d	[RISCV] Fold store of vmv.f.s to a vse with VL=1. This patch support the FP part of D109482. Differential Revision: https://reviews.llvm.org/D120235	2022-03-03 16:35:19 +08:00
Craig Topper	6cb42cd666	[RISCV] More correctly ignore Zfinx register classes in getRegForInlineAsmConstraint. Until Zfinx is supported in CodeGen we need to convert all Zfinx register classes to GPR. Remove the zfinx-types.ll test which didn't test anything meaningful since -mattr=zfinx isn't implemented completely in llc. Follow up to D93298.	2022-03-02 11:22:46 -08:00
Craig Topper	a1f8349d77	[RISCV] Don't combine ROTR ((GREV x, 24), 16)->(GREV x, 8) on RV64. This miscompile was introduced in D119527. This was a special pattern for rotate+bswap on RV32. It doesn't work for RV64 since the rotate needs to be half the bitwidth. The equivalent pattern for RV64 is ROTR ((GREV x, 56), 32) so match that instead. This could be generalized further as noted in the new FIXME. Reviewed By: Chenbing.Zheng Differential Revision: https://reviews.llvm.org/D120686	2022-03-02 09:47:06 -08:00
Nikita Popov	98cfcae4e9	Revert "[RISCV] Add cost modelling for masked memory op" This reverts commit `76f243b53b`. The newly added test fails.	2022-03-02 17:32:10 +01:00
Alex Tsao	76f243b53b	[RISCV] Add cost modelling for masked memory op The patch adds very basic cost model for masked memory op on scalable vector. Reviewed By: frasercrmck Differential Revision: https://reviews.llvm.org/D117884	2022-03-02 22:48:41 +08:00
Shao-Ce SUN	0e38b29543	[RISCV] add the MC layer support of Zfinx extension This patch added the MC layer support of Zfinx extension. Authored-by: StephenFan Co-Authored-by: Shao-Ce Sun Reviewed By: asb Differential Revision: https://reviews.llvm.org/D93298	2022-03-02 14:25:19 +08:00
Mircea Trofin	cb2160760e	[nfc][codegen] Move RegisterBank[Info].h under CodeGen This wraps up from D119053. The 2 headers are moved as described, fixed file headers and include guards, updated all files where the old paths were detected (simple grep through the repo), and `clang-format`-ed it all. Differential Revision: https://reviews.llvm.org/D119876	2022-03-01 21:53:25 -08:00
Craig Topper	b9d6e8c441	[RISCV] Lower VECTOR_SPLICE to RVV instructions. This lowers VECTOR_SPLICE of scalable vectors to a slidedown follow by a slideup. Fixed vectors are encouraged to use shufflevector instruction. The equivalent patch for fixed vectors is D119039. I've used a tail agnostic slidedown and limited the VL to only the elements that will not be overwritten by the slideup. The slideup uses VLMax for its VL. It unfortunately uses tail undisturbed policy but it isn't required as there is no tail. We just need the merge operand to carry the bits for the lower portion of the result. Care was taken to ensure that either the slideup or slidedown will be able to use a .vi instruction when the immediate is small. Which one uses the immediate depends on the sign of the immediate. Reviewed By: frasercrmck, ABataev Differential Revision: https://reviews.llvm.org/D119303	2022-03-01 10:10:13 -08:00
Lian Wang	db85cd729a	[RISCV] Add FMV_W_X and FMV_H_X instrutions to hasAllNBitUsers Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D120699	2022-03-01 08:13:59 +00:00
lian wang	5d91a8a707	[RISCV] Add schedule class for Zbp extension and Zbr extension Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D120012	2022-03-01 07:35:59 +00:00
Lian Wang	e2c150ab52	[RISCV][NFC] Move defined non_imm12 to proper place in RISCVInstrInfoZb.td Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D120656	2022-03-01 01:45:30 +00:00
Craig Topper	e83db8c001	[RISCV] Only enable combineROTR_ROTL_RORW_ROLW with Zbp. I think the immediate values we check for on the GREV nodes already protect this, but better to be explicit.	2022-02-28 12:47:36 -08:00
Craig Topper	b083157b7b	[RISCV] Don't call combineROTR_ROTL_RORW_ROLW for SLLW/SRLW/SRAW nodes. NFC I think the function does the correct thing internally, but it's confusing to read.	2022-02-28 11:05:10 -08:00
Craig Topper	f46890711f	[RISCV] Custom type legalize i32 ISD::ABS on RV64 without Zbb. Default type legalization will create sext_inreg+abs, but we may not be able to remove the sext_inreg. Instead this patch expands abs during type legalization to Y = sraiw X, 31; subw(xor X, Y), Y) which doesn't require the input to be sign extended. This gives a big improvement for some neg-abs tests where the abs is used more than the the neg. Previously the abs was expanded a different way before and after type legalization. Now they are expanded in a similar way enabling more CSE. Reviewed By: asb Differential Revision: https://reviews.llvm.org/D120636	2022-02-28 09:30:27 -08:00
eric.tang	b496a172e4	[RISCV] Support hypervisor extention instructions According to privileged spec version-20211203 Add the following hypervisor instructions: - HLV.B HLV.BU - HLV.H HLV.HU HLVX.HU - HLV.W HLV.WU HLVX.WU - HLV.D - HSV.B HSV.H HSV.W HSV.D Signed-off-by: eric.tang <eric.tang@starfivetech.com> Differential Revision: https://reviews.llvm.org/D117733	2022-02-28 14:02:43 +08:00
eric.tang	386c5be92a	[RISCV] Support Sinval extension and hypervisor memory management fence instructions According to Privileged spec version-20211203 Add Supervisor Memory-Management Instructions: - SINVAL.VMA, SFENCE.W.INVAL, SFENCE.INVAL.IR Add Hypervisor Memory-Management Instructions: - HFENCE.VVMA, HFENCE.GVMA, HINVAL.VVMA, HINVAL.GVMA Signed-off-by: eric.tang <eric.tang@starfivetech.com> Differential Revision: https://reviews.llvm.org/D117654	2022-02-28 14:02:43 +08:00
Eric Tang	cf80ef1393	[RISCV] Change GPRMemAtomic to GPRMemZeroOffset for general usage Not only some AMO instructions but also other instructions need to process (${gpr}) or 0(${gpr}), where the 0 is be silently ignored. This patch does some changes for general usage. Signed-off-by: Eric Tang <eric.tang@starfivetech.com> Differential Revision: https://reviews.llvm.org/D120017	2022-02-28 14:02:43 +08:00
Chenbing Zheng	7f811ce127	[RISCV] Optimize (sext.w, srli) to sraiw with Zba. In this patch, we add a more narrower exclusion for zeroext (srl x) -> srli (slli x), so that it provides an opportunity for the selection of sraiw. Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D120467	2022-02-28 10:34:35 +08:00
Jessica Clarke	6aa8521fdb	[RISCV] Fix parseBareSymbol to not double-parse top-level operators By failing to lex the token we end up both parsing it as a binary operator ourselves and parsing it as a unary operator when calling parseExpression on the RHS. For plus this is harmless but for minus this parses "foo - 4" as "foo - -4", effectively treating a top-level minus as a plus. Fixes https://github.com/llvm/llvm-project/issues/54105 Reviewed By: asb, MaskRay Differential Revision: https://reviews.llvm.org/D120635	2022-02-27 20:48:52 +00:00
Jameson Nash	c4b1a63a1b	mark getTargetTransformInfo and getTargetIRAnalysis as const Seems like this can be const, since Passes shouldn't modify it. Reviewed By: wsmoses Differential Revision: https://reviews.llvm.org/D120518	2022-02-25 14:30:44 -05:00
Haocong.Lu	865fe131f8	[RISCV] Fix a mistake in PostprocessISelDAG With the condition N->use_empty(), the root node of DAG always misses peephole optimization. So a dummy node is needed. Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D119934	2022-02-25 12:38:31 +00:00
Chenbing Zheng	b20e80aa59	[RISCV] DAG Combine vcpop and vfirst with VL=0 to li imm vcpop and vfirst are still useful when VL=0. vcpop equivalents to li 0 and vfirst equivalents to li -1, since no mask elements are active. Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D120302	2022-02-25 14:44:25 +08:00
Zakk Chen	4e115b7d88	[RISCV] Update computeTargetABI from llc as well as clang Clang computes the default ABI if -mabi is empty and encode it in LLVM IR module flag since D105555. For correctness, llc need to give the same target-abi (Options.MCOptions.ABIName) with ABI encoded in IR. The getSubtargetImpl already has a check for them only if Options.MCOptions.ABIName is not empty. In order to get more robustness we could have a check for explicit ABI, but now we have two different logic to compute the default ABI. The front-end ABI is defautl to the ilp32/ilp32e/lp64, and ilp32d/lp64d when hardware support for extension D. The backend ABI is default to the ilp32/ilp32e/lp64. Reviewed by: asb, jrtc27 Differential Revision: https://reviews.llvm.org/D118333	2022-02-24 21:55:44 -08:00
lian wang	f37d21ed20	[RISCV] Add schedule class for Zbt extension Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D119808	2022-02-25 01:57:20 +00:00
Qihan Cai	0d058ed3d6	[RISCV] Change rvv version to 1.0 and remove ratify notice This patch changes the version of V extension from 0.1 to 1.0 in RISCVInstrInfoVPseudos.td, RISCVInstrInfoVSDPatterns.td, RISCVInstrInfoVVLPatterns.td, RISCVInstrInfoV.td Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D120525	2022-02-25 11:38:20 +11:00
Craig Topper	506ac29632	[RISCV] Add 'i64' to some isel so tablegen will remove them for RV32. NFC Saves a 100 bytes or so from the isel table.	2022-02-24 15:10:05 -08:00
Craig Topper	a975ca97c3	[RISCV] Fold (sext_inreg (fmv_x_anyexth X), i16) -> (fmv_x_signexth X). Add a new ISD opcode to represent the sign extending behavior of vmv.x.h. Keep the previous anyext opcode to allow the existing (fmv_x_anyexth (fmv_h_x X)) combine to keep working without needing to generate a sign extend. For fmv.x.w we are able to match the sext_inreg in an isel pattern, but a 16-bit sext_inreg is lowered to a shift pair before isel. This seemed like a larger match than we should do in isel. Reviewed By: asb Differential Revision: https://reviews.llvm.org/D118974	2022-02-24 09:19:01 -08:00
Shao-Ce SUN	78b5f0fb05	[NFC][RISCV] Reuse ISD::NodeType in float extension Reviewed By: asb Differential Revision: https://reviews.llvm.org/D120412	2022-02-24 19:57:55 +08:00
Nikita Popov	c7fe6f9c92	Revert "[RISCV] add the MC layer support of Zfinx extension" This reverts commit `7798ecca9c`. As reported in https://reviews.llvm.org/D93298#3331641 and following, this causes assertion failures with inline assembly.	2022-02-24 12:14:31 +01:00
lian wang	e1d4d1c242	[RISCV] Add schedule class for Zbm and Zbe extension Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D119805	2022-02-24 08:49:25 +00:00
Chenbing.Zheng	2ae92e19eb	[RISCV][NFC] Add helper function isVectorConfigInstr to reduce Repeated code. Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D119924	2022-02-24 05:59:12 +00:00
Craig Topper	5b7ac107b1	[RISCV] Use SelectionDAG::getFreeze to simplify some code. NFC	2022-02-23 21:13:01 -08:00
Craig Topper	c7d6448d03	[DAGCombiner][TargetLowering] Pass SDValue by value to isMulAddWithConstProfitable. Internally to DAGCombiner the SDValues were passed by non-const reference despite not being modified. They were then passed by const reference to TLI. This patch passes them by value which is consistent with the vast majority of code. Reviewed By: RKSimon Differential Revision: https://reviews.llvm.org/D120420	2022-02-23 12:40:45 -08:00
Alex Bradbury	c5bcfb983e	[RISCV] Avoid infinite loop between DAGCombiner::visitMUL and RISCVISelLowering::transformAddImmMulImm See https://github.com/llvm/llvm-project/issues/53831 for a full discussion. The basic issue is that DAGCombiner::visitMUL and RISCVISelLowering;:transformAddImmMullImm get stuck in a loop, as the current checks in transformAddImmMulImm aren't sufficient to avoid all cases where DAGCombiner::isMulAddWithConstProfitable might trigger a transformation. This patch makes transformAddImmMulImm bail out if C0 (the constant used for multiplication) has more than one use. Differential Revision: https://reviews.llvm.org/D120332	2022-02-23 11:05:46 +00:00
jacquesguan	5acd9c49a8	[RISCV] Add patterns for vector widening integer reduction instructions Add patterns for vector widening integer reduction instructions. Differential Revision: https://reviews.llvm.org/D117643	2022-02-22 14:14:05 +08:00
Zakk Chen	f7dfc5d1af	[RISCV] Optimize tail agnostic vmv.s.x which don't need to select tail value. Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D120250	2022-02-21 14:53:37 -08:00
Craig Topper	90d240553d	[RISCV] Teach shouldSinkOperands to sink splat operands of vp.fma intrinsics. Reviewed By: frasercrmck Differential Revision: https://reviews.llvm.org/D120167	2022-02-21 11:52:59 -08:00
Lian Wang	4abe484525	[RISCV][NFC] Add sched for some instructions in Zb extension Add sched to brev8, zip and unzip instruction. Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D120009	2022-02-21 09:58:08 +08:00
Zakk Chen	17d5ba5bc7	[RISCV][NFC] Remove unused multiclass def.	2022-02-18 23:58:56 -08:00
Craig Topper	5489969550	[RISCV] Add IsRV32 to the isel pattern for ZIP_RV32/UNZIP_RV32. NFC I think the i32 in the pattern prevents this from matching on RV64, but using IsRV32 is safer. Add tests for RV64 to make sure we don't print zip or unzip because we incorrectly picked ZIP_RV32/UNZIP_RV32.	2022-02-18 22:38:14 -08:00
Zakk Chen	ca78312407	[RISCV] Add the policy operand for nomask vector Multiply-Add IR intrinsics. The goal is support tail and mask policy in RVV builtins. We focus on IR part first. The nomask vector Multiply-Add need a policy operand because merge value could not be undef. Reviewed By: monkchiang Differential Revision: https://reviews.llvm.org/D119727	2022-02-17 09:12:46 -08:00
Craig Topper	bbee9e77f3	[RISCV] Match shufflevector corresponding to slideup. This generalizes isElementRotate to work when there's only a single slide needed. I've removed matchShuffleAsSlideDown which is now redundant. Reviewed By: frasercrmck, khchen Differential Revision: https://reviews.llvm.org/D119759	2022-02-17 08:19:10 -08:00
Craig Topper	954fe404ab	[RISCV] Fix incorrect MemOperand copy converting splat+load to vlse. Due to an incorrect copy/paste from load intrinsic handling we checked if the splat node was a MemSDNode which of course it isn't. Instead get the MemOperand from the LoadSDNode for the source of the splat. This enables LICM to see the load is loop invariant and hoist it out of the loop. Reviewed By: frasercrmck Differential Revision: https://reviews.llvm.org/D120014	2022-02-17 08:15:50 -08:00
Zakk Chen	eeb7754f68	[RISCV] Add the passthru operand for vmv.vv/vmv.vx/vfmv.vf IR intrinsics. Add the passthru operand for VMV_V_X_VL, VFMV_V_F_VL and SPLAT_VECTOR_SPLIT_I64_VL also. The goal is support tail and mask policy in RVV builtins. We focus on IR part first. If the passthru operand is undef, we use tail agnostic, otherwise use tail undisturbed. Reviewed By: rogfer01 Differential Revision: https://reviews.llvm.org/D119688	2022-02-17 06:38:14 -08:00
Shao-Ce SUN	7798ecca9c	[RISCV] add the MC layer support of Zfinx extension This patch added the MC layer support of Zfinx extension. Authored-by: StephenFan Co-Authored-by: Shao-Ce Sun Reviewed By: asb Differential Revision: https://reviews.llvm.org/D93298	2022-02-17 21:54:13 +08:00
Zakk Chen	093ecccdab	[RISCV] Add the passthru operand for vadc/vsbc/vmerge/vfmerge IR intrinsics. The goal is support tail and mask policy in RVV builtins. We focus on IR part first. If the passthru operand is undef, we use tail agnostic, otherwise use tail undisturbed. Reviewed By: rogfer01 Differential Revision: https://reviews.llvm.org/D119686	2022-02-17 02:21:39 -08:00
Ben Shi	0b93e90971	Revert "[RISCV] LUI used for address computation should not isAsCheapAsAMove" This reverts commit `23a5073600`. Although this patch achieved better codegen in most cases, it is really important to accurately describe the cost of instructions. So I revert it.	2022-02-17 17:27:37 +08:00
Jessica Paquette	67ab4c010b	[MachineOutliner] NFC: Update LRU stuff for RISCV I missed it in my grep. Fixes broken buildbot.`	2022-02-16 12:01:59 -08:00
Jessica Paquette	6d58f4ab07	[MachineOutliner] NFC: Hide LRU-related stuff behind helper functions It's not particularly user-friendly to have to call `initLRU` everywhere. Also, it wasn't particularly great that the LRU for registers used in a sequence was also initialized by `initLRU`. This patch hides this stuff behind some helper functions: * `isAvailableAcrossAndOutOfSeq` * `isAnyUnavailableAcrossOrOutOfSeq` * `isAvailableInsideSeq` This allows the user to avoid calling `initLRU` explicitly. Also, it allows us to separate initializing the used-in-sequence LRU from the main LRU. Since both ARM and AArch64 check LR liveness in `insertOutlinedCall`, this refactor requires that we de-const the Candidate there. Some other quality-of-code improvements: * LRUs in outliner::Candidate now have more descriptive names * Use `Register` instead of `unsigned` in some places * Improve readability in some places by using ranges rather than `std::for_each` This is a preparatory commit for a larger compile time related change for the AArch64 outliner.	2022-02-16 11:39:07 -08:00
Craig Topper	cfbbcc544c	[RISCV] Improve lowering of SHL_PARTS/SRL_PARTS/SRA_PARTS. Part of the shift lowering creates a (sub XLEN-1, ShAmt). When this value is used we know that ShAmt is [0..XLEN-1]. Since XLEN is a power of 2 we can replace the sub with an xor. This allows us to use XORI instead of LI+SUB. Reviewed By: asb Differential Revision: https://reviews.llvm.org/D119411	2022-02-16 09:22:11 -08:00
Zakk Chen	e8973dd389	[RISCV] Add the passthru operand for some RVV nomask unary and nullary intrinsics. The goal is support tail and mask policy in RVV builtins. We focus on IR part first. If the passthru operand is undef, we use tail agnostic, otherwise use tail undisturbed. My plan is to handle more complex operations in follow-up patches. Reviewers: frasercrmck Differential Revision: https://reviews.llvm.org/D118253	2022-02-15 22:34:06 -08:00
Shao-Ce SUN	2aed07e96c	[NFC][MC] remove unused argument `MCRegisterInfo` in `MCCodeEmitter` Reviewed By: skan Differential Revision: https://reviews.llvm.org/D119846	2022-02-16 13:10:09 +08:00
Shao-Ce SUN	9cc49c1951	Revert "[NFC][MC] remove unused argument `MCRegisterInfo` in `MCCodeEmitter`" This reverts commit `fe25c06cc5`.	2022-02-16 11:57:49 +08:00
Shao-Ce SUN	fe25c06cc5	[NFC][MC] remove unused argument `MCRegisterInfo` in `MCCodeEmitter` For ten years, it seems that `MCRegisterInfo` is not used by any target. Reviewed By: skan Differential Revision: https://reviews.llvm.org/D119846	2022-02-16 11:47:17 +08:00
Zakk Chen	b784719904	[RISCV] Add the passthru operand for RVV nomask binary intrinsics. The goal is support tail and mask policy in RVV builtins. We focus on IR part first. If the passthru operand is undef, we use tail agnostic, otherwise use tail undisturbed. Add passthru operand for VSLIDE1UP_VL and VSLIDE1DOWN_VL to support i64 scalar in rv32. The masked VSLIDE1 would only emit mask undisturbed policy regardless of giving mask agnostic policy until InsertVSETVLI supports mask agnostic. Reviewed by: craig.topper, rogfer01 Differential Revision: https://reviews.llvm.org/D117989	2022-02-15 18:36:18 -08:00
Craig Topper	ab6e02dded	[RISCV] Match vwmulsu_vx with scalar splat input. This is a more generic version of D119110 that uses MaskedValueIsZero to do the matching and SimplifyDemandedBits to remove any unneeded AND instructions. Tests were taken from D119110. Reviewed By: Chenbing.Zheng Differential Revision: https://reviews.llvm.org/D119622	2022-02-15 08:45:21 -08:00
Craig Topper	d132b47bb9	[RISCV] Replace llvm_unreachable with report_fatal_error. Parsing errors aren't handled earlier in all cases. A simple example is llc -mtriple=riscv64 -mattr=+zve32f. If F or Finx is not also specified, this will hit a parse error. Use a fatal_error so that the error is conveyed to the user.	2022-02-15 08:40:37 -08:00
jacquesguan	bfb4c0c370	[RISCV] Recover the implication between Zve* extensions and the V extension. This revision recover the implication between Zve* extensions and the V extension. Differential Revision: https://reviews.llvm.org/D119210	2022-02-14 15:52:07 +08:00
Craig Topper	478c237e21	[RISCV] Fix incorrect extend type in vwmulsu combine. While matching widening multiply, if we matched an extend from i8->i32, i16->i64 or i8->i64, we need to reintroduce a narrower extend. If we're matching a vwmulsu we need to use a sext for op0 and a zext for op1. This bug exists in LLVM 14 and will need to be backported. Differential Revision: https://reviews.llvm.org/D119618	2022-02-12 12:47:20 -08:00
Dimitry Andric	7af3d4ab3d	Revert "[RISCV] Enable shrink wrap by default" This reverts commit `5ebdb07e7e`. Enabling shrink wrap by default can cause assertions or crashes, and these should first be investigated and fixed. For now, reverting the change so it can be cherry-picked into 14.0.0 is the safest choice.	2022-02-12 19:04:12 +01:00
Haocong.Lu	23a5073600	[RISCV] LUI used for address computation should not isAsCheapAsAMove A LUI instruction with flag RISCVII::MO_HI is usually used in conjunction with ADDI, and jointly complete address computation. To bind the cost evaluation of address computation, the LUI should not be regarded as a cheap move separately, which is consistent with ADDI. In this test case, it improves the unroll-loop code that the rematerialization of array's base address miss MachineCSE with Heuristics #1 at isProfitableToCSE. Reviewed By: asb, frasercrmck Differential Revision: https://reviews.llvm.org/D118216	2022-02-12 07:14:38 +00:00
Chenbing.Zheng	9e975e558b	[RISCV][NFC] Move some combine patterns to DAG combine. Move some combine patterns to DAG combine，and it dealt with fixme left in RISCVInstrInfoZb.td. Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D119527	2022-02-12 02:52:21 +00:00
Craig Topper	541c9ba842	[RISCV] Insert VSETVLI at the end of a basic block if we didn't produce BlockInfo.Exit. This is an alternative to D118667 that instead of fixing the store to match phase 1, it tries to detect the mismatch with the expected value at the end of the block. This inserts a vsetvli after the vse to satisfy the requirement of the other basic block. We still have serious design issues in the pass, that is going to require some rethinking. Differential Revision: https://reviews.llvm.org/D119518	2022-02-11 09:34:16 -08:00
Craig Topper	f35ac872b8	Revert "[RISCV] Fix a vsetvli insertion bug involving loads/stores." and "[RISCC] Add missing words to comment. NFC" This reverts commit `f943c58cae`. and commit `7eb7810727`. This introduced a new bug that appears to be easier to hit. Differential Revision: https://reviews.llvm.org/D119517	2022-02-11 09:34:16 -08:00
Zakk Chen	d224be3b99	[RISCV] Add the policy operand for some masked RVV ternary IR intrinsics. Masked reduction intrinsics are specical cases which don't need to have policy operand. The mask only affects which elements are read. It doesn't effect the destination register. The reduction intrinsics have a dedicated destination operand. If it is undef, we use tail agnostic. If it not undef we use tail undisturbed. Co-Authored-by: Craig Topper <craig.topper@sifive.com> Differential Revision: https://reviews.llvm.org/D117681	2022-02-11 05:02:03 -08:00
serge-sans-paille	06943537d9	Cleanup MCParser headers As usual with that header cleanup series, some implicit dependencies now need to be explicit: llvm/MC/MCParser/MCAsmParser.h no longer includes llvm/MC/MCParser/MCAsmLexer.h Preprocessed lines to build llvm on my setup: after: 1068185081 before: 1068324320 So no compile time benefit to expect, but we still get the looser coupling between files which is great. Discourse thread: https://discourse.llvm.org/t/include-what-you-use-include-cleanup Differential Revision: https://reviews.llvm.org/D119359	2022-02-11 10:39:29 +01:00
Fangrui Song	8eb750189c	[RISCV] Fix -Wunused-variable in -DLLVM_ENABLE_ASSERTIONS=off builds	2022-02-10 20:10:12 -08:00
Craig Topper	b0e77d5e48	[RISCV] Lower the shufflevector equivalent of vector.splice We can lower a vector splice to a vslidedown and a vslideup. The majority of the matching code here came from X86's code for matching PALIGNR and VPALIGND/Q. The slidedown and slideup lowering don't really require it to be concatenation, but it happened to be an interesting pattern with existing analysis code I could use. This helps with cases where the scalar loop optimizer forwarded a load result from a previous loop iteration. For example, this happens if the loop uses x[i] and x[i+1] on the same iteration. The scalar optimizer will forward x[i+1] load from the previous loop to satisfy x[i] on this loop. When this get vectorized it results in one element of a vector being forwarded from the previous loop to be concatenated with elements loaded on this iteration. Whether that's more efficient than doing a shifted loaded or reloading the single scalar and using vslide1up is an interesting question. But that's not something the backend can help with. Reviewed By: khchen Differential Revision: https://reviews.llvm.org/D119039	2022-02-10 09:39:35 -08:00
Craig Topper	b861ddf365	[RISCV] Move the creation of VLMaxSentinel to isel. Use X0 during lowering. The VLMaxSentinel is represented as TargetConstant, but that's included in isa<ConstantSDNode>. To keep constant VLs and VLMax separate as long as possible, use the X0 register during lowering and only convert to VLMaxSentinel during isel. Reviewed By: frasercrmck Differential Revision: https://reviews.llvm.org/D118845	2022-02-10 09:28:44 -08:00
Craig Topper	727cd5205f	[RISCV] Remove stale comment. NFC Now that we pre-process SPLAT_VECTOR to VFMV_V_F_VL, these patterns handled scalable vectors and vectors converted from fixed. These are also used by vp.fma lowering.	2022-02-10 09:04:32 -08:00
Fraser Cormack	fd43d99c93	[RISCV] Pre-process FP SPLAT_VECTOR to RISCVISD::VFMV_V_F_VL This patch builds on top of D119197 to canonicalize floating-point SPLAT_VECTOR as RISCVISD::VFMV_V_F_VL as a pre-process ISel step. This primarily benefits scalable-vector VP code, where our VP patterns only match VFMV_V_F_VL to reduce the burden on our ISel patterns, but where at the same time, scalable-vector code doesn't custom-legalize SPLAT_VECTOR. Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D117670	2022-02-10 09:56:00 +00:00
Chenbing.Zheng	c5d3b231e0	[RISCV] Add support for matching vwmaccsu/vwmaccus from fixed vectors Add pattern to match add and widening mul to vwmacc, and two multipliers are sext and zext. Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D119314	2022-02-10 01:59:31 +00:00
Craig Topper	c45c1b130b	[RISCV] Teach RISCVDAGToDAGISel::selectShiftMask to replace sub from constant with neg. If the shift amount is (sub C, X) where C is 0 modulo the size of the shift, we can replace it with neg or negw. Similar is is done for AArch64 and X86. Reviewed By: khchen Differential Revision: https://reviews.llvm.org/D119089	2022-02-09 12:33:01 -08:00
Craig Topper	09629215c2	[RISCV] Add a really basic cost model for SK_Splice. While testing scalable vectors I found that if we generate a vector splice intrinsic and run the code through the loop unroller, we'll crash due to an invalid cost. This adds a basic cost based on the 2 slide instructions used by the lowering in D119303. We probably need to factor LMUL into this, but that's true for arithmetic instructions too. So I've ignored for the moment. Reviewed By: ABataev Differential Revision: https://reviews.llvm.org/D119316	2022-02-09 11:43:31 -08:00
Craig Topper	279b3b8179	[RISCV][VP] Lower VP_FMA to RVV instructions. We already had FMA_VL node, but we didn't have masked patterns. I have not added the fneg variations. I'll do those after I add llvm.vp.fneg. Reviewed By: frasercrmck Differential Revision: https://reviews.llvm.org/D119196	2022-02-09 11:33:12 -08:00
Craig Topper	63e711549c	[RISCV] Lower VP_FNEG to RVV instructions Reviewed By: frasercrmck Differential Revision: https://reviews.llvm.org/D119269	2022-02-09 10:56:39 -08:00
Craig Topper	e305b1de7e	[RISCV] Pre-process integer ISD::SPLAT_VECTOR to RISCISD::VMV_V_X_VL before isel. This allows us to remove some isel patterns that exist for both operations. Saving nearly 3000 bytes from the isel table. Reviewed By: frasercrmck Differential Revision: https://reviews.llvm.org/D119197	2022-02-09 08:10:21 -08:00
Lian Wang	af2cd94555	[RISCV][NFC] Remove useless code Reviewed By: craig.topper, asb Differential Revision: https://reviews.llvm.org/D119317	2022-02-09 19:17:25 +08:00
serge-sans-paille	ef736a1c39	Cleanup LLVMMC headers There's a few relevant forward declarations in there that may require downstream adding explicit includes: llvm/MC/MCContext.h no longer includes llvm/BinaryFormat/ELF.h, llvm/MC/MCSubtargetInfo.h, llvm/MC/MCTargetOptions.h llvm/MC/MCObjectStreamer.h no longer include llvm/MC/MCAssembler.h llvm/MC/MCAssembler.h no longer includes llvm/MC/MCFixup.h, llvm/MC/MCFragment.h Counting preprocessed lines required to rebuild llvm-project on my setup: before: 1052436830 after: 1049293745 Which is significant and backs up the change in addition to the usual benefits of decreasing coupling between headers and compilation units. Discourse thread: https://discourse.llvm.org/t/include-what-you-use-include-cleanup Differential Revision: https://reviews.llvm.org/D119244	2022-02-09 11:09:17 +01:00
Fraser Cormack	6449bea508	[RISCV] Select unmasked RVV pseudos in a DAG post-process This patch drops TableGen patterns matching all-ones masked RVV pseudos in the case where there are fallback patterns matching the generic masked forms to "_MASK" pseudos. This optimization is now performed with a SelectionDAG post-processing step which peephole-optimizes these same pseudos with all-ones masks and swaps them out to their unmasked pseudos. This cuts our generated ISel table down by around ~5% (~110kB) in lieu of a far smaller auto-generated table to help with the peephole. This only targets our custom RISCVISD::*_VL binary operator nodes, which use the one form for both masked and unmasked variants. A similar approach could be used for our intrinsics but we'd need to do some work, e.g., to represent unmasked intrinsics as true-masked intrinsics at the IR or ISel level. At a rough estimate, this could save us a further 9% on the size of our ISel table for the binary intrinsic patterns alone. There is no observable impact on our tests. Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D118810	2022-02-09 07:50:15 +00:00
Zakk Chen	cfe7f69036	[RISCV][NFC] Refactor RISCVISAInfo. 1. Remove computeDefaultABIFromArch and add computeDefaultABI in RISCVISAInfo. 2. Add parseFeatureBits which may used in D118333. Differential Revision: https://reviews.llvm.org/D119250	2022-02-08 18:37:43 -08:00
jacquesguan	5e71bbfb6c	[RISCV] Add patterns for vector widening floating-point fused multiply-add instructions Add patterns for vector widening floating-point fused multiply-add instructions. Differential Revision: https://reviews.llvm.org/D117546	2022-02-09 10:34:39 +08:00
Fraser Cormack	62c4ac764b	[RISCV] Optimize splats of extracted vector elements This patch adds an optimization to splat-like operations where the splatted value is extracted from a identically-sized vector. On RVV we can splat that via vrgather.vx/vrgather.vi without dropping to scalar beforehand. We do have a similar VECTOR_SHUFFLE-specific optimization but that only works on fixed-length vector types and for those with a constant splat lane. This patch extends this optimization to make it work on scalable-vector types and on unknown extract indices. It is performed during fixed-vector BUILD_VECTOR lowering and during a new DAGCombine on SPLAT_VECTOR for scalable vectors. Reviewed By: craig.topper, khchen Differential Revision: https://reviews.llvm.org/D118456	2022-02-08 10:35:25 +00:00
wangpc	c53d99c37d	[RISCV] Split f64 undef into two i32 undefs So that no store instruction will be generated. Reviewed By: asb Differential Revision: https://reviews.llvm.org/D118222	2022-02-08 13:42:15 +08:00
Craig Topper	2c26cfdef7	[RISCV] Use splat_vector instead of SplatPat in widening FP instruction patterns. NFCI We use splat_vector for FP nodes without VL, not SplatPat which handles splat_vector and integer VMV_V_X_VL. Reduces isel table size by a few hundred bytes.	2022-02-07 15:53:27 -08:00
Kazu Hirata	3a3cb929ab	[llvm] Use = default (NFC)	2022-02-06 22:18:35 -08:00
Craig Topper	c1cef111a3	Revert "[RISCV] Fold (sext_inreg (fmv_x_anyexth X), i16) -> (fmv_x_signexth X)." This reverts commit `673d68cd92`. This hadn't been reviewed yet.	2022-02-05 12:51:01 -08:00
Craig Topper	673d68cd92	[RISCV] Fold (sext_inreg (fmv_x_anyexth X), i16) -> (fmv_x_signexth X). Add a new ISD opcode to represent the sign extending behavior of vmv.x.h. Keep the previous anyext opcode to allow the existing (fmv_x_anyexth (fmv_h_x X)) combine to keep working without needing to generate a sign extend. For fmv.x.w we are able to match the sext_inreg in an isel pattern, but a 16-bit sext_inreg is lowered to a shift pair before isel. This seemed like a larger match than we should do in isel. Differential Revision: https://reviews.llvm.org/D118974	2022-02-05 12:42:12 -08:00
Craig Topper	5f35009996	[RISCV] Remove a ComputeNumSignBits call from an isel special case. Only isel (and (srl (sexti32 Y), c2), c1) -> (srliw (sraiw Y, 31), c3 - 32) when there is a sext_inreg present. Don't both checking for Y having 32 sign bits.	2022-02-04 23:26:53 -08:00
Craig Topper	d752ea9a72	[RISCV] Remove exclusions for zext.h/zext.w from our (and (srl X, C1), C2) selection code. This code tries to replace the pattern with a pair of shifts, but we were excluding if the And could be a zext.h or zext.w. The SLLI/SRL pair is more compressible and doesn't come with much down side. We do regress one test case in rv64i-exhaustive-w-insts.ll but we can probably add a narrower exclusion for that case.	2022-02-04 17:10:48 -08:00
Craig Topper	1d8bbe3d25	[RISCV] Implement a basic version of AArch64RedundantCopyElimination pass. Using AArch64's original implementation for reference, this patch implements a pass to remove unneeded copies of X0. This pass runs after register allocation and looks to see if a register is implied to be 0 by a branch in the predecessor basic block. Reviewed By: asb Differential Revision: https://reviews.llvm.org/D118160	2022-02-04 10:43:46 -08:00
Craig Topper	234e54bdd8	[RISCV] Add more types of shuffles isShuffleMaskLegal. Add the vslidedown and interleave patterns that I recently implemented. Reviewed By: frasercrmck Differential Revision: https://reviews.llvm.org/D118952	2022-02-04 09:13:13 -08:00
Craig Topper	c83905a308	[RISCV] Add inline expansion for vector fround. This avoids a crash for scalable vectors and or scalarization for fixed vectors. The algorithm is different enough that I don't think it makes sense to merge with ceil/floor/trunc. Algorithm is adapted from gcc's X86 SSE2 output. Reviewed By: frasercrmck Differential Revision: https://reviews.llvm.org/D117247	2022-02-04 09:12:09 -08:00
serge-sans-paille	ffe8720aa0	Reduce dependencies on llvm/BinaryFormat/Dwarf.h This header is very large (3M Lines once expended) and was included in location where dwarf-specific information were not needed. More specifically, this commit suppresses the dependencies on llvm/BinaryFormat/Dwarf.h in two headers: llvm/IR/IRBuilder.h and llvm/IR/DebugInfoMetadata.h. As these headers (esp. the former) are widely used, this has a decent impact on number of preprocessed lines generated during compilation of LLVM, as showcased below. This is achieved by moving some definitions back to the .cpp file, no performance impact implied[0]. As a consequence of that patch, downstream user may need to manually some extra files: llvm/IR/IRBuilder.h no longer includes llvm/BinaryFormat/Dwarf.h llvm/IR/DebugInfoMetadata.h no longer includes llvm/BinaryFormat/Dwarf.h In some situations, codes maybe relying on the fact that llvm/BinaryFormat/Dwarf.h was including llvm/ADT/Triple.h, this hidden dependency now needs to be explicit. $ clang++ -E -Iinclude -I../llvm/include ../llvm/lib/Transforms/Scalar/*.cpp -std=c++14 -fno-rtti -fno-exceptions \| wc -l after: 10978519 before: 11245451 Related Discourse thread: https://llvm.discourse.group/t/include-what-you-use-include-cleanup [0] https://llvm-compile-time-tracker.com/compare.php?from=fa7145dfbf94cb93b1c3e610582c495cb806569b&to=995d3e326ee1d9489145e20762c65465a9caeab4&stat=instructions Differential Revision: https://reviews.llvm.org/D118781	2022-02-04 11:44:03 +01:00
Craig Topper	237eb37260	[RISCV] Add FMV_X_W and FMV_X_H to RISCVSExtWRemoval. Add -target-abi to sextw-removal.ll RUN lines to show benefit on new test case.	2022-02-03 09:40:47 -08:00
Craig Topper	997a86b99c	[RISCV] Remove createVirtualRegister from RISCVInstrInfo::movImm. Based on the discussion in D61884, this was done to enable compressed instructions by giving freedom to pick a compressible register. Integer materializing can generate LUI, ADDI, ADDIW, SLLI and some Zb* instructions. C.LI, C.LUI, C.ADDI, C.ADDIW, and C.SLLI all have a 5-bit register encoding. The Zb* instructions aren't compressible. Based on that I don't think compressibility of the register is a concern. Reviewed By: asb Differential Revision: https://reviews.llvm.org/D118741	2022-02-03 08:34:26 -08:00
Craig Topper	2349fb0312	[RISCV] Remove RISCVISD::SPLAT_VECTOR_I64 in favor of RISCVISD::VMV_V_X_VL. SPLAT_VECTOR_I64 has the same semantics as RISCVISD::VMV_V_X_VL, it just assumed VLMax instead of carrying a VL operand. Include order of RISCVInstrInfoVSDPatterns.td and RISCVInstrInfoVVLPatterns.td has been swapped to avoid moving riscv_vmv_v_x_vl into RISCVInstrInfoVSDPatterns.td and to allow moving other "_vl" SDNodes back to RISCVInstrInfoVVLPatterns.td Reviewed By: frasercrmck Differential Revision: https://reviews.llvm.org/D118841	2022-02-03 08:30:25 -08:00
Shao-Ce SUN	005fd8aa70	[RISCV] Add support for Zihintpause extention Add support for the 'pause' hint instruction as an alias for 'fence w, 0'. To do this allow the 'fence' operands pred and succ to be set to 0 (the empty set). This will also allow future hints to be encoded as 'fence 0, <x>' and 'fence <x>, 0'. This patch revised from @mundaym's D93019. Reviewed By: asb Differential Revision: https://reviews.llvm.org/D117789	2022-02-03 20:55:47 +08:00
Craig Topper	abc6716038	[RISCV] Remove unused variables. NFC	2022-02-02 19:23:16 -08:00
Craig Topper	f1720abb54	[RISCV] Cleanup some places that assumed VLMaxSentinel and -1 constant mean the same thing. NFCI VLMaxSentintel happens to be represented as -1 TargetConstant. A user provided -1 would be an ISD::Constant. We shouldn't assume that they are the same thing. I'm still not entirely convinced that we should be treating -1 from the user as VLMAX. Also fix one place that failed to use XLenVT for the VLMaxSentinel, using MVT::i64 in code that only executes on RV32.	2022-02-02 12:23:12 -08:00
Craig Topper	b73d151a11	[RISCV] Add DAG combines to transform ADD_VL/SUB_VL into widening add/sub. This adds or reuses ISD opcodes for vadd.wv, vaddu.wv, vadd.vv, vaddu.vv and a similar set for sub. I've included support for narrowing scalar splats that have known sign/zero bits similar to what was done for MUL_VL. The conversion to vwadd.vv proceeds in two phases. First we'll form a vwadd.wv by narrowing one of the operands. Then we'll visit the vwadd.wv to try to narrow the other operand. This turned out to be simpler than catching all the cases in one step. The forming of of vwadd.wv can happen for either operand for add, but only the right hand side for sub since sub isn't commutable. An interesting quirk is that ADD_VL and VZEXT_VL/VSEXT_VL are formed during vector op legalization, but VMV_V_X_VL isn't usually formed until op legalization when BUILD_VECTORS are handled. This leads to VWADD_W_VL forming in one DAG combine round, and then a later DAG combine round sees the VMV_V_X_VL and needs to commute the operands to get the splat in position. This alone necessitated a VWADD_W_VL combine function which made forming vwadd.vv in two stages an easy choice. I've left out trying hard to form vwadd.wx instructions for now. It would only save an extend in the scalar domain which isn't as interesting. Might need to review the test coverage a bit. Most of the vwadd.wv instructions are coming from vXi64 tests on rv64. The tests were copy pasted from the existing multiply tests. Reviewed By: rogfer01 Differential Revision: https://reviews.llvm.org/D117954	2022-02-02 10:03:08 -08:00
Craig Topper	5a5037c602	[RISCV] Fix some 80 column violations in ComputeNumSignBitsForTargetNode. NFC	2022-02-01 21:43:11 -08:00
Craig Topper	f943c58cae	[RISCC] Add missing words to comment. NFC	2022-02-01 07:39:51 -08:00
Craig Topper	7eb7810727	[RISCV] Fix a vsetvli insertion bug involving loads/stores. The first phase of the analysis can avoid a vsetvli if an earlier instruction in the block used an SEW and LMUL that when combined with the EEW of the load/store would produce the desired EMUL. If we avoided a vsetvli this will affect the global analysis we do in the second phase. The third phase where we really insert the vsetvlis needs to agree with the first phase. If it doesn't we can insert vsetvlis that invalidate the global analysis. In the test case there is a VSETVLI in the preheader that sets SEW=64 and LMUL=1. Inside the loop there is a VADD with SEW=64 and LMUL=1. This VADD is followed by a store that wants wants SEW=32 LMUL=1/2. Because it has EEW=32 as part of the opcode the SEW=64 LMUL=1 from the VADD can be become EMUL=1 for the store. So the first phase determines no vsetvli is needed. The third phase manages CurInfo differently than BBInfo.Change from the first phase. CurInfo is only updated when we see a vsetvli or insert a vsetvli. This was done to allow predecessor block information from the global analysis to be applied to multiple instructions. Since the loop body has no vsetvli we won't update CurInfo for either the VADD or the VSE. This prevented us from checking the store vsetvli elision for the VSE resulting in a vsetvli SEW=32 LMUL=1/2 being emitted which invalidated the global analysis. To mitigate this, I've added a BBLocalInfo variable that more closely matches the first phase propagation. This gets updated based on the VADD and prevents emitting a vsetvli for the store like we did in the first phase. I wonder if we should do an earlier phase to handle the load/store case by adding more pseudo opcodes and changing the SEW/LMUL for those instructions before the insertion analysis. That might be more robust than trying to guarantee two phases make the same decision. Fixes the test from D118629. Reviewed By: frasercrmck Differential Revision: https://reviews.llvm.org/D118667	2022-02-01 07:29:01 -08:00
Shao-Ce SUN	a2a7fc7ea5	[RISCV] Adjust some comments.	2022-02-01 22:53:54 +08:00
Craig Topper	2e45e8abb1	[RISCV] Add a fatal error if ISD::VSCALE is used with Zvl32b. We convert VLEN to vscale by dividing by RVVBitsPerBlock which is currently 64. This is only correct if VLEN is evenly divisible by 64. With only Zvl32b we can't assume that. This patch adds a fatal_error to prevent generating code that may be broken. We probably need to look at how we size stack frame objects too. Reviewed By: frasercrmck Differential Revision: https://reviews.llvm.org/D118583	2022-01-31 09:13:14 -08:00
Craig Topper	09606d6a63	[RISCV] Update the computeKnownBitsForTargetNode for RISCVISD::READ_VLENB to consider Zve/Zvl. We had previously hardcoded this to assume that vector registers are 128 bits. This was true when only V existed, but after Zve extensions were added this became incorrect. This patch adjusts it to support 128, 64, or 32 bit vectors depending on Zvl. The 128-bit limit is artificial, but we don't have any test coverage showing that we larger values so I was being conservative. None of our lit tests depend on this code today due to the custom lowering of ISD::VSCALE that inserts the appropriate left or right shift to convert from VLENB to VSCALE. That code was added after this code in computeKnownBitsForTargetNode. Reviewed By: frasercrmck Differential Revision: https://reviews.llvm.org/D118582	2022-01-31 09:13:14 -08:00
Craig Topper	aae947e860	[RISCV] Separate the Zfhmin and Zfh extensions. The spec doesn't seem to be written as if Zfh implies Zfhmin. They seem to be separate extensions. This patch moves the instructions from Zfhmin to be enabled with either the Zfh or Zfhmin extensions. Reviewed By: achieveartificialintelligence Differential Revision: https://reviews.llvm.org/D118581	2022-01-31 09:06:43 -08:00
Nikita Popov	0801940c17	[RISCV] Avoid pointer element type access for masked atomicrmw intrinsics masked.atomicrmw.*.i32 intrinsics access an i32 (and then possibly mask it), so hardcode MVT::i32 as the access type here, rather than determining it from the pointer element type. Differential Revision: https://reviews.llvm.org/D118336	2022-01-31 09:28:39 +01:00
Craig Topper	5fbc3cda9e	[RISCV] Use existing variable intead of calling getOperand again. NFCI This is a slight change because I'm using the ANY_EXTEND result instead of the original operand, but getNode should constant fold. While there, add a comment about why the code specifically checks for a ConstantSDNode.	2022-01-30 18:42:19 -08:00
Craig Topper	744be8c502	[RISCV] Lower riscv_zip/unzip intrinsic to RISCVISD::SHFL/UNSHFL. These are special versions of the more general shfli/unshfli instructions. We can use the general ISD opcodes with the correct immediates.	2022-01-30 13:27:41 -08:00
Craig Topper	e1075186a6	[RISCV] Custom lower brev8 intrinsic to RISCVISD::GREV. We can use the RISCVISD::GREV encoding that swaps the bits in each byte. This allows it to use the existing computeKnownBits support for RISCVISD::GREV.	2022-01-30 12:41:09 -08:00
Craig Topper	524545317c	[RISCV] Remove RISCVISD::BREV8 and use RISCVISD::GREV instead. We already have an ISD opcode for the more general GREV/GREVI instructon. We can just use it with the encoding that corresponds to the behavior of brev8. This is similar to what we do for orc.b where we use the GORC ISD opcode.	2022-01-29 22:45:43 -08:00
Craig Topper	0405ac0150	[RISCV] Rerrange RISCVInstrInfoZB.td to better group related wthings. NFC Especially placing W instructions/patterns near their non-W versions.	2022-01-29 21:16:15 -08:00
Craig Topper	815786eb67	[RISCV] Use RVBUnary to simplify ZEXT_H_RV32/ZEXT_H_RV64 definitions. NFC	2022-01-29 18:28:14 -08:00
Craig Topper	8faf2a0638	[RISCV] Correct predicate orc.b pattern to not include Zbkb. This was incorrectly lumped in when the predicate was changed for the rotate instructions.	2022-01-29 00:10:54 -08:00
Craig Topper	d8f929a567	[RISCV] Custom legalize BITREVERSE with Zbkb. With Zbkb, a bitreverse can be split into a rev8 and a brev8. Reviewed By: VincentWu Differential Revision: https://reviews.llvm.org/D118430	2022-01-28 23:11:12 -08:00
jacquesguan	1276678982	[RISCV] Improve extract_vector_elt for fixed mask registers. Now the backend promotes mask vector to an i8 vector and extract element from that. We could bitcast to a widen element vector, and extract from it to GPR, then use I instruction to extract the certain bit. Differential Revision: https://reviews.llvm.org/D117389	2022-01-29 11:07:53 +08:00
Craig Topper	06bd56d47d	[RISCV] Update comments about getInstSizeInBytes hard-coding the number of bytes. After D118175, we get the information from the tablegen definition. Differential Revision: https://reviews.llvm.org/D118488	2022-01-28 09:51:49 -08:00
Craig Topper	ea05ee9059	[RISCV] Preserve VL when truncating i64 gather/scatter indices on RV32. We were creating a truncate with the default for the type, but for VP intrinsics we have a VL that we should use. Reviewed By: frasercrmck Differential Revision: https://reviews.llvm.org/D118406	2022-01-28 09:25:30 -08:00
Craig Topper	de0c2d75bf	[RISCV] Use tablegen size for getInstSizeInBytes. Fix the pseudos to have the correct size in the MCInstrDesc description. Inspired by D118009 and D117970. Reviewed By: asb Differential Revision: https://reviews.llvm.org/D118175	2022-01-28 09:21:28 -08:00
Kito Cheng	a9d5bb926d	[RISCV] Use __extendhfsf2/__truncsfhf2 for fp16 <-> fp32 `__gnu_h2f_ieee` and `__gnu_f2h_ieee` are introduce by ARM and set that as default name for fp16 and fp32 conversion in LLVM. However RISC-V GCC using default naming scheme for that, which is `__extendhfsf2` and `__truncsfhf2` for that, that cause runtime ABI incompatible issue. Although we didn't have formal runtime ABI spec to specify those naming convention yet, but I think it would be great to fix the incompatible issue first. And I've plan to create a runtime ABI spec undere psABI spec this year. Reviewed By: asb Differential Revision: https://reviews.llvm.org/D118207	2022-01-29 00:01:00 +08:00
Alex Bradbury	588f121ada	[RISCV][NFC] Make Zb* instruction naming match the convention used elsewhere in the RISC-V backend Where the instruction mnemonic contains a dot, we name the corresponding instruction in the .td file using a _ in the place of the dot. e.g. LR_W rather than LRW. This commit updates RISCVInstrInfoZb.td to follow that convention.	2022-01-28 15:20:37 +00:00
Chenbing.Zheng	6d6c44a3f3	[RISCV] Add support for matching vwmulsu from fixed vectors According to riscv-v-spec-1.0, widening signed(vs2)-unsigned integer multiply vwmulsu.vv vd, vs2, vs1, vm # vector-vector vwmulsu.vx vd, vs2, rs1, vm # vector-scalar It is worth noting that signed op is only for vs2. For vwmulsu.vv, we can swap two ops, and don't care which is sign extension, but for vwmulsu.vx signExt can not be a vector extended from scalar (rs1). I specifically added two functions ending with _swap in the test case. Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D118215	2022-01-28 02:33:30 +00:00
Craig Topper	70e1cc6792	[RISCV] Prefer vmslt.vx v0, v8, zero over vmsle.vi v0, v8, -1. At least when starting from a vmslt.vx intrinsic or ISD::SETLT. We don't handle the case where the user used vmsle.vx intrinsic with -1.	2022-01-27 11:48:27 -08:00
Fraser Cormack	84e85e025e	[SelectionDAG][VP] Provide expansion for VP_MERGE This patch adds support for expanding VP_MERGE through a sequence of vector operations producing a full-length mask setting up the elements past EVL/pivot to be false, combining this with the original mask, and culminating in a full-length vector select. This expansion should work for any data type, though the only use for RVV is for boolean vectors, which themselves rely on an expansion for the VSELECT. Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D118058	2022-01-27 09:00:41 +00:00
Wu Xinlong	6a4d3f37b5	[RISCV] fix dead code fix dead code mentioned on https://reviews.llvm.org/D98136 Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D118323	2022-01-27 16:00:01 +08:00
Wu Xinlong	615d71d9a3	[RISCV][CodeGen] Implement IR Intrinsic support for K extension This revision implements IR Intrinsic support for RISCV Scalar Crypto extension according to the specification of version [[ https://github.com/riscv/riscv-crypto/releases/tag/v1.0.0-scalar \| 1.0]] Co-author：@ksyx & @VincentWu & @lihongliang & @achieveartificialintelligence Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D102310	2022-01-27 15:53:35 +08:00
Craig Topper	b3bec6e453	[RISCV] Use vnsrl.wx with x0 instead of vnsrl.vi for truncate. This matches what the spec uses for the vncvt.x.x.w assembly pseudoinstruction. Reviewed By: kito-cheng Differential Revision: https://reviews.llvm.org/D118295	2022-01-26 18:38:13 -08:00
Craig Topper	f487a76430	[RISCV] Add hasStdExtZbp() to hasAndNotCompare.	2022-01-26 13:54:05 -08:00
Craig Topper	b3d94b199c	[RISCV] Remove references to 'B' extension from AssemblerPredicate and SubtargetFeature strings. For Zba/Zbb/Zbc/Zbs I've removed the 'B' completely and used the extension names as presented at the start of Chapter 1 of the 1.0.0 Bitmanipulation spec. For the unratified extensions, I've replaced 'B' with 'Zb' and otherwise left them unchanged. Reviewed By: asb Differential Revision: https://reviews.llvm.org/D117822	2022-01-26 11:08:29 -08:00
Benjamin Kramer	f15014ff54	Revert "Rename llvm::array_lengthof into llvm::size to match std::size from C++17" This reverts commit `ef82063207`. - It conflicts with the existing llvm::size in STLExtras, which will now never be called. - Calling it without llvm:: breaks C++17 compat	2022-01-26 16:55:53 +01:00
serge-sans-paille	ef82063207	Rename llvm::array_lengthof into llvm::size to match std::size from C++17 As a conquence move llvm::array_lengthof from STLExtras.h to STLForwardCompat.h (which is included by STLExtras.h so no build breakage expected).	2022-01-26 16:17:45 +01:00
jacquesguan	267711e38b	[RISCV] Fix support of vlen = 64. In the Zve* extensions, the vlen could be 64. This patch change the vlen constraint of low bound to 64. Differential Revision: https://reviews.llvm.org/D118217	2022-01-26 16:31:21 +08:00
Zakk Chen	510710d037	[RISCV][NFC] Add getVLOperand for RVV intrinsics. Use the VLOperand information to get the VL. Differential Revision: https://reviews.llvm.org/D118156	2022-01-25 17:37:58 -08:00
Zakk Chen	9273378b85	[RISCV] Add the passthru operand for RVV nomask load intrinsics. The goal is support tail and mask policy in RVV builtins. We focus on IR part first. If the passthru operand is undef, we use tail agnostic, otherwise use tail undisturbed. Co-Authored-by: Hsiangkai Wang <Hsiangkai@gmail.com> Reviewers: craig.topper, frasercrmck Differential Revision: https://reviews.llvm.org/D117647	2022-01-25 17:31:36 -08:00
eopXD	b089e4072a	[RISCV] Don't allow i64 vector div by constant to use mulh with Zve64x EEW=64 of mulh and its vairants requires V extension. Authored by: Craig Topper <craig.topper@sifive.com> @craig.topper Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D117947	2022-01-25 09:55:05 -08:00
Nikita Popov	aa97bc116d	[NFC] Remove uses of PointerType::getElementType() Instead use either Type::getPointerElementType() or Type::getNonOpaquePointerElementType(). This is part of D117885, in preparation for deprecating the API.	2022-01-25 09:44:52 +01:00
Craig Topper	fd0a4bc76b	[RISCV] Add missing space to 'clang-format on' directive. NFC Without a space after the comment characters it seems to be ignored.	2022-01-24 17:00:37 -08:00
Craig Topper	cd2a9ff397	[RISCV] Select int_riscv_vsll with shift of 1 to vadd.vv. Add might be faster than shift. We can't do this earlier without using a Freeze instruction. This is the intrinsic version of D106689. Reviewed By: frasercrmck Differential Revision: https://reviews.llvm.org/D118013	2022-01-24 08:04:53 -08:00
Fraser Cormack	d42678b453	[RISCV] Add side-effect-free vsetvli intrinsics This patch introduces new intrinsics that enable the use of vsetvli in contexts where only the returned vector length is of interest. The pre-existing intrinsics are marked with side-effects, which prevents even trivial optimizations on/across them. These intrinsics are intended to be used in situations where the vector length is fed in turn to RVV intrinsics or to vector-predication intrinsics during loop vectorization, for example. Those codegen paths ensure that instructions are generated with their own implicit vsetvli, so the vector length and vtype can be relied upon to be correct. No corresponding C builtins are planned at this stage, though that is a possibility for the future if the need arises. Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D117910	2022-01-24 13:52:08 +00:00
SForeKeeper	70f83f3084	[RISCV] add support for zbkx subextension in MC layer. This patch adds support for zbkx extension from K extension(v1.0.0) in MC layer. Instructions with same functionality and same encoding is defined in the bitmanip extension. It defines {Xperm8, Xperm4} as instruction aliases for xperm.* in Zbp extension. When Zbkx is enabled while Zbp is not, xperm.h will not be available. When Zbkx and Zbp are both enabled, the instructions will be decoded in Zbp format. [[ https://reviews.llvm.org/D94999 \| D94999 ]] this is the patch that introduces xperm.* instructions. Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D117889	2022-01-24 20:38:46 +08:00
Fraser Cormack	af773a1818	[RISCV][VP] Lower VP_MERGE to RVV instructions This patch adds lowering of the llvm.vp.merge.* intrinsic (ISD::VP_MERGE) to RVV vmerge/vfmerge instructions. It introduces a special pseudo form of vmerge which allows a tied merge operand, allowing us to specify the tail elements as being equal to the "on false" operand, using a tied-def constraint and a "tail undisturbed" policy. While this strategy allows us to often lower the intrinsic to just one instruction, it may be less efficient in fixed-vector types as the number of tail elements may extend far beyond the length of the fixed vector. Another strategy could be to use a vmerge/vfmerge instruction with an AVL equal to the length of the vector type, and manipulate the condition operand such that mask elements greater than the operation's EVL are false. I've also observed inefficient codegen in which our 'VF' patterns don't match raw floating-point SPLAT_VECTORs, which occur in scalable-vector code. Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D117561	2022-01-24 11:05:05 +00:00
Fraser Cormack	e7926e8d97	[RISCV] Match VF variants for masked VFRDIV/VFRSUB This patch follows up on D117697 to help the simple binary operations behave similarly in the presence of masks. It also enables CGP sinking support for vp.fdiv and vp.fsub intrinsics, now that VFRDIV and VFRSUB are consistently matched with a LHS splat for masked and unmasked variants. Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D117783	2022-01-24 10:59:43 +00:00
Chenbing.Zheng	9aaa74aeef	[RISCV] Add patterns of SET[U]LT_VI for STECC forms This patch optmizes "li a0, 5 vmsgt[u].vx v10, v8, a0" -> "vmsgt[u].vi v10, v8, 5" Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D118014	2022-01-24 08:50:49 +00:00
jacquesguan	ba16e3c31f	[RISCV] Decouple Zve* extensions and the V extension. According to the spec, there are some difference between V and Zve64d. For example, the vmulh integer multiply variants that return the high word of the product (vmulh.vv, vmulh.vx, vmulhu.vv, vmulhu.vx, vmulhsu.vv, vmulhsu.vx) are not included for EEW=64 in Zve64, but V extension does support these instructions. So we should decouple Zve extensions and the V extension. Differential Revision: https://reviews.llvm.org/D117854	2022-01-24 14:55:21 +08:00
Wu Xinlong	e29d8fb169	[RISCV] Initially support the K-extension instructions on the LLVM MC layer This commit is currently implementing supports for scalar cryptography extension for LLVM according to version v1.0.0 of [K Ext specification](https://github.com/riscv/riscv-crypto/releases)(scala crypto has been ratified already). Currently, we are implementing the MC (Machine Code) layer of his extension and the majority of work is done under `llvm/lib/Target/RISCV` directory. There are also some test files in `llvm/test/MC/RISCV` directory. Remove the subfeature of Zbk* which conflict with b extensions to reduce the size of the patch. (Zbk* will be resubmit after this patch has been merged) Co-author：@ksyx & @VincentWu & @lihongliang & @achieveartificialintelligence Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D98136	2022-01-24 14:45:35 +08:00
Jim Lin	3f24cdec25	[RISCV][NFC] Remove tailing whitespaces in RISCVInstrInfoVSDPatterns.td and RISCVInstrInfoVVLPatterns.td	2022-01-24 10:49:43 +08:00
Craig Topper	413684313d	[RISCV] Adjust the header comment in RISCVInstrInfoZb.td to better integrate Zbk* extensions. The Zbk* extensions have some overlap with Zb so have been placed in this file. Reviewed By: VincentWu Differential Revision: https://reviews.llvm.org/D117958	2022-01-23 11:42:52 -08:00
eopXD	3cf15af2da	[RISCV] Remove experimental prefix from rvv-related extensions. Extensions affected: +v, +zve, +zvl Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D117860	2022-01-22 20:18:40 -08:00
Craig Topper	d44b6be6ea	[RISCV] Don't Custom legalize f16/f32/f64 bitcasts if those types aren't Legal.	2022-01-22 11:55:18 -08:00
Alex Fan	e796eaf2af	[RISCV][RFC] add MC support for zbkc subextension Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D117874	2022-01-22 10:23:01 +08:00
Craig Topper	0379459fc5	[RISCV] Strengthen a SDTypeProfile. Fix formatting.	2022-01-21 13:01:53 -08:00
Craig Topper	48132bb1e4	[RISCV] Simplify interface to combineMUL_VLToVWMUL. NFC Instead of passing the both the SDNode* and 2 of the operands in two different orders, just pass the SDNode * and a bool to indicate which operand order to test. While there rename to combineMUL_VLToVWMUL_VL.	2022-01-21 11:43:06 -08:00
Craig Topper	11754a4dbb	[RISCV] Use RVBUnary in more places to simplify some tablegen declarations. NFCI	2022-01-21 10:55:35 -08:00
Fraser Cormack	4d268dc94a	[RISCV] Enable CGP to sink splat operands of VP intrinsics This patch brings better splat-matching to our VP support, by sinking splat operands of VP intrinsics back into the same block as the VP operation. The list of VP intrinsics we are interested in matches that of the regular instructions. Some optimization is still lacking. For instance, our VL nodes aren't recognized as commutative, so splats must be on the RHS. Because of this, we limit our sinking of splats to just the RHS operand for now. Improvement in this regard can come in another patch. Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D117703	2022-01-21 11:30:37 +00:00
wangpc	8def89b5dc	[RISCV] Set CostPerUse to 1 iff RVC is enabled After D86836, we can define multiple cost values for different cost models. So here we set CostPerUse to 1 iff RVC is enabled to avoid potential impact on RA. Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D117741	2022-01-21 14:44:26 +08:00
Craig Topper	7b3d307288	[RISCV] Add isel patterns for grevi, shfli, and unshfli to brev8/zip/unzip instructions. Zbkb supports some encodings of the general grevi, shfli, and unshfli instructions legal, so we added separate instructions for those encodings to improve the diagnostics for assembler and disassembler. To be consistent we should always use these separate instructions whenever those specific encodings of grevi/shfli/unshfli occur. So this patch adds specific isel patterns to override the generic isel patterns for these cases. Similar was done for rev8 and zext.h for Zbb previously.	2022-01-20 20:43:52 -08:00
Wu Xinlong	7ee1c162cc	[RISCV][RFC] add inst support of zbkb This commit add instructions supports of `zbkb` which defined in scalar cryptography extension version v1.0.0 (has been ratified already). Most of the zbkb directives reuse parts of the zbp and zbb directives, so this patch just modified some of the inst aliases and predicates. Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D117640	2022-01-21 11:49:36 +08:00
Hsiangkai Wang	ad06e65dc4	[RISCV] Fix the bug in the register allocator caused by reserved BP. Originally, hasRVVFrameObject() will scan all the stack objects to check whether if there is any scalable vector object on the stack or not. However, it causes errors in the register allocator. In issue 53016, it returns false before RA because there is no RVV stack objects. After RA, it returns true because there are spilling slots for RVV values during RA. The compiler will not reserve BP during register allocation and generate BP access in the PEI pass due to the inconsistent behavior of the function. The function is changed to use hasStdExtV() as the return value. It is not precise, but it can make the register allocation correct. Refer to https://github.com/llvm/llvm-project/issues/53016. Differential Revision: https://reviews.llvm.org/D117663	2022-01-21 01:23:01 +00:00
Craig Topper	cfae2c65db	[RISCV] Factor Zve32 support into RISCVSubtarget::getMaxELENForFixedLengthVectors. This is needed to properly limit fractional LMULs for Zve32. Add new RUN Zve32 RUN lines to the existing tests for the -riscv-v-fixed-length-vector-elen-max command line option.	2022-01-20 16:31:12 -08:00
Craig Topper	5e88f527da	[RISCV] Remove RISCVSubtarget::hasStdExtV() and hasStdExtZve(). NFC All code should use one of the cleaner named hasVInstructions functions. Fix the two uses that weren't and delete the methods so no new uses can be created.	2022-01-20 15:05:09 -08:00
Craig Topper	fa8bb22466	[RISCV] Optimize vector_shuffles that are interleaving the lowest elements of two vectors. RISCV only has a unary shuffle that requires places indices in a register. For interleaving two vectors this means we need at least two vrgathers and a vmerge to do a shuffle of two vectors. This patch teaches shuffle lowering to use a widening addu followed by a widening vmaccu to implement the interleave. First we extract the low half of both V1 and V2. Then we implement (zext(V1) + zext(V2)) + (zext(V2) * zext(2^eltbits - 1)) which simplifies to (zext(V1) + zext(V2) * 2^eltbits). This further simplifies to (zext(V1) + zext(V2) << eltbits). Then we bitcast the result back to the original type splitting the wide elements in half. We can only do this if we have a type with wider elements available. Because we're using extends we also have to be careful with fractional lmuls. Floating point types are supported by bitcasting to/from integer. The tests test a varied combination of LMULs split across VLEN>=128 and VLEN>=512 tests. There a few tests with shuffle indices commuted as well as tests for undef indices. There's one test for a vXi64/vXf64 vector which we can't optimize, but verifies we don't crash. Reviewed By: rogfer01 Differential Revision: https://reviews.llvm.org/D117743	2022-01-20 14:44:47 -08:00
Craig Topper	dd7b69a61f	[RISCV] Remove HadStdExtV and HasStdZve* Predicates from tablegen. No instructions should be using these. Everything should use HasVInstructions* Predicates. Remove them so that they can't be used by accident.	2022-01-20 12:54:20 -08:00
Craig Topper	7a275dc354	[RISCV] Remove Zvlsseg extension. This string no longer appears in the Vector Extension specification. The segment load/store instructions are just part of the vector instruction set. Reviewed By: asb Differential Revision: https://reviews.llvm.org/D117724	2022-01-20 12:40:07 -08:00
Craig Topper	94e69fbb4f	[RISCV] Add DAG combine to fold (fp_to_int_sat (ffloor X)) -> (select X == nan, 0, (fcvt X, rdn)) Similar for ceil, trunc, round, and roundeven. This allows us to use static rounding modes to avoid a libcall. This is similar to D116771, but for the saturating conversions. This optimization is done for AArch64 as isel patterns. RISCV doesn't have instructions for ceil/floor/trunc/round/roundeven so the operations don't stick around until isel to enable a pattern match. Thus I've implemented a DAG combine. I'm only handling saturating to i64 or i32. This could be extended to other sizes in the future. Reviewed By: asb Differential Revision: https://reviews.llvm.org/D116864	2022-01-20 11:35:37 -08:00
Fraser Cormack	ca36cc56ac	[RISCV] Match RVV VF variants also through masked operations This brings floating-point RVV vector/scalar support more in line with the integer vector patterns, which can already match '.vx' instructions with masked operations. Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D117697	2022-01-20 12:08:02 +00:00
Fraser Cormack	5a12024b95	[RISCV] Optimize lowering of floating-point -0.0 This idea has come up in several reviews -- D115978 and D105902 -- so I can't take any credit for the idea. Instead of using a constant pool to lower -0.0, we can emit a sequence of two instructions: fmv.[hwd].x freg, zero fsgnjn.[hsd] freg, freg, freg This is only done when the floating-point type is legal. Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D117687	2022-01-20 11:46:28 +00:00
Chenbing.Zheng	0be3da1fab	[RISCV] Add intrinsic for Zbt extension RV32: fsl, fsr, fsri RV64: fsl, fsr, fsri, fslw, fsrw, fsriw Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D117468	2022-01-20 08:27:05 +00:00
eopXD	8eae99dfe5	[RISCV] Add the zve extension according to the v1.0 spec `zve` is the new standard vector extension to specify varying degrees of vector support for embedding processors. The `zve` extension is related to the `zvl` extension and other updates that are added in v1.0. According to https://github.com/riscv-non-isa/riscv-c-api-doc/pull/21, Clang defines macro `__riscv_v_max_elen`, `__riscv_v_max_elen_fp` for `zve` and it can be used by applications that uses the vector extension. Authored by: Zakk Chen <zakk.chen@sifive.com> @khchen Co-Authored by: Eop Chen <eop.chen@sifive.com> @eopXD Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D112408	2022-01-19 23:48:28 -08:00
Mohammed Nurul Hoque	21c79be5d7	[RISCV] Add patterns to MIR sign-extension removal pass. This patch adds a few instruction patterns that generate sign-extended values or propagate them, adding to the pass introduced in https://reviews.llvm.org/D116397 Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D117465	2022-01-19 17:33:58 -08:00
Luís Marques	a767ae2c5c	[RISCV] Fix incomplete asm statement parsing For instructions without operands, the final `AsmToken::EndOfStatement` wasn't being consumed. In the context of inline assembly, the resulting empty statements would cause extraneous empty lines to be emitted. Fix the issue by consuming the `EndOfStatement` token. Differential Revision: https://reviews.llvm.org/D117565	2022-01-19 21:56:21 +00:00
Craig Topper	4060b81e76	[RISCV] Obey -riscv-v-fixed-length-vector-elen-max when lowering mask BUILD_VECTORs. We may not be allowed to use vXiXLen vectors. Consult ELEN to determine what is allowed. This will become even more important when Zve32 is added. Reviewed By: frasercrmck, arcbbb Differential Revision: https://reviews.llvm.org/D117518	2022-01-19 10:47:37 -08:00
Jim Lin	d6b0734837	[NFC] Use Register instead of unsigned	2022-01-19 20:17:04 +08:00
eopXD	9f27941c2f	[RISCV] Add patterns for vector narrowing integer right shift instructions Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D117454	2022-01-18 22:30:13 -08:00
Craig Topper	5a6c622afd	[RISCV] Remove special case for constant shift amount in FSHL/FSHR lowering to FSL/FSR. Remove fshl/fshr with constant shift amount isel patterns. Replace with fsr/fsl with constant isel patterns. This hack was trying to preserve as much optimization opportunity for fshl/fshr by constant as possible, but the conversion to RISCVISD::FSR/FSL happens so late it probably isn't worth much. The new isel patterns are needed by D117468 anyway.	2022-01-18 11:47:50 -08:00
Craig Topper	aa7fc02feb	Recommit "[RISCV] Make the operand order for RISCVISD::FSL(W)/FSR(W) match the instruction register numbering." This reverts the revert commit `e328385739`. Accidental demanded bits change has been removed. The demanded bits code itself was remove in a pre-commit since it isn't tested. Original commit message: Previous we used the fshl/fshr operand ordering for simplicity. This made things confusing when D117468 proposed adding intrinsics for the instructions. We can't just use the generic funnel shifting intrinsics because fsl/fsr have different functionality that should be exposed to software. Now we use rs1, rs3, rs2/shamt order which matches the instruction printing order and the order used in this intrinsic header https://github.com/riscv/riscv-bitmanip/blob/main-history/cproofs/rvintrin.h	2022-01-18 10:52:43 -08:00
Craig Topper	b3a0ec7645	[RISCV] Remove DemandedBits handling for FSR/FSL until we have test cases for it. Testing may be easier after D117468. Right now we get demanded bits optimizations done on ISD::FSHL/FSHR before they become FSR/FSL. This makes it hard to test.	2022-01-18 10:52:43 -08:00
Craig Topper	e328385739	Revert "[RISCV] Make the operand order for RISCVISD::FSL(W)/FSR(W) match the instruction register numbering." This reverts commit `b634f8a663`. I broke the SimplifyDemandedBits code, but we don't have tests.	2022-01-18 10:36:03 -08:00
Craig Topper	b634f8a663	[RISCV] Make the operand order for RISCVISD::FSL(W)/FSR(W) match the instruction register numbering. Previous we used the fshl/fshr operand ordering for simplicity. This made things confusing when D117468 proposed adding intrinsics for the instructions. We can't just use the generic funnel shifting intrinsics because fsl/fsr have different functionality that should be exposed to software. Now we use rs1, rs3, rs2/shamt order which matches the instruction printing order and the order used in this intrinsic header https://github.com/riscv/riscv-bitmanip/blob/main-history/cproofs/rvintrin.h	2022-01-18 09:47:28 -08:00
David Sherwood	f4515ab858	Revert "[CodeGen][AArch64] Ensure isSExtCheaperThanZExt returns true for negative constants" This reverts commit `197f3c0deb`. Reverting after miscompilation errors discovered with ffmpeg.	2022-01-18 08:40:20 +00:00
Lian Wang	5ceb4f5446	[RISCV] Add instruction schedule for Zbc extension and Zbs extension Zbc extension: CLMUL/CLMULR/CLMULH are grouped together, defined one schedule class. Zbs extension: BCLR/BSET/BINV/BEXT are grouped together, defined one schedule class. BCLRI/BSETI/BINVI/BEXTI are grouped together, defined one schedule class. Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D117538	2022-01-18 07:31:50 +00:00
jacquesguan	1090000b63	[RISCV] Add patterns for vector widening floating-point multiply Add patterns for vector widening floating-point multiply Differential Revision: https://reviews.llvm.org/D117530	2022-01-18 14:52:43 +08:00
Han-Kuan Chen	ec9cb3a79c	[RISCV] Provide VLOperand in td. Currently, users expected VL is the last operand. However, since some intrinsics has tail policy in the last operand, this rule cannot be used anymore. Reviewed By: craig.topper, frasercrmck Differential Revision: https://reviews.llvm.org/D117452	2022-01-17 20:25:47 -08:00
Han-Kuan Chen	3fc4b5896a	[RISCV] Make SplatOperand start from 0. Current SplatOperand starts from 1 because operand 0 (or 1) is intrinsic id in SelectionDAG. Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D117453	2022-01-17 20:14:59 -08:00
jacquesguan	c29d6c410e	[RISCV] Add patterns for vector widening floating-point add/subtract instructions Add patterns for Vector Widening Floating-Point Add/Subtract Instructions Differential Revision: https://reviews.llvm.org/D117466	2022-01-18 10:33:56 +08:00
Craig Topper	116af698e2	[RISCV] When expanding CONCAT_VECTORS, don't create INSERT_SUBVECTORS for undef subvectors. For fixed vectors, the undef will get expanded to an all zeros build_vector. We don't want that so suppress creating the insert_subvector. Reviewed By: frasercrmck Differential Revision: https://reviews.llvm.org/D117379	2022-01-17 14:40:59 -08:00
Craig Topper	9c410838d2	[RISCV] Legalize fixed length (insert_subvector undef, X, 0) to a scalable insert. We were considering this legal, but later the undef would become an all zeros vector. This would cause us to need to re-legalize the insert later into a vslideup with zero vector. This patch catches the case and directly legalizes it to a scalable insert. Reviewed By: frasercrmck Differential Revision: https://reviews.llvm.org/D117377	2022-01-17 14:31:30 -08:00
David Sherwood	197f3c0deb	[CodeGen][AArch64] Ensure isSExtCheaperThanZExt returns true for negative constants When we know the value we're extending is a negative constant then it makes sense to use SIGN_EXTEND because this may improve code quality in some cases, particularly when doing a constant splat of an unpacked vector type. For example, for SVE when splatting the value -1 into all elements of a vector of type <vscale x 2 x i32> the element type will get promoted from i32 -> i64. In this case we want the splat value to sign-extend from (i32 -1) -> (i64 -1), whereas currently it zero-extends from (i32 -1) -> (i64 0xFFFFFFFF). Sign-extending the constant means we can use a single mov immediate instruction. New tests added here: CodeGen/AArch64/sve-vector-splat.ll I believe we see some code quality improvements in these existing tests too: CodeGen/AArch64/reduce-and.ll CodeGen/AArch64/unfold-masked-merge-vector-variablemask.ll The apparent regressions in CodeGen/AArch64/fast-isel-cmp-vec.ll only occur because the test disables codegen prepare and branch folding. Differential Revision: https://reviews.llvm.org/D114357	2022-01-17 11:08:57 +00:00
Lian Wang	85def34f5e	[RISCV] Add scheduler for bfp instruction in Zbf extension Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D117290	2022-01-17 09:17:18 +00:00
Kito Cheng	cc35161dc7	[RISCV] Add initial support for getRegUsageForType and getNumberOfRegisters Those two TTI hooks are used during vectorization for calculating register pressure, the default implementation isn't consider for LMUL, and that's also definitly wrong value for register number (all register class are 8 registers). So in this patch we tried to: 1. Calculate right register usage for vector type and scalar type. 2. Return right number of register for general purpose register and vector register. Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D116890	2022-01-17 15:27:54 +08:00
eopXD	5a457782a2	[RISCV] Add patterns for vector widening integer multiply-add instructions Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D117404	2022-01-16 18:37:13 -08:00
Craig Topper	4c1e1e05cb	[RISCV] Add RISCVISD::BFPW to ComputeNumSignBitsForTargetNode.	2022-01-15 15:23:49 -08:00
Fraser Cormack	877d1b3d07	[SelectionDAG][VP] Add splitting/widening for VP_LOAD and VP_STORE Original patch by @hussainjk. This patch was split off from D109377 to keep vector legalization (widening/splitting) separate from vector element legalization (promoting). While the original patch added a third overload of SelectionDAG::getVPStore, this patch takes the liberty of collapsing those all down to 1, as three overloads seems excessive for a little-used node. The original patch also used ModifyToType in places, but that method still crashes on scalable vector types. Seeing as the other VP legalization methods only work when all operands need identical widening, this patch follows in that vein. Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D117235	2022-01-15 11:41:29 +00:00
Alex Bradbury	0ee679e22c	[RISCV] Add CSRs defined in the recently ratified Sstc extension The 'RISC-V "stimecmp / vstimecmp" Extension' was ratified at the end of last year though hasn't yet been integrated in the main specification documents (see <https://wiki.riscv.org/display/TECH/Recently+Ratified+Extensions>). RISC-V "stimecmp / vstimecmp" Extension <https://github.com/riscv/riscv-time-compare/releases/download/v0.5.4/Sstc.pdf>. Differential Revision: https://reviews.llvm.org/D117311	2022-01-15 08:36:04 +00:00
Alex Bradbury	1ca79823e0	[RISCV] Add CSRs defined in the recently ratified Smstateen extension The "RISC-V State Enable Extension" was ratified at the end of at the end of last year though hasn't yet been integrated in the main specification documents (see <https://wiki.riscv.org/display/TECH/Recently+Ratified+Extensions>). This commit adds the CSRs defined by this extension in <https://github.com/riscv/riscv-state-enable/releases/download/v0.6.3/Smstateen.pdf>. Differential Revision: https://reviews.llvm.org/D117310	2022-01-15 08:35:47 +00:00
Alex Bradbury	f00a98a0a9	[RISCV] Add CSRs defined in the recently ratified Sscofpmf extension The "RISC-V Count Overflow and Mode-Based Filtering Extension" was ratified at the end of last year though hasn't yet been integrated in the main specification documents (see <https://wiki.riscv.org/display/TECH/Recently+Ratified+Extensions>). This commit adds the CSRs defined by this extension in <https://github.com/riscv/riscv-count-overflow/releases/download/v0.5.2/Sscofpmf.pdf>. Differential Revision: https://reviews.llvm.org/D117308	2022-01-15 08:35:13 +00:00
Chenbing.Zheng	fdd33a0c75	[RISCV][NFC] Add a function to customLegalizeToWOp by Intrinsic These cases follow the same pattern, so they can be combined to a unqiue function. Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D117378	2022-01-15 08:28:08 +00:00
eopXD	26bb1b1dab	[RISCV] Add the zvl extension according to the v1.0 spec `zvl` is the new standard vector extension that specifies the minimum vector length of the vector extension. The `zvl` extension is related to the `zve` extension and other updates that are added in v1.0. According to https://github.com/riscv-non-isa/riscv-c-api-doc/pull/21, Clang defines macro `__riscv_v_min_vlen` for `zvl` and it can be used for applications that uses the vector extension. LLVM checks whether the option `riscv-v-vector-bits-min` (if specified) matches the `zvl*` extension specified. Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D108694	2022-01-14 23:01:48 -08:00
Lian Wang	21dad9a522	[RISCV][NFC] Add IsRV64 predicate in xperm.w pattern Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D117191	2022-01-15 04:22:16 +00:00
jacquesguan	b148348ad4	[RISCV] Add patterns for vector widening integer add/subtract Add patterns for vector widening integer add/subtract instructions Differential Revision: https://reviews.llvm.org/D117188	2022-01-15 09:41:07 +08:00
Shao-Ce SUN	a0a76fee0c	[RISCV] update zfh and zfhmin extention to v1.0 `zfh` and `zfhmin` have been ratified, with version 1.0. Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D117098	2022-01-15 09:21:24 +08:00
Craig Topper	2baa1dffd1	[RISCV] Add basic support for matching shuffles to vslidedown.vi. Specifically the unary shuffle case where the elements being shifted in are undef. This handles the shuffles produce by expanding llvm.reduce.mul. I did not reduce the VL which would increase the number of vsetvlis, but may improve the execution speed. We'd also want to narrow the multiplies so we could share vsetvlis between the vslidedown.vi and the next multiply. Reviewed By: frasercrmck Differential Revision: https://reviews.llvm.org/D117239	2022-01-14 09:04:54 -08:00
Craig Topper	ac6b4896ea	[RISCV] Honor the VT when converting float point register names to register class for inline assembly. It appears the code here was written for the inline asm clobbering a specific register, but it also gets used for named input and output registers. For the input and output case, we should honor the VT so we don't insert conversion instructions around the inline assembly. For the clobber, case we need to pick the largest register class. Reviewed By: asb, jrtc27 Differential Revision: https://reviews.llvm.org/D117279	2022-01-14 09:04:00 -08:00
Alex Bradbury	4a4a652f34	[RISCV][NFC] Use TableGen 'foreach' to simplify repetitive CSR definitions Make the definitions of hpmcounter3-hpmcounter31, hpmcounter3h-hpmcounter31h, mhpmcounter3-mhpmcounter31, mhpmcounter3h-mhpmcounter31h, pmpaddr0-pmpaddr63, mhpmevent3-31, and pmpcfg0-15 substantially less repetitive using a foreach loop. Differential Revision: https://reviews.llvm.org/D117227	2022-01-14 11:59:39 +00:00
jacquesguan	88c0e0806b	[RISCV] Improve i64 splat vector lowering in RV32. We could use vmv.v.i/vmv.v.x whose eew is 32 to lower the i64 splat vector if the i64 constant scalar could be splitted into two same i32 scalar. Differential Revision: https://reviews.llvm.org/D117079	2022-01-14 14:06:01 +08:00
David Sherwood	ba471ba8d2	Revert "[CodeGen][AArch64] Ensure isSExtCheaperThanZExt returns true for negative constants" This reverts commit `31009f0b5a`. It seems to be causing SVE VLA buildbot failures and has introduced a genuine regression. Reverting for now.	2022-01-13 15:59:43 +00:00
David Sherwood	31009f0b5a	[CodeGen][AArch64] Ensure isSExtCheaperThanZExt returns true for negative constants When we know the value we're extending is a negative constant then it makes sense to use SIGN_EXTEND because this may improve code quality in some cases, particularly when doing a constant splat of an unpacked vector type. For example, for SVE when splatting the value -1 into all elements of a vector of type <vscale x 2 x i32> the element type will get promoted from i32 -> i64. In this case we want the splat value to sign-extend from (i32 -1) -> (i64 -1), whereas currently it zero-extends from (i32 -1) -> (i64 0xFFFFFFFF). Sign-extending the constant means we can use a single mov immediate instruction. New tests added here: CodeGen/AArch64/sve-vector-splat.ll I believe we see some code quality improvements in these existing tests too: CodeGen/AArch64/dag-numsignbits.ll CodeGen/AArch64/reduce-and.ll CodeGen/AArch64/unfold-masked-merge-vector-variablemask.ll The apparent regressions in CodeGen/AArch64/fast-isel-cmp-vec.ll only occur because the test disables codegen prepare and branch folding. Differential Revision: https://reviews.llvm.org/D114357	2022-01-13 09:43:07 +00:00
Lian Wang	16877c5d2c	[RISCV] Add bfp and bfpw intrinsic in zbf extension Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D116994	2022-01-13 02:53:00 +00:00
Alex Bradbury	33d008b169	[RISCV] Update recently ratified Zb{a,b,c,s} extensions to no longer be experimental Agreed policy is that RISC-V extensions that have not yet been ratified should be marked as experimental, and enabling them requires the use of the -menable-experimental-extensions flag when using clang alongside the version number. These extensions have now been ratified, so this is no longer necessary, and the target feature names can be renamed to no longer be prefixed with "experimental-". Differential Revision: https://reviews.llvm.org/D117131	2022-01-12 19:33:44 +00:00
Craig Topper	632c263eb3	[RISCV] Add RISCVProcFamilyEnum and add SiFive7. Use it to remove explicit string compares from unrolling preferences. I'm of two minds on this. Ideally, we would define things in terms of architectural or microarchitectural features, but it's hard to do that with things like unrolling preferences without just ending up with FeatureSiFive7UnrollingPreferences. Having a proc enum is consistent with ARM and AArch64. X86 only has a few and is trying to move away from it. Reviewed By: asb, mcberg2021 Differential Revision: https://reviews.llvm.org/D117060	2022-01-12 09:34:02 -08:00
Shao-Ce SUN	edb9175de6	[RISCV][llvm] Update CSRs According the newest RISC-V Privileged Spec, updated CSRs. Reviewed By: asb Differential Revision: https://reviews.llvm.org/D116645	2022-01-12 20:14:04 +08:00
Craig Topper	63b17eb9ec	[RISCV] Add strictfp support for compares. This adds support for STRICT_FSETCC(quiet) and STRICT_FSETCCS(signaling). FEQ matches well to STRICT_FSETCC oeq. FLT/FLE matches well to STRICT_FSETCCS olt/ole. Others require commuting operands or multiple instructions. STRICT_FSETCC olt/ole/ogt/oge/ult/ule/ugt/uge uses FLT/FLE, but we need to save/restore FFLAGS around them to avoid spurious exceptions. I've implemented pseudo instructions with a CustomInserter to insert the save/restore CSR instructions. Unfortunately, this doesn't honor exceptions for signaling NANs but I'm not sure if signaling nans are really supported by the constrained intrinsics. STRICT_FSETCC one and ueq expand to a pair of FLT instructions with a save/restore of fflags around each. This could be improved in the future. There may be some opportunities to generate better code for strict comparisons mixed with nonans fast math flags. I've left FIXMEs in the .td files for that. Co-Authored-by: ShihPo Hung <shihpo.hung@sifive.com> Reviewed By: arcbbb Differential Revision: https://reviews.llvm.org/D116694	2022-01-11 20:01:41 -08:00
Craig Topper	be1cc64cc1	[RISCV] Add DAG combine to fold (fp_to_int (ffloor X)) -> (fcvt X, rdn) Similar for ceil, trunc, round, and roundeven. This allows us to use static rounding modes to avoid a libcall. This optimization is done for AArch64 as isel patterns. RISCV doesn't have instructions for ceil/floor/trunc/round/roundeven so the operations don't stick around until isel to enable a pattern match. Thus I've implemented a DAG combine. We only handle XLen types except i32 on RV64. i32 will be type legalized to a RISCVISD node. All other types will be type legalized to XLen and maintain the FP_TO_SINT/UINT ISD opcode. Reviewed By: asb Differential Revision: https://reviews.llvm.org/D116771	2022-01-11 09:05:57 -08:00
wangpc	c6430fade3	[RISCV] Generate 32 bits jumptable entries when code model is small The code can only address the whole RV32 address space or the lower 2 GiB of the RV64 address space in small code model, so 32 bits entry is enough. Cache hit ratio and code size have some improvements. Reviewed By: asb Differential Revision: https://reviews.llvm.org/D116435	2022-01-11 18:20:37 +08:00
wangpc	98d51c2542	[RISCV] Override TargetLowering::BuildSDIVPow2 to generate SELECT When `Zbt` is enabled, we can generate SELECT for division by power of 2, so that there is no data dependency. Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D114856	2022-01-11 15:54:35 +08:00
Chenbing.Zheng	9ea772ff81	[RISCV] Block vmsgeu.vi with 0 immediate in Isel For vmsgeu.vi with 0, we know this is always true. So we can replace it with vmset.m (unmasked) or vmset.m+vmand.mm (masked). Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D116584	2022-01-11 03:04:44 +00:00
jacquesguan	d0554ae4cf	[RISCV] Select vl op to X0 when it is equal to ~0. Now the backend will select ~0 vl to a register and load instruction, we could use X0 to replace it. Differential Revision: https://reviews.llvm.org/D116798	2022-01-11 10:56:25 +08:00
Haocong.Lu	bd653f6406	[RISCV] Use shift for zero extension when Zbb and Zbp are not enabled Now AND is used for zero extension when both Zbb and Zbp are not enabled. It may be better to use shift operation if the trailing ones mask exceeds simm12. This patch optimzes LUI+ADDI+AND to SLLI+SRLI. Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D116720	2022-01-11 02:37:03 +00:00
jacquesguan	b607cd3928	[RISCV] Use vmv.s.x to build one element splat vector. When we want to create an splat vector that only the first element is initialized, we could use vmv.s.x or vfmv.s.f to build it. Differential Revision: https://reviews.llvm.org/D116277	2022-01-11 10:21:18 +08:00
Craig Topper	b645bcd98a	[RISCV] Generalize (srl (and X, 0xffff), C) -> (srli (slli X, (XLen-16), (XLen-16) + C) optimization. This can be generalized to (srl (and X, C2), C) -> (srli (slli X, (XLen-C3), (XLen-C3) + C). Where C2 is a mask with C3 trailing ones. This can avoid constant materialization for C2. This is beneficial even when C2 can be selected to ANDI because the SLLI can become C.SLLI, but C.ANDI cannot cover all the immediates of ANDI. This also enables CSE in some cases of i8 sdiv by constant codegen.	2022-01-09 23:37:10 -08:00
Craig Topper	296e8cae5c	[RISCV] Isel (sra (sext_inreg X, i16), C) -> (srai (slli X, (XLen-16), (XLen-16) + C). Similar for (sra (sext_inreg X, i8), C). With Zbb, sext_inreg of i8 and i16 are legal for sext.b and sext.h. This transform makes the Zbb codegen the same as without Zbb. The shifts are more compressible. This also exposes an opportunity for CSE with another slli in the i16 sdiv by constant codegen.	2022-01-09 21:23:43 -08:00
jacquesguan	6b8362eb8d	[RISCV] Disable EEW=64 for index values when XLEN=32. Disable EEW=64 for vector index load/store when XLEN=32. Differential Revision: https://reviews.llvm.org/D106518	2022-01-10 10:51:27 +08:00
Craig Topper	2dd52f840b	[RISCV] Fold (srl (and X, 0xffff), C)->(srli (slli X, (XLen-16), (XLen-16) + C) even with Zbb/Zbp. We can use zext.h with Zbb, but srli/slli may offer more opportunities for compression.	2022-01-09 18:42:03 -08:00
Kazu Hirata	435a5a3652	[llvm] Fix bugprone argument comments (NFC) Identified with bugprone-argument-comment.	2022-01-08 11:56:38 -08:00
Craig Topper	042394b69e	[RISCV] Add a command line option to control the LMUL used by TTI's getRegisterBitWidth. By default we return the width of an LMUL=1 register. We can enable testing with larger LMUL values by returning a larger bit width. This patch adds a RISCV specific option to provide a LMUL which will be multiplied by the LMUL=1 bit width. Reviewed By: kito-cheng Differential Revision: https://reviews.llvm.org/D116339	2022-01-07 20:02:10 -08:00
Kito Cheng	f142c45f1e	[RISCV] Set getMinVectorRegisterBitWidth to 16 if enable fixed length vector code gen for RVV getMinVectorRegisterBitWidth means what vector types is supported in this target, and actually RISC-V support all fixed length vector types with vector length less than `getMinRVVVectorSizeInBits`, so set it to 16, means 2 x i8, that is minimal fixed length vector size in theory. That also fixed one issue, some testcase migth become non-vectorizable when `-riscv-v-vector-bits-min` set to larger value, because the vector size is smaller than `-riscv-v-vector-bits-min`. For example, following code can vectorize by SLP with `-riscv-v-vector-bits-min=128` or `-riscv-v-vector-bits-min=256`, but can't vectorize `-riscv-v-vector-bits-min=512` or larger: ``` void foo(double *da) { da[0] = 0; da[1] = 1; da[2] = 2; da[3] = 3; } ``` Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D116534	2022-01-08 11:16:21 +08:00
Baoshan Pang	af931a51b9	[RISCV] Materializing constants with 'rori' Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D116574	2022-01-07 15:39:22 -08:00
Lian Wang	e8f1dfe923	[RISCV] Supplement PACKH instruction pattern Optimize (rs1 & 255) \| ((rs2 & 255) << 8) -> (PACKH rs1, rs2). Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D116791	2022-01-07 17:59:19 +08:00
Kazu Hirata	f3a344d212	[Target] Remove redundant member initialization (NFC) Identified with readability-redundant-member-init.	2022-01-06 22:01:44 -08:00
wangpc	91cf2a9b6c	[RISCV][NFC] Use sub operator to generate register list There are several duplicated lines for generating GPRXXX's register list that can be eliminated by using `sub` operator. Reviewed By: asb Differential Revision: https://reviews.llvm.org/D116729	2022-01-07 12:29:58 +08:00
Shao-Ce SUN	808c0987c3	[NFC][RISCV] Make the macro names more uniform Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D116719	2022-01-07 11:09:41 +08:00
Liqin Weng	92153a9aa7	[RISCV] Support immediate vtype of VSETVLI/VSETIVLI in asm parser Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D115133	2022-01-07 02:26:41 +00:00
Craig Topper	ec4dd862bf	[RISCV] Use simm5_plus1_nonzero in isel patterns for vmsgeu.vi/vmsltu.vi intrinsics. The 0 immediate can't be selected to vmsgtu.vi/vmsleu.vi by decrementing the immediate. To prevent his we had special patterns that provided alternate lowering for the 0 cases. This relied on tablegen prioritizing the 0 pattern over the sim5_plus1 range. This patch introduces simm5_plus1_nonzero that excludes 0. It also excludes the special case for vmsltu.vi since we can just use vmsltu.vx and let the 0 be selected to X0. This is an alternative to some of the changes in D116584. Reviewed By: Chenbing.Zheng, asb Differential Revision: https://reviews.llvm.org/D116723	2022-01-06 08:27:27 -08:00
Craig Topper	56ca11e31e	[RISCV] Add an MIR pass to replace redundant sext.w instructions with copies. Function calls and compare instructions tend to cause sext.w instructions to be inserted. If we make good use of W instructions, these operations can often end up being redundant. We don't always detect these during SelectionDAG due to things like phis. There also some cases caused by failure to turn extload into sextload in SelectionDAG. extload selects to LW allowing later sext.ws to become redundant. This patch adds a pass that examines the input of sext.w instructions trying to determine if it is already sign extended. Either by finding a W instruction, other instructions that produce a sign extended result, or looking through instructions that propagate sign bits. It uses a worklist and visited set to search as far back as necessary. Reviewed By: asb, kito-cheng Differential Revision: https://reviews.llvm.org/D116397	2022-01-06 08:23:42 -08:00
Craig Topper	75117fb340	[RISCV] Don't advertise i32->i64 zextload as free for RV64. The zextload hook is only used to determine whether to insert a zero_extend or any_extend for narrow types leaving a basic block. Returning true from this hook tends to cause any load whose output leaves the basic block to become an LWU instead of an LW. Since we tend to prefer sexts for i32 compares on RV64, this can cause extra sext.w instructions to be created in other basic blocks. If we use LW instead of LWU this gives the MIR pass from D116397 a better chance of removing them. Another option might be to teach getPreferredExtendForValue in FunctionLoweringInfo.cpp about our preference for sign_extend of i32 compares. That would cause SIGN_EXTEND to be chosen for any value used by a compare instead of using the isZExtFree heuristic. That will require code to convert from the llvm::Type* to EVT/MVT as well as querying the type legalization actions to get the promoted type in order to call TargetLowering::isSExtCheaperThanZExt. That seemed like many extra steps when no other target wants it. Though it would avoid us needing to lean on the MIR pass in some cases. Reviewed By: asb Differential Revision: https://reviews.llvm.org/D116567	2022-01-06 08:13:42 -08:00
Craig Topper	808c662665	[RISCV] Change RISCVISD::FCVT*RTZ opcodes to take rounding mode as an operand. Pre-work for a future change that will use these opcodes with other rounding modes. Differential Revision: https://reviews.llvm.org/D116724	2022-01-06 08:12:12 -08:00
Craig Topper	fd992aac19	[RISCV] Use macros to reduce repetive switch cases. NFC These 3 switches map LMUL enum to instruction names. These follow a regular pattern. Use a macro to reduce the number of source code lines. Reviewed By: arcbbb Differential Revision: https://reviews.llvm.org/D116631	2022-01-05 09:00:48 -08:00
Craig Topper	df2e728b77	[RISCV] Teach RISCVGatherScatterLowering to handle more complex recurrence start values. Previously we only recognized strided loads/store when the initial value for the phi was a strided constant vector. This patch extends the support to a strided_constant added to a splatted value. The rewritten loop will add the splat value to the first element of the strided constant vector to use as the scalar start value. The stride is unaffected. Reviewed By: frasercrmck Differential Revision: https://reviews.llvm.org/D115958	2022-01-04 10:13:34 -08:00
Kazu Hirata	e5947760c2	Revert "[llvm] Remove redundant member initialization (NFC)" This reverts commit `fd4808887e`. This patch causes gcc to issue a lot of warnings like: warning: base class ‘class llvm::MCParsedAsmOperand’ should be explicitly initialized in the copy constructor [-Wextra]	2022-01-03 11:28:47 -08:00
Jim Lin	d38637a0e6	[RISCV] Fix the code alignment for GroupFloatVectors. NFC Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D116520	2022-01-03 17:08:53 +08:00
Victor Perez	5527139302	[RISCV][VP] Add RVV codegen for [nX]vXi1 vp.select Expand [nX]vXi1 vp.select the same way as [nX]vXi1 vselect. Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D115546	2022-01-02 23:12:32 -08:00
Craig Topper	bc091e0862	[RISCV] Prune more unnecessary vector pseudo instructions. NFC For floating point specific vector instructions, we don't need pseudos for mf8. Reviewed By: khchen Differential Revision: https://reviews.llvm.org/D116460	2022-01-02 23:00:42 -08:00
Kazu Hirata	41bfac6aed	[Target] Remove unused forward declarations (NFC)	2022-01-02 10:20:15 -08:00
Craig Topper	4602f4169a	[RISCV] Prune unnecessary vector pseudo instructions. NFC For .vf instructions, we don't need MF8 pseudos for f16. We don't need MF8 or MF4 pseudos for f32. Or MF8, MF4, MF2 for f64. Reviewed By: khchen Differential Revision: https://reviews.llvm.org/D116437	2022-01-01 19:53:53 -08:00
Kazu Hirata	fd4808887e	[llvm] Remove redundant member initialization (NFC) Identified with readability-redundant-member-init.	2022-01-01 16:18:18 -08:00
Craig Topper	6f45fe9851	[RISCV] Use MxListW instead of MxList[0-5]. NFC Better to use the named list instead of assuming the size of MxList.	2021-12-31 00:22:55 -08:00
Craig Topper	8811a87e8c	[RISCV] Use defvar to simplify some code. NFC Rather than wrapping a def around a list, we can just make a defvar of the list.	2021-12-30 23:48:39 -08:00
wangpc	41454ab256	[RISCV] Use constant pool for large integers For large integers (for example, magic numbers generated by TargetLowering::BuildSDIV when dividing by constant), we may need about 4~8 instructions to build them. In the same time, it just takes two instructions to load constants (with extra cycles to access memory), so it may be profitable to put these integers into constant pool. Reviewed By: asb, craig.topper Differential Revision: https://reviews.llvm.org/D114950	2021-12-31 14:48:48 +08:00
jacquesguan	05f82dc877	[RISCV] Fix incorrect cases of vmv.s.f in the VSETVLI insert pass. Fix incorrect cases of vmv.s.f and add test cases for it. Differential Revision: https://reviews.llvm.org/D116432	2021-12-31 14:17:03 +08:00
Craig Topper	15787ccd45	[RISCV] Add support for STRICT_LRINT/LLRINT/LROUND/LLROUND. Tests for other strict intrinsics. This patch adds isel support for STRICT_LRINT/LLRINT/LROUND/LLROUND. It also adds test cases for f32 and f64 constrained intrinsics that correspond to the intrinsics in float-intrinsics.ll and double-intrinsics.ll. Support for promoting the integer argument of STRICT_FPOWI was added. I've skipped adding tests for f16 intrinsics, since we don't have libcalls for them and we have inconsistent support for promoting them in LegalizeDAG. This will need to be examined more closely. Reviewed By: asb Differential Revision: https://reviews.llvm.org/D116323	2021-12-30 11:54:32 -08:00
jacquesguan	128c6ed73b	[RISCV] Teach VSETVLInsert to eliminate redundant vsetvli for vmv.s.x and vfmv.s.f. Differential Revision: https://reviews.llvm.org/D116307	2021-12-30 17:16:18 +08:00
jacquesguan	1dd5e6fed5	[RISCV] Use vmv.s.x instead of vfmv.s.f when the floating point scalar is 0. Use integer vector scalar move instruction when move 0 to avoid add a integer-float move instruction. Differential Revision: https://reviews.llvm.org/D116365	2021-12-30 10:16:54 +08:00
Chenbing.Zheng	43c8296cda	[RISCV] Refactor immediate comparison instructions patterns The patterns of the immediate comparison instruction is rewrite here, and put similar code to a class. Do not change any function of the original code, making the code more concise. Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D116215	2021-12-30 09:31:01 +08:00
Craig Topper	015ff729cb	[RISCV] Add a few more instructions to hasAllNBitUsers.	2021-12-29 09:17:47 -08:00
Kazu Hirata	5a667c0e74	[llvm] Use nullptr instead of 0 (NFC) Identified with modernize-use-nullptr.	2021-12-28 08:52:25 -08:00
Hsiangkai Wang	a1c7ddf926	[RISCV] Support passing scalable vectur values through the stack. After consuming all vector registers, the scalable vector values will be passed indirectly. The pointer values will be saved in general registers. If all general registers are used up, we will report an error to notify users the compiler does not support passing scalable vector values through the stack. In this patch, we remove the restriction. After all general registers are used up, we use the stack to save the pointers which point to the indirect passed scalable vector values. Differential Revision: https://reviews.llvm.org/D116310	2021-12-28 09:26:36 +08:00
Hsiangkai Wang	5d47e7d768	[RISCV] Convert whole register copies as the source defined explicitly. The implicit defines may come from a partial define in an instruction. It does not mean the defining instruction and the COPY instruction have the same vl and vtype. When the source comes from the implicit defines, do not convert the whole register copies to vmv.v.v. Differential Revision: https://reviews.llvm.org/D115866	2021-12-27 13:59:49 +08:00
Shao-Ce SUN	70a98008ea	[RISCV] Reduce repetitive codes in flw, fsw Trying to improve code reuse in F,D,Zfh *.td files. Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D116089	2021-12-27 09:29:35 +08:00
Kazu Hirata	e7774f499b	Use static_assert instead of assert (NFC) Identified with misc-static-assert.	2021-12-26 14:26:44 -08:00
Jim Lin	02478a26f2	[RISCV] Use DAG variable directly instead of DCI.DAG Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D116087	2021-12-24 13:06:55 +08:00
Craig Topper	0a35211b34	[RISCV] Don't allow vector types to be used with inline asm 'r' constraint The 'r' constraint uses the GPR class. There is generic support for bitcasting and extending/truncating non-integer VTs to the required integer VT. This doesn't work for scalable vectors and instead crashes. To prevent this, explicitly reject vectors. Fixed vectors might work without crashing, but it doesn't seem worthwhile to allow. While there remove an unnecessary level of indentation in the "vr" and "vm" constraint handling. Differential Revision: https://reviews.llvm.org/D115810	2021-12-23 20:32:36 -06:00
Victor Perez	10b3675aa9	[RISCV][VP] Lower mask vector VP AND/OR/XOR to RVV instructions For fixed and scalable vectors, each intrinsic x is lowered to vmx.mm, dropping the mask, which is safe to do as masked-off elements are undef anyway. Differential Revision: https://reviews.llvm.org/D115339	2021-12-23 15:02:32 -06:00
Craig Topper	7704c503ec	[RISCV] Use positive 0.0 for the neutral element in fadd reductions if nsz is present. -0.0 requires a constant pool. +0.0 can be made with vmv.v.x x0. Not doing this in getNeutralElement for fear of changing other targets. Differential Revision: https://reviews.llvm.org/D115978	2021-12-23 10:38:00 -06:00
Craig Topper	b7b260e19a	[RISCV] Support strict FP conversion operations. This adds support for strict conversions between fp types and between integer and fp. NOTE: RISCV has static rounding mode instructions, but the constrainted intrinsic metadata is not used to select static rounding modes. Dynamic rounding mode is always used. Differential Revision: https://reviews.llvm.org/D115997	2021-12-23 09:40:58 -06:00
Craig Topper	a9486a40f7	[RISCV] Disable interleaving scalar loops in the loop vectorizer. The loop vectorizer can interleave scalar loops even if it doesn't vectorize them. I don't believe we intended to enable this when we enabled interleaving for vector instructions. Disable interleaving for VF=1 like X86 and AMDGPU already do. Test lifted from AMDGPU. Differential Revision: https://reviews.llvm.org/D115975	2021-12-23 08:37:24 -06:00
jacquesguan	28a3e7dea2	[RISCV] Override hasAndNotCompare to use more andn when have Zbb extension. Enable transform (X & Y) == Y ---> (~X & Y) == 0 and (X & Y) != Y ---> (~X & Y) != 0 when have Zbb extension to use more andn instruction. Differential Revision: https://reviews.llvm.org/D115922	2021-12-23 10:42:20 +08:00
Shao-Ce SUN	68bc6d7cae	[RISCV] Remove Zvamo Extention Based on D111692. Zvamo is not part of the 1.0 V spec. Remove it. Reviewed By: arcbbb Differential Revision: https://reviews.llvm.org/D115709	2021-12-20 10:28:39 +08:00
Michael Berg	f95ee6074a	[RISCV] Add target specific loop unrolling and peeling preferences Both these preference helper functions have initial support with this change. The loop unrolling preferences are set with initial settings to control thresholds, size and attributes of loops to unroll with some tuning done. The peeling preferences may need some tuning as well as the initial support looks much like what other architectures utilize. Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D113798	2021-12-18 12:54:50 -08:00
Craig Topper	be41996f4f	[RISCV} Add FSGNJ_H to isAsCheapAsAMove and isCopyInstrImpl. This matches FSGNJ_S and FSGNJ_D.	2021-12-17 09:14:20 -08:00
Craig Topper	66bbefeb13	[RISCV] Revert Zfhmin related changes that aren't tested and depend on f16 being a legal type. Our Zfhmin support is only MC layer, but these are CodeGen layer interfaces. If f16 isn't a Legal type for CodeGen with Zfhmin, then these interfaces should keep their non-Zfh behavior. Reviewed By: luismarques Differential Revision: https://reviews.llvm.org/D115822	2021-12-16 08:55:28 -08:00
jacquesguan	7dfbf0b60f	[RISCV] Fold (and (not (srl X, C)), 1) to (xor (bexti X, C), 1) when have Zbs extension. When have Zbs extension, we could use bexti to fold (and (not (srl X, C)), 1) to (xor (bexti X, C), 1). Differential Revision: https://reviews.llvm.org/D115629	2021-12-16 15:01:05 +08:00
jacquesguan	d3c2ad154e	[RISCV] Fix whole vector register move instruction's vector register constraint. According to the v-spec, the source and destination VR of vmv<nr>r.v should be aligned for the VR group size. Differential Revision: https://reviews.llvm.org/D115720	2021-12-16 10:58:55 +08:00
Craig Topper	3926893439	[RISCV] Add isel support for scalar STRICT_FADD/FSUB/FMUL/FDIV/FSQRT. Test that STRICT_FMINNUM/FMAXNUM are lowered to libcalls for f32/f64. The RISC-V instructions don't match the behavior of fmin/fmax libcalls with respect to SNaN. Promoting FMINNUM/FMAXNUM for f16 needs more work outside of the RISC-V backend. Reviewed By: asb, arcbbb Differential Revision: https://reviews.llvm.org/D115680	2021-12-14 10:50:55 -08:00
Craig Topper	3f1c403a2b	[RISCV] Use AdjustInstrPostInstrSelection to insert a FRM dependency for scalar FP instructions with dynamic rounding mode. In order to support constrained FP intrinsics we need to model FRM dependency. Whether or not a instruction uses FRM is based on a 3 bit field in the instruction. Because of this we can't add 'Uses = [FRM]' to the tablegen descriptions. This patch examines the immediate after isel and adds an implicit use of FRM. This idea came from Roger Ferrer Ibanez. Other ideas: We could be overly conservative and just pretend all instructions with frm field read the FRM register. Or we could have pseudoinstructions for CodeGen with rounding mode. Reviewed By: asb, frasercrmck, arcbbb Differential Revision: https://reviews.llvm.org/D115555	2021-12-14 10:17:57 -08:00
Craig Topper	d4d76409d1	[RISCV] Add mayRaiseFPException to RISCV scalar FP instructions. FRM dependency will be added in a future patch. Reviewed By: arcbbb Differential Revision: https://reviews.llvm.org/D115540	2021-12-14 09:53:30 -08:00
Craig Topper	7598ac5ec5	[RISCV] Convert (splat_vector (load)) to vlse with 0 stride. We already do this for splat nodes that carry a VL, but not for splats that use VLMAX. Reviewed By: frasercrmck Differential Revision: https://reviews.llvm.org/D115483	2021-12-14 09:14:03 -08:00
Craig Topper	3cda38796c	[RISCV] Add rs2 encoding to the FPUnaryOp_r and FPUnaryOp_r_frm template arguments. Instead of having unary instruction include a 'let' in their class body, add rs2val as a template parameter. Then we can use a let in FPUnaryOp_r and FPUnaryOp_r_frm. This reduces the overall verbosity of the FP files. Reviewed By: achieveartificialintelligence Differential Revision: https://reviews.llvm.org/D115537	2021-12-13 21:38:42 -08:00
Nelson Chu	10a71981e9	[RISCV] Support named opcodes in .insn directive. This patch is one of the TODO of commit, `283879793d` We build the GenericTable for these opcodes, and also extend class RISCVOpcode, to store the names of opcodes. Then we call the parseInsnDirectiveOpcode to parse the opcode filed in .insn directive. We only allow users to write the recognized opcode names, or just write the immediate values in the 7 bits range. Documentation: https://sourceware.org/binutils/docs-2.37/as/RISC_002dV_002dFormats.html Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D115224	2021-12-13 20:59:33 -08:00
Fangrui Song	a6a07a514b	[MachineOutliner] Don't outline functions starting with PATCHABLE_FUNCTION_ENTER/FENTRL_CALL MachineOutliner may outline a "patchable-function-entry" function whose body has a TargetOpcode::PATCHABLE_FUNCTION_ENTER MachineInstr. This is incorrect because the special code sequence must stay unchanged to be used at run-time. Avoid outlining PATCHABLE_FUNCTION_ENTER. While here, avoid outlining FENTRY_CALL too (which doesn't reproduce currently) to allow phase ordering flexibility. Fixes #52635 Reviewed By: paquette Differential Revision: https://reviews.llvm.org/D115614	2021-12-13 13:24:29 -08:00
Craig Topper	b18b2a01ef	[RISCV] Don't use VLMAX for start value splat in reduction lowering. The reduction instructions only reads the first element. The execution time for a splat may take longer with a larger VL. We should use the smallest VL we can. Reviewed By: frasercrmck, HsiangKai Differential Revision: https://reviews.llvm.org/D115536	2021-12-13 09:06:42 -08:00
Craig Topper	80ed2f6b36	[RISCV] Share tablegen classes for F, D, and Zfh. Other simplifications. NFC By adding the register class and funct as template parameters we can share the classes with all 3 extensions. I've used "let SchedRW =" to avoid repeating scheduler classes on multiple lines where we previously inherited from the Sched class. A subsequent patch will add mayRaiseFPException and FRM dependencies. Reducing the number of classes means less repeating for those changes. This of course conflicts with the Zfinx patch D93298. Reviewed By: achieveartificialintelligence Differential Revision: https://reviews.llvm.org/D115469	2021-12-10 09:35:51 -08:00
Craig Topper	5861cf77da	[RISCV] Remove FCSR from RISCVRegisterInfo. We only used this to mark it as a reserved register. But that's not important if we don't do anything else with it. I think if we were ever to do anything with it, we would need to model it as a super register of FRM and FFLAGS. But it might be easier to reference both FRM and FFLAGS in implicit defs/uses for anything we were to do with "fcsr". Reviewed By: sepavloff Differential Revision: https://reviews.llvm.org/D115455	2021-12-10 09:24:13 -08:00
eopXD	a4bf1b449d	[RISCV] Unify depedency check and extension implication parsing logics Originially there are two places that does parsing - `parseArchString` and `parseFeatures`, each with its code on dependency check and implication. This patch extracts common parts of the two as functions of `RISCVISAInfo` and let them 2 use it. Reviewed By: asb Differential Revision: https://reviews.llvm.org/D112359	2021-12-09 21:16:04 -08:00
Craig Topper	6f7de819b9	[RISCV] Use MULHU for more division by constant cases. D113805 improved handling of i32 divu/remu on RV64. The basic idea from that can be extended to (mul (and X, C2), C1) where C2 is any mask constant. We can replace the and with an SLLI by shifting by the number of leading zeros in C2 if we also shift C1 left by XLen - lzcnt(C1) bits. This will give the full product XLen additional trailing zeros, putting the result in the output of MULHU. If we can't use ANDI, ZEXT.H, or ZEXT.W, this will avoid materializing C2 in a register. The downside is it make take 1 additional instruction to create C1. But since that's not on the critical path, it can hopefully be interleaved with other operations. The previous tablegen pattern is replaced by custom isel code. Reviewed By: asb Differential Revision: https://reviews.llvm.org/D115310	2021-12-09 09:10:14 -08:00
Kito Cheng	39c861719b	[RISCV] Fix vm operand constraint to fit GCC's behavior - `vm` constraint is used for masking operand, which always v0. - Update testcase, only masking operand should use `vm`, vector mask operations should just use `vr` for any vector register. - Revise the description of `vm` constraint. - This patch also fix issue on RISCVRegisterInfo.td and RISCVISelLowering.cpp. RISCVRegisterInfo.td: - The first VT in the list must be the largest total size since the SelectionDAGBuilder uses the first register in the list as the canonical type for the register. RISCVISelLowering.cpp: - Fix RISCVTargetLowering::splitValueIntoRegisterParts and RISCVTargetLowering::joinRegisterPartsIntoValue for handling vectors with different total size, that will happened on fractional LMUL since fractional LMUL is always occupy one vector register. Reviewed By: frasercrmck Differential Revision: https://reviews.llvm.org/D112599	2021-12-09 14:46:49 +08:00
Craig Topper	de8d26ac02	[RISCV] Improve tracking of EndLoc in the assembly parser. The SMLoc::getFromPointer(S.getPointer() - 1) pattern used in several place didn't make sense since that puts the End before the Start location. This patch corrects this to properly calculate the end location as we parse. Unsure how much this matters, a lot of these are for custom operand parsing. If the custom parsing succeeds, the instruction matching probably won't fail, so the end loc won't be used to build a range for a diagnostic. I've also fixed the creation functions for the CSR and VType operands to assign the EndLoc of the RISCVOperand class using the StartLoc instead of an empty SMLoc. I don't think these will ever be accessed, but it makes the code look less like we forgot to assign it. This is consistent with some operands on the ARM target. Reviewed By: luismarques Differential Revision: https://reviews.llvm.org/D115192	2021-12-08 15:32:44 -08:00
Michael Berg	3e363f14e1	Revert "[RISCV] Add target specific loop unrolling and peeling preferences" This reverts commit `8487981a72`.	2021-12-07 15:13:42 -08:00
Michael Berg	8487981a72	[RISCV] Add target specific loop unrolling and peeling preferences Both these preference helper functions have initial support with this change. The loop unrolling preferences are set with initial settings to control thresholds, size and attributes of loops to unroll with some tuning done. The peeling preferences may need some tuning as well as the initial support looks much like what other architectures utilize. Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D113798	2021-12-07 15:06:42 -08:00
Craig Topper	622d689480	[RISCV] Revise RISCVInstPrinter::printVTypeI to not assume there are 3 invalid vtype bits. Instead of checking [10:8]. Check for non-zero in 8 and above. Addresses a post-commit comment from @jrtc27 in D114581.	2021-12-07 09:40:50 -08:00
Craig Topper	2a9b2444d9	[RISCV] Replace uses of RISCVOpcode<0b0010011> and RISCVOpcode<0b0011011> with existing named objects. NFC These are already instantiated with names as OPC_OP_IMM and OPC_OP_IMM_32. Reviewed By: frasercrmck Differential Revision: https://reviews.llvm.org/D115172	2021-12-07 08:07:14 -08:00
Evandro Menezes	2ba7698423	[RISCV] Add scheduling resources for Vector pseudo instructions. Add the scheduling resources for the V extension pseudo instructions. Authored-by: Evandro Menezes <evandro.menezes@sifive.com> Differential Revision: https://reviews.llvm.org/D113353	2021-12-07 09:14:28 +08:00
Craig Topper	acdbd34cfb	[RISCV] Loosen some restrictions on lowering constant BUILD_VECTORs using vid.v. The immediate size check on StepNumerator did not take into account that vmul.vi does not exist. It also did not account for power of 2 constants that can be done with vshl.vi. This patch fixes this by moving the conversion from mul to shift further up. Then we can consider the immediates separately for MUL vs SHL. For MUL I've allowed simm12 which requires a single addi before a vmul.vx. For SHL I've allowed any uimm5 which works with vshl.vi. We could relax these further in the future. This is a starting point that allows us to emit the same number of instructions we were already using for smaller numerators. Reviewed By: frasercrmck Differential Revision: https://reviews.llvm.org/D115081	2021-12-06 09:34:40 -08:00
Victor Perez	9eb7322748	[RISCV][VP] Add RVV codegen for vp.select Lower vp.select instrinsic to VSELECT_VL. Reviewed By: rogfer01 Differential Revision: https://reviews.llvm.org/D114629	2021-12-03 11:02:20 +00:00
Craig Topper	2f6beb7b0e	[RISCV] Add inline expansion for vector ftrunc/fceil/ffloor. This prevents scalarization of fixed vector operations or crashes on scalable vectors. We don't have direct support for these operations. To emulate ftrunc we can convert to the same sized integer and back to fp using round to zero. We don't need to do a convert if the value is large enough to have no fractional bits or is a nan. The ceil and floor lowering would be better if we changed FRM, but we don't model FRM correctly yet. So I've used the trunc lowering with a conditional add or subtract with 1.0 if the truncate rounded in the wrong direction. There are also missed opportunities to use masked instructions. Reviewed By: frasercrmck Differential Revision: https://reviews.llvm.org/D113543	2021-12-01 11:25:28 -08:00
Craig Topper	d8f9eaad89	[RISCV] Teach RISCVTargetLowering::shouldSinkOperands to handle udiv/sdiv/urem/srem. The V extension supports .vx instructions for integer division and remainder so we should sink splats for that operand.	2021-11-30 18:47:51 -08:00
David Green	9e8a71caf0	[DAG] Create fptosi.sat from clamped fptosi This adds a fold in DAGCombine to create fptosi_sat from sequences for smin(smax(fptosi(x))) nodes, where the min/max saturate the output of the fp convert to a specific bitwidth (say INT_MIN and INT_MAX). Because it is dealing with smin(/smax) in DAG they may currently be ISD::SMIN, ISD::SETCC/ISD::SELECT, ISD::VSELECT or ISD::SELECT_CC nodes which need to be handled similarly. A shouldConvertFpToSat method was added to control when converting may be profitable. The original fptosi will have a less strict semantics than the fptosisat, with less values that need to produce defined behaviour. This especially helps on ARM/AArch64 where the vcvt instructions naturally saturate the result. Differential Revision: https://reviews.llvm.org/D111976	2021-11-30 15:29:14 +00:00
Hans Wennborg	a87782c34d	Revert "[DAG] Create fptosi.sat from clamped fptosi" It causes builds to fail with this assert: llvm/include/llvm/ADT/APInt.h:990: bool llvm::APInt::operator==(const llvm::APInt &) const: Assertion `BitWidth == RHS.BitWidth && "Comparison requires equal bit widths"' failed. See comment on the code review. > This adds a fold in DAGCombine to create fptosi_sat from sequences for > smin(smax(fptosi(x))) nodes, where the min/max saturate the output of > the fp convert to a specific bitwidth (say INT_MIN and INT_MAX). Because > it is dealing with smin(/smax) in DAG they may currently be ISD::SMIN, > ISD::SETCC/ISD::SELECT, ISD::VSELECT or ISD::SELECT_CC nodes which need > to be handled similarly. > > A shouldConvertFpToSat method was added to control when converting may > be profitable. The original fptosi will have a less strict semantics > than the fptosisat, with less values that need to produce defined > behaviour. > > This especially helps on ARM/AArch64 where the vcvt instructions > naturally saturate the result. > > Differential Revision: https://reviews.llvm.org/D111976 This reverts commit `52ff3b0093`.	2021-11-30 15:36:56 +01:00
David Green	52ff3b0093	[DAG] Create fptosi.sat from clamped fptosi This adds a fold in DAGCombine to create fptosi_sat from sequences for smin(smax(fptosi(x))) nodes, where the min/max saturate the output of the fp convert to a specific bitwidth (say INT_MIN and INT_MAX). Because it is dealing with smin(/smax) in DAG they may currently be ISD::SMIN, ISD::SETCC/ISD::SELECT, ISD::VSELECT or ISD::SELECT_CC nodes which need to be handled similarly. A shouldConvertFpToSat method was added to control when converting may be profitable. The original fptosi will have a less strict semantics than the fptosisat, with less values that need to produce defined behaviour. This especially helps on ARM/AArch64 where the vcvt instructions naturally saturate the result. Differential Revision: https://reviews.llvm.org/D111976	2021-11-30 11:05:32 +00:00
Ben Shi	29d4230d6b	[RISCV] Decode vtype with reserved fields to raw immediate This patch fixes a crash when doing "llvm-objdump -D --mattr=+experimental-v" against an object file which happens to keep a word that can be decoded to VSETVLI & VSETIVLI with reserved vlmul[2:0]=4. All vtype values with reserved fields (vlmul[2:0]=4, vsew[2:0]=0b1xx, non-zero bits 8/9/10) are printed to raw immediate. Reviewed By: jhenderson, jrtc27, craig.topper Differential Revision: https://reviews.llvm.org/D114581	2021-11-30 08:31:20 +00:00
Craig Topper	b121d23a9c	[RISCV] Promote f16 log/pow/exp/sin/cos/etc. to f32 libcalls. Prevents crashes or cannot select errors. Reviewed By: frasercrmck Differential Revision: https://reviews.llvm.org/D113822	2021-11-29 18:49:11 -08:00
Hsiangkai Wang	9a88566537	[RISCV] Fix a bug in RISCVFrameLowering. When we have out-going arguments passing through stack and we do not reserve the stack space in the prologue. Use BP to access stack objects after adjusting the stack pointer before function calls. callseq_start -> sp = sp - reserved_space // // Use FP to access fixed stack objects. // Use BP to access non-fixed stack objects. // call @foo callseq_end -> sp = sp + reserved_space Differential Revision: https://reviews.llvm.org/D114246	2021-11-30 10:39:35 +08:00
Hsiangkai Wang	b0c7421524	[RISCV] Emit DWARF location expression for RVV stack objects. VLENB is the length of a vector register in bytes. We use <vscale x 64 bits> to represent one vector register. The dwarf offset is VLENB * scalable_offset / 8. For the mask vector, it occupies one vector register. Differential Revision: https://reviews.llvm.org/D107432	2021-11-27 15:13:10 +08:00
Hsiangkai Wang	137d3474ca	[RISCV] Reverse the order of loading/storing callee-saved registers. Currently, we restore the return address register as the last restoring instruction in the epilog. The next instruction is `ret` usually. It is a use of return address register. In some microarchitectures, there is load-to-use data hazard. To avoid the load-to-use data hazard, we could separate the load instruction from its use as far as possible. In this patch, we reverse the order of restoring callee-saved registers to increase the distance of `load ra` and `ret` in the epilog. Differential Revision: https://reviews.llvm.org/D113967	2021-11-22 23:02:11 +08:00
wangpc	af0ecfccae	[RISCV] Generate pseudo instruction li Add an alias of `addi [x], zero, imm` to generate pseudo instruction li, which makes assembly mush more readable. For existed tests, users can update them by running script `llvm/utils/update_llc_test_checks.py`. Reviewed By: asb Differential Revision: https://reviews.llvm.org/D112692	2021-11-22 14:01:37 +08:00
Philipp Tomsich	af57a71d18	[RISCV] Don't call setHasMultipleConditionRegisters(), so icmp is sunk On RISC-V, icmp is not sunk (as the following snippet shows) which generates the following suboptimal branch pattern: ``` core_list_find: lh a2, 2(a1) seqz a3, a0 << bltz a2, .LBB0_5 bnez a3, .LBB0_9 << should sink the seqz [...] j .LBB0_9 .LBB0_5: bnez a3, .LBB0_9 << should sink the seqz lh a1, 0(a1) [...] ``` due to an icmp not being sunk. The blocks after `codegenprepare` look as follows: ``` define dso_local %struct.list_head_s* @core_list_find(%struct.list_head_s* readonly %list, %struct.list_data_s* nocapture readonly %info) local_unnamed_addr #0 { entry: %idx = getelementptr inbounds %struct.list_data_s, %struct.list_data_s* %info, i64 0, i32 1 %0 = load i16, i16* %idx, align 2, !tbaa !4 %cmp = icmp sgt i16 %0, -1 %tobool.not37 = icmp eq %struct.list_head_s* %list, null br i1 %cmp, label %while.cond.preheader, label %while.cond9.preheader while.cond9.preheader: ; preds = %entry br i1 %tobool.not37, label %return, label %land.rhs11.lr.ph ``` where the `%tobool.not37` is the result of the icmp that is not sunk. Note that it is computed in the basic-block up until what becomes the `bltz` instruction and the `bnez` is a basic-block of its own. Compare this to what happens on AArch64 (where the icmp is correctly sunk): ``` define dso_local %struct.list_head_s* @core_list_find(%struct.list_head_s* readonly %list, %struct.list_data_s* nocapture readonly %info) local_unnamed_addr #0 { entry: %idx = getelementptr inbounds %struct.list_data_s, %struct.list_data_s* %info, i64 0, i32 1 %0 = load i16, i16* %idx, align 2, !tbaa !6 %cmp = icmp sgt i16 %0, -1 br i1 %cmp, label %while.cond.preheader, label %while.cond9.preheader while.cond9.preheader: ; preds = %entry %1 = icmp eq %struct.list_head_s* %list, null br i1 %1, label %return, label %land.rhs11.lr.ph ``` This is caused by sinkCmpExpression() being skipped, if multiple condition registers are supported. Given that the check for multiple condition registers affect only sinkCmpExpression() and shouldNormalizeToSelectSequence(), this change adjusts the RISC-V target as follows: * we no longer signal multiple condition registers (thus changing the behaviour of sinkCmpExpression() back to sinking the icmp) * we override shouldNormalizeToSelectSequence() to let always select the preferred normalisation strategy for our backend With both changes, the test results remain unchanged. Note that without the target-specific override to shouldNormalizeToSelectSequence(), there is worse code (more branches) generated for select-and.ll and select-or.ll. The original test case changes as expected: ``` core_list_find: lh a2, 2(a1) bltz a2, .LBB0_5 beqz a0, .LBB0_9 << [...] j .LBB0_9 .LBB0_5: beqz a0, .LBB0_9 << lh a1, 0(a1) [...] ``` Differential Revision: https://reviews.llvm.org/D98932	2021-11-19 08:32:59 -08:00
Zi Xuan Wu	24d1673c8b	[llvm-tblgen][RISCV] Make llvm-tblgen RISCVCompressInstEmitter to be common infra across different targets Not only RISCV but also other target such as CSKY, there are compressed instructions mixed with normal instructions. To reuse the basic infra to compress/uncompress and predict instruction, we need reconstruct the RISCVCompressInstEmitter and make it more general and suitable for other target. Differential Revision: https://reviews.llvm.org/D113475	2021-11-18 11:14:27 +08:00
Zarko Todorovski	5b8bbbecfa	[NFC][llvm] Inclusive language: reword and remove uses of sanity in llvm/lib/Target Reworded removed code comments that contain `sanity check` and `sanity test`.	2021-11-17 21:59:00 -05:00
Craig Topper	0274be28d7	[RISCV] Lower vector CTLZ_ZERO_UNDEF/CTTZ_ZERO_UNDEF by converting to FP and extracting the exponent. If we have a large enough floating point type that can exactly represent the integer value, we can convert the value to FP and use the exponent to calculate the leading/trailing zeros. The exponent will contain log2 of the value plus the exponent bias. We can then remove the bias and convert from log2 to leading/trailing zeros. This doesn't work for zero since the exponent of zero is zero so we can only do this for CTLZ_ZERO_UNDEF/CTTZ_ZERO_UNDEF. If we need a value for zero we can use a vmseq and a vmerge to handle it. We need to be careful to make sure the floating point type is legal. If it isn't we'll continue using the integer expansion. We could split the vector and concatenate the results but that needs some additional work and evaluation. Differential Revision: https://reviews.llvm.org/D111904	2021-11-17 10:29:41 -08:00
Jay Foad	3264e95938	[CodeGen] Update LiveIntervals in TargetInstrInfo::convertToThreeAddress Delegate updating of LiveIntervals to each target's convertToThreeAddress implementation, instead of repairing LiveIntervals after the fact in TwoAddressInstruction::convertInstTo3Addr. Differential Revision: https://reviews.llvm.org/D113493	2021-11-17 10:16:47 +00:00
jacquesguan	6405e8b584	[RISCV] Refactor some rvv instructions' definition with foreach. Simplify rvv instructions that use eew in their mnemonic and encoding with foreach. And fix a scheduling bug. Differential Revision: https://reviews.llvm.org/D113453	2021-11-16 15:20:45 +08:00
Craig Topper	391b0ba603	[RISCV] Override TargetLowering::hasAndNot for Zbb. Differential Revision: https://reviews.llvm.org/D113937	2021-11-15 18:44:07 -08:00
Ben Shi	4c3d916c4b	[RISCV] Optimize immediate materialisation with SHADD Use LUI+SHADD+ADDI to compose specific immediates. Reviewed By: craig.topper, luismarques Differential Revision: https://reviews.llvm.org/D113568	2021-11-15 23:34:28 +00:00
Craig Topper	f59307bfdc	[RISCV] Teach needVSETVLIPHI to handle mask register instructions. This handles the case where the mask register instruction input comes from a Phi of vsetvlis. If the VLMAX is the same as the VLMAX required by the mask register instruction, we can avoid a vsetvli. Reviewed By: frasercrmck Differential Revision: https://reviews.llvm.org/D113204	2021-11-15 09:57:28 -08:00
Craig Topper	02bed66cd5	[RISCV] Improve codegen for i32 udiv/urem by constant on RV64. The division by constant optimization often produces constants that are uimm32, but not simm32. These constants require 3 or 4 instructions to materialize without Zba. Since these instructions are often used by a multiply with a LHS that needs to be zero extended with an AND, we can switch the MUL to a MULHU by shifting both inputs left by 32. Once we shift the constant left, the upper 32 bits no longer need to be 0 so constant materialization is free to use LUI+ADDIW. This reduces the constant materialization from 4 instructions to 3 in some cases while also reducing the zero extend of the LHS from 2 shifts to 1. Differential Revision: https://reviews.llvm.org/D113805	2021-11-12 14:49:10 -08:00
Craig Topper	ee7a006ce4	[RISCV] Promote f16 ceil/floor/round/roundeven/nearbyint/rint/trunc intrinsics to f32 libcalls. Previously these would crash. I don't think these can be generated directly from C. Not sure if any optimizations can introduce them. Reviewed By: asb Differential Revision: https://reviews.llvm.org/D113527	2021-11-11 08:28:41 -08:00
Craig Topper	4183522e80	[RISCV] Promote f16 frem with Zfh. Add riscv64 coverage for f32 and f64 frem. Reviewed By: luismarques Differential Revision: https://reviews.llvm.org/D113531	2021-11-10 17:35:07 -08:00
Craig Topper	9ee5cec688	[RISCV] Prevent bad legalizer behavior when bitcasting fixed vectors to i64 on RV32 with Zve32. Similar to D113219, we need to make sure we don't create a vXi64 vector when it isn't legal. This fixes an error found by an expensive checks build.	2021-11-10 11:58:49 -08:00
Craig Topper	57bc7b1089	[RISCV] Prevent crashes when bitcasting between fixed vectors and scalars. Not all scalar element types are allowed in vectors so we may not be able to bitcast to a 1 element vector to use insert/extract. This will become a bigger issue when the Zve extensions are commited. For now, I'm using the ELEN limit to limit the element types. Reviewed By: frasercrmck Differential Revision: https://reviews.llvm.org/D113219	2021-11-10 09:21:52 -08:00
Shao-Ce SUN	1c81941f19	[NFC][RISCV] Fix wrong predicates of vfwredsum	2021-11-09 17:19:50 +08:00
Craig Topper	376233113e	[RISCV] Use TargetConstant for CSR number for READ_CSR/WRITE_CSR. This is consistent with what we do for other operands that are required to be constants. I don't think this results in any real changes. The pattern match code for isel treats ConstantSDNode and TargetConstantSDNode the same.	2021-11-08 15:10:24 -08:00
Craig Topper	304edbb553	[RISCV] SMUL_LOHI/UMUL_LOHI should expand for RVV. These and MULHS/MULHU both default to Legal. Targets need to set the ones they don't support to Expand. I think MULHS/MULHU likely has priority in most places so this change probably isn't directly testable. I found it while looking at disabling MULHS/MULHU for nxvXi64 as required for Zve64x. Reviewed By: frasercrmck Differential Revision: https://reviews.llvm.org/D113325	2021-11-08 09:38:36 -08:00
Ben Shi	e32cf690df	[RISCV] Optimize (add (mul r, c0), c1) Optimize (add (mul x, c0), c1) -> (add (mul (add x, c1/c0+1), c0), c1%c0-c0), if c1/c0+1 and c1%c0-c0 are simm12, while c1 is not. Optimize (add (mul x, c0), c1) -> (add (mul (add x, c1/c0-1), c0), c1%c0+c0), if c1/c0-1 and c1%c0+c0 are simm12, while c1 is not. Reviewed By: craig.topper, asb Differential Revision: https://reviews.llvm.org/D111141	2021-11-08 02:58:25 +00:00
Bin Cheng	54d891a7d5	[RISCV]: Fix typo by abstracting VWholeLoad* classes This patch abstracts VWholeLoad* classes into VWholeLoadN, simplifies existing code as well as fixes a typo. Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D109319	2021-11-06 10:48:03 +08:00
Bin Cheng	d488f1fff2	[RISCV][NFC]: Refactor classes for load/store instructions of RVV This patch refactors classes for load/store of V extension by: - Introduce new class for VUnitStrideLoadFF and VUnitStrideSegmentLoadFF so that uses of L/SUMOP* are not spread around different places. - Reorder classes for Unit-Stride load/store in line with table describing lumop/sumop in riscv-v-spec.pdf. Reviewed By: HsiangKai, craig.topper Differential Revision: https://reviews.llvm.org/D109318	2021-11-06 10:48:03 +08:00
Shao-Ce SUN	5c3d7184b4	[RISCV] Support Zfhmin extension According to RISC-V Unprivileged ISA 15.6. Reviewed By: asb Differential Revision: https://reviews.llvm.org/D111866	2021-11-06 01:41:02 +08:00
Zakk Chen	0649dfebba	[RISCV] Rename some assembler mnemonic and intrinsic functions for RVV 1.0. Rename vpopc/vmandnot/vmornot to vcpop/vmandn/vmorn assembler mnemonic. Reviewed By: frasercrmck, jrtc27, craig.topper Differential Revision: https://reviews.llvm.org/D111062	2021-11-04 10:08:01 -07:00
Craig Topper	5022ac0771	[RISCV] Use HasVInstructions and HasVInstructionsAnyF in more place in TableGen. NFC Change RISCVSubtarget.hasVInstructionAnyF() to call hasVInstructionsF32 so that any changes to hasVInstructionsF32 are reflected. The files were missed in D112496.	2021-11-03 14:32:45 -07:00
Fraser Cormack	d065b03801	[RISCV] Optimize vp.load with an all-ones mask Similar to D110206, this patch optimizes unmasked vp.load intrinsics to avoid the need of a vmset instruction to set the mask. It does so by selecting a riscv_vle intrinsic rather than a riscv_vle_mask intrinsic. Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D113022	2021-11-02 17:23:39 +00:00
David Callahan	4ec1b8eeac	[RISCV] Fix invalid kill on callee save A callee save may be live (specifically X1) on entry and so a spill should not mark it killed. Differential Revision: https://reviews.llvm.org/D111285	2021-11-02 11:56:54 +00:00
Craig Topper	ada5458521	[RISCV] Expand scalable vector bswap. Fix crash for bitreverse. Fix LegalizeVectorOps to not try shuffle or unrolling expansions for scalable vectors. Differential Revision: https://reviews.llvm.org/D112236	2021-10-31 10:01:27 -07:00
Craig Topper	aefcd59895	[RISCV] Teach RISCVInsertVSETVLI::needVSETVLI to handle mask register instructions better. If the VL operand of a mask register instruction comes from an explicit vsetvli with a different VTYPE, we can still avoid needing a vsetvli as long as the SEW/LMUL ratio is the same and policy bits match. Differential Revision: https://reviews.llvm.org/D112762	2021-10-29 09:49:36 -07:00
Hsiangkai Wang	7051f73d69	[RISCV] Sync Zvlsseg register order as the same as vector registers. Sync the order of Zvlsseg registers with vector registers to avoid unnecessary register copies between vector instructions and zvlsseg instructions. Differential Revision: https://reviews.llvm.org/D110250	2021-10-28 13:34:53 +08:00
Hsiangkai Wang	0a9b82960c	[RISCV] Use vmv.v.[v\|i] if we know COPY is under the same vl and vtype. If we know the source operand of COPY is defined by a vector instruction with tail agnostic and the same LMUL and there is no vsetvli between COPY and the define instruction to change the vl and vtype, we could use vmv.v.v or vmv.v.i to copy vector registers to get better performance than the whole vector register move instructions. If the source of COPY is from vmv.v.i, we could use vmv.v.i for the COPY. This patch only considers all these instructions within one basic block. Case 1: ``` bb.0: ... VSETVLI # The first VSETVLI before COPY and VOP. ... # Use this VSETVLI to check LMUL and tail agnostic. ... vy = VOP va, vb # Define vy. ... # There is no vsetvli between VOP and COPY. vx = COPY vy ``` Case 2: ``` bb.0: ... VSETVLI # The first VSETVLI before VOP. ... # Use this VSETVLI to check LMUL and tail agnostic. ... vy = VOP va, vb # Define vy. ... # There is no vsetvli to change vl between VOP and COPY. ... VSETVLI # The first VSETVLI before COPY. ... # This VSETVLI does not change vl and vtype. ... vx = COPY vy ``` Co-Authored-by: Zakk Chen <zakk.chen@sifive.com> Co-Authored-by: Kito Cheng <kito.cheng@sifive.com> Differential Revision: https://reviews.llvm.org/D103510	2021-10-28 11:39:04 +08:00

... 7 8 9 10 11 ...

2266 Commits