llvm-project

Commit Graph

Author	SHA1	Message	Date
Shao-Ce SUN	862f30a428	[RISCV] Add ISD::EH_DWARF_CFA Based on D24038. LLVM has an @llvm.eh.dwarf.cfa intrinsic, used to lower the GCC-compatible __builtin_dwarf_cfa() builtin. Reviewed By: StephenFan Differential Revision: https://reviews.llvm.org/D126181	2022-06-08 22:03:30 +08:00
Craig Topper	0c66deb498	[RISCV] Scalarize gather/scatter on RV64 with Zve32* extension. i64 indices aren't supported on Zve32*. Scalarize gathers to prevent generating illegal instructions. Since InstCombine will aggressively canonicalize GEP indices to pointer size, we're pretty much always going to have an i64 index. Trying to predict when SelectionDAG will find a smaller index from the TTI hook used by the ScalarizeMaskedMemIntrinPass seems fragile. To optimize this we probably need an IR pass to rewrite it earlier. Test RUN lines have also been added to make sure the strided load/store optimization still works. Reviewed By: reames Differential Revision: https://reviews.llvm.org/D127179	2022-06-07 08:07:50 -07:00
Matt Arsenault	cc5a1b3dd9	llvm-reduce: Add cloning of target MachineFunctionInfo MIR support is totally unusable for AMDGPU without this, since the set of reserved registers is set from fields here. Add a clone method to MachineFunctionInfo. This is a subtle variant of the copy constructor that is required if there are any MIR constructs that use pointers. Specifically, at minimum fields that reference MachineBasicBlocks or the MachineFunction need to be adjusted to the values in the new function.	2022-06-07 10:14:48 -04:00
Philip Reames	3fa5876216	[RISCV] Reorganize getShuffleCost to make it more clear what's going on [nfc]	2022-06-06 10:11:58 -07:00
yanming	bc93d51d36	[NFC][RISCV][format] Blank line between functions, remove unnecessary semicolon.	2022-06-06 15:38:14 +08:00
yanming	8d9d8f866a	[RISCV] Define risc-v's own register class to model FP Register. The default RegisterClass is not enough to model RISCV Register. We define risc-v's own register class to model FP Register. This helps to better estimate the register pressure in the loop-vectorize. Reviewed By: kito-cheng Differential Revision: https://reviews.llvm.org/D126854	2022-06-06 14:43:52 +08:00
Fangrui Song	77e300ffdf	[MC] Change EndOfStatement "unexpected tokens in .xxx directive " to "expected newline"	2022-06-05 15:11:01 -07:00
LiaoChunyu	f14d18c7a9	[RISCV] Add more patterns for FNMADD D54205 handles fnmadd: -rs1 * rs2 - rs3 This patch add fnmadd: -(rs1 * rs2 + rs3) (the nsz flag on the FMA) Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D126852	2022-06-04 12:31:45 +08:00
Craig Topper	cc3bd43533	[RISCV] Support LUI+ADDIW in doPeepholeLoadStoreADDI. This fixes an inconsistency between RV32 and RV64. Still considering trying to do this peephole during isel, but wanted to fix the inconsistency first. Reviewed By: reames Differential Revision: https://reviews.llvm.org/D126986	2022-06-03 18:06:56 -07:00
Craig Topper	170c550ca8	[RISCV] Use SelectionDAG::isBaseWithConstantOffset in scalar load/store address matching. Test changes are because isBaseWithConstantOffset uses computeKnownBits and that is able to see that an earlier AND instruction guaranteed alignment so that we can treat an OR as an ADD. Reviewed By: reames Differential Revision: https://reviews.llvm.org/D126970	2022-06-03 10:55:28 -07:00
Craig Topper	4402852002	[RISCV] Reduce scalar load/store isel patterns to a single ComplexPattern. NFCI Previously we had 3 different isel patterns for every scalar load store instruction. This reduces them to a single ComplexPattern that returns the Base and Offset. Or an offset of 0 if there was no offset identified I've done a similar thing for the 2 isel patterns that match add/or with FrameIndex and immediate. Using the offset of 0, I was also able to remove the custom handler for FrameIndex. Happy to split that to another patch. We might be able to enhance in the future to remove the post-isel peephole or the special handling for ADD with constant added by D126576. A nice side effect is that this removes nearly 3000 bytes from the isel table. Differential Revision: https://reviews.llvm.org/D126932	2022-06-03 09:00:17 -07:00
Craig Topper	1d67adbfbf	[RISCV] Give CSImm12MulBy4 PatLeaf priority over CSImm12MulBy8. NFC The immediate range check for CSImm12MulBy8 included some values covered by CSImm12MulBy4. I assume CSImm12MulBy4 had priority due to pattern order in the td file, but this makes the priority explicit in the predicate.	2022-06-02 20:51:14 -07:00
Craig Topper	dbead2388b	[RISCV] Add custom isel for (add X, imm) used by load/stores. If the imm is out of range for an ADDI, we will materialize it in a register using multiple instructions. If the ADD is used by a load/store, doPeepholeLoadStoreADDI can try to pull an ADDI from the constant materialization into the load/store offset. This only works if the ADD has a single use, otherwise the peephole would have to rebuild multiple nodes. This patch instead tries to solve the problem when the add is selected. We check that the add is only used by loads/stores and if it is we will select it to (ADDI (ADD X, Imm-Lo12), Lo12). This will enable the simple case in doPeepholeLoadStoreADDI that can bypass an ADDI used as a pointer. As a result we can remove the more complicated peephole from doPeepholeLoadStoreADDI. Reviewed By: reames Differential Revision: https://reviews.llvm.org/D126576	2022-06-02 13:45:32 -07:00
Philip Reames	76ac916d63	[RISCV] Inline one copy of needVSETVLI into the other [NFC] Calling the non-MI version directly was unsound (as fixed in `dcdb0bf2`), so remove that version to decrease likelyhood of future mistakes.	2022-06-02 13:06:18 -07:00
Philip Reames	dcdb0bf25b	[RISCV] Fix an inconsistency with compatible load/store handling Once we've computed the incoming predecessor state, we should use the same compatibility check with knowledge of MI as we did in phase 2 in order to be consistent across all phases. Differential Revision: https://reviews.llvm.org/D126574	2022-06-02 08:03:51 -07:00
Craig Topper	909a78b3a4	[RISCV] Use MachineRegisterInfo::use_instr_begin instead of use_begin+getParent. NFCI	2022-06-01 15:37:48 -07:00
Craig Topper	aeb27f133a	[RISCV] Fix i64<->f64 and i32<->f32 bitcasts with VLS vectors enabled. We enable a custom handler to optimize conversions between scalars and fixed vectors. Unfortunately, the custom handler picks up scalar to scalar conversions as well. If the scalar types are both legal, we wouldn't match any of the fixed vector cases and would return SDValue() causing the LegalizeDAG to expand the bitcast through memory. This patch fixes this by checking if it's a scalar to scalar conversion and returns `Op` if both types are legal. Differential Revision: https://reviews.llvm.org/D126739	2022-06-01 08:13:49 -07:00
Craig Topper	1b2de79ff4	[RISCV] Use two ADDIs to do some stack pointer adjustments. If the adjustment doesn't fit in 12 bits, try to break it into two 12 bit values before falling back to movImm+add/sub. This is based on a similar idea from isel. Reviewed By: luismarques, reames Differential Revision: https://reviews.llvm.org/D126392	2022-05-31 10:25:28 -07:00
Craig Topper	80c4cf6369	[RISCV] Fix a few corner case bugs in RISCVMergeBaseOffsetOpt::matchLargeOffset The immediate for LUI is stored as 20-bit unsigned value. We need to sign extend if after shifting by 12 to match the instruction behavior. If we find an LUI+ADDI on RV64, it means the constant isn't a simm32. If it was, we would have emitted LUI+ADDIW from constant materialization. Make sure the constant is a simm32 before folding. This appears to match gcc. A future patch will add support for LUI+ADDIW on RV64.	2022-05-31 09:50:54 -07:00
Fraser Cormack	5a2e640eb7	[RISCV][NFC] Adjust some comments in RISCVInsertVSETVLI Capitalize the first letter of comments like the others, and a few other tweaks.	2022-05-31 10:13:15 +01:00
eopXD	2cadf84fc8	[RISCV] Pass OptLevel to `RISCVDAGToDAGISel` correctly Originally, `OptLevel` isn't passed into the `MachineFunctionPass`. This lets the default parameter of `SelectionDAGISel`, which is `CodeGenOpt::Default`, be passed in. OptLevelChanger captures the optimization level with the parameter, and rather not the value within `TargetMachine`. This lets the optimization be unintentionally overwriten if other value than `CodeGenOpt::Default` passed. This patch fixes this by passing the optimization level rather than using the default value. Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D126641	2022-05-30 17:22:50 -07:00
Craig Topper	6a6cf2e28d	[RISCV] isel (add (and X, 0x1FFFFFFFE), Y) as (SH1ADD (SRLI X, 1), Y) This pattern is what we get after DAG combine for C code like this. short ptr1, ptr2, *ptr3; unsigned diff = ptr1 - ptr2; return ptr3[diff]; Reviewed By: reames Differential Revision: https://reviews.llvm.org/D126588	2022-05-29 18:24:07 -07:00
Philip Reames	85b4470035	[RISCV] Allow PRE of vsetvli involving non-1 LMUL This is a follow up to address a review comment from D124869. When deciding whether to PRE a vsetvli, we can allow non-LMUL1 vsetvlis. Differential Revision: https://reviews.llvm.org/D126563	2022-05-27 15:49:41 -07:00
Craig Topper	b09e54541a	[RISCV] Use template version of SignExtend64 for constant extends. NFC We were inconsistent about which one we used.	2022-05-27 13:11:15 -07:00
Craig Topper	d0f65eaa85	[RISCV] Remove unused variables. NFC	2022-05-27 12:13:45 -07:00
Craig Topper	aaad507546	[RISCV] Return false from isOffsetFoldingLegal instead of reversing the fold in lowering. When lowering GlobalAddressNodes, we were removing a non-zero offset and creating a separate ADD. It already comes out of SelectionDAGBuilder with a separate ADD. The ADD was being removed by DAGCombiner. This patch disables the DAG combine so we don't have to reverse it. Test changes all look to be instruction order changes. Probably due to different DAG node ordering. Differential Revision: https://reviews.llvm.org/D126558	2022-05-27 11:05:18 -07:00
Fraser Cormack	3e450d9cbb	[RISCV][NFC] Unify compatibility checks under one function Split off from D125021. We were duplicating logic across different phases. Since we want to ensure a consistency of logic across phases for correctness, this patch combines our multiple compatibility checks into one function to better convey this. Several methods were made const too. Reviewed By: reames Differential Revision: https://reviews.llvm.org/D126472	2022-05-27 11:21:54 +01:00
Fangrui Song	cef377d75d	[RISCV] Simplify code after D125905	2022-05-26 18:13:38 -07:00
Philip Reames	8a3b6ba756	[RISCV] Add a subtarget feature to enable unaligned scalar loads and stores A RISCV implementation can choose to implement unaligned load/store support. We currently don't have a way for such a processor to indicate a preference for unaligned load/stores, so add a subtarget feature. There doesn't appear to be a formal extension for unaligned support. The RISCV Profiles (https://github.com/riscv/riscv-profiles/blob/main/profiles.adoc#rva20u64-profile) docs use the name Zicclsm, but a) that doesn't appear to actually been standardized, and b) isn't quite what we want here anyway due to the perf comment. Instead, we can follow precedent from other backends and have a feature flag for the existence of misaligned load/stores with sufficient performance that user code should actually use them. Differential Revision: https://reviews.llvm.org/D126085	2022-05-26 15:25:47 -07:00
Craig Topper	e9ac99b609	[RISCV] Simplfy creation of IndexVT in lowerMaskedGather/lowerMaskedScatter. NFC The scalar element width is not a factor in how ContainerVT is determined. We don't need to check the relative size of VT and IndexVT.	2022-05-26 13:13:32 -07:00
Philip Reames	d58cc0839e	[RISCV] reorganize getFrameIndexReference to reduce code duplication [nfc] This change reorganizes the majority of frame index resolution into a two strep process. Step 1 - Select which base register we're going to use. Step 2 - Compute the offset from that base register. The key point is that this allows us to share the step 2 logic for the SP case. This reduces the code duplication, and (I think) makes the code much easier to follow. I also went ahead and added assertions into phase 2 to catch errors where we select an illegal base pointer. In general, we can't index from a base register to a stack location if that requires crossing a variable and unknown region. In practice, we have two such cases: dynamic stack realign and var sized objects. Note that crossing the scalable region is fine since while variable, it's a known variability which can be expressed in the offset. Differential Revision: https://reviews.llvm.org/D126403	2022-05-26 09:44:58 -07:00
Philip Reames	afe49934a6	[RISCV] Allow compatible VTYPE in AVL Reg Forward cases During insertion of VSETVLI, we have two related bits of code which decide whether we can reuse a previous vsetvli result. As was pointed out in the original review, these cases can allow any prior state for which we know that VL is the same for any value of AVL. This was originally separated out of a desire for separate tests and review. As it turns out, finding a test case for this has been quite challenging. Most of the cases I tried, we manage to already get through other chains of logic. We do have one correct test change, but that only exercises one of the two changes. Differential Revision: https://reviews.llvm.org/D126400	2022-05-26 08:50:35 -07:00
Fraser Cormack	2c9983f530	[RISCV][NFC] Add braces to 'else' to match braced 'if'	2022-05-26 10:00:33 +01:00
Kito Cheng	e45087fd53	[RISCV] Fix state persistence bugs (PR55548) We didn't implement RISCVELFStreamer::reset and cause some very strange section output for attribute section...just reference D15950 to see how ARM implement that. Reviewed By: MaskRay Differential Revision: https://reviews.llvm.org/D125905	2022-05-26 16:09:00 +08:00
jacquesguan	b271488e8b	[RISCV] Replace ISD::FP_EXTEND and ISD::FP_ROUND with RVV VL op. This patch tries to solve the incoordination between the direct and intermediate cast caused by D123975. This patch replaces ISD::FP_EXTEND and ISD::FP_ROUND with RVV VL op in the lowering of FP scalable vector direct cast to unify with the intermediate cast. And it also changes the FP widenning pattern with the VL op. Differential Revision: https://reviews.llvm.org/D125364	2022-05-26 02:17:31 +00:00
Philip Reames	1f06398e96	Reapply "[RISCV] Enable strict assertions in InsertVSETVLI data flow" be2cb8 fixes the case which triggered the revert. Reapply, and let's see if anything else falls out. Original commit message: These asserts are believed to hold after several recent miscompiles have been fixed. If you see an assertion failure on this change, please toggle the default back and make sure you file a bug with a reproducer. We may have as yet uncaught miscompiles lurking in this code. Differential Revision: https://reviews.llvm.org/D125271	2022-05-25 11:18:55 -07:00
Philip Reames	be2cb824d0	[riscv] Remove mutation of prior vsetvli from insertion dataflow This moves mutation entirely out of the main algorithm. The immediate trigger is that we hit another case of the same issue I thought we'd fixed in `72925d9`. It turns out we hadn't considered the cross block case. As a brief summary, the issue being fixed is that if we mutate a previous vsetvli in phase 3, there's a possibility that some later use of that vsetvli changes "compatibility". In the cross_block_mutate test, this later vsetvli occurs in another block (and is thus visit order dependent too!). This causes us to fail strict asserts. (To be explicit, the current on by default workaround should compensate. It's only when we turn that off that we have problems.) Now, I want to explicitly call out an alternate workaround. We could leave the mutation in phase 3, and simplify restrict it to the case where the previous vsetvli's GPR result is unused. That covers the case we've actually seen. (I'll note that codegen regressions with a simple form of this were significant. We might have to check specifically for the use outside block case to keep them reasonable, which complicates the workaround slightly.) Personally, I'm at the point where I want the mutation pulled out just for robustness sake. I'm worried there's yet one more form of this bug we haven't thought about. The other motivation for this change is that it does give us a couple of minor codegen wins. None appear to be hugely significant, but improvements never hurt right? Differential Revision: https://reviews.llvm.org/D125270	2022-05-25 10:51:14 -07:00
Craig Topper	172149e98c	[RISCV] Preserve fast math flags in lowerVPOp. Update test to check MIR after finalize-isel instead of debug output. This is of course not the only place we should preserve FMF, but it's the most obvious one. Reviewed By: frasercrmck Differential Revision: https://reviews.llvm.org/D126306	2022-05-25 09:16:07 -07:00
Philip Reames	2a3b6f2cba	[RISCV] Hoist VSETVLI vlmax, vtype out of scalable loops This is a straight forward extension of the PRE transform introduced in D124869 to handle the VLMAX case. The test changes here look quite positive. This surprised me until I realized that all the tests are using @llvm.vscale to figure out the VLMAX, not the llvm.riscv.vsetvlmax intrinsic. If they'd used the later, these would have been full redundancy cases and fully handled by the data flow. I'm not really sure if use of vscale here is representative or not. If it is, we should probably look at using VSETVLI to lower vscale rather than a raw read of vlenb and some math. Differential Revision: https://reviews.llvm.org/D126338	2022-05-25 08:00:27 -07:00
Philip Reames	dd336b6891	[RISCV] Restructure comment and add clarifying assert to getFrameIndexReference [NFC] Differential Revision: https://reviews.llvm.org/D126088	2022-05-25 07:59:27 -07:00
Lewis Revill	29a5a7c6d4	[RISCV] Add pre-emit pass to make more instructions compressible When optimizing for size, this pass searches for instructions that are prevented from being compressed by one of the following: 1. The use of a single uncompressed register. 2. A base register + offset where the offset is too large to be compressed and the base register may or may not already be compressed. In the first case, if there is a compressed register available, then the uncompressed register is copied to the compressed register and its uses replaced. This is only done if there are enough uses that code size would be improved. In the second case, if a compressed register is available, then the original base register is copied and adjusted such that: new_base_register = base_register + adjustment base_register + large_offset = new_base_register + small_offset and the uses of the base register are replaced with the new base register. Again this is only done if there are enough uses for code size to be improved. This pass was authored by Lewis Revill, with large offset optimization added by Craig Blackmore. Differential Revision: https://reviews.llvm.org/D92105	2022-05-25 09:25:02 +01:00
Craig Topper	66db5312bd	[RISCV] Fix vnsrl/vnsra isel patterns that are dropping VL. We were incorrectly using VLMax instead of the passed VL. Reviewed By: khchen, reames Differential Revision: https://reviews.llvm.org/D126319	2022-05-24 21:38:59 -07:00
Fraser Cormack	fd93736657	[RISCV] Replace untested code with assert We found untested code where negative frame indices were ostensibly handled despite it being in a block guarded by !MFI.isFixedObjectIndex. While the implementation of MachineFrameInfo::isFixedObjectIndex suggests this is possible (i.e., if a frame index was more negative - less than the number of fixed objects), I couldn't find any test in tree -- for any target -- where a negative frame index wasn't also a fixed object offset. I couldn't find a way of creating such a object with the public MachineFrameInfo creation APIs. Even MachineFrameInfo::getObjectIndexBegin starts counting at the negative number of fixed objects, so such frame indices wouldn't be covered by loops using the provided begin/end methods. Given all this, an assert that any object encountered in the block is non-negative seems reasonable. Reviewed By: StephenFan, kito-cheng Differential Revision: https://reviews.llvm.org/D126278	2022-05-25 05:03:53 +01:00
Philip Reames	948d931323	[RISCV] Ensure the forwarded AVL register is alive When the AVL value does not fit in 5 bits, the register in which this value is stored may be dead when we want to forward it. This patch ensure the kill flags on the register are cleared before forwarding. Patch by: loralb Differential Revision: https://reviews.llvm.org/D125971	2022-05-24 15:07:42 -07:00
Craig Topper	d2ee2c9c8d	[RISCV] Add an operand kind to the opcode/imm returned from RISCVMatInt. Instead of matching opcodes to know the format to emit, use an enum value that we can get from the RISCVMatInt::Inst class. Change the consumers to use fully covered switches so that we get a compiler warning if a new kind is added. With the opcode checks it was easier to forget to update one of the 3 consumers. Reviewed By: reames Differential Revision: https://reviews.llvm.org/D126317	2022-05-24 14:56:29 -07:00
Philip Reames	fb948572e0	[riscv] Use getFirstInstrTerminator [nfc]	2022-05-24 14:56:01 -07:00
Philip Reames	a95ecb20bc	[RISCV] Hoist VSETVLI out of idiomatic fixed length vector loops This patch teaches the VSETVLI insertion pass to perform a very limited form of partial redundancy elimination. The motivating example comes from the fixed length vectorization of a simple loop such as: for (unsigned i = 0; i < a_len; i++) a[i] += b; Without this change, the core vector loop and preheader is as follows: .LBB0_3: # %vector.ph andi a1, a6, -8 addi a4, a0, 16 mv a5, a1 .LBB0_4: # %vector.body # =>This Inner Loop Header: Depth=1 addi a3, a4, -16 vsetivli zero, 4, e32, m1, ta, mu vle32.v v8, (a3) vle32.v v9, (a4) vadd.vx v8, v8, a2 vadd.vx v9, v9, a2 vse32.v v8, (a3) vse32.v v9, (a4) addi a5, a5, -8 addi a4, a4, 32 bnez a5, .LBB0_4 The key thing to note here is that, the execution of the vsetivli only needs to happen once. Since there's no tail folding happening here, the value of the vector configuration registers are invariant through the loop. After this patch, we hoist the configuration into the preheader and perform it once. .LBB0_3: # %vector.ph andi a1, a6, -8 vsetivli zero, 4, e32, m1, ta, mu addi a4, a0, 16 mv a5, a1 .LBB0_4: # %vector.body # =>This Inner Loop Header: Depth=1 addi a3, a4, -16 vle32.v v8, (a3) vle32.v v9, (a4) vadd.vx v8, v8, a2 vadd.vx v9, v9, a2 vse32.v v8, (a3) vse32.v v9, (a4) addi a5, a5, -8 addi a4, a4, 32 bnez a5, .LBB0_4 Differential Revision: https://reviews.llvm.org/D124869	2022-05-24 14:56:01 -07:00
Craig Topper	415b9f595d	Recommit "[RISCV] Use selectShiftMaskXLen ComplexPattern for isel of rotates." This reverts commit `dfe513ae1b`. Tests have been changed to avoid the type legalization bug being fixed in D126036. Original commit message: This will remove masks on the shift amount. We usually get this with SimplifyDemandedBits in DAGCombine, but that's restricted to cases where the AND has a single use. selectShiftMaskXLen does not have that restriction.	2022-05-24 09:41:04 -07:00
Fraser Cormack	08c9fb8447	[RISCV] Ensure the entire stack is aligned to the RVV stack alignment This patch fixes another bug in the RVV frame lowering. While some frame objects with non-default stack IDs (such scalable-vector alloca instructions) are considered in the target-independent max alignment calculations, others (for example, during calling-convention lowering) are not. This means we'd occasionally align the base of the stack to only 16 bytes, with no way to ensure that the RVV section contained within that is aligned to anything higher. Reviewed By: StephenFan Differential Revision: https://reviews.llvm.org/D125973	2022-05-24 06:58:51 +01:00
Fraser Cormack	cb8681a2b3	[RISCV] Fix RVV stack frame alignment bugs This patch addresses several alignment issues in the stack frame when RVV objects are taken into account. One bug is that the RVV stack was never guaranteed to keep the alignment of the stack as a whole. We must maintain a 16-byte aligned stack at all times, especially when calling other functions. With the standard V extension, this is conveniently happening since VLEN is at least 128 and always 16-byte aligned. However, we support Zvl64b which does not guarantee this. To fix this, the RVV stack size is rounded up to be aligned to 16 bytes. This in practice generally makes us allocate a stack sized at least 2VLEN in size, and a multiple of 2. \|------------------------------\| -- <-- FP \| 8-byte callee-save \| \| \| \|------------------------------\| \| \| \| one VLENB-sized RVV object \| \| \| \|------------------------------\| \| \| \| 8-byte local variable \| \| \| \|------------------------------\| -- <-- SP (must be aligned to 16) In the example above, with Zvl64b we are decrementing SP by 12 bytes which does not leave SP correctly aligned. We therefore introduce an extra VLENB-sized amount used for alignment. This would therefore ensure the total stack size was 16 bytes (48 for Zvl128b, 80 for Zvl256b, etc): \|------------------------------\| -- <-- FP \| 8-byte callee-save \| \| \| \|------------------------------\| \| \| \| one VLENB-sized padding obj \| \| \| \| one VLENB-sized RVV object \| \| \| \|------------------------------\| \| \| \| 8-byte local variable \| \| \| \|------------------------------\| -- <-- SP A new RVV invariant has been introduced in this patch, which is that the base of the RVV stack itself is now always aligned to 16 bytes, not 8 as before. This keeps us more in line with the scalar stack and should be easier to reason about. The calculation of the RVV padding has thus changed to be the amount required to align the scalar local variable section to the RVV section's alignment. This amount is further rounded up when setting up the initial stack to keep everything aligned: \|------------------------------\| -- <-- FP \| 8-byte callee-save \| \|------------------------------\| \| \| \| RVV objects \| \| (aligned to at least 16) \| \| \| \|------------------------------\| \| RVV padding of 8 bytes \| \|------------------------------\| \| 8-byte local variable \| \|------------------------------\| -- <-- SP In the example above, it's clear that we need 8 bytes of padding to keep the RVV section aligned to 16 when using SP. But to keep SP itself* aligned to 16 we can't decrement the initial stack pointer by 24 - we have to round up to 32. With the RVV section correctly aligned, the second bug fixed by this patch is that RVV objects themselves are now correctly aligned. We were previously only guaranteeing an alignment of 8 bytes, even if they required a higher alignment. This is relatively simple and in practice we see more rounding up of VLEN amounts to account for alignment in between objects: \|------------------------------\| \| RVV object (aligned to 16) \| \|------------------------------\| \| no padding necessary \| \|------------------------------\| \| 2VLENB RVV object (align 16)\| \|------------------------------\| \| VLENB alignment padding \| \|------------------------------\| \| RVV object (align 32) \| \|------------------------------\| \| 3VLENB alignment padding \| \|------------------------------\| \| VLENB RVV object (align 32) \| \|------------------------------\| -- <-- base of RVV section Note that a lot of the regressions in codegen owing to the new alignment rules are correct but actually only strictly necessary for Zvl64b (and Zvl32b but that's not really supported). I plan a follow-up patch to take the known VLEN into account when padding for alignment. Reviewed By: StephenFan Differential Revision: https://reviews.llvm.org/D125787	2022-05-24 06:53:51 +01:00
Peter Waller	ade47bdc31	[LV] Improve register pressure estimate at high VFs Previously, `getRegUsageForType` was implemented using `getTypeLegalizationCost`. `getRegUsageForType` is used by the loop vectorizer to estimate the register pressure caused by using a vector type. However, `getTypeLegalizationCost` currently only appears to understand splitting and not scalarization, so significantly underestimates the register requirements. Instead, use `getNumRegisters`, which understands when scalarization can occur (via computeRegisterProperties). This was discovered while investigating D118979 (Set maximum VF with shouldMaximizeVectorBandwidth), where under fixed-length 512-bit SVE the loop vectorizer previously ends up costing an v128i1 as 2 v64i* registers where it actually occupies 128 i32 registers. I'm sending this patch early for comment, I'm still doing some sanity checking with LNT. I note that getRegisterClassForType appears to return VectorRC even though the type in question (large vNi1 types) end up occupying scalar registers. That might be worth fixing too. Differential Revision: https://reviews.llvm.org/D125918	2022-05-23 07:57:45 +00:00
Paul Walker	258dac43d6	[SVE] Enable use of 32bit gather/scatter indices for fixed length vectors Differential Revision: https://reviews.llvm.org/D125193	2022-05-22 12:32:30 +01:00
Fraser Cormack	d60ae47f9d	[RISCV] Fix logic for determining RVV stack padding We must add padding when using SP or BP to access stack objects. Checking whether we're missing FP is not sufficient as stack realignment uses SP too. The test in D125962 explains the specific issue in more detail. Split from D125787. Reviewed By: reames Differential Revision: https://reviews.llvm.org/D125964	2022-05-20 13:18:52 +01:00
jacquesguan	8fc4fcecb8	[RISCV] Add VL patterns for vector widening floating-point fused multiply-add instructions. This patch adds VL patterns for vector widening floating-point fused multiply-add instructions to support fixed length vector type. Differential Revision: https://reviews.llvm.org/D124505	2022-05-20 06:56:48 +00:00
Craig Topper	dfe513ae1b	Revert "[RISCV] Use selectShiftMaskXLen ComplexPattern for isel of rotates." This reverts commit `86f7d7074a`. The test cases added for this exposed an pre-existing bug that is failing the expensive checks bot. Reverting so I can revert that patch.	2022-05-19 14:39:38 -07:00
Jay Foad	6bec3e9303	[APInt] Remove all uses of zextOrSelf, sextOrSelf and truncOrSelf Most clients only used these methods because they wanted to be able to extend or truncate to the same bit width (which is a no-op). Now that the standard zext, sext and trunc allow this, there is no reason to use the OrSelf versions. The OrSelf versions additionally have the strange behaviour of allowing extending to a smaller width, or truncating to a larger width, which are also treated as no-ops. A small amount of client code relied on this (ConstantRange::castOp and MicrosoftCXXNameMangler::mangleNumber) and needed rewriting. Differential Revision: https://reviews.llvm.org/D125557	2022-05-19 11:23:13 +01:00
Zi Xuan Wu (Zeson)	861489af1b	[NFC][RISCV] Enable TuneNoDefaultUnroll feature to control targets which use default unroll preference In RISCVTargetTransformInfo, enumerating the processor family is not a good way to predict. Because it needs to enumerate many subtarget family and is hard to update if add new subtarget. Instead, create a feature to distinguish whether targets want to use default unroll preference or not. Keep TuneSiFive7 because it's flag to indicate subtarget family, which may used in other place. Differential Revision: https://reviews.llvm.org/D125741	2022-05-19 12:21:49 +08:00
Craig Topper	86f7d7074a	[RISCV] Use selectShiftMaskXLen ComplexPattern for isel of rotates. This will remove masks on the shift amount. We usually get this with SimplifyDemandedBits in DAGCombine, but that's restricted to cases where the AND has a single use. selectShiftMaskXLen does not have that restriction.	2022-05-18 10:23:29 -07:00
Philip Reames	d4545e6fa0	Revert "[RISCV] Enable strict assertions in InsertVSETVLI data flow" This reverts commit `79a66ec97b`. The stronger asserts served their purpose; I stumbled across another bug. Will reapply once this one is also fixed. The bug appears to be a variant of a previous one: * We mutate an instruction in one block. * That mutation changes the phase3 results of another block. This is very similiar to a previous issue, except cross block instead of within a single block.	2022-05-17 15:53:13 -07:00
Philip Reames	118c5d1c97	[RISCV] Minor reorganization of VSETVLIInfo::operator== for readability [NFC]	2022-05-17 12:05:17 -07:00
Philip Reames	11a7e77c95	[RISCV] Canonicalize AVL=setvli to AVL=Imm or AVL=VLMAX This patch adds a transform to the local prepass in InsertVSETVLI which canonicalizes an AVL of a register from another vsetvli into immediate or VLMAX when VTYPE is the same. In this patch, I chose to be conservative and avoid arbitrary vreg forwarding due to profitability concerns about possibility overlapping live ranges. This has the effect of eliminating vsetvli instructions in loops which are walking either VLMAX or a constant number of lanes per iteration. Differential Revision: https://reviews.llvm.org/D125812	2022-05-17 11:46:22 -07:00
Philip Reames	79a66ec97b	[RISCV] Enable strict assertions in InsertVSETVLI data flow These asserts are believed to hold after several recent miscompiles have been fixed. If you see an assertion failure on this change, please toggle the default back and make sure you file a bug with a reproducer. We may have as yet uncaught miscompiles lurking in this code. Differential Revision: https://reviews.llvm.org/D125271	2022-05-17 11:12:31 -07:00
Fraser Cormack	8430b82741	[RISCV] Drop notion of "strict" vsetvli compatibility With recent fixes to the dataflow in place, we now never pass Strict=true to isCompatible, so remove the parameter completely. Reviewed By: reames Differential Revision: https://reviews.llvm.org/D125748	2022-05-17 15:24:23 +01:00
Fraser Cormack	f00f894d5d	[RISCV][NFC] Reword split SP adjustment comments	2022-05-17 10:03:21 +01:00
Fraser Cormack	05ad4d4f38	[RISCV][NFC] Fix comment typos in split SP adjustment	2022-05-17 09:56:54 +01:00
Philip Reames	1474880353	[RISCV] Use classic dataflow for VSETVLI insertion Our current implementation of the InsertVSETVLI dataflow allows phase 3 to arrive at a different block end state than the data flow in phase 1/2 computed. This arises because a block which contains instructions (e.g. load or stores) which don't consume all the incoming bits of the VL/VTYPE can be compatible with multiple incoming states. The algorithm effectively changes the SEW on such instructions, and propagates the prior state forward. As phase 3 uses the block input state for this propagation, but phase 1/2 doesn't, this can result in different block end states. If we don't correct for it, this discrepancy can result in miscompiles. This was the source of multiple recent bugs. However, by now we have fixes for all known correctness issues. The basic strategy we use is to insert a compensation vsetvli to bring the block state leaving the block back into consistency with the one computed. This is correct, but results in extra vsetvlis being placed at the end of blocks. This change adjusts the phase 1/2 algorithm to propagate the incoming block state through the block, allowing the compatibility rules to modify the end state. The algorithm may need to run slightly more iterations, but the end result is consistent with what phase 3 does. The benefit of doing this is two fold. First, we reverse some of the code quality introductions introduced in the functional fixes. Second, we simplify the invariants, and allow the strict assertions to be enabled. Several humans, myself included, have found it quite surprising that invariant didn't hold already, and arguably that confusion is the cause of several of our recent miscompiles in this code. The downside to this patch is that the dataflow may require additional iterations to stabilize. In the worse case, we go from O(Edges) to O(E + UniquePaths) as the incoming state (and thus the outgoing one) can now change once for each path from the entry block. Differential Revision: https://reviews.llvm.org/D125232	2022-05-16 17:06:27 -07:00
Philip Reames	3d17c91709	[RISCV] Fix missing vsetvli in transparent block case We've got a lurking problem with our data flow implementation where different phases disagree, resulting in possible miscompiles. D119518 introduced a workaround, but failed to consider blocks which only contain load/stores compatible with their incoming state. When I went to rebase and simplify D125232, it turned out that not all of the correctness issues had been fixed yet after all. This is the correctness fix accidentally embedded in the original more complicated version. Note that the test changes here are mostly regressions. It's worth noting that the simplified version of D125232 exactly reverses all the non-functional diffs in the test caused here. D125232 should be the immediate following commit. Differential Revision: https://reviews.llvm.org/D125703	2022-05-16 17:06:27 -07:00
Philip Reames	7dbf2e7b57	Teach PeepholeOpt to eliminate redundant copy from constant physreg (e.g VLENB on RISCV) The existing redundant copy elimination required a virtual register source, but the same logic works for any physreg where we don't have to worry about clobbers. On RISCV, this helps eliminate redundant CSR reads from VLENB. Differential Revision: https://reviews.llvm.org/D125564	2022-05-16 16:38:30 -07:00
Paul Walker	7dd05ba9ed	[SelectionDAG] Remove duplicate "is scaled" information from gather/scatter SDNodes. During early gather/scatter enablement two different approaches were taken to represent scaled indices: * A Scale operand whereby byte_offsets = Index * Scale * An IndexType whereby byte_offsets = Index * sizeof(MemVT.ElementType) Having multiple representations is bad as shown by this patch which fixes instances where the two are out of sync. The dedicated scale operand is more flexible and pervasive so this patch removes the UNSCALED values from IndexType. This means all indices are scaled but the scale can be one, hence unscaled. SDNodes now use the scale operand to answer the "isScaledIndex" question. I toyed with the idea of keeping the UNSCALED enums and helper functions but because they will have no uses and force SDNodes to validate the set of supported values I figured it's best to remove them. We can re-add them if there's a real need. For similar reasons I've kept the IndexType enum when a bool could be used as I think being explicitly looks better. Depends On D123347 Differential Revision: https://reviews.llvm.org/D123381	2022-05-16 20:47:52 +01:00
Philip Reames	e2df48bb23	[RISCV] Add further trace output to InsertVSETLVI	2022-05-16 09:15:32 -07:00
Liqin.Weng	d95513ae3a	[RISCV] remove useless code When legality check for vectoring reduction， hasVInstructions() check be unneeded. RISCV can only loop vectorization with hasVInstructions() Reviewed By: kito-cheng, craig.topper Differential Revision: https://reviews.llvm.org/D125460	2022-05-16 12:54:03 +00:00
jacquesguan	a8426ada49	[RISCV][NFC] Replace for-each with array argument call. This patch replaces some for-each set with the new arrayref argument API, since it already used an array in defination, I think this change won't cause any ambiguity. Differential Revision: https://reviews.llvm.org/D125455	2022-05-16 02:12:48 +00:00
Zakk Chen	1878f240c9	[RISCV] Fix incorrect use of tail agnostic vslideup. We need to use tail undisturbed for vslideup to implement vector insert operation correctly. Ideally, we cound use the tail agnostic when insert subvector or element at the end of the vector. This will be in follow-up patch. Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D125545	2022-05-15 18:32:21 -07:00
Sheng	c644488a8b	Rename `MCFixedLenDisassembler.h` as `MCDecoderOps.h` The name `MCFixedLenDisassembler.h` is out of date after D120958. Rename it as `MCDecoderOps.h` to reflect the change. Reviewed By: myhsu Differential Revision: https://reviews.llvm.org/D124987	2022-05-15 08:44:58 +08:00
Craig Topper	5a19fbad83	[RISCV] Remove unneeded check for ISD::VSCALE operand being a constant. NFC ISD::VSCALE only allows constant operands.	2022-05-14 13:45:03 -07:00
Roger Ferrer Ibanez	189ca6958e	[RISCV] Use the new chain when converting a fixed RVV load When building the final merged node, we were using the original chain rather than the output chain of the new operation. After some collapsing of the chain this could cause the loads be incorrectly scheduled respect to later stores. This was uncovered by SingleSource/Regression/C/gcc-c-torture/execute/pr36038.c of the llvm testsuite. https://reviews.llvm.org/D125560	2022-05-13 22:21:08 +00:00
Craig Topper	a2918976cd	Revert "[RISCV] Enable subregister liveness tracking for RVV." This reverts most of `ed242b54c9` I'm seeing failures in our intrinsic testing on qemu that seem related to this. Reverting while I investigate. I've left the command line option in place for directed testing. It defaults to off.	2022-05-13 10:59:58 -07:00
Philip Reames	853fa8ee22	[RISCV] Address post-commit feedback from `af5e09b`	2022-05-13 09:51:23 -07:00
Philip Reames	af5e09b7d9	[RISCV] Add llvm.read.register support for vlenb This patch adds minimal support for lowering an read.register intrinsic with vlenb as the argument. Note that vlenb is an implementation constant, so it is never allocatable. This was split off a patch to eventually replace PseudoReadVLENB with a COPY MI because doing so revealed a couple of optimization opportunities which really seemed to warrant individual patches and tests. To write those patches, I need a way to write the tests involving vlenb, and read.register seemed like the right testing hook. Differential Revision: https://reviews.llvm.org/D125552	2022-05-13 09:12:02 -07:00
Zakk Chen	7dfc56c107	[RISCV] Add the passthru operand for RVV unmasked segment load IR intrinsics. The goal is support tail and mask policy in RVV builtins. We focus on IR part first. If the passthru operand is undef, we use tail agnostic, otherwise use tail undisturbed. Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D125323	2022-05-13 02:16:40 -07:00
Philip Reames	52b5f1f7d4	[RISCV] Extend dataflow workaround from D119518 to fallthrough blocks We've got a lurking problem with our data flow implementation where different phases disagree, resulting in possible miscompiles. D119518 introduced a workaround, but failed to consider blocks without terminators (e.g. fallthroughs). I have a deeper rework of the algorithm in flight over in D125232, but this patch is specifically a minimal fix for an active miscompile. That change can be reworked over this once landed. Differential Revision: https://reviews.llvm.org/D125408	2022-05-12 10:45:59 -07:00
Craig Topper	40e9654511	[RISCV] Use tail agnostic policy when selecting riscv_fma_vl to instructions riscv_fma_vl doesn't have a tail, so use the tail_agnostic policy. We were already doing this for some patterns. I think the patterns with fneg and mask were added later and I copied the tail policy from the unmasked patterns. Reviewed By: khchen Differential Revision: https://reviews.llvm.org/D125424	2022-05-12 09:09:24 -07:00
Craig Topper	ed242b54c9	[RISCV] Enable subregister liveness tracking for RVV. RVV makes heavy use of subregisters due to LMUL>1 and segment load/store tuples. Enabling subregister liveness tracking improves the quality of the register allocation. I've added a command line that can be used to turn it off if it causes compile time or functional issues. I used the command line to keep the old behavior for one interesting test case that was testing register allocation. Reviewed By: frasercrmck Differential Revision: https://reviews.llvm.org/D125108	2022-05-11 12:49:03 -07:00
Craig Topper	5c7ec998a9	[RISCV] Fold addiw from (add X, (addiw (lui C1, C2))) into load/store address This is a followup to D124231. We can fold the ADDIW in this pattern if we can prove that LUI+ADDI would have produced the same result as LUI+ADDIW. This pattern occurs because constant materialization prefers LUI+ADDIW for all simm32 immediates. Only immediates in the range 0x7ffff800-0x7fffffff require an ADDIW. Other simm32 immediates work with LUI+ADDI. Reviewed By: asb Differential Revision: https://reviews.llvm.org/D124693	2022-05-11 12:47:13 -07:00
Craig Topper	f499ec6b3d	[RISCV] Add caching to the gather/scatter to strided load/store conversion. If we have multiple gather/scatter instructions using the same the same strided address we would scalarize it multiple times. I guess a later pass cleans this up, but I don't know if that's guaranteed. This patch adds a cache to remember the scalarization we already created for a previous gather/scatter. Reviewed By: frasercrmck Differential Revision: https://reviews.llvm.org/D125326	2022-05-11 11:47:27 -07:00
Craig Topper	09f48c6b80	[RISCV] Move implementation of getVLOpNum and getSEWOpNum from RISCVInsertVSETVLI to RISCVBaseInfo.h. NFC We should consolidate the operand counting and ordering into RISCVBaseInfo.h and stop spreading it around. Reviewed By: asb Differential Revision: https://reviews.llvm.org/D125344	2022-05-11 11:14:58 -07:00
Craig Topper	0ebb02b90a	[RISCV] Override TargetLowering::shouldProduceAndByConstByHoistingConstFromShiftsLHSOfAnd. This hook determines if SimplifySetcc transforms (X & (C l>>/<< Y)) ==/!= 0 into ((X <</l>> Y) & C) ==/!= 0. Where C is a constant and X might be a constant. The default implementation favors doing the transform if X is not a constant. Otherwise the code is left alone. There is a provision that if the target supports a bit test instruction then the transform will favor ((1 << Y) & X) ==/!= 0. RISCV does not say it has a variable bit test operation. RISCV with Zbs does have a BEXT instruction that performs (X >> Y) & 1. Without Zbs, (X >> Y) & 1 still looks preferable to ((1 << Y) & X) since we can fold use ANDI instead of putting a 1 in a register for SLL. This patch overrides this hook to favor bit extract patterns and otherwise falls back to the "do the transform if X is not a constant" heuristic. I've added tests where both C and X are constants with both the shl form and lshr form. I've also added a test for a switch statement that lowers to a bit test. That was my original motivation for looking at this. Reviewed By: asb Differential Revision: https://reviews.llvm.org/D124639	2022-05-11 11:13:17 -07:00
Craig Topper	0781742785	[RISCV] Add a DAG combine to pre-promote (i32 (and (srl X, Y), 1)) with Zbs on RV64. Type legalization will want to turn (srl X, Y) into RISCVISD::SRLW, which will prevent us from using a BEXT instruction. I don't think there is any precedent for type promotion checking users to decide how to promote. Instead, I've added this DAG combine to do it before type legalization. Reviewed By: asb Differential Revision: https://reviews.llvm.org/D124109	2022-05-11 10:49:16 -07:00
Philip Reames	72925d98bf	[riscv] Canonicalize vsetvli (vsetvli avl, vtype1) vtype2 transitionsas reviewed This patch is an alternative to a piece of D125270. If we have one vsetvli which is using as AVL the output of another, and the prior AVL can be proven to produce the same VL value as that defining one, we can use the AVL from the prior instruction. This has the effect of removing a state transition on AVL, and will let us use the cheaper 'vsetvli x0, x0, vtype1' form or possible even skip emitting it entirely. This builds on the same infrastructure as D125337, and does the analogous extension to working on abstract states instead of only prior explicit vsetvli instructions. This is where the (relatively minor) code improvements come from. More importantly, this fixes the last case where the state computed in phase 1 and 2 of the algorithm differs from the state computed during phase 3. Note that such differences can cause miscompiles by creating disagreements about contents of the VL and VTYPE registers at block boundaries. Doing this transform inside the dataflow can cause the compatibility of a later store to change with regards to the current state. test15 in the diff illustrates this case well. What we have is a vsetvli which is mutated by one following vector op, but whose GPR result is used by another. The compatibility logic walks back to the def in this case, and checks to see if it matches the immediate prior state. In phase 1 and 2, it doesn't, and in phase 3 (after mutation) it does because we remove a transition which caused it to differ. Differential Revision: https://reviews.llvm.org/D125392	2022-05-11 10:45:29 -07:00
Philip Reames	cc0283a635	[riscv] Prefer to use previous VL for scalar move instructionsK This patch is an alternative to a piece of D125270. Its direct motivation is to fix a wrong code bug (described below), but somewhat unexpectedly, it also results in a significant code quality improvement for idiomatic fixed length vector patterns. The existing transform is simply wrong in its current location. We are correct about the fact that the scalar move itself can use the previous vsetvli, but we loose track of the fact that later instructions might depend on the state change represented. That is, the actual value of VL in the register is different than the abstract state thinks it is. Not simply due to precision of modeling, but e.g. the VL register could contain 3 when the abstract state says it is 1. This is annoying hard to demonstrate in practice due to differences in policy flags on the intrinsics, but this is at least a latent wrong code bug. The code quality benefit comes from the fact we don't need to tie this to explicit vsetvli instructions at all. We can propagate the abstract state, and reduce a) the number of transitions, or b) the cost of those transitions. It turns out we have a bunch of cases - in tests at least - where fixed length AVLs are known non-zero, and we can leave VL unchanged while changing VTYPE. Differential Revision: https://reviews.llvm.org/D125337	2022-05-11 07:37:50 -07:00
Fraser Cormack	27c7e922fe	[RISCV][NFC] Rename variable to appease code style	2022-05-11 12:41:25 +01:00
Fraser Cormack	874b802a6d	[RISCV][NFC] Move variable down closer to its first use	2022-05-11 12:33:01 +01:00
Fraser Cormack	c1d48b35d8	[SelectionDAG][VP] Rename VP sext/zext/trunc ISD opcodes Rather than VP_SEXT/VP_ZEXT/VP_TRUNC, having VP_SIGN_EXTEND/VP_ZERO_EXTEND/VP_TRUNCATE better matches their non-VP counterparts. Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D125298	2022-05-11 10:25:51 +01:00
Yeting Kuo	4537aae0d5	[RISCV] Make PseudoReadVL have the vtypes of the corresponding VLEFF/VLSEGFF. The patch make PseudoReadVL have the vtypes of the corresponding VLEFF/VLSEGFF. It's useful to get the vtypes of locations of PseudoReadVL without finding the corresponding VLEFF/VLSEGFF. It could simplify optimizations in RISCVInsertVSETVLI like D123581. Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D125199	2022-05-11 14:07:58 +08:00
jacquesguan	2509dcd58a	[RISCV] Add rvv codegen support for vp.fpext. This patch adds rvv codegen support for vp.fpext. The lowering of fp_round, vp.fptrunc, fp_extend and vp.fpext share most code so use a common lowering function to handle these four. And this patch changes the intermediate cast from ISD::FP_EXTEND/ISD::FP_ROUND to the RVV VL version op RISCVISD::FP_EXTEND_VL and RISCVISD::FP_ROUND_VL for scalable vectors. Reviewed By: frasercrmck Differential Revision: https://reviews.llvm.org/D123975	2022-05-11 03:28:25 +00:00
Philip Reames	7731935ffc	[riscv] Consolidate logic for SEW/VL operand offset calculations [nfc]	2022-05-10 15:06:26 -07:00
Philip Reames	413052310a	[riscv] Minor style cleanup so that code more obviously matches comments [nfc]	2022-05-10 14:20:26 -07:00
Fraser Cormack	0b2e7a7c72	[RISCV][NFC] Remove else after continue	2022-05-10 11:15:50 +01:00
Fraser Cormack	3b9a231d25	[RISCV] Remove two unmasked RVV patterns These can be selected to unmasked from masked instructions by the post-process DAG step. Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D125239	2022-05-09 16:54:24 +01:00
Philip Reames	70ad96ca5e	[riscv, InsertVSETVLI] Rename InstrInfo to Require to more clearly indicate purpose [nfc]	2022-05-09 06:40:33 -07:00
Philip Reames	7ed16e7c51	[riscv] Fix state tracking bug on vsetvli (phi of vsetvli) peephole This fixes the first of several cases where the state computed in phase 1 and 2 of the algorithm differs from the state computed during phase 3. Note that such differences can cause miscompiles by creating disagreements about contents of the VL and VTYPE registers at block boundaries. In this particular case, we recognize that for the first vsetvli in a block, that if the AVL is a phi of GPR results from previous vsetvlis and the VTYPE field matches, we can avoid emitting a vsetvli as the register contents don't change. Unfortunately, the abstract state does change and that update was lost. As noted in the test change, this can actually improve results by preserving information until later state transitions in the block. However, this minor codegen improvement is not the motivation for the patch. The motivation is to avoid cases a case where we break a key internal correctness invariant. Differential Revision: https://reviews.llvm.org/D125133	2022-05-09 06:21:45 -07:00
Philip Reames	c7c3f58544	[riscv] Use early return to reduce nesting for InsertVSETVLI [nfc]	2022-05-06 13:10:05 -07:00
Philip Reames	99a41005fe	[riscv] Add early return to InsertVSETLI fixed point step [nfc] If the income state hasn't changed, and the step function is fixed by assumption, then the output state can't have changed. In the current algorithm, this is a very minor win and mostly allows adding tracing output without being horrible verbose.	2022-05-06 13:08:11 -07:00
Philip Reames	dee9b01d83	[riscv] Add some minimal tracing output to InsertVSETVLI Only available with -debug. Main purpose is simplifying an upcoming change, and providing tools for debugging problems.	2022-05-06 13:08:11 -07:00
Philip Reames	f486119ce9	[riscv] Add strict asserts for VSETVLI insertion algorithm to help catch bugs This assertion should hold for any reasonable data flow algorithm, but is known not to in several cases today. I'd like to go ahead and land this off-by-default, so that we can collaborate on fixes and have a common definition of success. Differential: https://reviews.llvm.org/D125035	2022-05-06 10:28:22 -07:00
wangpc	4ff5e8184c	[RISCV] Enable MachineOutliner by default under -Oz for RISCV Enable default outlining when the function has the minsize attribute. `addr-label.ll` crashed after enabling this, so a barrier is added before instruction selection as a workaround. Reviewed By: luismarques Differential Revision: https://reviews.llvm.org/D122213	2022-05-06 17:37:45 +08:00
Philip Reames	042a7a5f0d	[riscv] Use X0 for destination of VSETVLI instruction if result unused If the GPR destination register of a VSETVLI instruction is unused, we can replace it with X0. This discards the result, and thus reduces register pressure. Since after the core insertion/lowering algorithm has run, many user written VSETVLIs will have their GPR result unused (as VTYPE/VLEN is now explicitly read instead), this kicks in for most tests which involve a vsetvli intrinsic for fixed length vectorization. (vscale vectorization generally uses the GPR result to know how far to e.g. advance pointers in a loop and these uses are not removed.) When inserting VSETVLIs to lower psuedos, we prefer the X0 form anyways. Differential Revision: https://reviews.llvm.org/D124961	2022-05-05 07:39:45 -07:00
Lian Wang	8bb10436ab	[RISCV][NFC] Use true_mask replace riscv_vmset_vl in defined patterns. Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D124660	2022-05-05 03:05:52 +00:00
Craig Topper	60cb489685	[RISCV] Use movImm went multiplying by simm12 in getVLENFactoredAmount. No reason to special case simm12, movImm handles all immediates. This also fixe a bug that we weren't passing the frame-setup/destroy flag to movImm when we were calling it.	2022-05-04 17:23:22 -07:00
Philip Reames	18ed2ee80c	[RISCV] Add a version of insertVSETVLI which uses an iterator [NFC] This is to simplify the final version of D124869.	2022-05-04 14:48:31 -07:00
Craig Topper	411bb42eed	[RISCV] Add a special case to treat riscv-v-vector-bits-min=-1 as meaning use Zvlb value. riscv-v-vector-bits-min is primarily used to opt-in to the autovectorizer. The vector width can be determined from Zvlb. This patch adds support treating -1 as meaning use Zvlb so we can still opt-in to autovectorization without needing to repeat a vector width already given by Zvlb or -mcpu. Reviewed By: reames Differential Revision: https://reviews.llvm.org/D124960	2022-05-04 14:26:45 -07:00
Craig Topper	1d6430b9e2	[RISCV] Update isLegalAddressingMode for RVV. RVV instructions only support base register addressing. Reviewed By: reames Differential Revision: https://reviews.llvm.org/D124820	2022-05-03 19:49:11 -07:00
Craig Topper	9cce9a126c	[RISCV] Make use of SHXADD instructions in RVV spill/reload code. We can use SH1ADD, SH2ADD, SH3ADD to multipy by 3, 5, and 9 respectively. We could extend this to 3, 5, or 9 multiplied by a power 2 by also emitting a SLLI. Reviewed By: reames Differential Revision: https://reviews.llvm.org/D124824	2022-05-03 19:35:21 -07:00
Craig Topper	0971819740	[RISCV] Don't lookup TII in RISCVInstrInfo::getVLENFactoredAmount. NFCI We're already inside of our implementation of TII.	2022-05-03 19:35:21 -07:00
Weverything	5afd20806d	[riscv] Mark function as used to avoid unused warning.	2022-05-03 18:51:23 -07:00
Philip Reames	2982d0032b	Fix a buildbot warning [nfc]	2022-05-03 14:40:27 -07:00
Philip Reames	be50b8c185	[riscv] Add debug printing support for VSETVLIInfo class [nfc]	2022-05-03 14:00:17 -07:00
Hsiangkai Wang	eaaa31ff2c	[RISCV][TargetLowering] Special case overflow expansion for (uaddo X, C). Follow-up to D122933. Differential Revision: https://reviews.llvm.org/D124374	2022-05-03 03:51:36 +00:00
Craig Topper	72a66358f6	[RISCV] Add isCommutable to FADD/FMUL/FMIN/FMAX/FEQ. Reviewed By: arcbbb Differential Revision: https://reviews.llvm.org/D123972	2022-05-02 20:21:16 -07:00
Zakk Chen	5807e59a0a	[RISCV] Fix incorrect codegen for masked vmsge{u}.vx with mask agnostic. The result was totally wrong. We could use mask undisturbed result to emulate the mask agnostic result. Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D124684	2022-05-02 17:57:29 -07:00
Fangrui Song	2019c9b1c8	[RISCV] Lower case the first letter of LowerRISCVMachineOperandToMCOperand. NFC	2022-05-01 14:13:55 -07:00
luxufan	e098281c27	[RISCV] Don't getDebugLoc for the end node of MBB iterator Because of shrink wrapping, the block to insert epilog may don't have instructions (Only debug instructions). And the position to insert may point to MBB.end() that don't have a DebugLoc. This patch fix this problem. The test program was copied from the issue:https://github.com/llvm/llvm-project/issues/53662 Reviewed By: luismarques Differential Revision: https://reviews.llvm.org/D123679	2022-04-30 16:00:20 +08:00
Yeting Kuo	c069e37019	[RISCV] Add DAGCombine to fold base operation and reduction. Transform (<bop> x, (reduce.<bop> vec, splat(neutral_element))) to (reduce.<bop> vec, splat (x)). Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D122563	2022-04-30 14:07:05 +08:00
Craig Topper	f91690f7db	[RISCV] Don't merge addi into load/store address if addi has a FrameIndex operand. This fixes a crash from D124231. We can't fold (load (add base, (addi src, off1)), off2) -> (load (add base, src), off1+off2) if the src is a FrameIndex. FrameIndex cannot be the operand of an add. There was an immediate==0 check that I think was trying to catch the common case of FrameIndex addis where the immediate is 0, but they can also appear in non-zero form. Instead explicitly check for a FrameIndex operand.	2022-04-29 18:22:20 -07:00
Craig Topper	5aa1a7b307	[RISCV] Remove 'frameindex' from list for ComplexPattern. NFC Putting a node in this list allows the node to be used as the root of an isel pattern that would then call the ComplexPattern. The usual case is to use the ComplexPattern as the operand of another operator. AddrFI is never used as a root operation. frameindex is handled directly with custom code in RISCVISelDAGToDAG::Select. So adding frameindex to the list here serves no purpose.	2022-04-29 17:41:07 -07:00
Philip Reames	3ea191ed03	[RISCV] Factor repeating code into getMaskTypeFor(VT) [nfc]	2022-04-29 10:00:57 -07:00
Philip Reames	f927be0df8	[RISCV] Extract getAllOnesMask helper [nfc]	2022-04-29 09:30:18 -07:00
Craig Topper	5c38373125	[RISCV] Improve constant materialization for cases that can use LUI+ADDI instead of LUI+ADDIW. It's possible that we have a constant that isn't simm32 so we can't use LUI+ADDIW, but we can use LUI+ADDI. Because ADDI uses a sign extended constant, it's possible that after subtracting it out, we end up with a simm32 that maps to LUI. This patch detects this case after removing Lo12 and before shifting the value for SLLI. Reviewed By: luismarques Differential Revision: https://reviews.llvm.org/D124222	2022-04-29 08:58:32 -07:00
LiaoChunyu	03a3654203	[RISCV] Add cost model for SK_Broadcast Add cost model for broadcast shuffle in RISCVTTIImpl::getShuffleCost with scalable vector. The cost model might not the best. For scalable vector, BasicTTIImpl::getShuffleCost return invalid cost, so this patch relies on the existing cost model in BasicTTIImpl. Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D124101	2022-04-29 13:28:02 +08:00
Hsiangkai Wang	c62b014db9	[RISCV] Merge addi into load/store as there is a ADD between them This patch adds peephole optimizations for the following patterns: (load (add base, (addi src, off1)), off2) -> (load (add base, src), off1+off2) (store val, (add base, (addi src, off1)), off2) -> (store val, (add base, src), off1+off2) Differential Revision: https://reviews.llvm.org/D124231	2022-04-29 04:33:05 +00:00
Craig Topper	ec11fbb1d6	[RISCV] Use default promotion for (i32 (shl 1, X)) on RV64 when Zbs is enabled. This improves opportunities to use bset/bclr/binv. Unfortunately, there are no W versions of these instrcutions so this isn't always a clear win. If we use SLLW we get free sign extend and shift masking, but need to put a 1 in a register and can't remove an or/xor. If we use bset/bclr/binv we remove the immediate materializationg and logic op, but might need a mask on the shift amount and sext.w. Reviewed By: luismarques Differential Revision: https://reviews.llvm.org/D124096	2022-04-28 09:58:30 -07:00
Craig Topper	8631a5e712	[RISCV] Fix alias printing for vmnot.m By clearing the HasDummyMask flag from mask register binary operations and mask load/store. HasDummyMask was causing an extra operand to get appended when converting from MachineInstr to MCInst. This extra operand doesn't appear in the assembly string so was mostly ignored, but it prevented the alias instruction printing from working correctly. Reviewed By: arcbbb Differential Revision: https://reviews.llvm.org/D124424	2022-04-28 08:33:52 -07:00
Lian Wang	dc0ae8ce18	[RISCV] Support VP_SETCC mask operations Support VP_SETCC mask operations, turn it to logical operation. Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D124438	2022-04-28 08:52:29 +00:00
Craig Topper	c2614b31d9	[RISCV] Add isCommutable to scalar FMA instructions. The default implementation of findCommutedOpIndices picks the first two source operands. That's exactly what we want for the scalar FMA instructions. Reviewed By: reames Differential Revision: https://reviews.llvm.org/D124463	2022-04-27 11:07:18 -07:00
Jim Lin	9de7b93bc0	[RISCV][NFC] Update and add missing closed curly bracket comment in RISCVInstrInfoZb.td	2022-04-27 15:08:51 +08:00
ShihPo Hung	6b55f133fb	[RISCV][RVV] Select unmasked TU RVV pseudos in a DAG post-process Following D118810 that reduced the size of ISel table, this patch optimizes allone-masked RVV pseudos with TU policy and swap them out to their unmasked TU pseudos. Since the UNDEF merge operand is not preserved, we turn it into TA pseudo regardless of the policy operand. Reviewed By: craig.topper, frasercrmck Differential Revision: https://reviews.llvm.org/D121881	2022-04-26 20:14:54 -07:00
Vasileios Porpodas	fa8a9fea47	Recommit "[SLP][TTI] Refactoring of `getShuffleCost` `Args` to work like `getArithmeticInstrCost`" This reverts commit `6a9bbd9f20`. Code review: https://reviews.llvm.org/D124202	2022-04-26 14:02:40 -07:00
Shao-Ce SUN	c59473aacc	[NFC][RISCV][CodeGen] Use ArrayRef in TargetLowering functions Based on D123467. Reviewed By: reames Differential Revision: https://reviews.llvm.org/D123653	2022-04-26 23:53:00 +08:00
Craig Topper	40f1af4760	[RISCV] Add isCommutable to ADD/ADDW/MUL/AND/OR/XOR/MIN/MAX/CLMUL Reviewed By: reames Differential Revision: https://reviews.llvm.org/D123970	2022-04-25 10:53:41 -07:00
Zakk Chen	ffe03ff75c	[RISCV] Fix incorrect policy implement for unmasked vslidedown and vslideup. vslideup works by leaving elements 0<i<OFFSET undisturbed. so it need the destination operand as input for correctness regardless of policy. Add a operand to indicate policy. We also add policy operand for unmaksed vslidedown to keep the interface consistent with vslideup because vslidedown have only undisturbed at 0<i<vstart but user have no way to control of vstart. Reviewed By: rogfer01, craig.topper Differential Revision: https://reviews.llvm.org/D124186	2022-04-25 09:18:41 -07:00
wangpc	7a21a0525a	[RISCV] Add sched to pseudo function call instructions To fix llvm-mca's error of 'found an unsupported instruction in the input assembly sequence.' caused by the lack of scheduling info. Pseudo function call instructions will be expanded to `auipc` and `jalr`, so their scheduling info are the combination of two. Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D123578	2022-04-24 14:58:18 +08:00
Mohammed Nurul Hoque	5dd99f71aa	[RISCV] transform MI to W variant to remove sext.w Backwards search The sext.w removal pass (before the new patch) checks if the input to sext.w is already in sign-extended form, so it can eliminate it. It does that by checking every definition/source that reaches the sext.w is an instruction that produces a sign-extended value, either by definition (e.g. ADDW), or it propagates sign-extension (e.g. OR) so we check its sources recursively. Forward search Sometimes, one of the sources is an instruction that doesn't always produce a sign-extended value, but it has a W-version that does (e.g. ADD / ADDW). If we transform the ADD to ADDW, the sext.w can be removed (assuming other def paths are satisfied), but this transformation is sound only if every use of this ADD/W only reqruires the lower 32-bits either directly (like sll %x, 32) or they propagate dependency (lower word of output only depends on lower word of input) so we check its uses recursively. When searching backwards, if an instruction that can be replaced with W-variant is encountered, this pass runs the forward search to verify it can be replaced, then adds it to a list of fixable instructions. After verifying all paths, it replaces the instruction and removes the sext.w. Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D119928	2022-04-22 10:59:26 -07:00
Fraser Cormack	98db7ea262	[RISCV][NFC] Adjust some formatting in VL patterns	2022-04-22 17:19:27 +01:00
Fraser Cormack	2b0fedc2dd	[RISCV] Print human-readable VTYPE/SEW/LMUL in MIR This patch adds custom MIR operand comments to VTYPE immediate operands in VSETVLI instructions and SEW/LMUL operands in vector codegen pseudo instructions. The result is intended to be more human-readable and hopefully maintainable when working with MIR, particularly when writing or reading test cases. Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D124187	2022-04-22 17:13:18 +01:00
wangpc	5c3ea07848	[RISCV] Do not outline CFI instructions when they are needed in EH We saw a failure caused by unwinding with incomplete CFIs, so we can't outline CFI instructions when they are needed in EH. This is a recommit of `0d40688`, which was reverted in `ce83883` as related precommit test `360d44e` caused some errors. Reviewed By: luismarques Differential Revision: https://reviews.llvm.org/D122634	2022-04-22 12:28:19 +08:00
Ping Deng	7493d9ffb6	[RISCV][NFC] Use defvar to simplify pattern definations. Reviewed By: jacquesguan, frasercrmck Differential Revision: https://reviews.llvm.org/D123839	2022-04-22 02:45:14 +00:00
Craig Topper	9534811aa8	[RISCV] Teach generateInstSeqImpl to generate BSETI for single bit cases. If the immediate has one bit set, but isn't a simm32 we can try the BSETI instruction from Zbs.	2022-04-21 12:08:34 -07:00
Craig Topper	98b866892d	[RISCV] Add special case to constant materialization to remove trailing zeros first. If there are fewer than 12 trailing zeros, we'll try to use an ADDI at the end of the sequence. If we strip trailing zeros and end the sequence with a SLLI we might find a shorter sequence. Differential Revision: https://reviews.llvm.org/D124148	2022-04-21 09:43:32 -07:00
wangpc	ce83883691	Revert "[RISCV] Do not outline CFI instructions when they are needed in EH" This reverts commit `0d40688925`.	2022-04-21 16:23:10 +08:00
wangpc	0d40688925	[RISCV] Do not outline CFI instructions when they are needed in EH We saw a failure caused by unwinding with incomplete CFIs, so we can't outline CFI instructions when they are needed in EH. Reviewed By: luismarques Differential Revision: https://reviews.llvm.org/D122634	2022-04-21 16:13:22 +08:00
Fraser Cormack	3e678cb772	[RISCV] Don't emit fractional VIDs with negative steps We can't shift-right negative numbers to divide them, so avoid emitting such sequences. Use negative numerators as a proxy for this situation, since the indices are always non-negative. An alternative strategy could be to add a compiler flag to emit division instructions, which would at least allow us to test the VID sequence matching itself. Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D123796	2022-04-21 07:00:34 +01:00
Craig Topper	186d5c8af5	[RISCV] Make getInstSeqCost handle other Zb* instructions. We haven't been updating this as Zb* instructions have been used for immediate materialization. They will hit the default case and trigger an llvm_unreachable. Instead of trying to list them all, assume instructions that aren't explicitly listed aren't compressible. Spotted while looking at integer materialization for other reasons. I haven't seen a crash from this yet.	2022-04-20 22:08:04 -07:00
Craig Topper	6db0afb44e	[RISCV] Fold (xor (sllw 1, x), -1) -> (rolw ~1, x). There's an existing generic combine that does this for legal types. This patch adds a RISCV specific combine for W instructions. Reviewed By: luismarques Differential Revision: https://reviews.llvm.org/D123983	2022-04-19 15:03:43 -07:00
Fraser Cormack	c5cac48549	[RISCV] Fix lowering of BUILD_VECTORs as VID sequences This patch fixes a bug when lowering BUILD_VECTOR via VID sequences. After adding support for fractional steps in D106533, elements with zero steps may be skipped if no step has yet been computed. This allowed certain sequences to slip through the cracks, being identified as VID sequences when in fact they are not. The fix for this is to perform a second loop over the BUILD_VECTOR to validate the entire sequence once the step has been computed. This isn't the most efficient, but on balance the code is more readable and maintainable than doing back-validation during the first loop. Fixes the tests introduced in D123785. Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D123786	2022-04-19 07:43:38 +01:00
jacquesguan	25445b94db	[RISCV] Add rvv codegen support for vp.fptrunc. This patch adds rvv codegen support for vp.fptrunc. The lowering of fp_round and vp.fptrunc share most code so use a common lowering function to handle those two, similar to vp.trunc. Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D123841	2022-04-19 01:56:18 +00:00
Lian Wang	545d353b3c	[RISCV][NFC] Refactor VL patterns for vnsrl and vnsra Reviewed By: frasercrmck Differential Revision: https://reviews.llvm.org/D123274	2022-04-15 07:42:59 +00:00
jacquesguan	1aa4f0bb6c	[RISCV][VP] Add RVV codegen for vp.trunc. Differential Revision: https://reviews.llvm.org/D123579	2022-04-15 02:29:53 +00:00
Lian Wang	3100893f63	[RISCV] Remove sext_inreg+riscv_grev/riscv_gorc isel patterns Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D123565	2022-04-14 08:16:32 +00:00
Lian Wang	38706dd940	[RISCV][NFC] Refactor patterns for Multiply Add instructions Reviewed By: craig.topper, frasercrmck Differential Revision: https://reviews.llvm.org/D123355	2022-04-14 08:00:00 +00:00
wangpc	d0828c5af9	[RISCV][NFC] Use addExpr() instead of createExpr() It seems to be neater. Reviewed By: asb Differential Revision: https://reviews.llvm.org/D123675	2022-04-14 10:48:25 +08:00
Liqin Weng	8265679018	[RISCV][NFC] Refactor the type promotion of fsl/fsr/becompress/bdecompress/bfp Reviewed By: asb, jrtc27, craig.topper, frasercrmck Differential Revision: https://reviews.llvm.org/D123181	2022-04-13 08:52:04 +00:00
Craig Topper	057c063c9b	[RISCV] Add a encodeLMUL function to RISCVVType. NFC This moves the encoding handling out of the assembly parser. Reviewed By: khchen, frasercrmck Differential Revision: https://reviews.llvm.org/D123553	2022-04-12 13:39:47 -07:00
Craig Topper	2ce2562876	[RISCV][SelectionDAG] Add a hook to sign extend i32 ConstantInt operands of phis on RV64. Materializing constants on RISCV is simpler if the constant is sign extended from i32. By default i32 constant operands of phis are zero extended. This patch adds a hook to allow RISCV to override this for i32. We have an existing isSExtCheaperThanZExt, but it operates on EVT which we don't have at these places in the code. Reviewed By: efriedma Differential Revision: https://reviews.llvm.org/D122951	2022-04-11 14:38:39 -07:00
Craig Topper	76192182d0	[RISCV] Remove riscv-v-fixed-length-vector-elen-max command line option. This was added before Zve extensions were defined. I think users should use Zve32x or Zve32f now. Though we will lose support for limiting ELEN to 16 or 8, but I hope no one was using that. Reviewed By: frasercrmck Differential Revision: https://reviews.llvm.org/D123418	2022-04-11 10:14:48 -07:00
Craig Topper	c266e50430	[RISCV] Remove ExtZvl enum from RISCVSubtarget. NFC Having an enum with names that contain the string representation of their value doesn't add any value. We can just use the numbers. Reviewed By: kito-cheng, frasercrmck Differential Revision: https://reviews.llvm.org/D123417	2022-04-11 10:01:17 -07:00
LiaoChunyu	505fce5a9e	[RISCV] Add basic code modeling for llvm.experimental.stepvector intrinsic Scalable vectors llvm.experimental.stepvector intrinsic will crash due to an invalid cost when run the code through the loopunroll. Reviewed By: kito-cheng Differential Revision: https://reviews.llvm.org/D122782	2022-04-11 10:19:23 +08:00
Craig Topper	4e561a581f	[RISCV] Remove unnecessary cast to i8* when converting gather/scatter to strided load/store. Not sure why I thought this necessary at the time.	2022-04-09 20:05:03 -07:00
Craig Topper	70046438d0	[RISCV] Only try LUI+SHADD+ADDI for int materialization if LUI+ADDI+SHADD failed. There's an assert in LUI+SHADD+ADDI materialization that makes sure the lower 12 bits aren't zero since that case should have been handled as LUI+ADDI+SHADD. But nothing prevented the LUI+SH*ADD+ADDI checks from running after the earlier code handled it. The sequence would be the same length or longer so it wouldn't replace the earlier sequence, but the assert happened before that was checked. The vector holding the sequence also wasn't reset before the second check so that guaranteed the sequence would never be found to be shorter. This patch fixes this by only trying the second expansion when the earlier fails. Fixes PR54812. Reviewed By: benshi001 Differential Revision: https://reviews.llvm.org/D123406	2022-04-09 08:52:15 -07:00
Fraser Cormack	34e1b4774a	[RISCV] Select unmasked FP setcc insts via ISel post-process Similar to D123217 but for the floating-point patterns. No change in generated output, while reducing the generated table size. Reviewed By: arcbbb Differential Revision: https://reviews.llvm.org/D123291	2022-04-08 17:13:43 +01:00
Craig Topper	1903b99154	[RISCV] Always select (and (srl X, C), Mask) as (srli (slli X, C2), C3). SLLI is always compressible to C.SLLI as long as the source and dest register is the same. ANDI and SRLI are only compressible if the register is x8-x15. By using SLLI we have a better chance of generating shorter code. I had to exclude one exclusion for the BEXTI case so that it's pattern match could still fire. Reviewed By: luismarques Differential Revision: https://reviews.llvm.org/D123336	2022-04-08 09:04:04 -07:00
Kito Cheng	9c5aedfbf5	[RISCV] Fixing stack offset for RVV object with vararg in stack. We found LLVM generate wrong stack offset for RVV object when stack having variable argument, that cause by we didn't count vaarg part during calculate RVV stack objects. Also update the stack layout diagram for including vaarg in the diagram. Stack layout ref: https://github.com/gcc-mirror/gcc/blob/master/gcc/config/riscv/riscv.cc#L3941 Reviewed By: rogfer01 Differential Revision: https://reviews.llvm.org/D123180	2022-04-08 12:01:16 +08:00
Kito Cheng	690085c9b7	[RISCV] Store/restore RISCVMachineFunctionInfo into MIR YAML file RISCVMachineFunctionInfo has some fields like VarArgsFrameIndex and VarArgsSaveSize are calculated at ISel lowering stage, those info are not contained in MIR files, that cause test cases rely on those field can't not reproduce correctly by MIR dump files. This patch adding the MIR read/write for those fields. Reviewed By: frasercrmck Differential Revision: https://reviews.llvm.org/D123178	2022-04-08 11:55:48 +08:00
jacquesguan	a55c19c44b	[RISCV][NFC] Use defvar to simplify pattern definations. Reviewed By: frasercrmck Differential Revision: https://reviews.llvm.org/D123292	2022-04-08 02:51:30 +00:00
Craig Topper	d98bea87ef	[RISCV] Add more .vx patterns for VLMax integer setccs. This patch synchronizes the structure of the templates with those in RISCVInstrInfoVVLPatterns.td so that we get patterns with .vx on the left hand side. Reviewed By: rogfer01 Differential Revision: https://reviews.llvm.org/D123255	2022-04-07 09:17:43 -07:00
Craig Topper	82662b753d	[RISCV] Add swapped patterns to VPatIntegerSetCCVL_VIPlus1. This matches VPatIntegerSetCCVL_VI_Swappable. But as noted in the FIXME this may only be needed due to lack of canonicalization on VP_SETCC. Reviewed By: rogfer01 Differential Revision: https://reviews.llvm.org/D123239	2022-04-07 09:17:08 -07:00
Luís Marques	d09d297c5d	[RISCV] Fix crash for section alignment with .option norvc The existing code wasn't getting the subtarget info from the fragment, so the current status of RVC would be ignored. This would cause a crash for the new test case when the target then reported it couldn't write the requested number of code alignment bytes. Differential Revision: https://reviews.llvm.org/D122236	2022-04-07 12:02:27 +01:00
Fraser Cormack	8ebc9b1560	[RISCV] Select unmasked integer setcc insts via ISel post-process This patch has no effect on the generated code, whilst mitigating the increase in ISel table size caused by the recent addition of masked patterns. I aim to do the same for floating-point patterns once D123051 lands, giving us a reason to use masked floating-point patterns. Reviewed By: arcbbb Differential Revision: https://reviews.llvm.org/D123217	2022-04-07 09:30:19 +01:00
Fraser Cormack	8216255c9f	[RISCV][VP] Add basic RVV codegen for vp.fcmp This patch adds the necessary infrastructure to lower vp.fcmp via ISD::VP_SETCC to RVV instructions. Most notably this patch adds cond-code legalization for VP_SETCC, reusing the existing TargetLowering::LegalizeSetCCCondCode by passing in additional SDValue parameters for the Mask and EVL. This method then uses VP operations to legalize the condcode. There is still a general lack of canonicalization on VP_SETCC as opposed to SETCC which results in worse code than is theoretically possible. Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D123051	2022-04-07 09:16:07 +01:00
Liqin Weng	f891123556	[RISCV] Add CMOV isel pattern for (select (setgt X, Imm), Y, Z) Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D122644	2022-04-07 05:55:53 +00:00
Lian Wang	1b547799c5	[RISCV] Supplement patterns for vnsrl.wx/vnsra.wx when splat shift is sext or zext Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D122786	2022-04-07 02:21:41 +00:00
Craig Topper	e13a44b460	[RISCV] Add lowering for vp.sext and vp.zext. Including mask vector inputs. Reviewed By: frasercrmck, rogfer01 Differential Revision: https://reviews.llvm.org/D123150	2022-04-06 09:59:49 -07:00
Fraser Cormack	6be5e875be	[RISCV][VP] Add basic RVV codegen for vp.icmp This patch adds the minimum required to successfully lower vp.icmp via the new ISD::VP_SETCC node to RVV instructions. Regular ISD::SETCC goes through a lot of canonicalization which targets may rely on which has not hereto been ported to VP_SETCC. It also supports expansion of individual condition codes and a non-boolean return type. Support for all of that will follow in later patches. In the case of RVV this largely isn't a problem as the vector integer comparison instructions are plentiful enough that it can lower all VP_SETCC nodes on legal integer vectors except for boolean vectors, which regular SETCC folds away immediately into logical operations. Floating-point VP_SETCC operations aren't as well supported in RVV and the backend relies on condition code expansion, so support for those operations will come in later patches. Portions of this code were taken from the VP reference patches. Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D122743	2022-04-06 16:51:22 +01:00
Craig Topper	3c831c9b28	[RISCV] Add support for vp.fptosi where the result is a mask type. We can do this conversion by converting the same sized integer type, then compare the result with 0. The conversion is undefined if the converted FP value doesn't fit in an i1. Reviewed By: frasercrmck Differential Revision: https://reviews.llvm.org/D122678	2022-04-05 09:48:04 -07:00
Craig Topper	d970e96c53	[RISCV] Add lowering for vp.fptoui and vp.uitofp. This is a straightforward extension of D122512 to unsigned integers.	2022-04-01 18:28:46 -07:00
Craig Topper	fa630e7594	[RISCV][AMDGPU][TargetLowering] Special case overflow expansion for (uaddo X, 1). If we expand (uaddo X, 1) we previously expanded the overflow calculation as (X + 1) <u X. This potentially increases the live range of X and can prevent X+1 from reusing the register that previously held X. Since we're adding 1, overflow only occurs if X was UINT_MAX in which case (X+1) would be 0. So this patch adds a special case to expand the overflow calculation to (X+1) == 0. This seems to help with uaddo intrinsics that get introduced by CodeGenPrepare after LSR. Alternatively, we could block the uaddo transform in CodeGenPrepare for this case. Reviewed By: arsenm Differential Revision: https://reviews.llvm.org/D122933	2022-04-01 13:14:10 -07:00
Lian Wang	62dd3674bc	[RISCV] Supplement SDNode patterns for vfwmul/vfwadd/vfwsub Reviewed By: jacquesguan Differential Revision: https://reviews.llvm.org/D122720	2022-04-01 03:09:50 +00:00
Fraser Cormack	ee51aefba0	[RISCV][NFC] Minor formatting fix	2022-03-31 16:15:22 +01:00
Fraser Cormack	a276d1f44b	[RISCV][NFC] Fix formatting on one line	2022-03-31 13:17:37 +01:00
ShihPo Hung	2f1261abe4	[RISCV][RVV] Add Uses = [FRM] and mayRaiseFPException = true to RVV instructions This patch adds Uses = [FRM] and mayRaiseFPException = true to following instructions: VFADD, VFSUB, VFRSUB, VFMUL, VFDIV, VFRDIV VFWADD, VFWSUB, VFWMUL VFMADD, VFMACC, VFMSAC, VFMSUB VFNMADD, VFNMACC, VFNMSAC, VVFNMSUB VFWMACC, VFWMSAC, VFWNMACC, VFWNMSAC VFSQRT, VFREC7 VFREDOSUM, VFREDUSUM, VFWREDOSUM, VFWREDUSUM and only adds mayRaiseFPException = true to following instructions: VFRSQRT7, VFMIN, VFMAX, VFREDMIN, VFREDMAX VMFEQ, VMFNE, VMFLT,VMFLE, VMFGT, VMFGE Reviewed By: rogfer01 Differential Revision: https://reviews.llvm.org/D121087	2022-03-31 01:33:17 -07:00
Fraser Cormack	893d63fbdc	[RISCV][NFC] Fix comment to refer to correct file	2022-03-31 08:59:10 +01:00
Lian Wang	b3851e9931	[RISCV] Add VL patterns for vfwmul/vfwadd/vfwsub Reviewed By: frasercrmck Differential Revision: https://reviews.llvm.org/D122369	2022-03-31 07:08:58 +00:00
Craig Topper	4477500533	[RISCV] ISel (and (shift X, C1), C2)) to shift pair in more cases Previously, these isel optimizations were disabled if the AND could be selected as a ANDI instruction. This patch disables the optimizations only if the immediate is valid for C.ANDI. If we can't use C.ANDI, we might be able to compress the shift instructions instead. I'm not checking the C extension since we have relatively poor test coverage of the C extension. Without C extension the code size should be equal. My only concern would be if the shift+andi had better latency/throughput on a particular CPU. I did have to add a peephole to match SRLIW if the input is zexti32 to prevent a regression in rv64zbp.ll. Reviewed By: luismarques Differential Revision: https://reviews.llvm.org/D122701	2022-03-30 11:46:42 -07:00
Craig Topper	7417eb29ce	[RISCV] Use getSplatBuildVector instead of getSplatVector for fixed vectors. The splat_vector will be legalized to build_vector eventually anyway. This patch makes it take fewer steps. Unfortunately, this results in some codegen changes. It looks like it comes down to how the nodes were ordered in the topological sort for isel. Because the build_vector is created earlier we end up with a different ordering of nodes. Reviewed By: frasercrmck Differential Revision: https://reviews.llvm.org/D122185	2022-03-30 11:36:34 -07:00
Liqin Weng	4cb85da811	[RISCV] Add CMIX isel pattern for (xor (and (xor rs1, rs3), rs2), rs3) Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D122702	2022-03-30 16:51:09 +08:00
Fraser Cormack	75047577d6	[RISCV] Trim RVV isel pats matchable via DAG post-process In D122512, several masked patterns were added to support lowering of vector-predicated float-to-int and int-to-float conversions. With the introduction of these patterns, all of the old "unmasked" patterns are matchable via the DAG post-process introduced in D118810, once the relevant opcode entries are set up in the helper table. Locally this reduces the generated isel table by 4%. Reviewed By: arcbbb Differential Revision: https://reviews.llvm.org/D122637	2022-03-30 08:56:38 +01:00
Liqin Weng	7f81765898	[RISCV][NFC] Add immediate tests for the icmp instruction Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D122651	2022-03-30 02:51:26 +00:00
Zakk Chen	b578330754	[RISCV] Use maskedoff to decide mask policy for masked compare and vmsbf/vmsif/vmsof. masked compare and vmsbf/vmsif/vmsof are always tail agnostic, we could check maskedoff value to decide mask policy rather than have a addtional policy operand. Reviewed By: craig.topper, arcbbb Differential Revision: https://reviews.llvm.org/D122456	2022-03-29 18:05:33 -07:00
Zakk Chen	10b2760da0	Revert "[RISCV] Add policy operand for masked compare and vmsbf/vmsif/vmsof IR" This reverts commit `10fd2822b7`. I have a better implementation for those operations without the additional policy operand. masked compare and vmsbf/vmsif/vmsof are always tail agnostic so we could assume undef maskedoff is mask agnostic. Differential Revision: https://reviews.llvm.org/D122455	2022-03-29 18:05:33 -07:00
Liqin Weng	d660c0d793	[RISCV] Optimize LI+SLT to SLTI+XORI for immediates in specific range This transform will reduce one GPR. Reviewed By: craig.topper, benshi001 Differential Revision: https://reviews.llvm.org/D122051	2022-03-29 14:46:49 +08:00
Craig Topper	45e85feba6	[RISCV] Pull APInt/computeKnonwbits specifics out of computeGREVOrGORC. NFC This function now takes a uint64_t instead of an APInt. The caller is responsible for masking the shift amount, extracting and inserting into the KnownBits APInts, and inverting to compute zeros. This is less code and cleaner division of responsibilities.	2022-03-28 20:53:54 -07:00
Shao-Ce SUN	662b9fa02c	[NFC][CodeGen] Add a setTargetDAGCombine use ArrayRef Reviewed By: arsenm Differential Revision: https://reviews.llvm.org/D122557	2022-03-29 09:53:24 +08:00
Craig Topper	01203918d1	[RISCV] Add computeKnownBits support for RISCVISD::GORC. Reviewed By: luismarques Differential Revision: https://reviews.llvm.org/D121575	2022-03-28 16:56:33 -07:00
Craig Topper	e68257fcee	[RISCV][SelectionDAG] Enable TargetLowering::hasBitTest for masks that fit in ANDI. Modified DAGCombiner to pass the shift the bittest input and the shift amount to hasBitTest. This matches the other call to hasBitTest in TargetLowering.h This is an alternative to D122454. Reviewed By: luismarques Differential Revision: https://reviews.llvm.org/D122458	2022-03-28 12:46:36 -07:00
Craig Topper	cfe533da05	[RISCV] Add lowering for vp.fptosi and vp.sitofp. This as an alternative version of D120641. Starting from the code here https://repo.hca.bsc.es/gitlab/rferrer/llvm-epi/-/raw/EPI/llvm/lib/Target/RISCV/RISCVISelLowering.cpp but with some modifications to how the interim types are calculated, and adding support for f16. Still need to add fptosi for mask vectors. Lots of masked isel patterns added so we can pass the mask through the type changes. Reviewed By: frasercrmck, arcbbb Differential Revision: https://reviews.llvm.org/D122512	2022-03-28 11:06:41 -07:00
Kazu Hirata	6212871968	[Target] Apply clang-tidy fixes for readability-redundant-member-init (NFC)	2022-03-27 22:22:37 -07:00
Maksim Panchenko	4ae9745af1	[Disassember][NFCI] Use strong type for instruction decoder All LLVM backends use MCDisassembler as a base class for their instruction decoders. Use "const MCDisassembler " for the decoder instead of "const void ". Remove unnecessary static casts. Reviewed By: skan Differential Revision: https://reviews.llvm.org/D122245	2022-03-25 18:53:59 -07:00
Dávid Bolvanský	9a738c477e	[NFCI] Fix set-but-unused warning in RISCVAsmParser.cpp	2022-03-24 08:33:40 +01:00
jacquesguan	8910ac400c	[RISCV] Add patterns for vector widening integer multiply Add patterns for vector widening integer multiply instructions Differential Revision: https://reviews.llvm.org/D117385	2022-03-24 15:26:08 +08:00
Vasileios Porpodas	39aa202aff	Recommit "[SLP] Fix lookahead operand reordering for splat loads." attempt 3, fixed assertion crash. Original review: https://reviews.llvm.org/D121354 This reverts commit `e6ead19b77`.	2022-03-23 18:32:17 -07:00
Craig Topper	6c90a654bb	[RISCV] Simplify some code in lowering vector int<->fp conversions. NFC Don't call EltVT.getSizeInBits() or SrcEltVT.getSizeInBits() a second time. They are already in EltSize or SrcEltSize variables. Refactor some comparisons to use multiply instead of division.	2022-03-23 12:09:05 -07:00
Arthur Eubanks	e6ead19b77	Revert "Recommit "[SLP] Fix lookahead operand reordering for splat loads." attempt 2, fixed assertion crash." This reverts commit `27bd8f9492`. Causes crashes, see comments in D121973	2022-03-23 10:57:45 -07:00
luxufan	5800fb41a6	[RISCV] Remove check and update test file in D121183 Differential Revision: https://reviews.llvm.org/D122290	2022-03-24 00:48:52 +08:00
luxufan	227496dc09	[RISCV] Generate correct ELF EFlags when .ll file has target-abi attribute In the past, when construct RISCVAsmBackend, MCTargetOptions.ABIName would be passed and stored in RISCVAsmBackend. But MCTargetOptions.ABIName can only be specified by -target-abi xxx in command line, if the .ll file has target-abi attribute, the codegen module will ignore it. And the generated object file would have incorrect EFlags value. https://github.com/llvm/llvm-project/issues/50591 also caused by this problem. This patch override the AsmPrinter::emitFunctionEntryLabel function and use it to set the target abi value that get from .ll file's target-abi attribute. And storing the target-abi in RISCVTargetStreamer instead of RISCVAsmBackend. Differential Revision: https://reviews.llvm.org/D121183	2022-03-24 00:48:52 +08:00
Vasileios Porpodas	27bd8f9492	Recommit "[SLP] Fix lookahead operand reordering for splat loads." attempt 2, fixed assertion crash. Original review: https://reviews.llvm.org/D121354 This reverts commit `f7d7d2a08d`.	2022-03-22 16:41:55 -07:00
Arthur Eubanks	f7d7d2a08d	Revert "Recommit "[SLP] Fix lookahead operand reordering for splat loads."" This reverts commit `79613185d3`. Causes crashes, see comments in https://reviews.llvm.org/D121973.	2022-03-22 13:33:49 -07:00
Craig Topper	51940d69cb	[RISCV] Special case sign extended scalars when type legalizing nxvXi64 .vx instrinsics on RV32. On RV32, we need to type legalize i64 scalar arguments to intrinsics. We usually do this by splatting the value into a vector separately. If the scalar happens to be sign extended, we can continue using a .vx intrinsic. We already special cased sign extended constants, this extends it to any sign extended value. I've only added tests for one case of vadd. Most intrinsics go through the same check. Reviewed By: khchen Differential Revision: https://reviews.llvm.org/D122186	2022-03-22 10:29:06 -07:00
Craig Topper	9b0f227d7b	[TableGen][RISCV] Add InstAliases with zero_reg to cover unmasked vnot.v, vncvt.x.x.w, vneg.v, etc. The mask being NoRegister prevented the existing aliases from matching since NoRegister isn't in the VMV0 register class. To workaround this I've added new aliases that look for zero_reg. I had to motify tablegen to generate matching code for zero_reg. And as a consequence, I had to change the EmitPriority for an ARM alias that used zero_reg that started printing. Reviewed By: frasercrmck Differential Revision: https://reviews.llvm.org/D121496	2022-03-22 10:14:43 -07:00
Zakk Chen	10fd2822b7	[RISCV] Add policy operand for masked compare and vmsbf/vmsif/vmsof IR intrinsics. Those operations are updated under a tail agnostic policy, but they could have mask agnostic or undisturbed. Reviewed By: rogfer01 Differential Revision: https://reviews.llvm.org/D120228	2022-03-22 07:47:21 -07:00
Zakk Chen	9ab18cc535	[RISCV] Add policy operand for masked vid and viota IR intrinsics. Reviewed By: rogfer01 Differential Revision: https://reviews.llvm.org/D120227	2022-03-22 02:32:31 -07:00
Zakk Chen	abb5a985e9	[RISCV] Support mask policy for RVV IR intrinsics. Add the UsesMaskPolicy flag to indicate the operations result would be effected by the mask policy. (ex. mask operations). It means RISCVInsertVSETVLI should decide the mask policy according by mask policy operand or passthru operand. If UsesMaskPolicy is false (ex. unmasked, store, and reduction operations), the mask policy could be either mask undisturbed or agnostic. Currently, RISCVInsertVSETVLI sets UsesMaskPolicy operations default to MA, otherwise to MU to keep the current mask policy would not be changed for unmasked operations. Add masked-tama, masked-tamu, masked-tuma and masked-tumu test cases. I didn't add all operations because most of implementations are using the same pseudo multiclass. Some tests maybe be duplicated in different tests. (ex. masked vmacc with tumu shows in vmacc-rv32.ll and masked-tumu) I think having different tests only for policy would make the testing clear. Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D120226	2022-03-22 01:19:16 -07:00
Yeting Kuo	ecd7a0132a	[RISCV] Add basic cost model for vector casting To perform the cost model of vector casting, the patch consider most vector casts as their scalar form and consider those vector form of free scalr castings as 1. Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D121771	2022-03-22 14:17:08 +08:00
Vasileios Porpodas	79613185d3	Recommit "[SLP] Fix lookahead operand reordering for splat loads." Original review: https://reviews.llvm.org/D121354 The original commit `9136145eb0` broke the build on several targets. Differential Revision: https://reviews.llvm.org/D121973	2022-03-21 15:57:32 -07:00
Craig Topper	cc5b0868ff	Revert "[RISCV] Special case sign extended scalars when type legalizing nxvXi64 .vx instrinsics on RV32." This reverts commit `8c4937b33f`. Committed by mistake.	2022-03-21 14:58:11 -07:00
Craig Topper	d4aeb5000f	[RISCV] Simplify some code. NFC	2022-03-21 14:50:56 -07:00
Craig Topper	19de2e8db6	[RISCV] Remove stray slash from comment. NFC	2022-03-21 14:50:56 -07:00
Craig Topper	8c4937b33f	[RISCV] Special case sign extended scalars when type legalizing nxvXi64 .vx instrinsics on RV32. On RV32, we need to type legalize i64 scalar arguments to intrinsics. We usually do this by splatting the value into a vector separately. If the scalar happens to be sign extended, we can continue using a .vx intrinsic. We already special cased sign extended constants, this extends it to any sign extended value. I've only added tests for one case of vadd. Most intrinsics go through the same check. I can add more tests if we're concerned. Differential Revision: https://reviews.llvm.org/D122186	2022-03-21 14:50:55 -07:00
Mohammed Nurul Hoque	7afa44f5f5	[RISCV] Add more sign-extending ops to MIR sext.w pass. This patch adds single-bit and bit-counting ops to list of sign-extending ops. A single-bit write propagates sign-extendedness if it's not in the sign-bits. Bit extraction and bit counting always outputs a small number, so sign-extended. Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D121152	2022-03-18 18:21:17 +08:00
Jessica Clarke	63ea7797dd	[RISCV] Fix buildbot breakage by explicitly instantiating templates RISCVISelDAGToDAG's selectImm uses RISCVTargetLowering::getAddr (specifically the ConstantPoolSDNode) as of `41454ab256` ("[RISCV] Use constant pool for large integers"), but nothing explicitly instantiates any of the templates, the only reason they exist is because of the various lowering methods in RISCVISelLowering.cpp that themselves use the methods. However, with inlining, those can end up not existing as real functions and thus not be exported, leading to link errors. Up until now this hasn't happened, but for whatever reason D121654 has triggered this on the sanitizer-ppc64be-linux buildbot, giving: ../../../../lib/libLLVMRISCVCodeGen.a(RISCVISelDAGToDAG.cpp.o): In function `selectImm(llvm::SelectionDAG, llvm::SDLoc const&, llvm::MVT, long, llvm::RISCVSubtarget const&)': RISCVISelDAGToDAG.cpp:(.text._ZL9selectImmPN4llvm12SelectionDAGERKNS_5SDLocENS_3MVTElRKNS_14RISCVSubtargetE+0x3d8): undefined reference to `llvm::SDValue llvm::RISCVTargetLowering::getAddr<llvm::ConstantPoolSDNode>(llvm::ConstantPoolSDNode, llvm::SelectionDAG&, bool) const' collect2: error: ld returned 1 exit status Fix this by explicitly instantiating getAddr in its four different forms so separate translation units can reliably use it. Fixes: `41454ab256` ("[RISCV] Use constant pool for large integers")	2022-03-18 02:22:17 +00:00
Craig Topper	bbd2ecf9f0	[RISCV] Add +experimental-zvfh extension to cover half types in vectors. Currently we allow half types in vectors if the scalar Zfh extension is enabled. This behavior is not inline with the vector spec. For f32 and f64 types, the Zve32f, Zve64f, Zve64d, and V explicitly control the availablity of floating point types in vectors. In order to make our compiler compliant, we either need to remove all support for half in vectors or we need an extension to control it. Draft spec here https://github.com/riscv/riscv-v-spec/pull/780 Reviewed By: kito-cheng Differential Revision: https://reviews.llvm.org/D121345	2022-03-17 10:04:02 -07:00
Craig Topper	7e15303062	[RISCV] Simplify scalable vector case in lowerVectorMaskExt. Since we have SPLAT_VECTOR_PARTS these days, I don't think we need to go through extra lengths to avoid introducing an illegal scalar type. We can just call getConstant using the scalable vector type and let it create either a SPLAT_VECTOR or a SPLAT_VECTOR_PARTS. Reviewed By: frasercrmck, rogfer01 Differential Revision: https://reviews.llvm.org/D121645	2022-03-17 09:43:13 -07:00
Lian Wang	214afc7116	[RISCV] Add patterns for vnsrl.wi and vnsra.wi instructions Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D121675	2022-03-17 07:22:32 +00:00
Lian Wang	b26abcad81	[RISCV][NFC] Replace redundant code with VLOpFrag Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D121783	2022-03-17 02:05:21 +00:00
Craig Topper	2e10671ec7	[RISCV] Improve detection of when to skip (and (srl x, c2) c1) -> (srli (slli x, c3-c2), c3) isel. We have a special case to skip this transform if c1 is 0xffffffff and x is sext_inreg in order to use sraiw+zext.w. But we were only checking that we have a sext_inreg opcode, not how many bits are being sign extended. This commit adds a check that it is a sext_inreg from i32 so we know for sure that an sraiw can be created.	2022-03-16 14:54:34 -07:00
Jessica Clarke	659363c0cc	[RISCV] Ensure PseudoLA* can be hoisted Since we mark the pseudos as mayLoad but do not provide any MMOs, isSafeToMove conservatively returns false, stopping MachineLICM from hoisting the instructions. PseudoLA_TLS_GD does not actually expand to a load, so stop marking that as mayLoad to allow it to be hoisted, and for the others make sure to add MMOs during lowering to indicate they're GOT loads and thus can be freely moved. Fixes https://github.com/llvm/llvm-project/issues/54372 Reviewed By: MaskRay, arichardson Differential Revision: https://reviews.llvm.org/D121654	2022-03-16 18:45:36 +00:00
Shengchen Kan	37b378386e	[NFC][CodeGen] Rename some functions in MachineInstr.h and remove duplicated comments	2022-03-16 20:25:42 +08:00
serge-sans-paille	989f1c72e0	Cleanup codegen includes This is a (fixed) recommit of https://reviews.llvm.org/D121169 after: 1061034926 before: 1063332844 Discourse thread: https://discourse.llvm.org/t/include-what-you-use-include-cleanup Differential Revision: https://reviews.llvm.org/D121681	2022-03-16 08:43:00 +01:00
Haocong.Lu	6a54776fe0	[RISCV] Select SRLI+SLLI for AND with leading ones mask Select SRLI+SLLI for and i64 %x, imm if the imm is a leading ones mask. It's useful in RV64 when the mask exceeds simm32 (cannot be generated by LUI). Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D121598	2022-03-16 02:10:57 +00:00
Craig Topper	06c5d74090	[RISCV] Remove lowerSPLAT_VECTOR This code handles fixed vector SPLAT_VECTOR, but is never called in any tests. We only form fixed vector splat vectors for vXi64 on RV32 as part of DAGCombine. This will be type legalized to SPLAT_VECTOR_PARTS. So the Custom handling for SPLAT_VECTOR is never needed. This patch makes SPLAT_VECTOR for vXi64 'Legal' on RV32 so that DAGCombine will create it, but there's no need for Custom handler. It will still be type legalized to SPLAT_VECTOR_PARTS. Reviewed By: frasercrmck Differential Revision: https://reviews.llvm.org/D121673	2022-03-15 08:22:13 -07:00
Yeting Kuo	ae7c6647f3	[RISCV] Add basic code modeling for fixed length vector reduction. Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D121447	2022-03-14 11:04:31 +08:00
Craig Topper	eeb3bfd74a	[RISCV] Merge ReplaceNodeResults code for SHFL and GREV/GORC. NFC	2022-03-13 18:42:26 -07:00
Lehua Ding	1648852c98	[RISCV][RVV] Fix vslide1up/down intrinsics overflow bug for SEW=64 on RV32 Reviewed By: craig.topper, kito-cheng Differential Revision: https://reviews.llvm.org/D120899	2022-03-13 18:06:09 +08:00
Craig Topper	fd4d584d6b	[RISCV] Add DAGCombine to fold (bitreverse (bswap X)) to brev8 with Zbkb. If the type is less than XLenVT, type legalization will turn this into (srl (bitreverse (bswap (srl (bswap X), C))), C). We can't completely recover from these shifts. They introduce zeros into the upper bits of the result and we can't easily tell if they are needed. By doing a DAG combine early, we avoid introducing these shifts.	2022-03-12 16:39:39 -08:00
Craig Topper	43f668b98e	[RISCV] Move GORCIW/GREVIW formation to isel patterns. Type legalize narrow RISCVISD::GREV/GORC with constant to a larger type without switching to W. Detect sext_inreg+gorci/grevi with a uimm5 immediate during isel to emit GREVIW/GORCIW. This allows us to better propagate known bits information through extended bits after type legalization. It will also simplify a change I'm considering for BREV8 with Zbkb. A future patch will add computeKnownBits support for GORC. A further improvement here would be to use hasAllWUsers and doPeepholeSExtW like we do for SLLIW, but I don't think we have the test coverage for that yet.	2022-03-11 18:02:47 -08:00
Craig Topper	d0969e485c	[RISCV] Optimize vfmv.s.f intrinsic with scalar 0.0 to vmv.s.x with x0. We already do this for RISCVISD::VFMV_S_F_VL and the vfmv.v.f intrinsic. Reviewed By: kito-cheng Differential Revision: https://reviews.llvm.org/D121429	2022-03-11 10:05:43 -08:00
Craig Topper	e9d4922543	[RISCV] Add tablegen helper classes to create PatFrag to check for one use. NFC Reduces code and the class can be instantiated in isel patterns to avoid creating more *_oneuse classes.	2022-03-10 23:14:21 -08:00
Craig Topper	337d49da84	[RISCV] Fix typo in comment. NFC	2022-03-10 22:00:18 -08:00
Eric Tang	336c92d5e8	[RISCV] Add alias for HFENCE.VVMA Signed-off-by: Eric Tang <eric.tang@starfivetech.com> Differential Revision: https://reviews.llvm.org/D120878	2022-03-11 13:32:52 +08:00
Craig Topper	1f3a8d58a6	[RISCV] Use ZERO_EXTEND instead of ANY_EXTEND when promoting i32 RISCVISD::SHFL. NFC We know the shift amount is a constant with bit 31 clear. anyext of constant will be either zext or sext which will produce the same result here. But we really shouldn't rely on that. It would be valid to put a random number in the upper bits. Our isel patterns expect the upper bits to be 0 so we should ask for it explicitly.	2022-03-10 20:57:04 -08:00
Craig Topper	9ce6b1ca86	[RISCV] Remove performANY_EXTENDCombine. This doesn't appear to be needed any more. I did some inspecting of the gcc torture suite and SPEC2006 with this removed and didn't find any meaningful changes. I think we're more aggressive about forming ADDIW now using sign_extend_inreg during type legalization and hasAllWUsers in isel. This probably helps catch the cases this helped with before.	2022-03-10 11:29:31 -08:00
Craig Topper	e0e8edf823	[RISCV] Add isel patterns for masked RISCVISD::FMA_VL with RISCVISD::FNEG_VL. This helps us form vfnmsub, vfnmadd, and vfmusb from masked VP intrinsics. I've used "srcvalue" for the mask parameter in the fneg nodes. We can't match "V0" because that doesn't ensure the mask the is the same. Instead it matches two different nodes and generates two copies to V0 of those separate values. Reviewed By: rogfer01 Differential Revision: https://reviews.llvm.org/D120287	2022-03-10 10:05:42 -08:00
Nico Weber	a278250b0f	Revert "Cleanup codegen includes" This reverts commit `7f230feeea`. Breaks CodeGenCUDA/link-device-bitcode.cu in check-clang, and many LLVM tests, see comments on https://reviews.llvm.org/D121169	2022-03-10 07:59:22 -05:00
serge-sans-paille	7f230feeea	Cleanup codegen includes after: 1061034926 before: 1063332844 Differential Revision: https://reviews.llvm.org/D121169	2022-03-10 10:00:30 +01:00
Luke	0803dba7dd	[RISCV] Add fixed-length vector instrinsics for segment load Inspired by reviews.llvm.org/D107790. Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D119834	2022-03-10 16:23:40 +08:00
Craig Topper	d53707508a	[RISCV] Remove RISCVISD::VLE_VL/VSE_VL. Use intrinsics instead. Similar to what we do for other loads/stores, use the intrinsic version that we already have custom isel for. Reviewed By: rogfer01 Differential Revision: https://reviews.llvm.org/D121166	2022-03-09 22:44:28 -08:00
Craig Topper	edd6632127	[RISCV] Support 'generic' as a valid CPU name. Most other targets support 'generic', but RISCV issues an error. This can require a special case in tools that use LLVM that aren't clang. This patch treats "generic" the same as an empty string and remaps it to generic-rv/rv64 based on the triple. Unfortunately, it has to be added to RISCV.td because MCSubtargetInfo is constructed and parses the CPU before RISCVSubtarget's constructor gets a chance to remap it. The CPU will then reparsed and the state in the MCSubtargetInfo subclass will be updated again. Fixes PR54146. Reviewed By: khchen Differential Revision: https://reviews.llvm.org/D121149	2022-03-09 16:43:22 -08:00
Shao-Ce SUN	365c858a5d	[RISCV] Share PatFprFpr classes for F, D, and Zfh Inspired by D115469 Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D121066	2022-03-08 13:02:04 +08:00
jacquesguan	e55b9b0d0a	[RISCV] Add patterns for vector widening floating-point reduction instructions. Add patterns for vector widening floating-point reduction instructions. Differential Revision: https://reviews.llvm.org/D120390	2022-03-08 10:53:49 +08:00
Craig Topper	845bfcede1	[RISCV] Rename 'SplatOperand' to 'ScalarOperand'. NFC vslide1up/down have this flag set, but the value isn't a splat. Rename for clarity. Reviewed By: khchen Differential Revision: https://reviews.llvm.org/D121037	2022-03-07 11:28:32 -08:00
Zakk Chen	3be907621f	[RISCV] Fix incorrect optimization for masked vmsgeu.vi with 0 immediate. vmsgeu.vi with 0 is always true, but in the masked with mask undisturbed policy, we still need to keep inactive elelemt which come from maskedoff. We could return mask directly if it's mask agnostic policy in the future. Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D121080	2022-03-06 19:22:35 -08:00
Benjamin Kramer	fbce4a7803	Drop some more global std::maps. NFCI.	2022-03-06 13:28:29 +01:00
Craig Topper	bd5f124716	[RISCV] Add SimplifyDemandedBits support for FSR/FSL/FSRW/FSLW.	2022-03-05 21:26:51 -08:00
Zakk Chen	33b61c5678	[RISCV] Fix incorrect codegen introduced by D119688. We should not emit a tail agnostic vlse for a tail undisturbed vmv.s.x In D119688: - if (IsScalarMove && !Node->getOperand(0).isUndef()) + bool HasPassthruOperand = Node->getOpcode() != ISD::SPLAT_VECTOR; + if (HasPassthruOperand && !IsScalarMove && !Node->getOperand(0).isUndef()) break; The IsScalarMove check in the if statement had been changed. Differential Revision: https://reviews.llvm.org/D120963	2022-03-05 06:10:26 -08:00
Craig Topper	1e569e3b7b	[RISCV] Add CMOV isel pattern for (select (setgt X, -1), Y, Z) setgt X, -1 is the canonical form of setge X, 0. We can swap the select operands and use setlt X, X0 when selecting CMOV. This avoid materializing the -1 in a register.	2022-03-04 22:35:13 -08:00
Craig Topper	232f57319d	[RISCV] Move vslide1up/down intrinsics into lowerVectorIntrinsicSplats. NFC Rename to lowerVectorIntrinsicScalars. This allows us to share the code that checks if the scalar needs to be type legalized.	2022-03-04 18:21:53 -08:00
Craig Topper	3d4e83f17d	[RISCV] With Zbb, fold (sext_inreg (abs X)) -> (max X, (negw X)) With Zbb, abs is expanded to (max X, neg) by default. If X has 33 or more sign bits, we can expand it a little early using negw instead of neg to save a sext_inreg. If X started as a 32 bit value, type legalization would have inserted a sext before the abs so X having 33 sign bits should always be true. Note: I've used ISD::FREEZE here since we increase the number of uses. Our default expansion for ABS doesn't do that, but I think that's a bug. We can't do this with custom type legalization because ISD::FREEZE doesn't propagate sign bits so later DAG combine won't expand be able to see optmize it. Alives2 https://alive2.llvm.org/ce/z/Gx3RNe Reviewed By: spatel Differential Revision: https://reviews.llvm.org/D120597	2022-03-03 15:42:29 -08:00
Alex Tsao	89f15fc687	[RISCV] Add cost modelling for masked memory op The patch adds very basic cost model for masked memory op on scalable vector. Reviewed By: frasercrmck Differential Revision: https://reviews.llvm.org/D117884	2022-03-03 20:47:58 +08:00
jacquesguan	44a430354d	[RISCV] Fold store of vmv.f.s to a vse with VL=1. This patch support the FP part of D109482. Differential Revision: https://reviews.llvm.org/D120235	2022-03-03 16:35:19 +08:00
Craig Topper	6cb42cd666	[RISCV] More correctly ignore Zfinx register classes in getRegForInlineAsmConstraint. Until Zfinx is supported in CodeGen we need to convert all Zfinx register classes to GPR. Remove the zfinx-types.ll test which didn't test anything meaningful since -mattr=zfinx isn't implemented completely in llc. Follow up to D93298.	2022-03-02 11:22:46 -08:00
Craig Topper	a1f8349d77	[RISCV] Don't combine ROTR ((GREV x, 24), 16)->(GREV x, 8) on RV64. This miscompile was introduced in D119527. This was a special pattern for rotate+bswap on RV32. It doesn't work for RV64 since the rotate needs to be half the bitwidth. The equivalent pattern for RV64 is ROTR ((GREV x, 56), 32) so match that instead. This could be generalized further as noted in the new FIXME. Reviewed By: Chenbing.Zheng Differential Revision: https://reviews.llvm.org/D120686	2022-03-02 09:47:06 -08:00
Nikita Popov	98cfcae4e9	Revert "[RISCV] Add cost modelling for masked memory op" This reverts commit `76f243b53b`. The newly added test fails.	2022-03-02 17:32:10 +01:00
Alex Tsao	76f243b53b	[RISCV] Add cost modelling for masked memory op The patch adds very basic cost model for masked memory op on scalable vector. Reviewed By: frasercrmck Differential Revision: https://reviews.llvm.org/D117884	2022-03-02 22:48:41 +08:00
Shao-Ce SUN	0e38b29543	[RISCV] add the MC layer support of Zfinx extension This patch added the MC layer support of Zfinx extension. Authored-by: StephenFan Co-Authored-by: Shao-Ce Sun Reviewed By: asb Differential Revision: https://reviews.llvm.org/D93298	2022-03-02 14:25:19 +08:00
Mircea Trofin	cb2160760e	[nfc][codegen] Move RegisterBank[Info].h under CodeGen This wraps up from D119053. The 2 headers are moved as described, fixed file headers and include guards, updated all files where the old paths were detected (simple grep through the repo), and `clang-format`-ed it all. Differential Revision: https://reviews.llvm.org/D119876	2022-03-01 21:53:25 -08:00
Craig Topper	b9d6e8c441	[RISCV] Lower VECTOR_SPLICE to RVV instructions. This lowers VECTOR_SPLICE of scalable vectors to a slidedown follow by a slideup. Fixed vectors are encouraged to use shufflevector instruction. The equivalent patch for fixed vectors is D119039. I've used a tail agnostic slidedown and limited the VL to only the elements that will not be overwritten by the slideup. The slideup uses VLMax for its VL. It unfortunately uses tail undisturbed policy but it isn't required as there is no tail. We just need the merge operand to carry the bits for the lower portion of the result. Care was taken to ensure that either the slideup or slidedown will be able to use a .vi instruction when the immediate is small. Which one uses the immediate depends on the sign of the immediate. Reviewed By: frasercrmck, ABataev Differential Revision: https://reviews.llvm.org/D119303	2022-03-01 10:10:13 -08:00
Lian Wang	db85cd729a	[RISCV] Add FMV_W_X and FMV_H_X instrutions to hasAllNBitUsers Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D120699	2022-03-01 08:13:59 +00:00
lian wang	5d91a8a707	[RISCV] Add schedule class for Zbp extension and Zbr extension Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D120012	2022-03-01 07:35:59 +00:00
Lian Wang	e2c150ab52	[RISCV][NFC] Move defined non_imm12 to proper place in RISCVInstrInfoZb.td Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D120656	2022-03-01 01:45:30 +00:00
Craig Topper	e83db8c001	[RISCV] Only enable combineROTR_ROTL_RORW_ROLW with Zbp. I think the immediate values we check for on the GREV nodes already protect this, but better to be explicit.	2022-02-28 12:47:36 -08:00
Craig Topper	b083157b7b	[RISCV] Don't call combineROTR_ROTL_RORW_ROLW for SLLW/SRLW/SRAW nodes. NFC I think the function does the correct thing internally, but it's confusing to read.	2022-02-28 11:05:10 -08:00
Craig Topper	f46890711f	[RISCV] Custom type legalize i32 ISD::ABS on RV64 without Zbb. Default type legalization will create sext_inreg+abs, but we may not be able to remove the sext_inreg. Instead this patch expands abs during type legalization to Y = sraiw X, 31; subw(xor X, Y), Y) which doesn't require the input to be sign extended. This gives a big improvement for some neg-abs tests where the abs is used more than the the neg. Previously the abs was expanded a different way before and after type legalization. Now they are expanded in a similar way enabling more CSE. Reviewed By: asb Differential Revision: https://reviews.llvm.org/D120636	2022-02-28 09:30:27 -08:00
eric.tang	b496a172e4	[RISCV] Support hypervisor extention instructions According to privileged spec version-20211203 Add the following hypervisor instructions: - HLV.B HLV.BU - HLV.H HLV.HU HLVX.HU - HLV.W HLV.WU HLVX.WU - HLV.D - HSV.B HSV.H HSV.W HSV.D Signed-off-by: eric.tang <eric.tang@starfivetech.com> Differential Revision: https://reviews.llvm.org/D117733	2022-02-28 14:02:43 +08:00
eric.tang	386c5be92a	[RISCV] Support Sinval extension and hypervisor memory management fence instructions According to Privileged spec version-20211203 Add Supervisor Memory-Management Instructions: - SINVAL.VMA, SFENCE.W.INVAL, SFENCE.INVAL.IR Add Hypervisor Memory-Management Instructions: - HFENCE.VVMA, HFENCE.GVMA, HINVAL.VVMA, HINVAL.GVMA Signed-off-by: eric.tang <eric.tang@starfivetech.com> Differential Revision: https://reviews.llvm.org/D117654	2022-02-28 14:02:43 +08:00
Eric Tang	cf80ef1393	[RISCV] Change GPRMemAtomic to GPRMemZeroOffset for general usage Not only some AMO instructions but also other instructions need to process (${gpr}) or 0(${gpr}), where the 0 is be silently ignored. This patch does some changes for general usage. Signed-off-by: Eric Tang <eric.tang@starfivetech.com> Differential Revision: https://reviews.llvm.org/D120017	2022-02-28 14:02:43 +08:00
Chenbing Zheng	7f811ce127	[RISCV] Optimize (sext.w, srli) to sraiw with Zba. In this patch, we add a more narrower exclusion for zeroext (srl x) -> srli (slli x), so that it provides an opportunity for the selection of sraiw. Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D120467	2022-02-28 10:34:35 +08:00
Jessica Clarke	6aa8521fdb	[RISCV] Fix parseBareSymbol to not double-parse top-level operators By failing to lex the token we end up both parsing it as a binary operator ourselves and parsing it as a unary operator when calling parseExpression on the RHS. For plus this is harmless but for minus this parses "foo - 4" as "foo - -4", effectively treating a top-level minus as a plus. Fixes https://github.com/llvm/llvm-project/issues/54105 Reviewed By: asb, MaskRay Differential Revision: https://reviews.llvm.org/D120635	2022-02-27 20:48:52 +00:00
Jameson Nash	c4b1a63a1b	mark getTargetTransformInfo and getTargetIRAnalysis as const Seems like this can be const, since Passes shouldn't modify it. Reviewed By: wsmoses Differential Revision: https://reviews.llvm.org/D120518	2022-02-25 14:30:44 -05:00
Haocong.Lu	865fe131f8	[RISCV] Fix a mistake in PostprocessISelDAG With the condition N->use_empty(), the root node of DAG always misses peephole optimization. So a dummy node is needed. Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D119934	2022-02-25 12:38:31 +00:00
Chenbing Zheng	b20e80aa59	[RISCV] DAG Combine vcpop and vfirst with VL=0 to li imm vcpop and vfirst are still useful when VL=0. vcpop equivalents to li 0 and vfirst equivalents to li -1, since no mask elements are active. Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D120302	2022-02-25 14:44:25 +08:00
Zakk Chen	4e115b7d88	[RISCV] Update computeTargetABI from llc as well as clang Clang computes the default ABI if -mabi is empty and encode it in LLVM IR module flag since D105555. For correctness, llc need to give the same target-abi (Options.MCOptions.ABIName) with ABI encoded in IR. The getSubtargetImpl already has a check for them only if Options.MCOptions.ABIName is not empty. In order to get more robustness we could have a check for explicit ABI, but now we have two different logic to compute the default ABI. The front-end ABI is defautl to the ilp32/ilp32e/lp64, and ilp32d/lp64d when hardware support for extension D. The backend ABI is default to the ilp32/ilp32e/lp64. Reviewed by: asb, jrtc27 Differential Revision: https://reviews.llvm.org/D118333	2022-02-24 21:55:44 -08:00
lian wang	f37d21ed20	[RISCV] Add schedule class for Zbt extension Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D119808	2022-02-25 01:57:20 +00:00
Qihan Cai	0d058ed3d6	[RISCV] Change rvv version to 1.0 and remove ratify notice This patch changes the version of V extension from 0.1 to 1.0 in RISCVInstrInfoVPseudos.td, RISCVInstrInfoVSDPatterns.td, RISCVInstrInfoVVLPatterns.td, RISCVInstrInfoV.td Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D120525	2022-02-25 11:38:20 +11:00
Craig Topper	506ac29632	[RISCV] Add 'i64' to some isel so tablegen will remove them for RV32. NFC Saves a 100 bytes or so from the isel table.	2022-02-24 15:10:05 -08:00
Craig Topper	a975ca97c3	[RISCV] Fold (sext_inreg (fmv_x_anyexth X), i16) -> (fmv_x_signexth X). Add a new ISD opcode to represent the sign extending behavior of vmv.x.h. Keep the previous anyext opcode to allow the existing (fmv_x_anyexth (fmv_h_x X)) combine to keep working without needing to generate a sign extend. For fmv.x.w we are able to match the sext_inreg in an isel pattern, but a 16-bit sext_inreg is lowered to a shift pair before isel. This seemed like a larger match than we should do in isel. Reviewed By: asb Differential Revision: https://reviews.llvm.org/D118974	2022-02-24 09:19:01 -08:00
Shao-Ce SUN	78b5f0fb05	[NFC][RISCV] Reuse ISD::NodeType in float extension Reviewed By: asb Differential Revision: https://reviews.llvm.org/D120412	2022-02-24 19:57:55 +08:00
Nikita Popov	c7fe6f9c92	Revert "[RISCV] add the MC layer support of Zfinx extension" This reverts commit `7798ecca9c`. As reported in https://reviews.llvm.org/D93298#3331641 and following, this causes assertion failures with inline assembly.	2022-02-24 12:14:31 +01:00
lian wang	e1d4d1c242	[RISCV] Add schedule class for Zbm and Zbe extension Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D119805	2022-02-24 08:49:25 +00:00
Chenbing.Zheng	2ae92e19eb	[RISCV][NFC] Add helper function isVectorConfigInstr to reduce Repeated code. Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D119924	2022-02-24 05:59:12 +00:00
Craig Topper	5b7ac107b1	[RISCV] Use SelectionDAG::getFreeze to simplify some code. NFC	2022-02-23 21:13:01 -08:00
Craig Topper	c7d6448d03	[DAGCombiner][TargetLowering] Pass SDValue by value to isMulAddWithConstProfitable. Internally to DAGCombiner the SDValues were passed by non-const reference despite not being modified. They were then passed by const reference to TLI. This patch passes them by value which is consistent with the vast majority of code. Reviewed By: RKSimon Differential Revision: https://reviews.llvm.org/D120420	2022-02-23 12:40:45 -08:00
Alex Bradbury	c5bcfb983e	[RISCV] Avoid infinite loop between DAGCombiner::visitMUL and RISCVISelLowering::transformAddImmMulImm See https://github.com/llvm/llvm-project/issues/53831 for a full discussion. The basic issue is that DAGCombiner::visitMUL and RISCVISelLowering;:transformAddImmMullImm get stuck in a loop, as the current checks in transformAddImmMulImm aren't sufficient to avoid all cases where DAGCombiner::isMulAddWithConstProfitable might trigger a transformation. This patch makes transformAddImmMulImm bail out if C0 (the constant used for multiplication) has more than one use. Differential Revision: https://reviews.llvm.org/D120332	2022-02-23 11:05:46 +00:00

... 4 5 6 7 8 ...

2266 Commits