Commit Graph

2266 Commits

Author SHA1 Message Date
Shao-Ce SUN 862f30a428 [RISCV] Add ISD::EH_DWARF_CFA
Based on D24038.
LLVM has an @llvm.eh.dwarf.cfa intrinsic, used to lower the GCC-compatible __builtin_dwarf_cfa() builtin.

Reviewed By: StephenFan

Differential Revision: https://reviews.llvm.org/D126181
2022-06-08 22:03:30 +08:00
Craig Topper 0c66deb498 [RISCV] Scalarize gather/scatter on RV64 with Zve32* extension.
i64 indices aren't supported on Zve32*. Scalarize gathers to prevent
generating illegal instructions.

Since InstCombine will aggressively canonicalize GEP indices to
pointer size, we're pretty much always going to have an i64 index.

Trying to predict when SelectionDAG will find a smaller index from
the TTI hook used by the ScalarizeMaskedMemIntrinPass seems fragile.
To optimize this we probably need an IR pass to rewrite it earlier.

Test RUN lines have also been added to make sure the strided load/store
optimization still works.

Reviewed By: reames

Differential Revision: https://reviews.llvm.org/D127179
2022-06-07 08:07:50 -07:00
Matt Arsenault cc5a1b3dd9 llvm-reduce: Add cloning of target MachineFunctionInfo
MIR support is totally unusable for AMDGPU without this, since the set
of reserved registers is set from fields here.

Add a clone method to MachineFunctionInfo. This is a subtle variant of
the copy constructor that is required if there are any MIR constructs
that use pointers. Specifically, at minimum fields that reference
MachineBasicBlocks or the MachineFunction need to be adjusted to the
values in the new function.
2022-06-07 10:14:48 -04:00
Philip Reames 3fa5876216 [RISCV] Reorganize getShuffleCost to make it more clear what's going on [nfc] 2022-06-06 10:11:58 -07:00
yanming bc93d51d36 [NFC][RISCV][format] Blank line between functions, remove unnecessary semicolon. 2022-06-06 15:38:14 +08:00
yanming 8d9d8f866a [RISCV] Define risc-v's own register class to model FP Register.
The default RegisterClass is not enough to model RISCV Register.
We define risc-v's own register class to model FP Register.
This helps to better estimate the register pressure in the loop-vectorize.

Reviewed By: kito-cheng

Differential Revision: https://reviews.llvm.org/D126854
2022-06-06 14:43:52 +08:00
Fangrui Song 77e300ffdf [MC] Change EndOfStatement "unexpected tokens in .xxx directive " to "expected newline" 2022-06-05 15:11:01 -07:00
LiaoChunyu f14d18c7a9 [RISCV] Add more patterns for FNMADD
D54205 handles fnmadd: -rs1 * rs2 - rs3
This patch add fnmadd: -(rs1 * rs2 + rs3) (the nsz flag on the FMA)

Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D126852
2022-06-04 12:31:45 +08:00
Craig Topper cc3bd43533 [RISCV] Support LUI+ADDIW in doPeepholeLoadStoreADDI.
This fixes an inconsistency between RV32 and RV64. Still considering
trying to do this peephole during isel, but wanted to fix the
inconsistency first.

Reviewed By: reames

Differential Revision: https://reviews.llvm.org/D126986
2022-06-03 18:06:56 -07:00
Craig Topper 170c550ca8 [RISCV] Use SelectionDAG::isBaseWithConstantOffset in scalar load/store address matching.
Test changes are because isBaseWithConstantOffset uses computeKnownBits
and that is able to see that an earlier AND instruction guaranteed
alignment so that we can treat an OR as an ADD.

Reviewed By: reames

Differential Revision: https://reviews.llvm.org/D126970
2022-06-03 10:55:28 -07:00
Craig Topper 4402852002 [RISCV] Reduce scalar load/store isel patterns to a single ComplexPattern. NFCI
Previously we had 3 different isel patterns for every scalar load
store instruction.

This reduces them to a single ComplexPattern that returns the Base
and Offset. Or an offset of 0 if there was no offset identified

I've done a similar thing for the 2 isel patterns that match add/or
with FrameIndex and immediate. Using the offset of 0, I was also
able to remove the custom handler for FrameIndex. Happy to split that
to another patch.

We might be able to enhance in the future to remove the post-isel
peephole or the special handling for ADD with constant added by D126576.

A nice side effect is that this removes nearly 3000 bytes from the isel
table.

Differential Revision: https://reviews.llvm.org/D126932
2022-06-03 09:00:17 -07:00
Craig Topper 1d67adbfbf [RISCV] Give CSImm12MulBy4 PatLeaf priority over CSImm12MulBy8. NFC
The immediate range check for CSImm12MulBy8 included some values
covered by CSImm12MulBy4. I assume CSImm12MulBy4 had priority due
to pattern order in the td file, but this makes the priority
explicit in the predicate.
2022-06-02 20:51:14 -07:00
Craig Topper dbead2388b [RISCV] Add custom isel for (add X, imm) used by load/stores.
If the imm is out of range for an ADDI, we will materialize it in
a register using multiple instructions. If the ADD is used by a
load/store, doPeepholeLoadStoreADDI can try to pull an ADDI from
the constant materialization into the load/store offset. This only
works if the ADD has a single use, otherwise the peephole would have
to rebuild multiple nodes.

This patch instead tries to solve the problem when the add is selected.
We check that the add is only used by loads/stores and if it is
we will select it to (ADDI (ADD X, Imm-Lo12), Lo12). This will enable
the simple case in doPeepholeLoadStoreADDI that can bypass an ADDI
used as a pointer. As a result we can remove the more complicated
peephole from doPeepholeLoadStoreADDI.

Reviewed By: reames

Differential Revision: https://reviews.llvm.org/D126576
2022-06-02 13:45:32 -07:00
Philip Reames 76ac916d63 [RISCV] Inline one copy of needVSETVLI into the other [NFC]
Calling the non-MI version directly was unsound (as fixed in dcdb0bf2), so remove that version to decrease likelyhood of future mistakes.
2022-06-02 13:06:18 -07:00
Philip Reames dcdb0bf25b [RISCV] Fix an inconsistency with compatible load/store handling
Once we've computed the incoming predecessor state, we should use the same compatibility check with knowledge of MI as we did in phase 2 in order to be consistent across all phases.

Differential Revision: https://reviews.llvm.org/D126574
2022-06-02 08:03:51 -07:00
Craig Topper 909a78b3a4 [RISCV] Use MachineRegisterInfo::use_instr_begin instead of use_begin+getParent. NFCI 2022-06-01 15:37:48 -07:00
Craig Topper aeb27f133a [RISCV] Fix i64<->f64 and i32<->f32 bitcasts with VLS vectors enabled.
We enable a custom handler to optimize conversions between scalars
and fixed vectors. Unfortunately, the custom handler picks up scalar
to scalar conversions as well. If the scalar types are both legal,
we wouldn't match any of the fixed vector cases and would return SDValue()
causing the LegalizeDAG to expand the bitcast through memory.

This patch fixes this by checking if it's a scalar to scalar conversion
and returns `Op` if both types are legal.

Differential Revision: https://reviews.llvm.org/D126739
2022-06-01 08:13:49 -07:00
Craig Topper 1b2de79ff4 [RISCV] Use two ADDIs to do some stack pointer adjustments.
If the adjustment doesn't fit in 12 bits, try to break it into
two 12 bit values before falling back to movImm+add/sub.

This is based on a similar idea from isel.

Reviewed By: luismarques, reames

Differential Revision: https://reviews.llvm.org/D126392
2022-05-31 10:25:28 -07:00
Craig Topper 80c4cf6369 [RISCV] Fix a few corner case bugs in RISCVMergeBaseOffsetOpt::matchLargeOffset
The immediate for LUI is stored as 20-bit unsigned value. We need
to sign extend if after shifting by 12 to match the instruction
behavior.

If we find an LUI+ADDI on RV64, it means the constant isn't a
simm32. If it was, we would have emitted LUI+ADDIW from constant
materialization. Make sure the constant is a simm32 before folding.
This appears to match gcc.

A future patch will add support for LUI+ADDIW on RV64.
2022-05-31 09:50:54 -07:00
Fraser Cormack 5a2e640eb7 [RISCV][NFC] Adjust some comments in RISCVInsertVSETVLI
Capitalize the first letter of comments like the others, and a few other
tweaks.
2022-05-31 10:13:15 +01:00
eopXD 2cadf84fc8 [RISCV] Pass OptLevel to `RISCVDAGToDAGISel` correctly
Originally, `OptLevel` isn't passed into the `MachineFunctionPass`.
This lets the default parameter of `SelectionDAGISel`, which is
`CodeGenOpt::Default`, be passed in. OptLevelChanger captures the
optimization level with the parameter, and rather not the value
within `TargetMachine`. This lets the optimization be
unintentionally overwriten if other value than `CodeGenOpt::Default`
passed.

This patch fixes this by passing the optimization level rather
than using the default value.

Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D126641
2022-05-30 17:22:50 -07:00
Craig Topper 6a6cf2e28d [RISCV] isel (add (and X, 0x1FFFFFFFE), Y) as (SH1ADD (SRLI X, 1), Y)
This pattern is what we get after DAG combine for C code like this.

short *ptr1, *ptr2, *ptr3;
unsigned diff = ptr1 - ptr2;
return ptr3[diff];

Reviewed By: reames

Differential Revision: https://reviews.llvm.org/D126588
2022-05-29 18:24:07 -07:00
Philip Reames 85b4470035 [RISCV] Allow PRE of vsetvli involving non-1 LMUL
This is a follow up to address a review comment from D124869. When deciding whether to PRE a vsetvli, we can allow non-LMUL1 vsetvlis.

Differential Revision: https://reviews.llvm.org/D126563
2022-05-27 15:49:41 -07:00
Craig Topper b09e54541a [RISCV] Use template version of SignExtend64 for constant extends. NFC
We were inconsistent about which one we used.
2022-05-27 13:11:15 -07:00
Craig Topper d0f65eaa85 [RISCV] Remove unused variables. NFC 2022-05-27 12:13:45 -07:00
Craig Topper aaad507546 [RISCV] Return false from isOffsetFoldingLegal instead of reversing the fold in lowering.
When lowering GlobalAddressNodes, we were removing a non-zero offset and
creating a separate ADD.

It already comes out of SelectionDAGBuilder with a separate ADD. The
ADD was being removed by DAGCombiner.

This patch disables the DAG combine so we don't have to reverse it.
Test changes all look to be instruction order changes. Probably due
to different DAG node ordering.

Differential Revision: https://reviews.llvm.org/D126558
2022-05-27 11:05:18 -07:00
Fraser Cormack 3e450d9cbb [RISCV][NFC] Unify compatibility checks under one function
Split off from D125021.

We were duplicating logic across different phases. Since we want to
ensure a consistency of logic across phases for correctness, this patch
combines our multiple compatibility checks into one function to better
convey this.

Several methods were made const too.

Reviewed By: reames

Differential Revision: https://reviews.llvm.org/D126472
2022-05-27 11:21:54 +01:00
Fangrui Song cef377d75d [RISCV] Simplify code after D125905 2022-05-26 18:13:38 -07:00
Philip Reames 8a3b6ba756 [RISCV] Add a subtarget feature to enable unaligned scalar loads and stores
A RISCV implementation can choose to implement unaligned load/store support. We currently don't have a way for such a processor to indicate a preference for unaligned load/stores, so add a subtarget feature.

There doesn't appear to be a formal extension for unaligned support. The RISCV Profiles (https://github.com/riscv/riscv-profiles/blob/main/profiles.adoc#rva20u64-profile) docs use the name Zicclsm, but a) that doesn't appear to actually been standardized, and b) isn't quite what we want here anyway due to the perf comment.

Instead, we can follow precedent from other backends and have a feature flag for the existence of misaligned load/stores with sufficient performance that user code should actually use them.

Differential Revision: https://reviews.llvm.org/D126085
2022-05-26 15:25:47 -07:00
Craig Topper e9ac99b609 [RISCV] Simplfy creation of IndexVT in lowerMaskedGather/lowerMaskedScatter. NFC
The scalar element width is not a factor in how ContainerVT is
determined. We don't need to check the relative size of VT and
IndexVT.
2022-05-26 13:13:32 -07:00
Philip Reames d58cc0839e [RISCV] reorganize getFrameIndexReference to reduce code duplication [nfc]
This change reorganizes the majority of frame index resolution into a two strep process.

    Step 1 - Select which base register we're going to use.
    Step 2 - Compute the offset from that base register.

The key point is that this allows us to share the step 2 logic for the SP case. This reduces the code duplication, and (I think) makes the code much easier to follow.

I also went ahead and added assertions into phase 2 to catch errors where we select an illegal base pointer. In general, we can't index from a base register to a stack location if that requires crossing a variable and unknown region. In practice, we have two such cases: dynamic stack realign and var sized objects. Note that crossing the scalable region is fine since while variable, it's a known variability which can be expressed in the offset.

Differential Revision: https://reviews.llvm.org/D126403
2022-05-26 09:44:58 -07:00
Philip Reames afe49934a6 [RISCV] Allow compatible VTYPE in AVL Reg Forward cases
During insertion of VSETVLI, we have two related bits of code which decide whether we can reuse a previous vsetvli result. As was pointed out in the original review, these cases can allow any prior state for which we know that VL is the same for any value of AVL.

This was originally separated out of a desire for separate tests and review. As it turns out, finding a test case for this has been quite challenging. Most of the cases I tried, we manage to already get through other chains of logic. We do have one correct test change, but that only exercises one of the two changes.

Differential Revision: https://reviews.llvm.org/D126400
2022-05-26 08:50:35 -07:00
Fraser Cormack 2c9983f530 [RISCV][NFC] Add braces to 'else' to match braced 'if' 2022-05-26 10:00:33 +01:00
Kito Cheng e45087fd53 [RISCV] Fix state persistence bugs (PR55548)
We didn't implement RISCVELFStreamer::reset and cause some very strange
section output for attribute section...just reference D15950 to see how
ARM implement that.

Reviewed By: MaskRay

Differential Revision: https://reviews.llvm.org/D125905
2022-05-26 16:09:00 +08:00
jacquesguan b271488e8b [RISCV] Replace ISD::FP_EXTEND and ISD::FP_ROUND with RVV VL op.
This patch tries to solve the incoordination between the direct and intermediate  cast caused by D123975.
This patch replaces ISD::FP_EXTEND and ISD::FP_ROUND with RVV VL op in the lowering of FP scalable vector direct cast to unify with the intermediate cast.
And it also changes the FP widenning pattern with the VL op.

Differential Revision: https://reviews.llvm.org/D125364
2022-05-26 02:17:31 +00:00
Philip Reames 1f06398e96 Reapply "[RISCV] Enable strict assertions in InsertVSETVLI data flow"
be2cb8 fixes the case which triggered the revert.  Reapply, and let's see if anything else falls out.

Original commit message:

These asserts are believed to hold after several recent miscompiles have been fixed.  If you see an assertion failure on this change, please toggle the default back and make sure you file a bug with a reproducer.  We may have as yet uncaught miscompiles lurking in this code.

Differential Revision: https://reviews.llvm.org/D125271
2022-05-25 11:18:55 -07:00
Philip Reames be2cb824d0 [riscv] Remove mutation of prior vsetvli from insertion dataflow
This moves mutation entirely out of the main algorithm.

The immediate trigger is that we hit another case of the same issue I thought we'd fixed in 72925d9. It turns out we hadn't considered the cross block case.

As a brief summary, the issue being fixed is that if we mutate a previous vsetvli in phase 3, there's a possibility that some later use of that vsetvli changes "compatibility". In the cross_block_mutate test, this later vsetvli occurs in another block (and is thus visit order dependent too!). This causes us to fail strict asserts. (To be explicit, the current on by default workaround should compensate. It's only when we turn that off that we have problems.)

Now, I want to explicitly call out an alternate workaround. We could leave the mutation in phase 3, and simplify restrict it to the case where the previous vsetvli's GPR result is unused. That covers the case we've actually seen. (I'll note that codegen regressions with a simple form of this were significant. We might have to check specifically for the use outside block case to keep them reasonable, which complicates the workaround slightly.)

Personally, I'm at the point where I want the mutation pulled out just for robustness sake. I'm worried there's yet one more form of this bug we haven't thought about.

The other motivation for this change is that it does give us a couple of minor codegen wins. None appear to be hugely significant, but improvements never hurt right?

Differential Revision: https://reviews.llvm.org/D125270
2022-05-25 10:51:14 -07:00
Craig Topper 172149e98c [RISCV] Preserve fast math flags in lowerVPOp.
Update test to check MIR after finalize-isel instead of debug output.

This is of course not the only place we should preserve FMF, but
it's the most obvious one.

Reviewed By: frasercrmck

Differential Revision: https://reviews.llvm.org/D126306
2022-05-25 09:16:07 -07:00
Philip Reames 2a3b6f2cba [RISCV] Hoist VSETVLI vlmax, vtype out of scalable loops
This is a straight forward extension of the PRE transform introduced in D124869 to handle the VLMAX case.

The test changes here look quite positive. This surprised me until I realized that all the tests are using @llvm.vscale to figure out the VLMAX, not the llvm.riscv.vsetvlmax intrinsic. If they'd used the later, these would have been full redundancy cases and fully handled by the data flow. I'm not really sure if use of vscale here is representative or not. If it is, we should probably look at using VSETVLI to lower vscale rather than a raw read of vlenb and some math.

Differential Revision: https://reviews.llvm.org/D126338
2022-05-25 08:00:27 -07:00
Philip Reames dd336b6891 [RISCV] Restructure comment and add clarifying assert to getFrameIndexReference [NFC]
Differential Revision: https://reviews.llvm.org/D126088
2022-05-25 07:59:27 -07:00
Lewis Revill 29a5a7c6d4 [RISCV] Add pre-emit pass to make more instructions compressible
When optimizing for size, this pass searches for instructions that are
prevented from being compressed by one of the following:

1. The use of a single uncompressed register.
2. A base register + offset where the offset is too large to be
   compressed and the base register may or may not already be compressed.

In the first case, if there is a compressed register available, then the
uncompressed register is copied to the compressed register and its uses
replaced. This is only done if there are enough uses that code size
would be improved.

In the second case, if a compressed register is available, then the
original base register is copied and adjusted such that:

new_base_register = base_register + adjustment
base_register + large_offset = new_base_register + small_offset

and the uses of the base register are replaced with the new base
register. Again this is only done if there are enough uses for code size
to be improved.

This pass was authored by Lewis Revill, with large offset optimization
added by Craig Blackmore.

Differential Revision: https://reviews.llvm.org/D92105
2022-05-25 09:25:02 +01:00
Craig Topper 66db5312bd [RISCV] Fix vnsrl/vnsra isel patterns that are dropping VL.
We were incorrectly using VLMax instead of the passed VL.

Reviewed By: khchen, reames

Differential Revision: https://reviews.llvm.org/D126319
2022-05-24 21:38:59 -07:00
Fraser Cormack fd93736657 [RISCV] Replace untested code with assert
We found untested code where negative frame indices were ostensibly
handled despite it being in a block guarded by !MFI.isFixedObjectIndex.

While the implementation of MachineFrameInfo::isFixedObjectIndex
suggests this is possible (i.e., if a frame index was more negative - less than the
number of fixed objects), I couldn't find any test in tree -- for any
target -- where a negative frame index wasn't also a fixed object
offset. I couldn't find a way of creating such a object with the
public MachineFrameInfo creation APIs. Even
MachineFrameInfo::getObjectIndexBegin starts counting at the negative
number of fixed objects, so such frame indices wouldn't be covered by
loops using the provided begin/end methods.

Given all this, an assert that any object encountered in the block is
non-negative seems reasonable.

Reviewed By: StephenFan, kito-cheng

Differential Revision: https://reviews.llvm.org/D126278
2022-05-25 05:03:53 +01:00
Philip Reames 948d931323 [RISCV] Ensure the forwarded AVL register is alive
When the AVL value does not fit in 5 bits, the register in which this value is stored may be dead when we want to forward it. This patch ensure the kill flags on the register are cleared before forwarding.

Patch by: loralb
Differential Revision: https://reviews.llvm.org/D125971
2022-05-24 15:07:42 -07:00
Craig Topper d2ee2c9c8d [RISCV] Add an operand kind to the opcode/imm returned from RISCVMatInt.
Instead of matching opcodes to know the format to emit, use an
enum value that we can get from the RISCVMatInt::Inst class.

Change the consumers to use fully covered switches so that we get
a compiler warning if a new kind is added. With the opcode checks
it was easier to forget to update one of the 3 consumers.

Reviewed By: reames

Differential Revision: https://reviews.llvm.org/D126317
2022-05-24 14:56:29 -07:00
Philip Reames fb948572e0 [riscv] Use getFirstInstrTerminator [nfc] 2022-05-24 14:56:01 -07:00
Philip Reames a95ecb20bc [RISCV] Hoist VSETVLI out of idiomatic fixed length vector loops
This patch teaches the VSETVLI insertion pass to perform a very limited form of partial redundancy elimination. The motivating example comes from the fixed length vectorization of a simple loop such as:

for (unsigned i = 0; i < a_len; i++)
    a[i] += b;

Without this change, the core vector loop and preheader is as follows:

.LBB0_3:                                # %vector.ph
	andi	a1, a6, -8
	addi	a4, a0, 16
	mv	a5, a1
.LBB0_4:                                # %vector.body
                                        # =>This Inner Loop Header: Depth=1
	addi	a3, a4, -16
	vsetivli	zero, 4, e32, m1, ta, mu
	vle32.v	v8, (a3)
	vle32.v	v9, (a4)
	vadd.vx	v8, v8, a2
	vadd.vx	v9, v9, a2
	vse32.v	v8, (a3)
	vse32.v	v9, (a4)
	addi	a5, a5, -8
	addi	a4, a4, 32
	bnez	a5, .LBB0_4

The key thing to note here is that, the execution of the vsetivli only needs to happen once. Since there's no tail folding happening here, the value of the vector configuration registers are invariant through the loop.

After this patch, we hoist the configuration into the preheader and perform it once.

.LBB0_3:                                # %vector.ph
	andi	a1, a6, -8
	vsetivli	zero, 4, e32, m1, ta, mu
	addi	a4, a0, 16
	mv	a5, a1
.LBB0_4:                                # %vector.body
                                        # =>This Inner Loop Header: Depth=1
	addi	a3, a4, -16
	vle32.v	v8, (a3)
	vle32.v	v9, (a4)
	vadd.vx	v8, v8, a2
	vadd.vx	v9, v9, a2
	vse32.v	v8, (a3)
	vse32.v	v9, (a4)
	addi	a5, a5, -8
	addi	a4, a4, 32
	bnez	a5, .LBB0_4

Differential Revision: https://reviews.llvm.org/D124869
2022-05-24 14:56:01 -07:00
Craig Topper 415b9f595d Recommit "[RISCV] Use selectShiftMaskXLen ComplexPattern for isel of rotates."
This reverts commit dfe513ae1b.

Tests have been changed to avoid the type legalization bug being
fixed in D126036.

Original commit message:
This will remove masks on the shift amount. We usually get this with
SimplifyDemandedBits in DAGCombine, but that's restricted to cases
where the AND has a single use. selectShiftMaskXLen does not have
that restriction.
2022-05-24 09:41:04 -07:00
Fraser Cormack 08c9fb8447 [RISCV] Ensure the entire stack is aligned to the RVV stack alignment
This patch fixes another bug in the RVV frame lowering. While some frame
objects with non-default stack IDs (such scalable-vector alloca
instructions) are considered in the target-independent max alignment
calculations, others (for example, during calling-convention lowering)
are not. This means we'd occasionally align the base of the stack to
only 16 bytes, with no way to ensure that the RVV section contained
within that is aligned to anything higher.

Reviewed By: StephenFan

Differential Revision: https://reviews.llvm.org/D125973
2022-05-24 06:58:51 +01:00
Fraser Cormack cb8681a2b3 [RISCV] Fix RVV stack frame alignment bugs
This patch addresses several alignment issues in the stack frame when
RVV objects are taken into account.

One bug is that the RVV stack was never guaranteed to keep the alignment
of the stack *as a whole*. We must maintain a 16-byte aligned stack at
all times, especially when calling other functions. With the standard V
extension, this is conveniently happening since VLEN is at least 128 and
always 16-byte aligned. However, we support Zvl64b which does not
guarantee this. To fix this, the RVV stack size is rounded up to be
aligned to 16 bytes. This in practice generally makes us allocate a
stack sized at least 2*VLEN in size, and a multiple of 2.

    |------------------------------| -- <-- FP
    | 8-byte callee-save           | |      |
    |------------------------------| |      |
    | one VLENB-sized RVV object   | |      |
    |------------------------------| |      |
    | 8-byte local variable        | |      |
    |------------------------------| -- <-- SP (must be aligned to 16)

In the example above, with Zvl64b we are decrementing SP by 12 bytes
which does not leave SP correctly aligned. We therefore introduce an
extra VLENB-sized amount used for alignment. This would therefore ensure
the total stack size was 16 bytes (48 for Zvl128b, 80 for Zvl256b, etc):

    |------------------------------| -- <-- FP
    | 8-byte callee-save           | |      |
    |------------------------------| |      |
    | one VLENB-sized padding obj  | |      |
    | one VLENB-sized RVV object   | |      |
    |------------------------------| |      |
    | 8-byte local variable        | |      |
    |------------------------------| -- <-- SP

A new RVV invariant has been introduced in this patch, which is that the
base of the RVV stack itself is now always aligned to 16 bytes, not 8 as
before. This keeps us more in line with the scalar stack and should be
easier to reason about. The calculation of the RVV padding has thus
changed to be the amount required to align the scalar local variable
section to the RVV section's alignment. This amount is further rounded
up when setting up the initial stack to keep everything aligned:

    |------------------------------| -- <-- FP
    | 8-byte callee-save           |
    |------------------------------|
    |                              |
    | RVV objects                  |
    | (aligned to at least 16)     |
    |                              |
    |------------------------------|
    | RVV padding of 8 bytes       |
    |------------------------------|
    | 8-byte local variable        |
    |------------------------------| -- <-- SP

In the example above, it's clear that we need 8 bytes of padding to keep
the RVV section aligned to 16 when using SP. But to keep SP *itself*
aligned to 16 we can't decrement the initial stack pointer by 24 - we
have to round up to 32.

With the RVV section correctly aligned, the second bug fixed by
this patch is that RVV objects themselves are now correctly aligned. We
were previously only guaranteeing an alignment of 8 bytes, even if they
required a higher alignment. This is relatively simple and in practice
we see more rounding up of VLEN amounts to account for alignment in
between objects:

    |------------------------------|
    | RVV object (aligned to 16)   |
    |------------------------------|
    | no padding necessary         |
    |------------------------------|
    | 2*VLENB RVV object (align 16)|
    |------------------------------|
    | VLENB alignment padding      |
    |------------------------------|
    | RVV object (align 32)        |
    |------------------------------|
    | 3*VLENB alignment padding    |
    |------------------------------|
    | VLENB RVV object (align 32)  |
    |------------------------------| -- <-- base of RVV section

Note that a lot of the regressions in codegen owing to the new alignment
rules are correct but actually only strictly necessary for Zvl64b (and
Zvl32b but that's not really supported). I plan a follow-up patch to
take the known VLEN into account when padding for alignment.

Reviewed By: StephenFan

Differential Revision: https://reviews.llvm.org/D125787
2022-05-24 06:53:51 +01:00
Peter Waller ade47bdc31 [LV] Improve register pressure estimate at high VFs
Previously, `getRegUsageForType` was implemented using
`getTypeLegalizationCost`.  `getRegUsageForType` is used by the loop
vectorizer to estimate the register pressure caused by using a vector
type.  However, `getTypeLegalizationCost` currently only appears to
understand splitting and not scalarization, so significantly
underestimates the register requirements.

Instead, use `getNumRegisters`, which understands when scalarization
can occur (via computeRegisterProperties).

This was discovered while investigating D118979 (Set maximum VF with
shouldMaximizeVectorBandwidth), where under fixed-length 512-bit SVE the
loop vectorizer previously ends up costing an v128i1 as 2 v64i*
registers where it actually occupies 128 i32 registers.

I'm sending this patch early for comment, I'm still doing some sanity checking
with LNT.  I note that getRegisterClassForType appears to return VectorRC even
though the type in question (large vNi1 types) end up occupying scalar
registers. That might be worth fixing too.

Differential Revision: https://reviews.llvm.org/D125918
2022-05-23 07:57:45 +00:00
Paul Walker 258dac43d6 [SVE] Enable use of 32bit gather/scatter indices for fixed length vectors
Differential Revision: https://reviews.llvm.org/D125193
2022-05-22 12:32:30 +01:00
Fraser Cormack d60ae47f9d [RISCV] Fix logic for determining RVV stack padding
We must add padding when using SP or BP to access stack objects.
Checking whether we're missing FP is not sufficient as stack realignment
uses SP too. The test in D125962 explains the specific issue in more
detail.

Split from D125787.

Reviewed By: reames

Differential Revision: https://reviews.llvm.org/D125964
2022-05-20 13:18:52 +01:00
jacquesguan 8fc4fcecb8 [RISCV] Add VL patterns for vector widening floating-point fused multiply-add instructions.
This patch adds VL patterns for vector widening floating-point fused multiply-add instructions to support fixed length vector type.

Differential Revision: https://reviews.llvm.org/D124505
2022-05-20 06:56:48 +00:00
Craig Topper dfe513ae1b Revert "[RISCV] Use selectShiftMaskXLen ComplexPattern for isel of rotates."
This reverts commit 86f7d7074a.

The test cases added for this exposed an pre-existing bug that is failing
the expensive checks bot. Reverting so I can revert that patch.
2022-05-19 14:39:38 -07:00
Jay Foad 6bec3e9303 [APInt] Remove all uses of zextOrSelf, sextOrSelf and truncOrSelf
Most clients only used these methods because they wanted to be able to
extend or truncate to the same bit width (which is a no-op). Now that
the standard zext, sext and trunc allow this, there is no reason to use
the OrSelf versions.

The OrSelf versions additionally have the strange behaviour of allowing
extending to a *smaller* width, or truncating to a *larger* width, which
are also treated as no-ops. A small amount of client code relied on this
(ConstantRange::castOp and MicrosoftCXXNameMangler::mangleNumber) and
needed rewriting.

Differential Revision: https://reviews.llvm.org/D125557
2022-05-19 11:23:13 +01:00
Zi Xuan Wu (Zeson) 861489af1b [NFC][RISCV] Enable TuneNoDefaultUnroll feature to control targets which use default unroll preference
In RISCVTargetTransformInfo, enumerating the processor family is not a good way to predict.
Because it needs to enumerate many subtarget family and is hard to update if add new subtarget.
Instead, create a feature to distinguish whether targets want to use default unroll preference or not.

Keep TuneSiFive7 because it's flag to indicate subtarget family, which may used in other place.

Differential Revision: https://reviews.llvm.org/D125741
2022-05-19 12:21:49 +08:00
Craig Topper 86f7d7074a [RISCV] Use selectShiftMaskXLen ComplexPattern for isel of rotates.
This will remove masks on the shift amount. We usually get this with
SimplifyDemandedBits in DAGCombine, but that's restricted to cases
where the AND has a single use. selectShiftMaskXLen does not have
that restriction.
2022-05-18 10:23:29 -07:00
Philip Reames d4545e6fa0 Revert "[RISCV] Enable strict assertions in InsertVSETVLI data flow"
This reverts commit 79a66ec97b.

The stronger asserts served their purpose; I stumbled across another bug.  Will reapply once this one is also fixed.

The bug appears to be a variant of a previous one:
* We mutate an instruction in one block.
* That mutation changes the phase3 results of another block.

This is very similiar to a previous issue, except cross block instead of within a single block.
2022-05-17 15:53:13 -07:00
Philip Reames 118c5d1c97 [RISCV] Minor reorganization of VSETVLIInfo::operator== for readability [NFC] 2022-05-17 12:05:17 -07:00
Philip Reames 11a7e77c95 [RISCV] Canonicalize AVL=setvli to AVL=Imm or AVL=VLMAX
This patch adds a transform to the local prepass in InsertVSETVLI which canonicalizes an AVL of a register from another vsetvli into immediate or VLMAX when VTYPE is the same. In this patch, I chose to be conservative and avoid arbitrary vreg forwarding due to profitability concerns about possibility overlapping live ranges.

This has the effect of eliminating vsetvli instructions in loops which are walking either VLMAX or a constant number of lanes per iteration.

Differential Revision: https://reviews.llvm.org/D125812
2022-05-17 11:46:22 -07:00
Philip Reames 79a66ec97b [RISCV] Enable strict assertions in InsertVSETVLI data flow
These asserts are believed to hold after several recent miscompiles have been fixed.  If you see an assertion failure on this change, please toggle the default back and make sure you file a bug with a reproducer.  We may have as yet uncaught miscompiles lurking in this code.

Differential Revision: https://reviews.llvm.org/D125271
2022-05-17 11:12:31 -07:00
Fraser Cormack 8430b82741 [RISCV] Drop notion of "strict" vsetvli compatibility
With recent fixes to the dataflow in place, we now never pass
Strict=true to isCompatible, so remove the parameter completely.

Reviewed By: reames

Differential Revision: https://reviews.llvm.org/D125748
2022-05-17 15:24:23 +01:00
Fraser Cormack f00f894d5d [RISCV][NFC] Reword split SP adjustment comments 2022-05-17 10:03:21 +01:00
Fraser Cormack 05ad4d4f38 [RISCV][NFC] Fix comment typos in split SP adjustment 2022-05-17 09:56:54 +01:00
Philip Reames 1474880353 [RISCV] Use classic dataflow for VSETVLI insertion
Our current implementation of the InsertVSETVLI dataflow allows phase 3 to arrive at a different block end state than the data flow in phase 1/2 computed. This arises because a block which contains instructions (e.g. load or stores) which don't consume all the incoming bits of the VL/VTYPE can be compatible with multiple incoming states. The algorithm effectively changes the SEW on such instructions, and propagates the prior state forward. As phase 3 uses the block input state for this propagation, but phase 1/2 doesn't, this can result in different block end states.

If we don't correct for it, this discrepancy can result in miscompiles. This was the source of multiple recent bugs. However, by now we have fixes for all known correctness issues.

The basic strategy we use is to insert a compensation vsetvli to bring the block state leaving the block back into consistency with the one computed. This is correct, but results in extra vsetvlis being placed at the end of blocks.

This change adjusts the phase 1/2 algorithm to propagate the incoming block state through the block, allowing the compatibility rules to modify the end state. The algorithm may need to run slightly more iterations, but the end result is consistent with what phase 3 does.

The benefit of doing this is two fold.

First, we reverse some of the code quality introductions introduced in the functional fixes.

Second, we simplify the invariants, and allow the strict assertions to be enabled. Several humans, myself included, have found it quite surprising that invariant didn't hold already, and arguably that confusion is the cause of several of our recent miscompiles in this code.

The downside to this patch is that the dataflow may require additional iterations to stabilize. In the worse case, we go from O(Edges) to O(E + UniquePaths) as the incoming state (and thus the outgoing one) can now change once for each path from the entry block.

Differential Revision: https://reviews.llvm.org/D125232
2022-05-16 17:06:27 -07:00
Philip Reames 3d17c91709 [RISCV] Fix missing vsetvli in transparent block case
We've got a lurking problem with our data flow implementation where different phases disagree, resulting in possible miscompiles. D119518 introduced a workaround, but failed to consider blocks which only contain load/stores compatible with their incoming state.

When I went to rebase and simplify D125232, it turned out that not all of the correctness issues had been fixed yet after all. This is the correctness fix accidentally embedded in the original more complicated version.

Note that the test changes here are mostly regressions. It's worth noting that the simplified version of D125232 exactly reverses all the non-functional diffs in the test caused here. D125232 should be the immediate following commit.

Differential Revision: https://reviews.llvm.org/D125703
2022-05-16 17:06:27 -07:00
Philip Reames 7dbf2e7b57 Teach PeepholeOpt to eliminate redundant copy from constant physreg (e.g VLENB on RISCV)
The existing redundant copy elimination required a virtual register source, but the same logic works for any physreg where we don't have to worry about clobbers.  On RISCV, this helps eliminate redundant CSR reads from VLENB.

Differential Revision: https://reviews.llvm.org/D125564
2022-05-16 16:38:30 -07:00
Paul Walker 7dd05ba9ed [SelectionDAG] Remove duplicate "is scaled" information from gather/scatter SDNodes.
During early gather/scatter enablement two different approaches
were taken to represent scaled indices:

* A Scale operand whereby byte_offsets = Index * Scale
* An IndexType whereby byte_offsets = Index * sizeof(MemVT.ElementType)

Having multiple representations is bad as shown by this patch which
fixes instances where the two are out of sync. The dedicated scale
operand is more flexible and pervasive so this patch removes the
UNSCALED values from IndexType. This means all indices are scaled
but the scale can be one, hence unscaled. SDNodes now use the scale
operand to answer the "isScaledIndex" question.

I toyed with the idea of keeping the UNSCALED enums and helper
functions but because they will have no uses and force SDNodes to
validate the set of supported values I figured it's best to remove
them. We can re-add them if there's a real need. For similar
reasons I've kept the IndexType enum when a bool could be used as I
think being explicitly looks better.

Depends On D123347

Differential Revision: https://reviews.llvm.org/D123381
2022-05-16 20:47:52 +01:00
Philip Reames e2df48bb23 [RISCV] Add further trace output to InsertVSETLVI 2022-05-16 09:15:32 -07:00
Liqin.Weng d95513ae3a [RISCV] remove useless code
When legality check for vectoring reduction, hasVInstructions() check be unneeded. RISCV can only loop vectorization with hasVInstructions()

Reviewed By: kito-cheng, craig.topper

Differential Revision: https://reviews.llvm.org/D125460
2022-05-16 12:54:03 +00:00
jacquesguan a8426ada49 [RISCV][NFC] Replace for-each with array argument call.
This patch replaces some for-each set with the new arrayref argument API, since it already used an array in defination, I think this change won't cause any ambiguity.

Differential Revision: https://reviews.llvm.org/D125455
2022-05-16 02:12:48 +00:00
Zakk Chen 1878f240c9 [RISCV] Fix incorrect use of tail agnostic vslideup.
We need to use tail undisturbed for vslideup to implement
vector insert operation correctly.

Ideally, we cound use the tail agnostic when insert subvector
or element at the end of the vector. This will be in follow-up
patch.

Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D125545
2022-05-15 18:32:21 -07:00
Sheng c644488a8b Rename `MCFixedLenDisassembler.h` as `MCDecoderOps.h`
The name `MCFixedLenDisassembler.h` is out of date after D120958.

Rename it as `MCDecoderOps.h` to reflect the change.

Reviewed By: myhsu

Differential Revision: https://reviews.llvm.org/D124987
2022-05-15 08:44:58 +08:00
Craig Topper 5a19fbad83 [RISCV] Remove unneeded check for ISD::VSCALE operand being a constant. NFC
ISD::VSCALE only allows constant operands.
2022-05-14 13:45:03 -07:00
Roger Ferrer Ibanez 189ca6958e [RISCV] Use the new chain when converting a fixed RVV load
When building the final merged node, we were using the original chain
rather than the output chain of the new operation. After some collapsing
of the chain this could cause the loads be incorrectly scheduled respect
to later stores.

This was uncovered by SingleSource/Regression/C/gcc-c-torture/execute/pr36038.c
of the llvm testsuite.

https://reviews.llvm.org/D125560
2022-05-13 22:21:08 +00:00
Craig Topper a2918976cd Revert "[RISCV] Enable subregister liveness tracking for RVV."
This reverts most of ed242b54c9

I'm seeing failures in our intrinsic testing on qemu that seem
related to this. Reverting while I investigate.

I've left the command line option in place for directed testing.
It defaults to off.
2022-05-13 10:59:58 -07:00
Philip Reames 853fa8ee22 [RISCV] Address post-commit feedback from af5e09b 2022-05-13 09:51:23 -07:00
Philip Reames af5e09b7d9 [RISCV] Add llvm.read.register support for vlenb
This patch adds minimal support for lowering an read.register intrinsic with vlenb as the argument. Note that vlenb is an implementation constant, so it is never allocatable.

This was split off a patch to eventually replace PseudoReadVLENB with a COPY MI because doing so revealed a couple of optimization opportunities which really seemed to warrant individual patches and tests. To write those patches, I need a way to write the tests involving vlenb, and read.register seemed like the right testing hook.

Differential Revision: https://reviews.llvm.org/D125552
2022-05-13 09:12:02 -07:00
Zakk Chen 7dfc56c107 [RISCV] Add the passthru operand for RVV unmasked segment load IR intrinsics.
The goal is support tail and mask policy in RVV builtins.
We focus on IR part first.
If the passthru operand is undef, we use tail agnostic, otherwise
use tail undisturbed.

Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D125323
2022-05-13 02:16:40 -07:00
Philip Reames 52b5f1f7d4 [RISCV] Extend dataflow workaround from D119518 to fallthrough blocks
We've got a lurking problem with our data flow implementation where different phases disagree, resulting in possible miscompiles. D119518 introduced a workaround, but failed to consider blocks without terminators (e.g. fallthroughs).

I have a deeper rework of the algorithm in flight over in D125232, but this patch is specifically a minimal fix for an active miscompile. That change can be reworked over this once landed.

Differential Revision: https://reviews.llvm.org/D125408
2022-05-12 10:45:59 -07:00
Craig Topper 40e9654511 [RISCV] Use tail agnostic policy when selecting riscv_fma_vl to instructions
riscv_fma_vl doesn't have a tail, so use the tail_agnostic policy.

We were already doing this for some patterns. I think the patterns
with fneg and mask were added later and I copied the tail policy
from the unmasked patterns.

Reviewed By: khchen

Differential Revision: https://reviews.llvm.org/D125424
2022-05-12 09:09:24 -07:00
Craig Topper ed242b54c9 [RISCV] Enable subregister liveness tracking for RVV.
RVV makes heavy use of subregisters due to LMUL>1 and segment
load/store tuples. Enabling subregister liveness tracking improves the quality
of the register allocation.

I've added a command line that can be used to turn it off if it causes compile
time or functional issues. I used the command line to keep the old behavior
for one interesting test case that was testing register allocation.

Reviewed By: frasercrmck

Differential Revision: https://reviews.llvm.org/D125108
2022-05-11 12:49:03 -07:00
Craig Topper 5c7ec998a9 [RISCV] Fold addiw from (add X, (addiw (lui C1, C2))) into load/store address
This is a followup to D124231.

We can fold the ADDIW in this pattern if we can prove that LUI+ADDI
would have produced the same result as LUI+ADDIW.

This pattern occurs because constant materialization prefers LUI+ADDIW
for all simm32 immediates. Only immediates in the range
0x7ffff800-0x7fffffff require an ADDIW. Other simm32 immediates
work with LUI+ADDI.

Reviewed By: asb

Differential Revision: https://reviews.llvm.org/D124693
2022-05-11 12:47:13 -07:00
Craig Topper f499ec6b3d [RISCV] Add caching to the gather/scatter to strided load/store conversion.
If we have multiple gather/scatter instructions using the same the
same strided address we would scalarize it multiple times. I guess
a later pass cleans this up, but I don't know if that's guaranteed.

This patch adds a cache to remember the scalarization we already
created for a previous gather/scatter.

Reviewed By: frasercrmck

Differential Revision: https://reviews.llvm.org/D125326
2022-05-11 11:47:27 -07:00
Craig Topper 09f48c6b80 [RISCV] Move implementation of getVLOpNum and getSEWOpNum from RISCVInsertVSETVLI to RISCVBaseInfo.h. NFC
We should consolidate the operand counting and ordering into
RISCVBaseInfo.h and stop spreading it around.

Reviewed By: asb

Differential Revision: https://reviews.llvm.org/D125344
2022-05-11 11:14:58 -07:00
Craig Topper 0ebb02b90a [RISCV] Override TargetLowering::shouldProduceAndByConstByHoistingConstFromShiftsLHSOfAnd.
This hook determines if SimplifySetcc transforms (X & (C l>>/<< Y))
==/!= 0 into ((X <</l>> Y) & C) ==/!= 0. Where C is a constant and
X might be a constant.

The default implementation favors doing the transform if X is not
a constant. Otherwise the code is left alone. There is a provision
that if the target supports a bit test instruction then the transform
will favor ((1 << Y) & X) ==/!= 0. RISCV does not say it has a variable
bit test operation.

RISCV with Zbs does have a BEXT instruction that performs (X >> Y) & 1.
Without Zbs, (X >> Y) & 1 still looks preferable to ((1 << Y) & X) since
we can fold use ANDI instead of putting a 1 in a register for SLL.

This patch overrides this hook to favor bit extract patterns and
otherwise falls back to the "do the transform if X is not a constant"
heuristic.

I've added tests where both C and X are constants with both the shl form
and lshr form. I've also added a test for a switch statement that lowers
to a bit test. That was my original motivation for looking at this.

Reviewed By: asb

Differential Revision: https://reviews.llvm.org/D124639
2022-05-11 11:13:17 -07:00
Craig Topper 0781742785 [RISCV] Add a DAG combine to pre-promote (i32 (and (srl X, Y), 1)) with Zbs on RV64.
Type legalization will want to turn (srl X, Y) into RISCVISD::SRLW,
which will prevent us from using a BEXT instruction.

I don't think there is any precedent for type promotion checking
users to decide how to promote. Instead, I've added this DAG combine to
do it before type legalization.

Reviewed By: asb

Differential Revision: https://reviews.llvm.org/D124109
2022-05-11 10:49:16 -07:00
Philip Reames 72925d98bf [riscv] Canonicalize vsetvli (vsetvli avl, vtype1) vtype2 transitionsas reviewed
This patch is an alternative to a piece of D125270. If we have one vsetvli which is using as AVL the output of another, and the prior AVL can be proven to produce the same VL value as that defining one, we can use the AVL from the prior instruction. This has the effect of removing a state transition on AVL, and will let us use the cheaper 'vsetvli x0, x0, vtype1' form or possible even skip emitting it entirely.

This builds on the same infrastructure as D125337, and does the analogous extension to working on abstract states instead of only prior explicit vsetvli instructions. This is where the (relatively minor) code improvements come from.

More importantly, this fixes the last case where the state computed in phase 1 and 2 of the algorithm differs from the state computed during phase 3. Note that such differences can cause miscompiles by creating disagreements about contents of the VL and VTYPE registers at block boundaries.

Doing this transform inside the dataflow can cause the compatibility of a later store to change with regards to the current state. test15 in the diff illustrates this case well. What we have is a vsetvli which is mutated by one following vector op, but whose GPR result is used by another. The compatibility logic walks back to the def in this case, and checks to see if it matches the immediate prior state. In phase 1 and 2, it doesn't, and in phase 3 (after mutation) it does because we remove a transition which caused it to differ.

Differential Revision: https://reviews.llvm.org/D125392
2022-05-11 10:45:29 -07:00
Philip Reames cc0283a635 [riscv] Prefer to use previous VL for scalar move instructionsK
This patch is an alternative to a piece of D125270. Its direct motivation is to fix a wrong code bug (described below), but somewhat unexpectedly, it also results in a significant code quality improvement for idiomatic fixed length vector patterns.

The existing transform is simply wrong in its current location. We are correct about the fact that the scalar move itself can use the previous vsetvli, but we loose track of the fact that later instructions might depend on the state change represented. That is, the actual value of VL in the register is different than the abstract state thinks it is. Not simply due to precision of modeling, but e.g. the VL register could contain 3 when the abstract state says it is 1. This is annoying hard to demonstrate in practice due to differences in policy flags on the intrinsics, but this is at least a latent wrong code bug.

The code quality benefit comes from the fact we don't need to tie this to explicit vsetvli instructions at all. We can propagate the abstract state, and reduce a) the number of transitions, or b) the cost of those transitions. It turns out we have a bunch of cases - in tests at least - where fixed length AVLs are known non-zero, and we can leave VL unchanged while changing VTYPE.

Differential Revision: https://reviews.llvm.org/D125337
2022-05-11 07:37:50 -07:00
Fraser Cormack 27c7e922fe [RISCV][NFC] Rename variable to appease code style 2022-05-11 12:41:25 +01:00
Fraser Cormack 874b802a6d [RISCV][NFC] Move variable down closer to its first use 2022-05-11 12:33:01 +01:00
Fraser Cormack c1d48b35d8 [SelectionDAG][VP] Rename VP sext/zext/trunc ISD opcodes
Rather than VP_SEXT/VP_ZEXT/VP_TRUNC, having
VP_SIGN_EXTEND/VP_ZERO_EXTEND/VP_TRUNCATE better matches their non-VP
counterparts.

Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D125298
2022-05-11 10:25:51 +01:00
Yeting Kuo 4537aae0d5 [RISCV] Make PseudoReadVL have the vtypes of the corresponding VLEFF/VLSEGFF.
The patch make PseudoReadVL have the vtypes of the corresponding VLEFF/VLSEGFF.
It's useful to get the vtypes of locations of PseudoReadVL without finding the
corresponding VLEFF/VLSEGFF.
It could simplify optimizations in RISCVInsertVSETVLI like D123581.

Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D125199
2022-05-11 14:07:58 +08:00
jacquesguan 2509dcd58a [RISCV] Add rvv codegen support for vp.fpext.
This patch adds rvv codegen support for vp.fpext. The lowering of fp_round, vp.fptrunc, fp_extend and vp.fpext share most code so use a common lowering function to handle these four.
And this patch changes the intermediate cast from ISD::FP_EXTEND/ISD::FP_ROUND to the RVV VL version op RISCVISD::FP_EXTEND_VL and RISCVISD::FP_ROUND_VL for scalable vectors.

Reviewed By: frasercrmck

Differential Revision: https://reviews.llvm.org/D123975
2022-05-11 03:28:25 +00:00
Philip Reames 7731935ffc [riscv] Consolidate logic for SEW/VL operand offset calculations [nfc] 2022-05-10 15:06:26 -07:00
Philip Reames 413052310a [riscv] Minor style cleanup so that code more obviously matches comments [nfc] 2022-05-10 14:20:26 -07:00
Fraser Cormack 0b2e7a7c72 [RISCV][NFC] Remove else after continue 2022-05-10 11:15:50 +01:00
Fraser Cormack 3b9a231d25 [RISCV] Remove two unmasked RVV patterns
These can be selected to unmasked from masked instructions by the
post-process DAG step.

Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D125239
2022-05-09 16:54:24 +01:00
Philip Reames 70ad96ca5e [riscv, InsertVSETVLI] Rename InstrInfo to Require to more clearly indicate purpose [nfc] 2022-05-09 06:40:33 -07:00
Philip Reames 7ed16e7c51 [riscv] Fix state tracking bug on vsetvli (phi of vsetvli) peephole
This fixes the first of several cases where the state computed in phase 1 and 2 of the algorithm differs from the state computed during phase 3. Note that such differences can cause miscompiles by creating disagreements about contents of the VL and VTYPE registers at block boundaries.

In this particular case, we recognize that for the first vsetvli in a block, that if the AVL is a phi of GPR results from previous vsetvlis and the VTYPE field matches, we can avoid emitting a vsetvli as the register contents don't change. Unfortunately, the abstract state does change and that update was lost.

As noted in the test change, this can actually improve results by preserving information until later state transitions in the block. However, this minor codegen improvement is not the motivation for the patch. The motivation is to avoid cases a case where we break a key internal correctness invariant.

Differential Revision: https://reviews.llvm.org/D125133
2022-05-09 06:21:45 -07:00
Philip Reames c7c3f58544 [riscv] Use early return to reduce nesting for InsertVSETVLI [nfc] 2022-05-06 13:10:05 -07:00
Philip Reames 99a41005fe [riscv] Add early return to InsertVSETLI fixed point step [nfc]
If the income state hasn't changed, and the step function is fixed by assumption, then the output state can't have changed.

In the current algorithm, this is a very minor win and mostly allows adding tracing output without being horrible verbose.
2022-05-06 13:08:11 -07:00
Philip Reames dee9b01d83 [riscv] Add some minimal tracing output to InsertVSETVLI
Only available with -debug.  Main purpose is simplifying an upcoming change, and providing tools for debugging problems.
2022-05-06 13:08:11 -07:00
Philip Reames f486119ce9 [riscv] Add strict asserts for VSETVLI insertion algorithm to help catch bugs
This assertion should hold for any reasonable data flow algorithm, but is known not to in several cases today. I'd like to go ahead and land this off-by-default, so that we can collaborate on fixes and have a common definition of success.

Differential: https://reviews.llvm.org/D125035
2022-05-06 10:28:22 -07:00
wangpc 4ff5e8184c [RISCV] Enable MachineOutliner by default under -Oz for RISCV
Enable default outlining when the function has the minsize attribute.

`addr-label.ll` crashed after enabling this, so a barrier is added before
instruction selection as a workaround.

Reviewed By: luismarques

Differential Revision: https://reviews.llvm.org/D122213
2022-05-06 17:37:45 +08:00
Philip Reames 042a7a5f0d [riscv] Use X0 for destination of VSETVLI instruction if result unused
If the GPR destination register of a VSETVLI instruction is unused, we can replace it with X0. This discards the result, and thus reduces register pressure.

Since after the core insertion/lowering algorithm has run, many user written VSETVLIs will have their GPR result unused (as VTYPE/VLEN is now explicitly read instead), this kicks in for most tests which involve a vsetvli intrinsic for fixed length vectorization. (vscale vectorization generally uses the GPR result to know how far to e.g. advance pointers in a loop and these uses are not removed.)  When inserting VSETVLIs to lower psuedos, we prefer the X0 form anyways.

Differential Revision: https://reviews.llvm.org/D124961
2022-05-05 07:39:45 -07:00
Lian Wang 8bb10436ab [RISCV][NFC] Use true_mask replace riscv_vmset_vl in defined patterns.
Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D124660
2022-05-05 03:05:52 +00:00
Craig Topper 60cb489685 [RISCV] Use movImm went multiplying by simm12 in getVLENFactoredAmount.
No reason to special case simm12, movImm handles all immediates.

This also fixe a bug that we weren't passing the frame-setup/destroy
flag to movImm when we were calling it.
2022-05-04 17:23:22 -07:00
Philip Reames 18ed2ee80c [RISCV] Add a version of insertVSETVLI which uses an iterator [NFC]
This is to simplify the final version of D124869.
2022-05-04 14:48:31 -07:00
Craig Topper 411bb42eed [RISCV] Add a special case to treat riscv-v-vector-bits-min=-1 as meaning use Zvl*b value.
riscv-v-vector-bits-min is primarily used to opt-in to the
autovectorizer. The vector width can be determined from Zvl*b.

This patch adds support treating -1 as meaning use Zvl*b so we can
still opt-in to autovectorization without needing to repeat a
vector width already given by Zvl*b or -mcpu.

Reviewed By: reames

Differential Revision: https://reviews.llvm.org/D124960
2022-05-04 14:26:45 -07:00
Craig Topper 1d6430b9e2 [RISCV] Update isLegalAddressingMode for RVV.
RVV instructions only support base register addressing.

Reviewed By: reames

Differential Revision: https://reviews.llvm.org/D124820
2022-05-03 19:49:11 -07:00
Craig Topper 9cce9a126c [RISCV] Make use of SHXADD instructions in RVV spill/reload code.
We can use SH1ADD, SH2ADD, SH3ADD to multipy by 3, 5, and 9 respectively.

We could extend this to 3, 5, or 9 multiplied by a power 2 by also
emitting a SLLI.

Reviewed By: reames

Differential Revision: https://reviews.llvm.org/D124824
2022-05-03 19:35:21 -07:00
Craig Topper 0971819740 [RISCV] Don't lookup TII in RISCVInstrInfo::getVLENFactoredAmount. NFCI
We're already inside of our implementation of TII.
2022-05-03 19:35:21 -07:00
Weverything 5afd20806d [riscv] Mark function as used to avoid unused warning. 2022-05-03 18:51:23 -07:00
Philip Reames 2982d0032b Fix a buildbot warning [nfc] 2022-05-03 14:40:27 -07:00
Philip Reames be50b8c185 [riscv] Add debug printing support for VSETVLIInfo class [nfc] 2022-05-03 14:00:17 -07:00
Hsiangkai Wang eaaa31ff2c [RISCV][TargetLowering] Special case overflow expansion for (uaddo X, C).
Follow-up to D122933.

Differential Revision: https://reviews.llvm.org/D124374
2022-05-03 03:51:36 +00:00
Craig Topper 72a66358f6 [RISCV] Add isCommutable to FADD/FMUL/FMIN/FMAX/FEQ.
Reviewed By: arcbbb

Differential Revision: https://reviews.llvm.org/D123972
2022-05-02 20:21:16 -07:00
Zakk Chen 5807e59a0a [RISCV] Fix incorrect codegen for masked vmsge{u}.vx with mask agnostic.
The result was totally wrong.
We could use mask undisturbed result to emulate the mask agnostic result.

Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D124684
2022-05-02 17:57:29 -07:00
Fangrui Song 2019c9b1c8 [RISCV] Lower case the first letter of LowerRISCVMachineOperandToMCOperand. NFC 2022-05-01 14:13:55 -07:00
luxufan e098281c27 [RISCV] Don't getDebugLoc for the end node of MBB iterator
Because of shrink wrapping, the block to insert epilog may don't have
instructions (Only debug instructions). And the position to insert may
point to MBB.end() that don't have a DebugLoc. This patch fix this
problem.

The test program was copied from the issue:https://github.com/llvm/llvm-project/issues/53662

Reviewed By: luismarques

Differential Revision: https://reviews.llvm.org/D123679
2022-04-30 16:00:20 +08:00
Yeting Kuo c069e37019 [RISCV] Add DAGCombine to fold base operation and reduction.
Transform (<bop> x, (reduce.<bop> vec, splat(neutral_element))) to
(reduce.<bop> vec, splat (x)).

Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D122563
2022-04-30 14:07:05 +08:00
Craig Topper f91690f7db [RISCV] Don't merge addi into load/store address if addi has a FrameIndex operand.
This fixes a crash from D124231.

We can't fold
  (load (add base, (addi src, off1)), off2)
     -> (load (add base, src), off1+off2)
if the src is a FrameIndex. FrameIndex cannot be the operand of an
add.

There was an immediate==0 check that I think was trying to catch
the common case of FrameIndex addis where the immediate is 0, but
they can also appear in non-zero form. Instead explicitly check
for a FrameIndex operand.
2022-04-29 18:22:20 -07:00
Craig Topper 5aa1a7b307 [RISCV] Remove 'frameindex' from list for ComplexPattern. NFC
Putting a node in this list allows the node to be used as the root
of an isel pattern that would then call the ComplexPattern. The
usual case is to use the ComplexPattern as the operand of another
operator.

AddrFI is never used as a root operation. frameindex is handled
directly with custom code in RISCVISelDAGToDAG::Select. So adding
frameindex to the list here serves no purpose.
2022-04-29 17:41:07 -07:00
Philip Reames 3ea191ed03 [RISCV] Factor repeating code into getMaskTypeFor(VT) [nfc] 2022-04-29 10:00:57 -07:00
Philip Reames f927be0df8 [RISCV] Extract getAllOnesMask helper [nfc] 2022-04-29 09:30:18 -07:00
Craig Topper 5c38373125 [RISCV] Improve constant materialization for cases that can use LUI+ADDI instead of LUI+ADDIW.
It's possible that we have a constant that isn't simm32 so we can't
use LUI+ADDIW, but we can use LUI+ADDI. Because ADDI uses a sign
extended constant, it's possible that after subtracting it out, we
end up with a simm32 that maps to LUI.

This patch detects this case after removing Lo12 and before shifting
the value for SLLI.

Reviewed By: luismarques

Differential Revision: https://reviews.llvm.org/D124222
2022-04-29 08:58:32 -07:00
LiaoChunyu 03a3654203 [RISCV] Add cost model for SK_Broadcast
Add cost model for broadcast shuffle in RISCVTTIImpl::getShuffleCost
with scalable vector. The cost model might not the best.

For scalable vector, BasicTTIImpl::getShuffleCost return invalid cost,
so this patch relies on the existing cost model in BasicTTIImpl.

Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D124101
2022-04-29 13:28:02 +08:00
Hsiangkai Wang c62b014db9 [RISCV] Merge addi into load/store as there is a ADD between them
This patch adds peephole optimizations for the following patterns:

(load (add base, (addi src, off1)), off2)
   -> (load (add base, src), off1+off2)
(store val, (add base, (addi src, off1)), off2)
   -> (store val, (add base, src), off1+off2)

Differential Revision: https://reviews.llvm.org/D124231
2022-04-29 04:33:05 +00:00
Craig Topper ec11fbb1d6 [RISCV] Use default promotion for (i32 (shl 1, X)) on RV64 when Zbs is enabled.
This improves opportunities to use bset/bclr/binv. Unfortunately,
there are no W versions of these instrcutions so this isn't always
a clear win. If we use SLLW we get free sign extend and shift masking,
but need to put a 1 in a register and can't remove an or/xor. If
we use bset/bclr/binv we remove the immediate materializationg and
logic op, but might need a mask on the shift amount and sext.w.

Reviewed By: luismarques

Differential Revision: https://reviews.llvm.org/D124096
2022-04-28 09:58:30 -07:00
Craig Topper 8631a5e712 [RISCV] Fix alias printing for vmnot.m
By clearing the HasDummyMask flag from mask register binary operations
and mask load/store.

HasDummyMask was causing an extra operand to get appended when
converting from MachineInstr to MCInst. This extra operand doesn't
appear in the assembly string so was mostly ignored, but it prevented
the alias instruction printing from working correctly.

Reviewed By: arcbbb

Differential Revision: https://reviews.llvm.org/D124424
2022-04-28 08:33:52 -07:00
Lian Wang dc0ae8ce18 [RISCV] Support VP_SETCC mask operations
Support VP_SETCC mask operations, turn it to logical operation.

Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D124438
2022-04-28 08:52:29 +00:00
Craig Topper c2614b31d9 [RISCV] Add isCommutable to scalar FMA instructions.
The default implementation of findCommutedOpIndices picks the
first two source operands. That's exactly what we want for the
scalar FMA instructions.

Reviewed By: reames

Differential Revision: https://reviews.llvm.org/D124463
2022-04-27 11:07:18 -07:00
Jim Lin 9de7b93bc0 [RISCV][NFC] Update and add missing closed curly bracket comment in RISCVInstrInfoZb.td 2022-04-27 15:08:51 +08:00
ShihPo Hung 6b55f133fb [RISCV][RVV] Select unmasked TU RVV pseudos in a DAG post-process
Following D118810 that reduced the size of ISel table,
this patch optimizes allone-masked RVV pseudos with TU policy and
swap them out to their unmasked TU pseudos.

Since the UNDEF merge operand is not preserved, we turn it into TA
pseudo regardless of the policy operand.

Reviewed By: craig.topper, frasercrmck
Differential Revision: https://reviews.llvm.org/D121881
2022-04-26 20:14:54 -07:00
Vasileios Porpodas fa8a9fea47 Recommit "[SLP][TTI] Refactoring of `getShuffleCost` `Args` to work like `getArithmeticInstrCost`"
This reverts commit 6a9bbd9f20.

Code review: https://reviews.llvm.org/D124202
2022-04-26 14:02:40 -07:00
Shao-Ce SUN c59473aacc [NFC][RISCV][CodeGen] Use ArrayRef in TargetLowering functions
Based on D123467.

Reviewed By: reames

Differential Revision: https://reviews.llvm.org/D123653
2022-04-26 23:53:00 +08:00
Craig Topper 40f1af4760 [RISCV] Add isCommutable to ADD/ADDW/MUL/AND/OR/XOR/MIN/MAX/CLMUL
Reviewed By: reames

Differential Revision: https://reviews.llvm.org/D123970
2022-04-25 10:53:41 -07:00
Zakk Chen ffe03ff75c [RISCV] Fix incorrect policy implement for unmasked vslidedown and vslideup.
vslideup works by leaving elements 0<i<OFFSET undisturbed.
so it need the destination operand as input for correctness
regardless of policy. Add a operand to indicate policy.

We also add policy operand for unmaksed vslidedown to keep the interface consistent with vslideup
because vslidedown have only undisturbed at 0<i<vstart but user have no way to control of vstart.

Reviewed By: rogfer01, craig.topper

Differential Revision: https://reviews.llvm.org/D124186
2022-04-25 09:18:41 -07:00
wangpc 7a21a0525a [RISCV] Add sched to pseudo function call instructions
To fix llvm-mca's error of 'found an unsupported instruction
in the input assembly sequence.' caused by the lack of
scheduling info.

Pseudo function call instructions will be expanded to `auipc`
and `jalr`, so their scheduling info are the combination of
two.

Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D123578
2022-04-24 14:58:18 +08:00
Mohammed Nurul Hoque 5dd99f71aa [RISCV] transform MI to W variant to remove sext.w
Backwards search
The sext.w removal pass (before the new patch) checks if the input to sext.w is already in sign-extended form, so it can eliminate it. It does that by checking every definition/source that reaches the sext.w is an instruction that produces a sign-extended value, either by definition (e.g. ADDW), or it propagates sign-extension (e.g. OR) so we check its sources recursively.

Forward search
Sometimes, one of the sources is an instruction that doesn't always produce a sign-extended value, but it has a W-version that does (e.g. ADD / ADDW). If we transform the ADD to ADDW, the sext.w can be removed (assuming other def paths are satisfied), but this transformation is sound only if every use of this ADD/W only reqruires the lower 32-bits either directly (like sll %x, 32) or they propagate dependency (lower word of output only depends on lower word of input) so we check its uses recursively.

When searching backwards, if an instruction that can be replaced with W-variant is encountered, this pass runs the forward search to verify it can be replaced, then adds it to a list of fixable instructions. After verifying all paths, it replaces the instruction and removes the sext.w.

Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D119928
2022-04-22 10:59:26 -07:00
Fraser Cormack 98db7ea262 [RISCV][NFC] Adjust some formatting in VL patterns 2022-04-22 17:19:27 +01:00
Fraser Cormack 2b0fedc2dd [RISCV] Print human-readable VTYPE/SEW/LMUL in MIR
This patch adds custom MIR operand comments to VTYPE immediate operands
in VSETVLI instructions and SEW/LMUL operands in vector codegen pseudo
instructions. The result is intended to be more human-readable and
hopefully maintainable when working with MIR, particularly when
writing or reading test cases.

Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D124187
2022-04-22 17:13:18 +01:00
wangpc 5c3ea07848 [RISCV] Do not outline CFI instructions when they are needed in EH
We saw a failure caused by unwinding with incomplete CFIs, so we
can't outline CFI instructions when they are needed in EH.

This is a recommit of 0d40688, which was reverted in ce83883 as
related precommit test 360d44e caused some errors.

Reviewed By: luismarques

Differential Revision: https://reviews.llvm.org/D122634
2022-04-22 12:28:19 +08:00
Ping Deng 7493d9ffb6 [RISCV][NFC] Use defvar to simplify pattern definations.
Reviewed By: jacquesguan, frasercrmck

Differential Revision: https://reviews.llvm.org/D123839
2022-04-22 02:45:14 +00:00
Craig Topper 9534811aa8 [RISCV] Teach generateInstSeqImpl to generate BSETI for single bit cases.
If the immediate has one bit set, but isn't a simm32 we can try
the BSETI instruction from Zbs.
2022-04-21 12:08:34 -07:00
Craig Topper 98b866892d [RISCV] Add special case to constant materialization to remove trailing zeros first.
If there are fewer than 12 trailing zeros, we'll try to use an ADDI
at the end of the sequence. If we strip trailing zeros and end the
sequence with a SLLI we might find a shorter sequence.

Differential Revision: https://reviews.llvm.org/D124148
2022-04-21 09:43:32 -07:00
wangpc ce83883691 Revert "[RISCV] Do not outline CFI instructions when they are needed in EH"
This reverts commit 0d40688925.
2022-04-21 16:23:10 +08:00
wangpc 0d40688925 [RISCV] Do not outline CFI instructions when they are needed in EH
We saw a failure caused by unwinding with incomplete CFIs, so we
can't outline CFI instructions when they are needed in EH.

Reviewed By: luismarques

Differential Revision: https://reviews.llvm.org/D122634
2022-04-21 16:13:22 +08:00
Fraser Cormack 3e678cb772 [RISCV] Don't emit fractional VIDs with negative steps
We can't shift-right negative numbers to divide them, so avoid emitting
such sequences. Use negative numerators as a proxy for this situation, since
the indices are always non-negative.

An alternative strategy could be to add a compiler flag to emit division
instructions, which would at least allow us to test the VID sequence
matching itself.

Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D123796
2022-04-21 07:00:34 +01:00
Craig Topper 186d5c8af5 [RISCV] Make getInstSeqCost handle other Zb* instructions.
We haven't been updating this as Zb* instructions have been used
for immediate materialization. They will hit the default case and
trigger an llvm_unreachable. Instead of trying to list them all,
assume instructions that aren't explicitly listed aren't compressible.

Spotted while looking at integer materialization for other reasons.
I haven't seen a crash from this yet.
2022-04-20 22:08:04 -07:00
Craig Topper 6db0afb44e [RISCV] Fold (xor (sllw 1, x), -1) -> (rolw ~1, x).
There's an existing generic combine that does this for legal types.
This patch adds a RISCV specific combine for W instructions.

Reviewed By: luismarques

Differential Revision: https://reviews.llvm.org/D123983
2022-04-19 15:03:43 -07:00
Fraser Cormack c5cac48549 [RISCV] Fix lowering of BUILD_VECTORs as VID sequences
This patch fixes a bug when lowering BUILD_VECTOR via VID sequences.
After adding support for fractional steps in D106533, elements with zero
steps may be skipped if no step has yet been computed. This allowed
certain sequences to slip through the cracks, being identified as VID
sequences when in fact they are not.

The fix for this is to perform a second loop over the BUILD_VECTOR to
validate the entire sequence once the step has been computed. This isn't
the most efficient, but on balance the code is more readable and
maintainable than doing back-validation during the first loop.

Fixes the tests introduced in D123785.

Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D123786
2022-04-19 07:43:38 +01:00
jacquesguan 25445b94db [RISCV] Add rvv codegen support for vp.fptrunc.
This patch adds rvv codegen support for vp.fptrunc. The lowering of fp_round and vp.fptrunc share most code so use a common lowering function to handle those two, similar to vp.trunc.

Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D123841
2022-04-19 01:56:18 +00:00
Lian Wang 545d353b3c [RISCV][NFC] Refactor VL patterns for vnsrl and vnsra
Reviewed By: frasercrmck

Differential Revision: https://reviews.llvm.org/D123274
2022-04-15 07:42:59 +00:00
jacquesguan 1aa4f0bb6c [RISCV][VP] Add RVV codegen for vp.trunc.
Differential Revision: https://reviews.llvm.org/D123579
2022-04-15 02:29:53 +00:00
Lian Wang 3100893f63 [RISCV] Remove sext_inreg+riscv_grev/riscv_gorc isel patterns
Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D123565
2022-04-14 08:16:32 +00:00
Lian Wang 38706dd940 [RISCV][NFC] Refactor patterns for Multiply Add instructions
Reviewed By: craig.topper, frasercrmck

Differential Revision: https://reviews.llvm.org/D123355
2022-04-14 08:00:00 +00:00
wangpc d0828c5af9 [RISCV][NFC] Use addExpr() instead of createExpr()
It seems to be neater.

Reviewed By: asb

Differential Revision: https://reviews.llvm.org/D123675
2022-04-14 10:48:25 +08:00
Liqin Weng 8265679018 [RISCV][NFC] Refactor the type promotion of fsl/fsr/becompress/bdecompress/bfp
Reviewed By: asb, jrtc27, craig.topper, frasercrmck

Differential Revision: https://reviews.llvm.org/D123181
2022-04-13 08:52:04 +00:00
Craig Topper 057c063c9b [RISCV] Add a encodeLMUL function to RISCVVType. NFC
This moves the encoding handling out of the assembly parser.

Reviewed By: khchen, frasercrmck

Differential Revision: https://reviews.llvm.org/D123553
2022-04-12 13:39:47 -07:00
Craig Topper 2ce2562876 [RISCV][SelectionDAG] Add a hook to sign extend i32 ConstantInt operands of phis on RV64.
Materializing constants on RISCV is simpler if the constant is sign
extended from i32. By default i32 constant operands of phis are
zero extended.

This patch adds a hook to allow RISCV to override this for i32. We
have an existing isSExtCheaperThanZExt, but it operates on EVT which
we don't have at these places in the code.

Reviewed By: efriedma

Differential Revision: https://reviews.llvm.org/D122951
2022-04-11 14:38:39 -07:00
Craig Topper 76192182d0 [RISCV] Remove riscv-v-fixed-length-vector-elen-max command line option.
This was added before Zve extensions were defined. I think users
should use Zve32x or Zve32f now. Though we will lose support for limiting
ELEN to 16 or 8, but I hope no one was using that.

Reviewed By: frasercrmck

Differential Revision: https://reviews.llvm.org/D123418
2022-04-11 10:14:48 -07:00
Craig Topper c266e50430 [RISCV] Remove ExtZvl enum from RISCVSubtarget. NFC
Having an enum with names that contain the string representation
of their value doesn't add any value. We can just use the numbers.

Reviewed By: kito-cheng, frasercrmck

Differential Revision: https://reviews.llvm.org/D123417
2022-04-11 10:01:17 -07:00
LiaoChunyu 505fce5a9e [RISCV] Add basic code modeling for llvm.experimental.stepvector intrinsic
Scalable vectors llvm.experimental.stepvector intrinsic
will crash due to an invalid cost when run the code through the loopunroll.

Reviewed By: kito-cheng

Differential Revision: https://reviews.llvm.org/D122782
2022-04-11 10:19:23 +08:00
Craig Topper 4e561a581f [RISCV] Remove unnecessary cast to i8* when converting gather/scatter to strided load/store.
Not sure why I thought this necessary at the time.
2022-04-09 20:05:03 -07:00
Craig Topper 70046438d0 [RISCV] Only try LUI+SH*ADD+ADDI for int materialization if LUI+ADDI+SH*ADD failed.
There's an assert in LUI+SH*ADD+ADDI materialization that makes sure the
lower 12 bits aren't zero since that case should have been handled as
LUI+ADDI+SH*ADD. But nothing prevented the LUI+SH*ADD+ADDI checks from
running after the earlier code handled it.

The sequence would be the same length or longer so it wouldn't replace
the earlier sequence, but the assert happened before that was checked.

The vector holding the sequence also wasn't reset before the second
check so that guaranteed the sequence would never be found to be
shorter.

This patch fixes this by only trying the second expansion when the
earlier fails.

Fixes PR54812.

Reviewed By: benshi001

Differential Revision: https://reviews.llvm.org/D123406
2022-04-09 08:52:15 -07:00
Fraser Cormack 34e1b4774a [RISCV] Select unmasked FP setcc insts via ISel post-process
Similar to D123217 but for the floating-point patterns. No change in
generated output, while reducing the generated table size.

Reviewed By: arcbbb

Differential Revision: https://reviews.llvm.org/D123291
2022-04-08 17:13:43 +01:00
Craig Topper 1903b99154 [RISCV] Always select (and (srl X, C), Mask) as (srli (slli X, C2), C3).
SLLI is always compressible to C.SLLI as long as the source and dest
register is the same.

ANDI and SRLI are only compressible if the register is x8-x15. By
using SLLI we have a better chance of generating shorter code.

I had to exclude one exclusion for the BEXTI case so that it's
pattern match could still fire.

Reviewed By: luismarques

Differential Revision: https://reviews.llvm.org/D123336
2022-04-08 09:04:04 -07:00
Kito Cheng 9c5aedfbf5 [RISCV] Fixing stack offset for RVV object with vararg in stack.
We found LLVM generate wrong stack offset for RVV object when stack
having variable argument, that cause by we didn't count vaarg part during
calculate RVV stack objects.

Also update the stack layout diagram for including vaarg in the diagram.

Stack layout ref:
https://github.com/gcc-mirror/gcc/blob/master/gcc/config/riscv/riscv.cc#L3941

Reviewed By: rogfer01

Differential Revision: https://reviews.llvm.org/D123180
2022-04-08 12:01:16 +08:00
Kito Cheng 690085c9b7 [RISCV] Store/restore RISCVMachineFunctionInfo into MIR YAML file
RISCVMachineFunctionInfo has some fields like VarArgsFrameIndex and
VarArgsSaveSize are calculated at ISel lowering stage, those info are
not contained in MIR files, that cause test cases rely on those field
can't not reproduce correctly by MIR dump files.

This patch adding the MIR read/write for those fields.

Reviewed By: frasercrmck

Differential Revision: https://reviews.llvm.org/D123178
2022-04-08 11:55:48 +08:00
jacquesguan a55c19c44b [RISCV][NFC] Use defvar to simplify pattern definations.
Reviewed By: frasercrmck

Differential Revision: https://reviews.llvm.org/D123292
2022-04-08 02:51:30 +00:00
Craig Topper d98bea87ef [RISCV] Add more .vx patterns for VLMax integer setccs.
This patch synchronizes the structure of the templates with those
in RISCVInstrInfoVVLPatterns.td so that we get patterns with .vx
on the left hand side.

Reviewed By: rogfer01

Differential Revision: https://reviews.llvm.org/D123255
2022-04-07 09:17:43 -07:00
Craig Topper 82662b753d [RISCV] Add swapped patterns to VPatIntegerSetCCVL_VIPlus1.
This matches VPatIntegerSetCCVL_VI_Swappable. But as noted in the
FIXME this may only be needed due to lack of canonicalization on
VP_SETCC.

Reviewed By: rogfer01

Differential Revision: https://reviews.llvm.org/D123239
2022-04-07 09:17:08 -07:00
Luís Marques d09d297c5d [RISCV] Fix crash for section alignment with .option norvc
The existing code wasn't getting the subtarget info from the fragment,
so the current status of RVC would be ignored. This would cause a crash
for the new test case when the target then reported it couldn't write
the requested number of code alignment bytes.

Differential Revision: https://reviews.llvm.org/D122236
2022-04-07 12:02:27 +01:00
Fraser Cormack 8ebc9b1560 [RISCV] Select unmasked integer setcc insts via ISel post-process
This patch has no effect on the generated code, whilst mitigating the
increase in ISel table size caused by the recent addition of masked
patterns.

I aim to do the same for floating-point patterns once D123051 lands,
giving us a reason to use masked floating-point patterns.

Reviewed By: arcbbb

Differential Revision: https://reviews.llvm.org/D123217
2022-04-07 09:30:19 +01:00
Fraser Cormack 8216255c9f [RISCV][VP] Add basic RVV codegen for vp.fcmp
This patch adds the necessary infrastructure to lower vp.fcmp via
ISD::VP_SETCC to RVV instructions.

Most notably this patch adds cond-code legalization for VP_SETCC,
reusing the existing TargetLowering::LegalizeSetCCCondCode by passing in
additional SDValue parameters for the Mask and EVL. This method then
uses VP operations to legalize the condcode.

There is still a general lack of canonicalization on VP_SETCC as opposed
to SETCC which results in worse code than is theoretically possible.

Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D123051
2022-04-07 09:16:07 +01:00
Liqin Weng f891123556 [RISCV] Add CMOV isel pattern for (select (setgt X, Imm), Y, Z)
Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D122644
2022-04-07 05:55:53 +00:00
Lian Wang 1b547799c5 [RISCV] Supplement patterns for vnsrl.wx/vnsra.wx when splat shift is sext or zext
Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D122786
2022-04-07 02:21:41 +00:00
Craig Topper e13a44b460 [RISCV] Add lowering for vp.sext and vp.zext.
Including mask vector inputs.

Reviewed By: frasercrmck, rogfer01

Differential Revision: https://reviews.llvm.org/D123150
2022-04-06 09:59:49 -07:00
Fraser Cormack 6be5e875be [RISCV][VP] Add basic RVV codegen for vp.icmp
This patch adds the minimum required to successfully lower vp.icmp via
the new ISD::VP_SETCC node to RVV instructions.

Regular ISD::SETCC goes through a lot of canonicalization which targets
may rely on which has not hereto been ported to VP_SETCC. It also
supports expansion of individual condition codes and a non-boolean
return type. Support for all of that will follow in later patches.

In the case of RVV this largely isn't a problem as the vector integer
comparison instructions are plentiful enough that it can lower all
VP_SETCC nodes on legal integer vectors except for boolean vectors,
which regular SETCC folds away immediately into logical operations.

Floating-point VP_SETCC operations aren't as well supported in RVV and
the backend relies on condition code expansion, so support for those
operations will come in later patches.

Portions of this code were taken from the VP reference patches.

Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D122743
2022-04-06 16:51:22 +01:00
Craig Topper 3c831c9b28 [RISCV] Add support for vp.fptosi where the result is a mask type.
We can do this conversion by converting the same sized integer type, then compare the result with 0. The conversion is undefined if the converted FP value doesn't fit in an i1.

Reviewed By: frasercrmck

Differential Revision: https://reviews.llvm.org/D122678
2022-04-05 09:48:04 -07:00
Craig Topper d970e96c53 [RISCV] Add lowering for vp.fptoui and vp.uitofp.
This is a straightforward extension of D122512 to unsigned integers.
2022-04-01 18:28:46 -07:00
Craig Topper fa630e7594 [RISCV][AMDGPU][TargetLowering] Special case overflow expansion for (uaddo X, 1).
If we expand (uaddo X, 1) we previously expanded the overflow calculation
as (X + 1) <u X. This potentially increases the live range of X and
can prevent X+1 from reusing the register that previously held X.

Since we're adding 1, overflow only occurs if X was UINT_MAX in which
case (X+1) would be 0. So this patch adds a special case to expand
the overflow calculation to (X+1) == 0.

This seems to help with uaddo intrinsics that get introduced by
CodeGenPrepare after LSR. Alternatively, we could block the uaddo
transform in CodeGenPrepare for this case.

Reviewed By: arsenm

Differential Revision: https://reviews.llvm.org/D122933
2022-04-01 13:14:10 -07:00
Lian Wang 62dd3674bc [RISCV] Supplement SDNode patterns for vfwmul/vfwadd/vfwsub
Reviewed By: jacquesguan

Differential Revision: https://reviews.llvm.org/D122720
2022-04-01 03:09:50 +00:00
Fraser Cormack ee51aefba0 [RISCV][NFC] Minor formatting fix 2022-03-31 16:15:22 +01:00
Fraser Cormack a276d1f44b [RISCV][NFC] Fix formatting on one line 2022-03-31 13:17:37 +01:00
ShihPo Hung 2f1261abe4 [RISCV][RVV] Add Uses = [FRM] and mayRaiseFPException = true to RVV instructions
This patch adds Uses = [FRM] and mayRaiseFPException = true to following
instructions:

VFADD, VFSUB, VFRSUB, VFMUL, VFDIV, VFRDIV
VFWADD, VFWSUB, VFWMUL
VFMADD, VFMACC, VFMSAC, VFMSUB
VFNMADD, VFNMACC, VFNMSAC, VVFNMSUB
VFWMACC, VFWMSAC,
VFWNMACC, VFWNMSAC
VFSQRT, VFREC7
VFREDOSUM, VFREDUSUM,
VFWREDOSUM, VFWREDUSUM
and only adds mayRaiseFPException = true to following instructions:

VFRSQRT7,
VFMIN, VFMAX, VFREDMIN, VFREDMAX
VMFEQ, VMFNE, VMFLT,VMFLE, VMFGT, VMFGE

Reviewed By: rogfer01

Differential Revision: https://reviews.llvm.org/D121087
2022-03-31 01:33:17 -07:00
Fraser Cormack 893d63fbdc [RISCV][NFC] Fix comment to refer to correct file 2022-03-31 08:59:10 +01:00
Lian Wang b3851e9931 [RISCV] Add VL patterns for vfwmul/vfwadd/vfwsub
Reviewed By: frasercrmck

Differential Revision: https://reviews.llvm.org/D122369
2022-03-31 07:08:58 +00:00
Craig Topper 4477500533 [RISCV] ISel (and (shift X, C1), C2)) to shift pair in more cases
Previously, these isel optimizations were disabled if the AND could
be selected as a ANDI instruction. This patch disables the optimizations
only if the immediate is valid for C.ANDI. If we can't use C.ANDI,
we might be able to compress the shift instructions instead.

I'm not checking the C extension since we have relatively poor test
coverage of the C extension. Without C extension the code size
should be equal. My only concern would be if the shift+andi had
better latency/throughput on a particular CPU.

I did have to add a peephole to match SRLIW if the input is zexti32
to prevent a regression in rv64zbp.ll.

Reviewed By: luismarques

Differential Revision: https://reviews.llvm.org/D122701
2022-03-30 11:46:42 -07:00
Craig Topper 7417eb29ce [RISCV] Use getSplatBuildVector instead of getSplatVector for fixed vectors.
The splat_vector will be legalized to build_vector eventually
anyway. This patch makes it take fewer steps.

Unfortunately, this results in some codegen changes. It looks
like it comes down to how the nodes were ordered in the topological
sort for isel. Because the build_vector is created earlier we end up
with a different ordering of nodes.

Reviewed By: frasercrmck

Differential Revision: https://reviews.llvm.org/D122185
2022-03-30 11:36:34 -07:00
Liqin Weng 4cb85da811 [RISCV] Add CMIX isel pattern for (xor (and (xor rs1, rs3), rs2), rs3)
Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D122702
2022-03-30 16:51:09 +08:00
Fraser Cormack 75047577d6 [RISCV] Trim RVV isel pats matchable via DAG post-process
In D122512, several masked patterns were added to support lowering of
vector-predicated float-to-int and int-to-float conversions. With the
introduction of these patterns, all of the old "unmasked" patterns are
matchable via the DAG post-process introduced in D118810, once the relevant
opcode entries are set up in the helper table.

Locally this reduces the generated isel table by 4%.

Reviewed By: arcbbb

Differential Revision: https://reviews.llvm.org/D122637
2022-03-30 08:56:38 +01:00
Liqin Weng 7f81765898 [RISCV][NFC] Add immediate tests for the icmp instruction
Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D122651
2022-03-30 02:51:26 +00:00
Zakk Chen b578330754 [RISCV] Use maskedoff to decide mask policy for masked compare and vmsbf/vmsif/vmsof.
masked compare and vmsbf/vmsif/vmsof are always tail agnostic, we could
check maskedoff value to decide mask policy rather than have a addtional
policy operand.

Reviewed By: craig.topper, arcbbb

Differential Revision: https://reviews.llvm.org/D122456
2022-03-29 18:05:33 -07:00
Zakk Chen 10b2760da0 Revert "[RISCV] Add policy operand for masked compare and vmsbf/vmsif/vmsof IR"
This reverts commit 10fd2822b7.

I have a better implementation for those operations without the
additional policy operand.
masked compare and vmsbf/vmsif/vmsof are always tail agnostic so we could
assume undef maskedoff is mask agnostic.

Differential Revision: https://reviews.llvm.org/D122455
2022-03-29 18:05:33 -07:00
Liqin Weng d660c0d793 [RISCV] Optimize LI+SLT to SLTI+XORI for immediates in specific range
This transform will reduce one GPR.

Reviewed By: craig.topper, benshi001

Differential Revision: https://reviews.llvm.org/D122051
2022-03-29 14:46:49 +08:00
Craig Topper 45e85feba6 [RISCV] Pull APInt/computeKnonwbits specifics out of computeGREVOrGORC. NFC
This function now takes a uint64_t instead of an APInt. The caller
is responsible for masking the shift amount, extracting and inserting
into the KnownBits APInts, and inverting to compute zeros.

This is less code and cleaner division of responsibilities.
2022-03-28 20:53:54 -07:00
Shao-Ce SUN 662b9fa02c [NFC][CodeGen] Add a setTargetDAGCombine use ArrayRef
Reviewed By: arsenm

Differential Revision: https://reviews.llvm.org/D122557
2022-03-29 09:53:24 +08:00
Craig Topper 01203918d1 [RISCV] Add computeKnownBits support for RISCVISD::GORC.
Reviewed By: luismarques

Differential Revision: https://reviews.llvm.org/D121575
2022-03-28 16:56:33 -07:00
Craig Topper e68257fcee [RISCV][SelectionDAG] Enable TargetLowering::hasBitTest for masks that fit in ANDI.
Modified DAGCombiner to pass the shift the bittest input and the shift amount
to hasBitTest. This matches the other call to hasBitTest in TargetLowering.h

This is an alternative to D122454.

Reviewed By: luismarques

Differential Revision: https://reviews.llvm.org/D122458
2022-03-28 12:46:36 -07:00
Craig Topper cfe533da05 [RISCV] Add lowering for vp.fptosi and vp.sitofp.
This as an alternative version of D120641. Starting from the code here
https://repo.hca.bsc.es/gitlab/rferrer/llvm-epi/-/raw/EPI/llvm/lib/Target/RISCV/RISCVISelLowering.cpp
but with some modifications to how the interim types are calculated,
and adding support for f16.

Still need to add fptosi for mask vectors.

Lots of masked isel patterns added so we can pass the mask through
the type changes.

Reviewed By: frasercrmck, arcbbb

Differential Revision: https://reviews.llvm.org/D122512
2022-03-28 11:06:41 -07:00
Kazu Hirata 6212871968 [Target] Apply clang-tidy fixes for readability-redundant-member-init (NFC) 2022-03-27 22:22:37 -07:00
Maksim Panchenko 4ae9745af1 [Disassember][NFCI] Use strong type for instruction decoder
All LLVM backends use MCDisassembler as a base class for their
instruction decoders. Use "const MCDisassembler *" for the decoder
instead of "const void *". Remove unnecessary static casts.

Reviewed By: skan

Differential Revision: https://reviews.llvm.org/D122245
2022-03-25 18:53:59 -07:00
Dávid Bolvanský 9a738c477e [NFCI] Fix set-but-unused warning in RISCVAsmParser.cpp 2022-03-24 08:33:40 +01:00
jacquesguan 8910ac400c [RISCV] Add patterns for vector widening integer multiply
Add patterns for vector widening integer multiply instructions

Differential Revision: https://reviews.llvm.org/D117385
2022-03-24 15:26:08 +08:00
Vasileios Porpodas 39aa202aff Recommit "[SLP] Fix lookahead operand reordering for splat loads." attempt 3, fixed assertion crash.
Original review: https://reviews.llvm.org/D121354

This reverts commit e6ead19b77.
2022-03-23 18:32:17 -07:00
Craig Topper 6c90a654bb [RISCV] Simplify some code in lowering vector int<->fp conversions. NFC
Don't call EltVT.getSizeInBits() or SrcEltVT.getSizeInBits() a second
time. They are already in EltSize or SrcEltSize variables.

Refactor some comparisons to use multiply instead of division.
2022-03-23 12:09:05 -07:00
Arthur Eubanks e6ead19b77 Revert "Recommit "[SLP] Fix lookahead operand reordering for splat loads." attempt 2, fixed assertion crash."
This reverts commit 27bd8f9492.

Causes crashes, see comments in D121973
2022-03-23 10:57:45 -07:00
luxufan 5800fb41a6 [RISCV] Remove check and update test file in D121183
Differential Revision: https://reviews.llvm.org/D122290
2022-03-24 00:48:52 +08:00
luxufan 227496dc09 [RISCV] Generate correct ELF EFlags when .ll file has target-abi attribute
In the past, when construct RISCVAsmBackend, MCTargetOptions.ABIName would be passed and stored in RISCVAsmBackend.
But MCTargetOptions.ABIName can only be specified by -target-abi xxx in command line, if the .ll file has target-abi attribute, the codegen module will ignore it. And the generated object file would have incorrect EFlags value.

https://github.com/llvm/llvm-project/issues/50591 also caused by this problem.

This patch override the AsmPrinter::emitFunctionEntryLabel function and use it to set the target abi value that get from .ll file's target-abi attribute. And storing the target-abi in RISCVTargetStreamer instead of RISCVAsmBackend.

Differential Revision: https://reviews.llvm.org/D121183
2022-03-24 00:48:52 +08:00
Vasileios Porpodas 27bd8f9492 Recommit "[SLP] Fix lookahead operand reordering for splat loads." attempt 2, fixed assertion crash.
Original review: https://reviews.llvm.org/D121354

This reverts commit f7d7d2a08d.
2022-03-22 16:41:55 -07:00
Arthur Eubanks f7d7d2a08d Revert "Recommit "[SLP] Fix lookahead operand reordering for splat loads.""
This reverts commit 79613185d3.

Causes crashes, see comments in https://reviews.llvm.org/D121973.
2022-03-22 13:33:49 -07:00
Craig Topper 51940d69cb [RISCV] Special case sign extended scalars when type legalizing nxvXi64 .vx instrinsics on RV32.
On RV32, we need to type legalize i64 scalar arguments to intrinsics.
We usually do this by splatting the value into a vector separately.
If the scalar happens to be sign extended, we can continue using a .vx
intrinsic.

We already special cased sign extended constants, this extends it
to any sign extended value.

I've only added tests for one case of vadd. Most intrinsics go
through the same check.

Reviewed By: khchen

Differential Revision: https://reviews.llvm.org/D122186
2022-03-22 10:29:06 -07:00
Craig Topper 9b0f227d7b [TableGen][RISCV] Add InstAliases with zero_reg to cover unmasked vnot.v, vncvt.x.x.w, vneg.v, etc.
The mask being NoRegister prevented the existing aliases from matching
since NoRegister isn't in the VMV0 register class.

To workaround this I've added new aliases that look for zero_reg.
I had to motify tablegen to generate matching code for zero_reg.
And as a consequence, I had to change the EmitPriority for an ARM
alias that used zero_reg that started printing.

Reviewed By: frasercrmck

Differential Revision: https://reviews.llvm.org/D121496
2022-03-22 10:14:43 -07:00
Zakk Chen 10fd2822b7 [RISCV] Add policy operand for masked compare and vmsbf/vmsif/vmsof IR
intrinsics.

Those operations are updated under a tail agnostic policy, but they
could have mask agnostic or undisturbed.

Reviewed By: rogfer01

Differential Revision: https://reviews.llvm.org/D120228
2022-03-22 07:47:21 -07:00
Zakk Chen 9ab18cc535 [RISCV] Add policy operand for masked vid and viota IR intrinsics.
Reviewed By: rogfer01

Differential Revision: https://reviews.llvm.org/D120227
2022-03-22 02:32:31 -07:00
Zakk Chen abb5a985e9 [RISCV] Support mask policy for RVV IR intrinsics.
Add the UsesMaskPolicy flag to indicate the operations result
would be effected by the mask policy. (ex. mask operations).

It means RISCVInsertVSETVLI should decide the mask policy according
by mask policy operand or passthru operand.
If UsesMaskPolicy is false (ex. unmasked, store, and reduction operations),
the mask policy could be either mask undisturbed or agnostic.
Currently, RISCVInsertVSETVLI sets UsesMaskPolicy operations default to
MA, otherwise to MU to keep the current mask policy would not be changed
for unmasked operations.

Add masked-tama, masked-tamu, masked-tuma and masked-tumu test cases.
I didn't add all operations because most of implementations are using
the same pseudo multiclass. Some tests maybe be duplicated in different
tests. (ex. masked vmacc with tumu shows in vmacc-rv32.ll and masked-tumu)
I think having different tests only for policy would make the testing
clear.

Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D120226
2022-03-22 01:19:16 -07:00
Yeting Kuo ecd7a0132a [RISCV] Add basic cost model for vector casting
To perform the cost model of vector casting, the patch consider most vector
casts as their scalar form and consider those vector form of free scalr castings
as 1.

Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D121771
2022-03-22 14:17:08 +08:00
Vasileios Porpodas 79613185d3 Recommit "[SLP] Fix lookahead operand reordering for splat loads."
Original review: https://reviews.llvm.org/D121354

The original commit 9136145eb0 broke the build on several targets.

Differential Revision: https://reviews.llvm.org/D121973
2022-03-21 15:57:32 -07:00
Craig Topper cc5b0868ff Revert "[RISCV] Special case sign extended scalars when type legalizing nxvXi64 .vx instrinsics on RV32."
This reverts commit 8c4937b33f.

Committed by mistake.
2022-03-21 14:58:11 -07:00
Craig Topper d4aeb5000f [RISCV] Simplify some code. NFC 2022-03-21 14:50:56 -07:00
Craig Topper 19de2e8db6 [RISCV] Remove stray slash from comment. NFC 2022-03-21 14:50:56 -07:00
Craig Topper 8c4937b33f [RISCV] Special case sign extended scalars when type legalizing nxvXi64 .vx instrinsics on RV32.
On RV32, we need to type legalize i64 scalar arguments to intrinsics.
We usually do this by splatting the value into a vector separately.
If the scalar happens to be sign extended, we can continue using a .vx
intrinsic.

We already special cased sign extended constants, this extends it
to any sign extended value.

I've only added tests for one case of vadd. Most intrinsics go
through the same check. I can add more tests if we're concerned.

Differential Revision: https://reviews.llvm.org/D122186
2022-03-21 14:50:55 -07:00
Mohammed Nurul Hoque 7afa44f5f5 [RISCV] Add more sign-extending ops to MIR sext.w pass.
This patch adds single-bit and bit-counting ops to list of sign-extending ops.

A single-bit write propagates sign-extendedness if it's not in the sign-bits.

Bit extraction and bit counting always outputs a small number, so sign-extended.

Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D121152
2022-03-18 18:21:17 +08:00
Jessica Clarke 63ea7797dd [RISCV] Fix buildbot breakage by explicitly instantiating templates
RISCVISelDAGToDAG's selectImm uses RISCVTargetLowering::getAddr
(specifically the ConstantPoolSDNode) as of 41454ab256 ("[RISCV] Use
constant pool for large integers"), but nothing explicitly instantiates
any of the templates, the only reason they exist is because of the
various lowering methods in RISCVISelLowering.cpp that themselves use
the methods. However, with inlining, those can end up not existing as
real functions and thus not be exported, leading to link errors. Up
until now this hasn't happened, but for whatever reason D121654 has
triggered this on the sanitizer-ppc64be-linux buildbot, giving:

  ../../../../lib/libLLVMRISCVCodeGen.a(RISCVISelDAGToDAG.cpp.o): In function `selectImm(llvm::SelectionDAG*, llvm::SDLoc const&, llvm::MVT, long, llvm::RISCVSubtarget const&)':
  RISCVISelDAGToDAG.cpp:(.text._ZL9selectImmPN4llvm12SelectionDAGERKNS_5SDLocENS_3MVTElRKNS_14RISCVSubtargetE+0x3d8): undefined reference to `llvm::SDValue llvm::RISCVTargetLowering::getAddr<llvm::ConstantPoolSDNode>(llvm::ConstantPoolSDNode*, llvm::SelectionDAG&, bool) const'
  collect2: error: ld returned 1 exit status

Fix this by explicitly instantiating getAddr in its four different forms
so separate translation units can reliably use it.

Fixes: 41454ab256 ("[RISCV] Use constant pool for large integers")
2022-03-18 02:22:17 +00:00
Craig Topper bbd2ecf9f0 [RISCV] Add +experimental-zvfh extension to cover half types in vectors.
Currently we allow half types in vectors if the scalar Zfh extension
is enabled. This behavior is not inline with the vector spec. For f32
and f64 types, the Zve32f, Zve64f, Zve64d, and V explicitly control
the availablity of floating point types in vectors.

In order to make our compiler compliant, we either need to remove all support
for half in vectors or we need an extension to control it.

Draft spec here https://github.com/riscv/riscv-v-spec/pull/780

Reviewed By: kito-cheng

Differential Revision: https://reviews.llvm.org/D121345
2022-03-17 10:04:02 -07:00
Craig Topper 7e15303062 [RISCV] Simplify scalable vector case in lowerVectorMaskExt.
Since we have SPLAT_VECTOR_PARTS these days, I don't think we need
to go through extra lengths to avoid introducing an illegal scalar type.
We can just call getConstant using the scalable vector type and let
it create either a SPLAT_VECTOR or a SPLAT_VECTOR_PARTS.

Reviewed By: frasercrmck, rogfer01

Differential Revision: https://reviews.llvm.org/D121645
2022-03-17 09:43:13 -07:00
Lian Wang 214afc7116 [RISCV] Add patterns for vnsrl.wi and vnsra.wi instructions
Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D121675
2022-03-17 07:22:32 +00:00
Lian Wang b26abcad81 [RISCV][NFC] Replace redundant code with VLOpFrag
Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D121783
2022-03-17 02:05:21 +00:00
Craig Topper 2e10671ec7 [RISCV] Improve detection of when to skip (and (srl x, c2) c1) -> (srli (slli x, c3-c2), c3) isel.
We have a special case to skip this transform if c1 is 0xffffffff
and x is sext_inreg in order to use sraiw+zext.w. But we were only
checking that we have a sext_inreg opcode, not how many bits are
being sign extended.

This commit adds a check that it is a sext_inreg from i32 so we know for
sure that an sraiw can be created.
2022-03-16 14:54:34 -07:00
Jessica Clarke 659363c0cc [RISCV] Ensure PseudoLA* can be hoisted
Since we mark the pseudos as mayLoad but do not provide any MMOs,
isSafeToMove conservatively returns false, stopping MachineLICM from
hoisting the instructions. PseudoLA_TLS_GD does not actually expand to a
load, so stop marking that as mayLoad to allow it to be hoisted, and for
the others make sure to add MMOs during lowering to indicate they're GOT
loads and thus can be freely moved.

Fixes https://github.com/llvm/llvm-project/issues/54372

Reviewed By: MaskRay, arichardson

Differential Revision: https://reviews.llvm.org/D121654
2022-03-16 18:45:36 +00:00
Shengchen Kan 37b378386e [NFC][CodeGen] Rename some functions in MachineInstr.h and remove duplicated comments 2022-03-16 20:25:42 +08:00
serge-sans-paille 989f1c72e0 Cleanup codegen includes
This is a (fixed) recommit of https://reviews.llvm.org/D121169

after:  1061034926
before: 1063332844

Discourse thread: https://discourse.llvm.org/t/include-what-you-use-include-cleanup
Differential Revision: https://reviews.llvm.org/D121681
2022-03-16 08:43:00 +01:00
Haocong.Lu 6a54776fe0 [RISCV] Select SRLI+SLLI for AND with leading ones mask
Select SRLI+SLLI for and i64 %x, imm if the imm is a leading ones mask.
It's useful in RV64 when the mask exceeds simm32 (cannot be generated by LUI).

Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D121598
2022-03-16 02:10:57 +00:00
Craig Topper 06c5d74090 [RISCV] Remove lowerSPLAT_VECTOR
This code handles fixed vector SPLAT_VECTOR, but is never called in
any tests.

We only form fixed vector splat vectors for vXi64 on RV32 as part
of DAGCombine. This will be type legalized to SPLAT_VECTOR_PARTS.
So the Custom handling for SPLAT_VECTOR is never needed.

This patch makes SPLAT_VECTOR for vXi64 'Legal' on RV32 so that
DAGCombine will create it, but there's no need for Custom handler.
It will still be type legalized to SPLAT_VECTOR_PARTS.

Reviewed By: frasercrmck

Differential Revision: https://reviews.llvm.org/D121673
2022-03-15 08:22:13 -07:00
Yeting Kuo ae7c6647f3 [RISCV] Add basic code modeling for fixed length vector reduction.
Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D121447
2022-03-14 11:04:31 +08:00
Craig Topper eeb3bfd74a [RISCV] Merge ReplaceNodeResults code for SHFL and GREV/GORC. NFC 2022-03-13 18:42:26 -07:00
Lehua Ding 1648852c98 [RISCV][RVV] Fix vslide1up/down intrinsics overflow bug for SEW=64 on RV32
Reviewed By: craig.topper, kito-cheng

Differential Revision: https://reviews.llvm.org/D120899
2022-03-13 18:06:09 +08:00
Craig Topper fd4d584d6b [RISCV] Add DAGCombine to fold (bitreverse (bswap X)) to brev8 with Zbkb.
If the type is less than XLenVT, type legalization will turn this
into (srl (bitreverse (bswap (srl (bswap X), C))), C). We can't
completely recover from these shifts. They introduce zeros into
the upper bits of the result and we can't easily tell if they are
needed. By doing a DAG combine early, we avoid introducing these
shifts.
2022-03-12 16:39:39 -08:00
Craig Topper 43f668b98e [RISCV] Move GORCIW/GREVIW formation to isel patterns.
Type legalize narrow RISCVISD::GREV/GORC with constant to a larger
type without switching to W. Detect sext_inreg+gorci/grevi with a
uimm5 immediate during isel to emit GREVIW/GORCIW.

This allows us to better propagate known bits information through
extended bits after type legalization. It will also simplify a
change I'm considering for BREV8 with Zbkb.

A future patch will add computeKnownBits support for GORC.

A further improvement here would be to use hasAllWUsers and
doPeepholeSExtW like we do for SLLIW, but I don't think we have
the test coverage for that yet.
2022-03-11 18:02:47 -08:00
Craig Topper d0969e485c [RISCV] Optimize vfmv.s.f intrinsic with scalar 0.0 to vmv.s.x with x0.
We already do this for RISCVISD::VFMV_S_F_VL and the vfmv.v.f
intrinsic.

Reviewed By: kito-cheng

Differential Revision: https://reviews.llvm.org/D121429
2022-03-11 10:05:43 -08:00
Craig Topper e9d4922543 [RISCV] Add tablegen helper classes to create PatFrag to check for one use. NFC
Reduces code and the class can be instantiated in isel patterns to
avoid creating more *_oneuse classes.
2022-03-10 23:14:21 -08:00
Craig Topper 337d49da84 [RISCV] Fix typo in comment. NFC 2022-03-10 22:00:18 -08:00
Eric Tang 336c92d5e8 [RISCV] Add alias for HFENCE.VVMA
Signed-off-by: Eric Tang <eric.tang@starfivetech.com>

Differential Revision: https://reviews.llvm.org/D120878
2022-03-11 13:32:52 +08:00
Craig Topper 1f3a8d58a6 [RISCV] Use ZERO_EXTEND instead of ANY_EXTEND when promoting i32 RISCVISD::SHFL. NFC
We know the shift amount is a constant with bit 31 clear. anyext
of constant will be either zext or sext which will produce the
same result here. But we really shouldn't rely on that. It would
be valid to put a random number in the upper bits. Our isel patterns
expect the upper bits to be 0 so we should ask for it explicitly.
2022-03-10 20:57:04 -08:00
Craig Topper 9ce6b1ca86 [RISCV] Remove performANY_EXTENDCombine.
This doesn't appear to be needed any more. I did some inspecting
of the gcc torture suite and SPEC2006 with this removed and didn't
find any meaningful changes.

I think we're more aggressive about forming ADDIW now using
sign_extend_inreg during type legalization and hasAllWUsers in isel.
This probably helps catch the cases this helped with before.
2022-03-10 11:29:31 -08:00
Craig Topper e0e8edf823 [RISCV] Add isel patterns for masked RISCVISD::FMA_VL with RISCVISD::FNEG_VL.
This helps us form vfnmsub, vfnmadd, and vfmusb from masked VP
intrinsics.

I've used "srcvalue" for the mask parameter in the fneg nodes. We
can't match "V0" because that doesn't ensure the mask the is the same.
Instead it matches two different nodes and generates two copies to
V0 of those separate values.

Reviewed By: rogfer01

Differential Revision: https://reviews.llvm.org/D120287
2022-03-10 10:05:42 -08:00
Nico Weber a278250b0f Revert "Cleanup codegen includes"
This reverts commit 7f230feeea.
Breaks CodeGenCUDA/link-device-bitcode.cu in check-clang,
and many LLVM tests, see comments on https://reviews.llvm.org/D121169
2022-03-10 07:59:22 -05:00
serge-sans-paille 7f230feeea Cleanup codegen includes
after:  1061034926
before: 1063332844

Differential Revision: https://reviews.llvm.org/D121169
2022-03-10 10:00:30 +01:00
Luke 0803dba7dd [RISCV] Add fixed-length vector instrinsics for segment load
Inspired by reviews.llvm.org/D107790.

Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D119834
2022-03-10 16:23:40 +08:00
Craig Topper d53707508a [RISCV] Remove RISCVISD::VLE_VL/VSE_VL. Use intrinsics instead.
Similar to what we do for other loads/stores, use the intrinsic
version that we already have custom isel for.

Reviewed By: rogfer01

Differential Revision: https://reviews.llvm.org/D121166
2022-03-09 22:44:28 -08:00
Craig Topper edd6632127 [RISCV] Support 'generic' as a valid CPU name.
Most other targets support 'generic', but RISCV issues an error.
This can require a special case in tools that use LLVM that aren't
clang.

This patch treats "generic" the same as an empty string and remaps
it to generic-rv/rv64 based on the triple. Unfortunately, it has to
be added to RISCV.td because MCSubtargetInfo is constructed and
parses the CPU before RISCVSubtarget's constructor gets a chance
to remap it. The CPU will then reparsed and the state in the
MCSubtargetInfo subclass will be updated again.

Fixes PR54146.

Reviewed By: khchen

Differential Revision: https://reviews.llvm.org/D121149
2022-03-09 16:43:22 -08:00
Shao-Ce SUN 365c858a5d [RISCV] Share PatFprFpr classes for F, D, and Zfh
Inspired by D115469

Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D121066
2022-03-08 13:02:04 +08:00
jacquesguan e55b9b0d0a [RISCV] Add patterns for vector widening floating-point reduction instructions.
Add patterns for vector widening floating-point reduction instructions.

Differential Revision: https://reviews.llvm.org/D120390
2022-03-08 10:53:49 +08:00
Craig Topper 845bfcede1 [RISCV] Rename 'SplatOperand' to 'ScalarOperand'. NFC
vslide1up/down have this flag set, but the value isn't a splat.
Rename for clarity.

Reviewed By: khchen

Differential Revision: https://reviews.llvm.org/D121037
2022-03-07 11:28:32 -08:00
Zakk Chen 3be907621f [RISCV] Fix incorrect optimization for masked vmsgeu.vi with 0 immediate.
vmsgeu.vi with 0 is always true, but in the masked with mask undisturbed
policy, we still need to keep inactive elelemt which come from maskedoff.

We could return mask directly if it's mask agnostic policy in the future.

Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D121080
2022-03-06 19:22:35 -08:00
Benjamin Kramer fbce4a7803 Drop some more global std::maps. NFCI. 2022-03-06 13:28:29 +01:00
Craig Topper bd5f124716 [RISCV] Add SimplifyDemandedBits support for FSR/FSL/FSRW/FSLW. 2022-03-05 21:26:51 -08:00
Zakk Chen 33b61c5678 [RISCV] Fix incorrect codegen introduced by D119688.
We should not emit a tail agnostic vlse for a tail undisturbed vmv.s.x

In D119688:
-    if (IsScalarMove && !Node->getOperand(0).isUndef())
+    bool HasPassthruOperand = Node->getOpcode() != ISD::SPLAT_VECTOR;
+    if (HasPassthruOperand && !IsScalarMove &&
!Node->getOperand(0).isUndef())
       break;

The IsScalarMove check in the if statement had been changed.

Differential Revision: https://reviews.llvm.org/D120963
2022-03-05 06:10:26 -08:00
Craig Topper 1e569e3b7b [RISCV] Add CMOV isel pattern for (select (setgt X, -1), Y, Z)
setgt X, -1 is the canonical form of setge X, 0. We can swap the
select operands and use setlt X, X0 when selecting CMOV. This
avoid materializing the -1 in a register.
2022-03-04 22:35:13 -08:00
Craig Topper 232f57319d [RISCV] Move vslide1up/down intrinsics into lowerVectorIntrinsicSplats. NFC
Rename to lowerVectorIntrinsicScalars.

This allows us to share the code that checks if the scalar needs
to be type legalized.
2022-03-04 18:21:53 -08:00
Craig Topper 3d4e83f17d [RISCV] With Zbb, fold (sext_inreg (abs X)) -> (max X, (negw X))
With Zbb, abs is expanded to (max X, neg) by default. If X has 33 or
more sign bits, we can expand it a little early using negw instead of
neg to save a sext_inreg. If X started as a 32 bit value, type
legalization would have inserted a sext before the abs so X having
33 sign bits should always be true.

Note: I've used ISD::FREEZE here since we increase the number of uses.
Our default expansion for ABS doesn't do that, but I think that's a bug.

We can't do this with custom type legalization because ISD::FREEZE
doesn't propagate sign bits so later DAG combine won't expand be
able to see optmize it.

Alives2 https://alive2.llvm.org/ce/z/Gx3RNe

Reviewed By: spatel

Differential Revision: https://reviews.llvm.org/D120597
2022-03-03 15:42:29 -08:00
Alex Tsao 89f15fc687 [RISCV] Add cost modelling for masked memory op
The patch adds very basic cost model for masked memory op on scalable vector.

Reviewed By: frasercrmck

Differential Revision: https://reviews.llvm.org/D117884
2022-03-03 20:47:58 +08:00
jacquesguan 44a430354d [RISCV] Fold store of vmv.f.s to a vse with VL=1.
This patch support the FP part of D109482.

Differential Revision: https://reviews.llvm.org/D120235
2022-03-03 16:35:19 +08:00
Craig Topper 6cb42cd666 [RISCV] More correctly ignore Zfinx register classes in getRegForInlineAsmConstraint.
Until Zfinx is supported in CodeGen we need to convert all Zfinx
register classes to GPR.

Remove the zfinx-types.ll test which didn't test anything meaningful
since -mattr=zfinx isn't implemented completely in llc.

Follow up to D93298.
2022-03-02 11:22:46 -08:00
Craig Topper a1f8349d77 [RISCV] Don't combine ROTR ((GREV x, 24), 16)->(GREV x, 8) on RV64.
This miscompile was introduced in D119527.

This was a special pattern for rotate+bswap on RV32. It doesn't
work for RV64 since the rotate needs to be half the bitwidth. The
equivalent pattern for RV64 is ROTR ((GREV x, 56), 32) so match
that instead.

This could be generalized further as noted in the new FIXME.

Reviewed By: Chenbing.Zheng

Differential Revision: https://reviews.llvm.org/D120686
2022-03-02 09:47:06 -08:00
Nikita Popov 98cfcae4e9 Revert "[RISCV] Add cost modelling for masked memory op"
This reverts commit 76f243b53b.

The newly added test fails.
2022-03-02 17:32:10 +01:00
Alex Tsao 76f243b53b [RISCV] Add cost modelling for masked memory op
The patch adds very basic cost model for masked memory op on scalable vector.

Reviewed By: frasercrmck

Differential Revision: https://reviews.llvm.org/D117884
2022-03-02 22:48:41 +08:00
Shao-Ce SUN 0e38b29543 [RISCV] add the MC layer support of Zfinx extension
This patch added the MC layer support of Zfinx extension.

Authored-by: StephenFan
Co-Authored-by: Shao-Ce Sun

Reviewed By: asb

Differential Revision: https://reviews.llvm.org/D93298
2022-03-02 14:25:19 +08:00
Mircea Trofin cb2160760e [nfc][codegen] Move RegisterBank[Info].h under CodeGen
This wraps up from D119053. The 2 headers are moved as described,
fixed file headers and include guards, updated all files where the old
paths were detected (simple grep through the repo), and `clang-format`-ed it all.

Differential Revision: https://reviews.llvm.org/D119876
2022-03-01 21:53:25 -08:00
Craig Topper b9d6e8c441 [RISCV] Lower VECTOR_SPLICE to RVV instructions.
This lowers VECTOR_SPLICE of scalable vectors to a slidedown follow by a slideup.
Fixed vectors are encouraged to use shufflevector instruction. The equivalent patch
for fixed vectors is D119039.

I've used a tail agnostic slidedown and limited the VL to only the
elements that will not be overwritten by the slideup. The slideup
uses VLMax for its VL. It unfortunately uses tail undisturbed policy
but it isn't required as there is no tail. We just need the merge
operand to carry the bits for the lower portion of the result.

Care was taken to ensure that either the slideup or slidedown will
be able to use a .vi instruction when the immediate is small. Which
one uses the immediate depends on the sign of the immediate.

Reviewed By: frasercrmck, ABataev

Differential Revision: https://reviews.llvm.org/D119303
2022-03-01 10:10:13 -08:00
Lian Wang db85cd729a [RISCV] Add FMV_W_X and FMV_H_X instrutions to hasAllNBitUsers
Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D120699
2022-03-01 08:13:59 +00:00
lian wang 5d91a8a707 [RISCV] Add schedule class for Zbp extension and Zbr extension
Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D120012
2022-03-01 07:35:59 +00:00
Lian Wang e2c150ab52 [RISCV][NFC] Move defined non_imm12 to proper place in RISCVInstrInfoZb.td
Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D120656
2022-03-01 01:45:30 +00:00
Craig Topper e83db8c001 [RISCV] Only enable combineROTR_ROTL_RORW_ROLW with Zbp.
I think the immediate values we check for on the GREV nodes already
protect this, but better to be explicit.
2022-02-28 12:47:36 -08:00
Craig Topper b083157b7b [RISCV] Don't call combineROTR_ROTL_RORW_ROLW for SLLW/SRLW/SRAW nodes. NFC
I think the function does the correct thing internally, but it's
confusing to read.
2022-02-28 11:05:10 -08:00
Craig Topper f46890711f [RISCV] Custom type legalize i32 ISD::ABS on RV64 without Zbb.
Default type legalization will create sext_inreg+abs, but we may
not be able to remove the sext_inreg.

Instead this patch expands abs during type legalization to
Y = sraiw X, 31; subw(xor X, Y), Y) which doesn't require the input
to be sign extended.

This gives a big improvement for some neg-abs tests where the
abs is used more than the the neg. Previously the abs was expanded
a different way before and after type legalization. Now they are
expanded in a similar way enabling more CSE.

Reviewed By: asb

Differential Revision: https://reviews.llvm.org/D120636
2022-02-28 09:30:27 -08:00
eric.tang b496a172e4 [RISCV] Support hypervisor extention instructions
According to privileged spec version-20211203

    Add the following hypervisor instructions:
        - HLV.B HLV.BU
        - HLV.H HLV.HU HLVX.HU
        - HLV.W HLV.WU HLVX.WU
        - HLV.D
        - HSV.B HSV.H HSV.W HSV.D

Signed-off-by: eric.tang <eric.tang@starfivetech.com>

Differential Revision: https://reviews.llvm.org/D117733
2022-02-28 14:02:43 +08:00
eric.tang 386c5be92a [RISCV] Support Sinval extension and hypervisor memory management fence instructions
According to Privileged spec version-20211203

    Add Supervisor Memory-Management Instructions:
        - SINVAL.VMA, SFENCE.W.INVAL, SFENCE.INVAL.IR
    Add Hypervisor Memory-Management Instructions:
        - HFENCE.VVMA, HFENCE.GVMA, HINVAL.VVMA, HINVAL.GVMA

Signed-off-by: eric.tang <eric.tang@starfivetech.com>

Differential Revision: https://reviews.llvm.org/D117654
2022-02-28 14:02:43 +08:00
Eric Tang cf80ef1393 [RISCV] Change GPRMemAtomic to GPRMemZeroOffset for general usage
Not only some AMO instructions but also other instructions need to
    process (${gpr}) or 0(${gpr}), where the 0 is be silently ignored.

    This patch does some changes for general usage.

Signed-off-by: Eric Tang <eric.tang@starfivetech.com>

Differential Revision: https://reviews.llvm.org/D120017
2022-02-28 14:02:43 +08:00
Chenbing Zheng 7f811ce127 [RISCV] Optimize (sext.w, srli) to sraiw with Zba.
In this patch, we add a more narrower exclusion for
zeroext (srl x) -> srli (slli x), so that it provides an opportunity
for the selection of sraiw.

Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D120467
2022-02-28 10:34:35 +08:00
Jessica Clarke 6aa8521fdb [RISCV] Fix parseBareSymbol to not double-parse top-level operators
By failing to lex the token we end up both parsing it as a binary
operator ourselves and parsing it as a unary operator when calling
parseExpression on the RHS. For plus this is harmless but for minus this
parses "foo - 4" as "foo - -4", effectively treating a top-level minus
as a plus.

Fixes https://github.com/llvm/llvm-project/issues/54105

Reviewed By: asb, MaskRay

Differential Revision: https://reviews.llvm.org/D120635
2022-02-27 20:48:52 +00:00
Jameson Nash c4b1a63a1b mark getTargetTransformInfo and getTargetIRAnalysis as const
Seems like this can be const, since Passes shouldn't modify it.

Reviewed By: wsmoses

Differential Revision: https://reviews.llvm.org/D120518
2022-02-25 14:30:44 -05:00
Haocong.Lu 865fe131f8 [RISCV] Fix a mistake in PostprocessISelDAG
With the condition N->use_empty(), the root node of DAG always
misses peephole optimization. So a dummy node is needed.

Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D119934
2022-02-25 12:38:31 +00:00
Chenbing Zheng b20e80aa59 [RISCV] DAG Combine vcpop and vfirst with VL=0 to li imm
vcpop and vfirst are still useful when VL=0.
vcpop equivalents to li 0 and vfirst equivalents to li -1,
since no mask elements are active.

Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D120302
2022-02-25 14:44:25 +08:00
Zakk Chen 4e115b7d88 [RISCV] Update computeTargetABI from llc as well as clang
Clang computes the default ABI if -mabi is empty
and encode it in LLVM IR module flag since D105555.
For correctness, llc need to give the same target-abi
(Options.MCOptions.ABIName) with ABI encoded in IR.
The getSubtargetImpl already has a check for them only if
Options.MCOptions.ABIName is not empty.

In order to get more robustness we could have a check for
explicit ABI, but now we have two different logic to
compute the default ABI.

The front-end ABI is defautl to the ilp32/ilp32e/lp64, and
ilp32d/lp64d when hardware support for extension D.
The backend ABI is default to the ilp32/ilp32e/lp64.

Reviewed by: asb, jrtc27

Differential Revision: https://reviews.llvm.org/D118333
2022-02-24 21:55:44 -08:00
lian wang f37d21ed20 [RISCV] Add schedule class for Zbt extension
Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D119808
2022-02-25 01:57:20 +00:00
Qihan Cai 0d058ed3d6 [RISCV] Change rvv version to 1.0 and remove ratify notice
This patch changes the version of V extension from 0.1 to 1.0 in RISCVInstrInfoVPseudos.td, RISCVInstrInfoVSDPatterns.td,  RISCVInstrInfoVVLPatterns.td, RISCVInstrInfoV.td

Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D120525
2022-02-25 11:38:20 +11:00
Craig Topper 506ac29632 [RISCV] Add 'i64' to some isel so tablegen will remove them for RV32. NFC
Saves a 100 bytes or so from the isel table.
2022-02-24 15:10:05 -08:00
Craig Topper a975ca97c3 [RISCV] Fold (sext_inreg (fmv_x_anyexth X), i16) -> (fmv_x_signexth X).
Add a new ISD opcode to represent the sign extending behavior of
vmv.x.h. Keep the previous anyext opcode to allow the existing
(fmv_x_anyexth (fmv_h_x X)) combine to keep working without needing
to generate a sign extend.

For fmv.x.w we are able to match the sext_inreg in an isel pattern,
but a 16-bit sext_inreg is lowered to a shift pair before isel. This
seemed like a larger match than we should do in isel.

Reviewed By: asb

Differential Revision: https://reviews.llvm.org/D118974
2022-02-24 09:19:01 -08:00
Shao-Ce SUN 78b5f0fb05 [NFC][RISCV] Reuse ISD::NodeType in float extension
Reviewed By: asb

Differential Revision: https://reviews.llvm.org/D120412
2022-02-24 19:57:55 +08:00
Nikita Popov c7fe6f9c92 Revert "[RISCV] add the MC layer support of Zfinx extension"
This reverts commit 7798ecca9c.

As reported in https://reviews.llvm.org/D93298#3331641 and
following, this causes assertion failures with inline assembly.
2022-02-24 12:14:31 +01:00
lian wang e1d4d1c242 [RISCV] Add schedule class for Zbm and Zbe extension
Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D119805
2022-02-24 08:49:25 +00:00
Chenbing.Zheng 2ae92e19eb [RISCV][NFC] Add helper function isVectorConfigInstr to reduce Repeated code.
Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D119924
2022-02-24 05:59:12 +00:00
Craig Topper 5b7ac107b1 [RISCV] Use SelectionDAG::getFreeze to simplify some code. NFC 2022-02-23 21:13:01 -08:00
Craig Topper c7d6448d03 [DAGCombiner][TargetLowering] Pass SDValue by value to isMulAddWithConstProfitable.
Internally to DAGCombiner the SDValues were passed by non-const
reference despite not being modified. They were then passed by
const reference to TLI.

This patch passes them by value which is consistent with the vast
majority of code.

Reviewed By: RKSimon

Differential Revision: https://reviews.llvm.org/D120420
2022-02-23 12:40:45 -08:00
Alex Bradbury c5bcfb983e [RISCV] Avoid infinite loop between DAGCombiner::visitMUL and RISCVISelLowering::transformAddImmMulImm
See https://github.com/llvm/llvm-project/issues/53831 for a full discussion.

The basic issue is that DAGCombiner::visitMUL and
RISCVISelLowering;:transformAddImmMullImm get stuck in a loop, as the
current checks in transformAddImmMulImm aren't sufficient to avoid all
cases where DAGCombiner::isMulAddWithConstProfitable might trigger a
transformation. This patch makes transformAddImmMulImm bail out if C0
(the constant used for multiplication) has more than one use.

Differential Revision: https://reviews.llvm.org/D120332
2022-02-23 11:05:46 +00:00