Commit Graph

1330 Commits

Author SHA1 Message Date
Craig Topper 294efcd6f7 [RISCV] Add support for fixed vector masked gather/scatter.
I've split the gather/scatter custom handler to avoid complicating
it with even more differences between gather/scatter.

Tests are the scalable vector tests with the vscale removed and
dropped the tests that used vector.insert. We're probably not
as thorough on the splitting cases since we use 128 for VLEN here
but scalable vector use a known min size of 64.

Reviewed By: frasercrmck

Differential Revision: https://reviews.llvm.org/D98991
2021-03-22 10:17:30 -07:00
luxufan 02ffbac844 [RISCV] remove redundant instruction when eliminate frame index
The reason for generating mv a0, a0 instruction is when the stack object offset is large then int<12>. To deal this situation, in the elimintateFrameIndex function, it will
create a virtual register, which needs the register scavenger to scavenge it. If the machine instruction that contains the stack object and the opcode is ADDI(the addi
was generated by frameindexNode), and then this instruction's destination register was the same as the register that was generated by the register scavenger, then the
mv a0, a0 was generated. So to eliminnate this instruction, in the eliminateFrameIndex function, if the instrution opcode is ADDI, then the virtual register can't be created.

Differential Revision: https://reviews.llvm.org/D92479
2021-03-21 18:54:00 +08:00
Jessica Clarke b2bb003774
[RISCV] Update comment in RISCVInstrInfoM.td
Missed in 07ed62b7d5.
2021-03-20 22:35:40 +00:00
Craig Topper 07ed62b7d5 [RISCV] Disable (mul (and X, 0xffffffff), (and Y, 0xffffffff)) optimization when Zba is enabled.
This optimization is trying to save SRLI instructions needed to
implement the ANDs. If we have zext.w we won't save anything.
Because we don't check that the multiply is the only user of the
AND we might even increase instruction count.
2021-03-20 15:31:45 -07:00
Craig Topper b0d8823a8a [RISCV] Add isel pattern to optimize (mul (and X, 0xffffffff), (and Y, 0xffffffff)) on RV64
This patterns computes the full 64 bit product of a 32x32 unsigned
multiply. This requires a two pairs of SLLI+SRLI to zero the
upper 32 bits of the inputs.

We can do better than this by using two SLLI to move the lower
bits to the upper bits then use MULHU to compute the product. This
is the high half of a full 64x64 product. Since we put 32 0s in the lower
bits of the inputs we know the 128-bit product will have zeros in the
lower 64 bits. So the upper 64 bits, which MULHU computes, will contain
the original 64 bit product we were after.

The same trick would work for (mul (sext_inreg X, i32), (sext_inreg Y, i32))
using MULHS, but sext_inreg is sext.w which is already one instruction so we
wouldn't save anything.

Differential Revision: https://reviews.llvm.org/D99026
2021-03-20 14:55:46 -07:00
Craig Topper d5c1d305b3 [RISCV] Rename WriteShift/ReadShift scheduler classes to WriteShiftImm/ReadShiftImm. Move variable shifts from WriteIALU/ReadIALU to new WriteShiftReg/ReadShiftReg.
Previously only immediate shifts were in WriteShift. Register
shifts were grouped with IALU. Seems likely that immediate shifts
would be as fast or faster than register shifts. And that immediate
shifts wouldn't be any faster than IALU. So if any deserved to be in
their own group it should be register shifts not immediate shifts.

Rather than try to flip them let's just add more granularity
and give each kind their own class. I've used new names for both to
make them unambiguous and to force any downstream implementations to
be forced to put correct information in their scheduler models.

Reviewed By: evandro

Differential Revision: https://reviews.llvm.org/D98911
2021-03-19 20:39:49 -07:00
Craig Topper 5d315691c4 [RISCV] Add missing bitcasts to the results of lowerINSERT_SUBVECTOR and lowerEXTRACT_SUBVECTOR when handling mask vectors.
Found by adding asserts to LegalizeDAG to catch incorrect result
types being returned.

Reviewed By: frasercrmck

Differential Revision: https://reviews.llvm.org/D98964
2021-03-19 10:54:33 -07:00
Craig Topper 85f3f6b3cc [RISCV] Lower scalable vector masked loads to intrinsics to match fixed vectors and reduce isel patterns.
Reviewed By: frasercrmck

Differential Revision: https://reviews.llvm.org/D98840
2021-03-19 10:39:35 -07:00
Fraser Cormack d399b82e2a [RISCV] Maintain fixed-length info when optimizing BUILD_VECTORs
I'm not sure how I failed to notice this before, but when optimizing
dominant-element BUILD_VECTORs we would lower via the scalable container type,
which lost us the information about the fixed length of the vector types. By
lowering via the fixed-length type we can preserve that information and
eliminate redundant vsetvli instructions.

Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D98938
2021-03-19 17:21:06 +00:00
Fraser Cormack 550292ecb1 [RISCV] Fix missing scalable->fixed-length vector conversion
Returning the scalable-vector container type would present problems when
the fixed-length INSERT_VECTOR_ELT was used by later operations.

Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D98776
2021-03-19 16:49:47 +00:00
Hsiangkai Wang aa8d33a6d6 [RISCV] Spilling for Zvlsseg registers.
For Zvlsseg, we create several tuple register classes. When spilling for
these tuple register classes, we need to iterate NF times to load/store
these tuple registers.

Differential Revision: https://reviews.llvm.org/D98629
2021-03-19 07:46:16 +08:00
Craig Topper c9861f722e [RISCV] Correct the output chain in lowerFixedLengthVectorMaskedLoadToRVV
We returned the input chain instead of the output chain from the
new load. This bypasses the load in the chain. I haven't found a
good way to test this yet. IR order prevents my initial attempts
at causing reordering.
2021-03-18 16:34:35 -07:00
Fraser Cormack 3495031a39 [RISCV] Support scalable-vector masked scatter operations
This patch adds support for masked scatter intrinsics on scalable vector
types. It is mostly an extension of the earlier masked gather support
introduced in D96263, since the addressing mode legalization is the
same.

Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D96486
2021-03-18 10:17:50 +00:00
Fraser Cormack 0331399dc9 [RISCV] Support scalable-vector masked gather operations
This patch supports the masked gather intrinsics in RVV.

The RVV indexed load/store instructions only support the "unsigned unscaled"
addressing mode; indices are implicitly zero-extended or truncated to XLEN and
are treated as byte offsets. This ISA supports the intrinsics directly, but not
the majority of various forms of the MGATHER SDNode that LLVM combines to. Any
signed or scaled indexing is extended to the XLEN value type and scaled
accordingly. This is done during DAG combining as widening the index types to
XLEN may produce illegal vectors that require splitting, e.g.
nxv16i8->nxv16i64.

Support for scalable-vector CONCAT_VECTORS was added to avoid spilling via the
stack when lowering split legalized index operands.

Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D96263
2021-03-18 09:26:18 +00:00
Fraser Cormack c2b4600ec8 [RISCV] Support bitcasts of fixed-length mask vectors
Without this patch, bitcasts of fixed-length mask vectors would go
through the stack.

Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D98779
2021-03-18 08:52:42 +00:00
ShihPo Hung fca5d63aa8 [RISCV] Fix isel pattern of masked vmslt[u]
This patch changes the operand order of masked vmslt[u]
from (mask, rs1, scalar, maskedoff, vl)
to (maskedoff, rs1, scalar, mask, vl).

Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D98839
2021-03-17 20:18:11 -07:00
Craig Topper 92b39c6907 [RISCV] Use getTargetExtractSubreg and getTargetInsertSubreg to simplify some code. NFCI 2021-03-17 12:10:19 -07:00
Craig Topper 696ddef569 [RISCV] Support masked load/store for fixed vectors.
Reviewed By: frasercrmck

Differential Revision: https://reviews.llvm.org/D98561
2021-03-17 10:26:15 -07:00
Fraser Cormack 70251759a2 [RISCV] Optimize "dominant element" BUILD_VECTORs
This patch adds an optimization path for BUILD_VECTOR nodes where the
majority of the elements are identical. These can be splatted, with the
remaining elements patched up with INSERT_VECTOR_ELTs. The threshold can
be tweaked as required - it is currently conservative. Undef elements
are disregarded when judging the dominance of a particular element. This
allows them to be covered by the splat value.

In addition, vectors of 2 elements are always optimized to a splat (for
the upper element) and an insert at element zero.

This optimization is disabled when optimizing for size.

Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D98700
2021-03-17 10:09:04 +00:00
Fangrui Song 6ab8927931 [RISCV] Support clang -fpatchable-function-entry && GNU function attribute 'patchable_function_entry'
Similar to D72215 (AArch64) and D72220 (x86).

```
% clang -target riscv32 -march=rv64g -c -fpatchable-function-entry=2 a.c && llvm-objdump -dr a.o
...
0000000000000000 <main>:
       0: 13 00 00 00   nop
       4: 13 00 00 00   nop

% clang -target riscv32 -march=rv64gc -c -fpatchable-function-entry=2 a.c && llvm-objdump -dr a.o
...
00000002 <main>:
       2: 01 00         nop
       4: 01 00         nop
```

Recently the mainline kernel started to use -fpatchable-function-entry=8 for riscv (https://git.kernel.org/linus/afc76b8b80112189b6f11e67e19cf58301944814).

Differential Revision: https://reviews.llvm.org/D98610
2021-03-16 10:02:35 -07:00
Craig Topper 229eeb187d [RISCV] Look through copies when trying to find an implicit def in addVSetVL.
The InstrEmitter can sometimes insert a copy after an IMPLICIT_DEF
before connecting it to the vector instruction. This occurs when
constrainRegClass reduces to a class with less than 4 registers.
I believe LMUL8 on masked instructions triggers this since the
result can only use the v8, v16, or v24 register group as the mask
is using v0.

Reviewed By: frasercrmck

Differential Revision: https://reviews.llvm.org/D98567
2021-03-16 07:59:09 -07:00
Craig Topper a33ce06cf5 [RISCV] Improve i32 UADDSAT/USUBSAT on RV64.
The default promotion uses zero extends that become shifts. We
cam use sign extend instead which is better for RISCV.

I've used two different implementations based on whether we
have minu/maxu instructions.

Differential Revision: https://reviews.llvm.org/D98683
2021-03-16 07:44:06 -07:00
Craig Topper 41759c3d92 [RISCV] Add RISCVISD::BR_CC similar to RISCVISD::SELECT_CC.
This allows me to introduce similar combines for branches as
we have recently added for SELECT_CC. Some of them are less
useful for standalone setccs and only help branch instructions.
By having a BR_CC node its easier to only affect branches.

I'm using CondCodeSDNode to make isel patterns easier to
write so we can refer to the codes by name. SELECT_CC uses a
constant instead.

I've translated the condition code just like SELECT_CC so
we need less patterns for the swapped conditions. This
includes special cases for X < 1 and X > -1 that get translated
to blez and bgez by using a 0 constant.

computeKnownBitsForTargetNode support for SELECT_CC is added
to allow MaskedValueIsZero to work for cases where the true
and false values of the SELECT_CC are setccs and the
result of the SELECT_CC is used by a BR_CC. This was needed
to avoid regressions in some of the overflow tests.

Reviewed By: luismarques

Differential Revision: https://reviews.llvm.org/D98159
2021-03-15 11:54:01 -07:00
Philipp Tomsich 018e96f71f [RISCV] Add isel-patterns to optimize (a < 1) into blez (a <= 0)
The following code-sequence showed up in a testcase (isolated from
SPEC2017) for if-conversion and vectorization when searching for the
maximum in an array:
        addi    a2, zero, 1
        blt     a1, a2, .LBB0_5
which can be expressed as `bge zero,a1,.LBB0_5`/`blez a1,/LBB0_5`.

More generally, we want to express (a < 1) as (a <= 0).

This adds the required isel-pattern and updates the testcases.

Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D98449
2021-03-15 11:32:43 -07:00
Craig Topper 3dc5b533e0 [RISCV] Improve legalization of i32 UADDO/USUBO on RV64.
The default legalization uses zero extends that require pair of shifts
on RISCV. Instead we can take advantage of the fact that unsigned
compares work equally well on sign extended inputs. This allows
us to use addw/subw and sext.w.

Reviewed By: luismarques

Differential Revision: https://reviews.llvm.org/D98233
2021-03-15 09:30:23 -07:00
Fraser Cormack 0c5b789c73 [RISCV] Support fixed-length vectors in the calling convention
This patch adds fixed-length vector support to the calling convention
when RVV is used to lower fixed-length vectors. The scheme follows the
regular vector calling convention for the argument/return registers, but
uses scalable vector container types as the LocVTs, and converts to/from
the fixed-length vector value types as required.

Fixed-length vector types may be split when the combination of minimum
VLEN and the maximum allowable LMUL is not large enough to fully contain
the vector. In this case the behaviour differs between fixed-length
vectors passed as parameters and as return values:
1. For return values, vectors must be passed entirely via registers or
via the stack.
2. For parameters, unlike scalar values, split vectors continue to be
passed by value, and are split across multiple registers until there are
no remaining registers. Thus vector parameters may be found partly in
registers and partly on the stack.

As with scalable vectors, the first fixed-length mask vector is passed
via v0. Split mask fixed-length vectors are passed first via v0 and then
via the next available vector register: v8,v9,etc.

The handling of vector return values uses all available argument
registers v8-v23 which does not adhere to the calling convention we're
supposedly implementing, but since this issue affects both fixed-length
and scalable-vector values, it was left as-is.

Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D97954
2021-03-15 10:43:51 +00:00
Hsiangkai Wang a81dff1e58 [RISCV] Support inline asm for vector instructions.
Types of fractional LMUL and LMUL=1 are all using VR register class. When
using inline asm, it will use the first type in the register class as the
type for the register. It is not necessary the same as the value type. We
need to use INSERT_SUBVECTOR/EXTRACT_SUBVECToR/BITCAST to make it legal
to put the value in the corresponding register class.

Differential Revision: https://reviews.llvm.org/D97480
2021-03-15 11:02:18 +08:00
Craig Topper fcdf7f6224 [RISCV] Give an explicit error if 'generic' CPU is passed instead of 'generic-rv32' or 'generic-rv64'. Validate 64Bit feature against the triple.
I encountered a project that uses llvm that passes "generic" by
default. While I could fix that project, I wouldn't be surprised
if other projects did something similar. So it seems like
a good idea to provide a better error here.

I've also added validation of the 64Bit feature against the
triple so that we can catch a mismatched CPU before failing in
a mysterious way. We can make it pretty far in isel because we
calculate XLenVT from the triple and use that to set up the legal
integer type.

Reviewed By: luismarques, khchen

Differential Revision: https://reviews.llvm.org/D98307
2021-03-14 17:21:31 -07:00
luxufan a9b9c64fd4 change rvv frame layout
This patch change the rvv frame layout that proposed in D94465. In patch D94465, In the eliminateFrameIndex function,
to eliminate the rvv frame index, create temp virtual register is needed. This virtual register should be scavenged by class
RegsiterScavenger. If the machine function has other unused registers, there is no problem. But if there isn't unused registers,
we need a emergency spill slot. Because of the emergency spill slot belongs to the scalar local variables field, to access emergency
spill slot, we need a temp virtual register again. This makes the compiler report the "Incomplete scavenging after 2nd pass" error.
So I change the rvv frame layout as follows:

```
|--------------------------------------|
|   arguments passed on the stack      |
|--------------------------------------|<--- fp
|   callee saved registers             |
|--------------------------------------|
|   rvv vector objects(local variables |
|   and outgoing arguments             |
|--------------------------------------|
|   realignment field                  |
|--------------------------------------|
|   scalar local variable(also contains|
|   emergency spill slot)              |
|--------------------------------------|<--- bp
|   variable-sized local variables     |
|--------------------------------------|<--- sp
```

Differential Revision: https://reviews.llvm.org/D97111
2021-03-13 16:05:55 +08:00
Craig Topper 51151828ac [RISCV] Teach normaliseSetCC to canonicalize X > -1 to X >= 0 and X < 1 to 0 >= X.
This allows the use of BGE with X0 instead of puting -1/1 in a
register.

Reviewed By: jrtc27

Differential Revision: https://reviews.llvm.org/D98542
2021-03-12 11:50:10 -08:00
Craig Topper 45d3ed0304 [RISCV] Add support for scalable vector masked load/store.
Reviewed By: frasercrmck

Differential Revision: https://reviews.llvm.org/D98460
2021-03-12 10:32:33 -08:00
Fraser Cormack 641f5700f9 [RISCV] Optimize INSERT_VECTOR_ELT sequences
This patch optimizes the codegen for INSERT_VECTOR_ELT in various ways.
Primarily, it removes the use of vslidedown during lowering, and the
vector element is inserted entirely using vslideup with a custom VL and
slide index.

Additionally, lowering of i64-element vectors on RV32 has been optimized
in several ways. When the 64-bit value to insert is the same as the
sign-extension of the lower 32-bits, the codegen can follow the regular
path. When this is not possible, a new sequence of two i32 vslide1up
instructions is used to get the vector element into a vector. This
sequence was suggested by @craig.topper. From there, the value is slid
into the final position for more consistent lowering across RV32 and
RV64.

Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D98250
2021-03-12 09:13:38 +00:00
Fraser Cormack 4d2d5855c7 [RISCV] Fix up stale VECREDUCE comments. NFC.
Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D98399
2021-03-12 08:49:46 +00:00
Craig Topper 1d26bbcf9b [RISCV] Return false from isShuffleMaskLegal except for splats.
We don't support any other shuffles currently.

This changes the bswap/bitreverse tests that check for this in
their expansion code. Previously we expanded a byte swapping
shuffle through memory. Now we're scalarizing and doing bit
operations on scalars to swap bytes.

In the future we can probably use vrgather.vx to do a byte swap
shuffle.
2021-03-11 20:02:49 -08:00
Craig Topper c82f442954 [RISCV] Support fixed vector copysign.
Reviewed By: frasercrmck

Differential Revision: https://reviews.llvm.org/D98394
2021-03-11 09:57:24 -08:00
Craig Topper 0dff8a9627 [RISCV] Handle vmv.x.s intrinsic for i64 vectors on RV32.
Reviewed By: frasercrmck

Differential Revision: https://reviews.llvm.org/D98372
2021-03-11 09:39:50 -08:00
Craig Topper 9c841cb8e8 [RISCV] Support extract_vector_elt for fixed and scalable masked registers.
This uses a really simple approach of converting to an i8 vector
and extracting. This is probably not the best approach especially
if you know the index is constant.

Other ideas:
-Store to stack temporary using vse1, load as scalar and shift.
-Sort of bitcast the vector to a vector of i8, slide down the
 appropriate 8 bit element, copy to scalar, shift down the
 correct bit within the 8 bits we extracted. Not exactly sure
 how to describe such a bitcast from i1 vector to i8 vector
 within the type system for elements less than 8.

Reviewed By: frasercrmck

Differential Revision: https://reviews.llvm.org/D98310
2021-03-11 09:26:44 -08:00
Craig Topper 9106d04554 [RISCV][SelectionDAG] Introduce an ISD::SPLAT_VECTOR_PARTS node that can represent a splat of 2 i32 values into a nxvXi64 vector for riscv32.
On riscv32, i64 isn't a legal scalar type but we would like to
support scalable vectors of i64.

This patch introduces a new node that can represent a splat made
of multiple scalar values. I've used this new node to solve the current
crashes we experience when getConstant is used after type legalization.

For RISCV, we are now default expanding SPLAT_VECTOR to SPLAT_VECTOR_PARTS
when needed and then handling the SPLAT_VECTOR_PARTS later during
LegalizeOps. I've remove the special case I previously put in for
ABS for D97991 as the default expansion is now able to succesfully
use getConstant.

Reviewed By: frasercrmck

Differential Revision: https://reviews.llvm.org/D98004
2021-03-10 09:46:18 -08:00
Craig Topper 0c73a506e8 [RISCV] Starting fixing issues that prevent us from testing vXi64 intrinsics on RV32.
Currently we crash in type legalization any time an intrinsic
uses a scalar i64 on RV32.

This patch adds support for type legalizing this to prevent
crashing. I don't promise that it uses the best possible codegen
just that it is functional.

This first version handles 3 cases. vmv.v.x intrinsic, vmv.s.x
intrinsic and intrinsics that take a scalar input, splat it and
then do some operation.

For vmv.v.x we'll either rely on hardware sign extension for
constants or we'll convert it to multiple splats and bit
manipulation.

For vmv.s.x we use a really unoptimal sequence inspired by what
we do for an INSERT_VECTOR_ELT.

For the third case we'll either try to use the .vi form for
constants or convert to a complicated splat and bitmanip and use
the .vv form of the operation.

I've renamed the ExtendOperand field to SplatOperand now use it
specifically for the third case. The first two cases are handled
by custom lowering specifically for those intrinsics.

I haven't updated all tests yet, but I tried to cover a subset
that includes single-width, widening, and narrowing.

Reviewed By: frasercrmck

Differential Revision: https://reviews.llvm.org/D97895
2021-03-10 09:45:38 -08:00
Craig Topper 1e39118638 [RISCV] Manually split vector operands to VECREDUCE when handling vXi64 vectors on RV32.
The type legalizer will visit the result before the operands. To
avoid creating an illegal target specific node or falling back to
scalarization, we need to manually split vector operands.

This still doesn't handle the case of non-power of 2 operands
which need to be widened. I'm not sure the type legalizer is
ready for it. I think we would need to insert an
INSERT_SUBVECTOR with the power of 2 type we want, with an undef
first operand, and the non-power of 2 orignal operand as the vector
to insert. Then fill in the neutral elements into the elements the
padded elements. Alternatively we INSERT_SUBVECTOR into a neutral vector.
From there we carry on splitting if needed to get to a legal type
then do the target specific code.

The problem with this is the type legalizer doesn't know how to
widen an insert_subvector yet. We would need to add that including
the handling for a non-undef first vector.

Reviewed By: frasercrmck

Differential Revision: https://reviews.llvm.org/D98292
2021-03-10 09:27:38 -08:00
Craig Topper 351844edf1 [RISCV] Add support for VECTOR_REVERSE for scalable vector types.
I've left mask registers to a future patch as we'll need
to convert them to full vectors, shuffle, and then truncate.

Reviewed By: frasercrmck

Differential Revision: https://reviews.llvm.org/D97609
2021-03-09 10:03:45 -08:00
Craig Topper 77ac3166e5 [RISCV] Add support for fixed vector reductions.
I've included tests that require type legalization to split the
vector. The i64 version of these scalarizes on RV32 due to type
legalization visiting the result before the vector type. So we
have to abort our custom expansion to avoid creating target
specific nodes with an illegal type. Then type legalization ends
up scalarizing. We might be able to fix this by doing custom
splitting for large vectors in our handler to get down to a legal
type.

Reviewed By: frasercrmck

Differential Revision: https://reviews.llvm.org/D98102
2021-03-09 09:39:59 -08:00
Craig Topper 1c7ad4dd88 [RISCV] Don't modify the SEW immediate on the V extension pseudo instructions after inserting VSETVLI.
Previously we set the value to -1, but the SEW information could
be useful for scheduling.

Reviewed By: frasercrmck, rogfer01

Differential Revision: https://reviews.llvm.org/D98062
2021-03-09 09:02:19 -08:00
Craig Topper 72ecf2f43f [RISCV] Optimize fixed vector ABS. Fix crash on scalable vector ABS for SEW=64 with RV32.
The default fixed vector expansion uses sra+xor+add since it can't
see that smax is legal due to our custom handling. So we select
smax(X, sub(0, X)) manually.

Scalable vectors are able to use the smax expansion automatically
for most cases. It crashes in one case because getConstant can't build a
SPLAT_VECTOR for nxvXi64 when i64 scalars aren't legal. So
we manually emit a SPLAT_VECTOR_I64 for that case.

Reviewed By: frasercrmck

Differential Revision: https://reviews.llvm.org/D97991
2021-03-09 08:51:03 -08:00
Craig Topper 478317fbb7 [RISCV] Make the hasStdExtM() check in RISCVInstrInfo::getVLENFactoredAmount emit a diagnostic rather than an assert.
As far as I know we're not enforcing the StdExtM must be enabled
to use the V extension. If we use an assert here and hit this
code in a release build we'll silently emit an invalid instruction.

By using a diagnostic we report the error to the user in release
builds. I think there may still be a later fatal error from
the code emitter though.

Reviewed By: frasercrmck

Differential Revision: https://reviews.llvm.org/D97970
2021-03-09 08:50:02 -08:00
ShihPo Hung 5cdb2e9860 [RISCV][MC] Fix nf encoding for vector ld/st whole register
The three bit nf is one less than the number of NFIELDS,
so we manually decrement 1 for VS1/2/4/8R & VL1/2/4/8R.

Reviewed By: craig.topper

Differential revision: https://reviews.llvm.org/D98185
2021-03-08 19:30:24 -08:00
Craig Topper 7a64cc4a76 [RISCV] Make use of DAG.getNeutralElement in lowerVECREDUCE to avoid repeating the same list of constants. NFC
Reviewed By: frasercrmck, khchen

Differential Revision: https://reviews.llvm.org/D98091
2021-03-08 09:11:10 -08:00
Craig Topper a2651266c5 [RISCV] Add explicit i64 types to RV64 isel patterns to stop tablegen from generating unneeded i32 patterns for RV32 HwMode. 2021-03-08 09:06:56 -08:00
Fraser Cormack 18173c57bd [RISCV] Add new entry points to getContainerForFixedLengthVector
While working on adding fixed-length vectors to the calling convention,
it was necessary to be able to query for a fixed-length vector container
type without access to an instance of SelectionDAG.

This patch modifies the "main" getContainerForFixedLengthVector function
to use an instance of TargetLowering rather than SelectionDAG, and
preserves the SelectionDAG overload as a wrapper.

An additional non-static version of the function was also added to
simplify the common case in RISCVTargetLowering.

Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D97925
2021-03-08 09:26:19 +00:00
Craig Topper c91b3c9e63 [RISCV] Fold (select_cc (setlt X, Y), 0, ne, trueV, falseV) -> (select_cc X, Y, lt, trueV, falseV)
A setcc can be created during LegalizeDAG after select_cc has been
created. This combine will enable us to fold these late setccs.

Reviewed By: luismarques

Differential Revision: https://reviews.llvm.org/D98132
2021-03-07 09:44:56 -08:00
Craig Topper fdbd5d3206 [RISCV] Fold (select_cc (xor X, Y), 0, eq/ne, trueV, falseV) -> (select_cc X, Y, eq/ne, trueV, falseV)
This pattern occurs when lowering for overflow operations
introduce an xor after select_cc has already been formed.

I had to rework another combine that looked for select_cc of an xor
with 1. That xor will now get combined away so we just need to
look for the RHS of the select_cc being 1.

Reviewed By: luismarques

Differential Revision: https://reviews.llvm.org/D98130
2021-03-07 09:29:55 -08:00
Fangrui Song 2d922de3af [MC][RISCV] Support .reloc *, BFD_RELOC_{NONE,32,64}, *
BFD_RELOC_NONE is useful for ld --gc-sections: it provides a generic way indicating a dependency between two sections.
2021-03-05 21:45:11 -08:00
Luke d28297ff68 [RISCV] Enable fixed-length vectorization of LoopVectorizer for RISC-V Vector
By implementing the method "unsigned RISCVTTIImpl::getRegisterBitWidth(bool Vector)",
fixed-length vectorization is enabled when possible. Without this method, the
"#pragma clang loop" directive is needed to enable vectorization(or the cost model
may inform LLVM that "Vectorization is possible but not beneficial").

Reviewed By: frasercrmck

Differential Revision: https://reviews.llvm.org/D97549
2021-03-05 10:54:51 +08:00
Fraser Cormack 8e7ceffd0b [RISCV] Fix crash when inserting large fixed-length subvectors
This patch addresses a compiler crash resulting from passing a
fixed-length type to one that expects scalable vector types. An
assertion was added to prevent this regressing in the future.

Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D97868
2021-03-04 09:27:16 +00:00
Fraser Cormack d8e1d2ebf4 [RISCV] Preserve fixed-length VL on insert_vector_elt in more cases
This patch fixes up one case where the fixed-length-vector VL was
dropped (falling back to VLMAX) when inserting vector elements, as the
code would lower via ISD::INSERT_VECTOR_ELT (at index 0) which loses the
fixed-length vector information.

To this end, a custom node, VMV_S_XF_VL, was introduced to carry the VL
operand through to the final instruction. This node wraps the RVV
vmv.s.x and vmv.s.f instructions, which were being selected by
insert_vector_elt anyway.

There should be no observable difference in scalable-vector codegen.

There is still one outstanding drop from fixed-length VL to VLMAX, when
an i64 element is inserted into a vector on RV32; the splat (which is
custom legalized) has no notion of the original fixed-length vector
type.

Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D97842
2021-03-04 09:21:10 +00:00
Amara Emerson 8a316045ed [AArch64][GlobalISel] Enable use of the optsize predicate in the selector.
To do this while supporting the existing functionality in SelectionDAG of using
PGO info, we add the ProfileSummaryInfo and LazyBlockFrequencyInfo analysis
dependencies to the instruction selector pass.

Then, use the predicate to generate constant pool loads for f32 materialization,
if we're targeting optsize/minsize.

Differential Revision: https://reviews.llvm.org/D97732
2021-03-02 12:55:51 -08:00
Fraser Cormack c1695ddf7d [RISCV] Support fixed-length INSERT_VECTOR_ELT
This patch enables support for lowering INSERT_VECTOR_ELT on
fixed-length vector types. The strategy follows that for scalable vector
types.

This patch also includes a quick fix to prevent the compiler infinitely
looping between lowering BUILD_VECTOR as VECTOR_SHUFFLE and back again.

Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D97698
2021-03-02 16:48:38 +00:00
Fraser Cormack de2b70010a [RISCV] Lower CONCAT_VECTORS to INSERT_SUBVECTOR nodes
The default expansion of CONCAT_VECTORS goes through the stack. This
patch avoids that penalty by custom-lowering CONCAT_VECTORS to a series
of INSERT_SUBVECTOR nodes. Futher optimizations are possible, but this
is a good start.

Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D97692
2021-03-02 11:13:59 +00:00
Fraser Cormack 3fea9226ee [RISCV] Support INSERT_SUBVECTOR on vector masks
Like with EXTRACT_SUBVECTOR, INSERT_SUBVECTOR poses a problem
for vector masks as RVV isn't able to slide mask types around. We choose
instead to bitcast to equivalently-sized i8 types where we can, else we
zero-extend, perform the operation, and truncate back down.

One test was left disabled due to a crash in the legalizer.

Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D97559
2021-03-01 12:04:11 +00:00
Fraser Cormack e80ca3af82 [RISCV] Fix INSERT/EXTRACT_SUBVECTOR on fractional LMUL types
This patch fixes a bug where the lowering for INSERT_SUBVECTOR and
EXTRACT_SUBVECTOR would insist on first extracting a register-aligned
LMUL1 vector type before perfoming the slide up/down. This was even if
the vector was a fractional LMUL type, in which case the aligned
EXTRACT_SUBVECTOR was invalid.

This issue only occurred for scalable vector types, but a variety of
tests for both scalable and fixed-length vectors have been added to
ensure this does not regress in the future.

Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D97556
2021-03-01 11:51:05 +00:00
Fraser Cormack 4ea734e6ec [RISCV] Unify scalable- and fixed-vector INSERT_SUBVECTOR lowering
This patch unifies the two disparate paths for lowering INSERT_SUBVECTOR
operations under one roof. Consequently, with this patch it is possible to
support any fixed-length subvector insertion, not just "cast-like" ones.

As before, support for the insertion of mask vectors will come in a
separate patch.

Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D97543
2021-03-01 11:38:47 +00:00
Fraser Cormack bd4d421688 [RISCV] Support EXTRACT_SUBVECTOR on vector masks
This patch adds support for extracting subvectors from vector masks.
This can be either extracting a scalable vector from another, or a fixed-length
vector from a fixed-length or scalable vector.

Since RVV lacks a way to slide vector masks down on an element-wise
basis and we don't know the true length of the vector registers, in many
cases we must resort to using equivalently-sized i8 vectors to perform
the operation. When this is not possible we fall back and extend to a
suitable i8 vector.

Support was also added for fixed-length truncation to mask types.

Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D97475
2021-03-01 11:20:09 +00:00
Fangrui Song 47c5576d7d ELF: Create unique SHF_GNU_RETAIN sections for llvm.used global objects
If a global object is listed in `@llvm.used`, place it in a unique section with
the `SHF_GNU_RETAIN` flag. The section is a GC root under `ld --gc-sections`
with LLD>=13 or GNU ld>=2.36.

For front ends which do not expect to see multiple sections of the same name,
consider emitting `@llvm.compiler.used` instead of `@llvm.used`.

SHF_GNU_RETAIN is restricted to ELFOSABI_GNU and ELFOSABI_FREEBSD in
binutils. We don't do the restriction - see the rationale in D95749.

The integrated assembler has supported SHF_GNU_RETAIN since D95730.
GNU as>=2.36 supports section flag 'R'.
We don't need to worry about GNU ld support because older GNU ld just ignores
the unknown SHF_GNU_RETAIN.

With this change, `__attribute__((retain))` functions/variables emitted
by clang will get the SHF_GNU_RETAIN flag.

Differential Revision: https://reviews.llvm.org/D97448
2021-02-26 16:38:44 -08:00
Craig Topper b183cbfacd [RISCV] Call SelectBaseAddr on the base pointer in the custom isel for vector loads and stores.
This will allow FrameIndex as the base address instead of
emitting a separate ADDI from isel. eliminateFrameIndex will likely turn
it back into an ADDI, but this makes things consistent with the
SDPatterns and VLPatterns.

I only tested one case for simplicity. I can test more if reviewers
want.

Reviewed By: frasercrmck

Differential Revision: https://reviews.llvm.org/D97221
2021-02-26 11:38:23 -08:00
Fraser Cormack 37014db013 [RISCV] Use existing method for the LMUL1 type. NFCI.
Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D97467
2021-02-26 09:44:05 +00:00
Craig Topper d7fca3f0bf [RISCV] Support fixed vector extract_element for FP types. 2021-02-25 16:30:28 -08:00
Craig Topper 95c6824995 [RISCV] Teach CleanupVSETVLI to remove 'vsetvli zero, zero, vtype' when the vtype matches the previous vsetvli or vsetivli
Reviewed By: frasercrmck, arcbbb

Differential Revision: https://reviews.llvm.org/D97408
2021-02-25 07:51:19 -08:00
Craig Topper 25c6b7ddd2 [RISCV] Add isel pattern to match X > -1 to bgez.
Reviewed By: asb

Differential Revision: https://reviews.llvm.org/D97262
2021-02-25 07:42:22 -08:00
Fraser Cormack 0ad86f879f [RISCV] Update RVV ISA section-header comments. NFC.
Some of the section headers had become stale with the transition from
RVV specification version 0.9 to 0.10. This patch brings them up to
date.
2021-02-25 14:15:28 +00:00
Fraser Cormack 02f435db0b [RISCV] Support fixed-length vector i2fp/fp2i conversions
This patch extends the support for scalable-vector int->fp and fp->int
conversions by additionally handling fixed-length vectors.

The existing scalable-vector lowering re-expresses widening/narrowing by
x4+ conversions as standard nodes. The fixed-length vector support slots
in at "the end" of this process by lowering the now equally-sized and
widening/narrowing by x2 nodes to our custom VL versions.

Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D97374
2021-02-25 13:47:58 +00:00
Fraser Cormack 9620ce90d7 [RISCV] Support fixed-length vector FP_ROUND & FP_EXTEND
This patch extends the support for vector FP_ROUND and FP_EXTEND by
including support for fixed-length vector types. Since fixed-length
vectors use "VL" nodes and scalable vectors can use the standard nodes,
there is slightly more to do in the fixed-length case. A helper function
was introduced to try and reduce the divergent paths. It is expected
that this function will similarly come in useful for lowering the
int-to-fp and fp-to-int operations for fixed-length vectors.

Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D97301
2021-02-25 12:16:06 +00:00
Fraser Cormack 84413e1947 [RISCV] Support fixed-length vector truncates
This patch extends support for our custom-lowering of scalable-vector
truncates to include those of fixed-length vectors. It does this by
co-opting the custom RISCVISD::TRUNCATE_VECTOR node and adding mask and
VL operands. This avoids unnecessary duplication of patterns and
inflation of the ISel table.

Some truncates go through CONCAT_VECTORS which currently isn't
efficiently handled, as it goes through the stack. This can be improved
upon in the future.

Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D97202
2021-02-25 12:11:34 +00:00
Fraser Cormack 3bc5ed3875 [RISCV] Support fixed-length vector sign/zero extension
This patch adds support for the custom lowering sign- and zero-extension
of fixed-length vector types. It does so through custom nodes. Since the
source and destination types are (necessarily) of different sizes, it is
possible that the source type is legal whilst the larger destination
type isn't. In this case the legalization makes heavy use of
EXTRACT_SUBVECTOR.

Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D97194
2021-02-25 12:05:17 +00:00
Fraser Cormack 821f8bb29a [RISCV] Unify scalable- and fixed-vector EXTRACT_SUBVECTOR lowering
This patch unifies the two disparate paths for lowering
EXTRACT_SUBVECTOR operations under one roof. Consequently, with this
patch it is possible to support any fixed-length subvector extraction,
not just "cast-like" ones.

Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D97192
2021-02-25 11:46:57 +00:00
Craig Topper 159f78fc2f [RISCV] Reuse existing SDLoc and XLenVT in the switch in RISCVISelDAGToDAG::Select. NFC
A SDLoc and XLenVT were already created above the switch.
2021-02-24 21:39:00 -08:00
Craig Topper efcdd598b7 [RISCV] Teach VSETVLI inserter to use VSETIVLI when possible.
We always create the VL operand using a register, but if we can
determine that it came from an ADDI X0, imm with a sufficiently
small immediate, we can use VSETIVLI.

Reviewed By: frasercrmck

Differential Revision: https://reviews.llvm.org/D97332
2021-02-24 16:07:33 -08:00
Craig Topper 9bde29629d [RISCV] Use a ComplexPattern for zexti32 to match sexti32.
We just started using a ComplexPattern for sexti32. This updates
zexti32 to match.

Reviewed By: luismarques

Differential Revision: https://reviews.llvm.org/D97231
2021-02-24 16:06:29 -08:00
Craig Topper 086670d367 [RISCV] Support fixed vector extract element. Use VL=1 for scalable vector extract element.
I've changed to use VL=1 for slidedown and shifts to avoid extra
element processing that we don't need.

The i64 fixed vector handling on i32 isn't great if the vector type
isn't legal due to an ordering issue in type legalization. If the
vector type isn't legal, we fall back to default legalization
which will bitcast the vector to vXi32 and use two independent extracts.
Doing better will require handling several different cases by
manually inserting insert_subvector/extract_subvector to adjust the type
to a legal vector before emitting custom nodes.

Reviewed By: frasercrmck

Differential Revision: https://reviews.llvm.org/D97319
2021-02-24 10:17:00 -08:00
Hsiangkai Wang 53c4c2b9f7 [RISCV] vle1.v/vse1.v should be unmasked instructions.
vle1.v/vse1.v should be unmasked instructions. The vm encoding is 1 for
unmasked instructions.

Differential Revision: https://reviews.llvm.org/D97237
2021-02-23 19:59:22 +08:00
Fraser Cormack dd68f3cf28 [RISCV] Support insertion of misaligned subvectors
This patch extends the support for RVV INSERT_SUBVECTOR to cover those
which don't align to a vector register boundary. Like the support for
EXTRACT_SUBVECTOR in D96959, it accomplishes this by extracting the
nearest register-sized subvector (a subregister operation), then sliding
the vector down with VSLIDEDOWN, inserting the subvector to the first
position, and sliding the vector back up again afterwards.

Unlike subvector extraction, for vectors that occupy less than a full
vector register we must preserve the untouched elements. We do this by
lowering to an LMUL=1 INSERT_SUBVECTOR using the above method and
lowering that to a VSLIDEUP with a zero offset. This uses a
tail-undisturbed policy and so has the effect of "sliding in" the
subvector elements while preserving the surrounding ones.

Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D96972
2021-02-23 10:31:06 +00:00
Craig Topper 3231607ce9 [RISCV] Have sexti32 also recognize AssertZExt from types smaller than i32.
An i64 AssertZExt from a type smaller than i32 has at least 33
leading zeros which mean it has at least 33 sign bits.

Since we have a couple patterns that use two sexti32, I've
switched to a ComplexPattern so tablegen didn't have to generate
9 different permutations.

As noted in the FIXME, maybe we should just call computeNumSignBits,
but we don't have tests that benefit from that yet.

Reviewed By: luismarques

Differential Revision: https://reviews.llvm.org/D97130
2021-02-22 14:56:22 -08:00
Craig Topper 1cd2a5a7da [RISCV] Add isel support for bitcasts between fixed vector types.
This should fix the issue reported in D96972.

I don't have a good test case for this without those changes.

Differential Revision: https://reviews.llvm.org/D97082
2021-02-22 12:05:46 -08:00
Craig Topper 1aeb927fed [RISCV] Custom isel the rest of the vector load/store intrinsics.
A previous patch moved the index versions. This moves the rest.
I also removed the custom lowering for VLEFF since we can now
do everything directly in the isel handling.

I had to update getLMUL to handle mask registers to index the
pseudo table correctly for VLE1/VSE1.

This is good for another 15K reduction in llc size.

Reviewed By: frasercrmck

Differential Revision: https://reviews.llvm.org/D97097
2021-02-22 09:53:46 -08:00
Craig Topper 1a6c1ac686 [SelectionDAG][RISCV] Teach ComputeNumSignBits to handle SREM.
This also removes a pattern from RISCV that is no longer needed
since the sexti32 on the LHS of the srem in the pattern implies
the result is sign extended so the sign_extend_inreg should be
removed in DAG combine now.

Reviewed By: luismarques, RKSimon

Differential Revision: https://reviews.llvm.org/D97133
2021-02-21 11:13:36 -08:00
Fraser Cormack 3e1317fd32 [RISCV] Support extraction of misaligned subvectors
This patch extends the support for RVV EXTRACT_SUBVECTOR to cover those
which don't align to a vector register boundary. It accomplishes this by
extracting the nearest register-sized subvector (a subregister
operation), then sliding the vector down with VSLIDEDOWN and extracting
the subvector from the first position (a COPY operation).

Since this procedure involves the use of VSCALE and multiplication, the
handling of such operations is done during lowering to simplify the
implementation and make use of DAG combining. This necessitated moving
some helper functions from RISCVISelDAGToDAG to RISCVTargetLowering.

Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D96959
2021-02-20 15:43:54 +00:00
Fraser Cormack 9aa20caee6 [RISCV] Improve register allocation around vector masks
With vector mask registers only allocatable to V0 (VMV0Regs) it is
relatively simple to generate code which uses multiple masks and naively
requires spilling.

This patch aims to improve codegen in such cases by telling LLVM it can
use VRRegs to hold masks. This will prevent spilling in many cases by
having LLVM copy to an available VR register.

Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D97055
2021-02-20 14:47:51 +00:00
Craig Topper 71b68fe532 [RISCV] Teach our custom vector load/store intrinsic isel code to propagate memory operands if we have them.
We don't currently create memory operands for these intrinsics,
but there was a suggestion of using the indexed load/store
intrinsics to implement isel for scalable vector gather/scatter.
That may propagate the memory operand from the gather/scatter
ISD nodes.
2021-02-19 19:12:20 -08:00
Craig Topper 7e54d7304b [RISCV] Remove VPatILoad and VPatIStore multiclasses that are no longer used. NFC 2021-02-19 13:23:08 -08:00
Craig Topper e7c86f4ac4 [RISCV] Use inheritance to reduce some repeated code in tablegen. NFC
The VLX and VSX searchable tables, share the same format so we
can have a common base class for them.
2021-02-19 10:42:18 -08:00
Craig Topper 7f5b3886e4 [RISCV] Remove unneeded indexed segment load/store vector pseudo instruction.
We had more combinations of data and index lmuls than we needed.

Also add some asserts to verify that the IndexVT and data VT have
the same element count when we isel these pseudo instructions.
2021-02-19 10:28:48 -08:00
Craig Topper d056d5decf [RISCV] Use custom isel for vector indexed load/store intrinsics.
There are many legal combinations of index and data VTs supported
for these intrinsics. This results in a lot of isel patterns in
RISCVGenDAGISel.inc.

By adding a separate table similar to what we use for segment
load/stores, we can more efficiently manually select these
intrinsics. We should also be able to reuse this table scalable
vector gather/scatter.

This reduces the llc binary size by ~56K.

Reviewed By: khchen

Differential Revision: https://reviews.llvm.org/D97033
2021-02-19 10:10:06 -08:00
Craig Topper dbf910f0d9 [RISCV] Prevent selecting a 0 VL to X0 for the segment load/store intrinsics.
Just like we do for isel patterns, we need to call selectVLOp
to prevent 0 from being selected to X0 by the default isel.

Reviewed By: frasercrmck

Differential Revision: https://reviews.llvm.org/D97021
2021-02-19 10:07:12 -08:00
Craig Topper 98dff5e804 [RISCV] Move SHFLI matching to DAG combine. Add 32-bit support for RV64
We previously used isel patterns for this, but that used quite
a bit of space in the isel table due to OR being associative
and commutative. It also wouldn't handle shifts/ands being in
reversed order.

This generalizes the shift/and matching from GREVI to
take the expected mask table as input so we can reuse it for
SHFLI.

There is no SHFLIW instruction, but we can promote a 32-bit
SHFLI to i64 on RV64. As long as bit 4 of the control bit isn't
set, a 64-bit SHFLI will preserve 33 sign bits if the input had
at least 33 sign bits. ComputeNumSignBits has been updated to
account for that to avoid sext.w in the tests.

Reviewed By: frasercrmck

Differential Revision: https://reviews.llvm.org/D96661
2021-02-19 10:07:12 -08:00
Fraser Cormack d9531a3097 [RISCV] Address some clang-tidy warnings. NFCI. 2021-02-19 12:10:28 +00:00
Craig Topper cd4051ac80 [RISCV] Prune unneeded indexed load/store pseudo instructions.
We were creating more combinations of value and index lmul than
we needed.

I've copied the loop structure used here from VPseudoAMOEI with
all data sew values instead of just 32/64.

Similar can be done for segment loads/store.

Reviewed By: khchen

Differential Revision: https://reviews.llvm.org/D97008
2021-02-18 23:08:39 -08:00
Craig Topper 8ed3bbbcc3 [RISCV] Split zvlsseg searchable table into 4 separate tables. Index by properties rather than intrinsic ID.
Intrinsic ID is a 32-bit value which made each row of the table 4
byte aligned. The remaining fields used 5 bytes. This meant 3 bytes
of padding per row.

This patch breaks the table into 4 separate tables and indexes them
by properties we know about the intrinsic. NF, masked,
strided, ordered, etc. The indexed load/store tables have no
padding in their rows now.

All together this reduces the size of llc binary by ~28K.

I'm considering adding similar tables for isel of non-segment
load/store as well to cut down the size of the isel table and
probably improve our isel performance. Those tables would need to
indexed from intrinsics, IR loads/stores, gathers/scatters, and
RISCVISD opcodes. So having a table that can be indexed without using
intrinsic ID is more flexible.

Reviewed By: HsiangKai

Differential Revision: https://reviews.llvm.org/D96894
2021-02-18 19:00:49 -08:00
Craig Topper cf34559104 [RISCV] Enable PrimaryKeyEarlyOut on RISCVVPseudosTable.
This table is queried in RISCVMCInstLower without knowing
whether the instruction is a vector pseudo. Due to the way the
binary search works, we have to do log2(tablesize) checks just
to determine a non-vector instruction isn't in the table.

Conveniently, all the vector pseudos are pretty tightly
packed within the internal instruction enum. By enabling the
PrimaryKeyEarlyOut, tablegen will emit a check against the
beginning and end of the table before doing the binary search.
This gives a quick early out on the search for the majority
of non-vector instructions.

Differential Revision: https://reviews.llvm.org/D97016
2021-02-18 18:59:32 -08:00
Craig Topper 0db938312a [RISCV] Simplify VPseudoAMOEI multiclass. NFC
lmul was already iterated in one of the loops. We don't need to recreate
it from a string.
2021-02-18 12:40:51 -08:00
Jessica Clarke 74df1ffaad [RISCV] Use XLenRI alias for RegInfoByHwMode instances
This avoids tedious repetition and matches what we do for the
ValueTypeByHwMode uses.

Reviewed By: craig.topper, luismarques

Differential Revision: https://reviews.llvm.org/D96649
2021-02-18 19:38:36 +00:00
Craig Topper 156fc07e19 [RISCV] Add support for fixed vector MULHU/MULHS.
This uses to division by constant optimization to use MULHU/MULHS.

Reviewed By: frasercrmck, arcbbb

Differential Revision: https://reviews.llvm.org/D96934
2021-02-18 09:15:08 -08:00
Craig Topper 792627be35 [RISCV] Add support for fixed vector sign/zero extend from mask types.
Due to vXi64 on RV32, I've directly emitted this using _VL ISD
opcodes. If it wasn't for that we could just use fixed vector
BUILD_VECTOR and VSELECT and let those each be legalized.

Reviewed By: frasercrmck

Differential Revision: https://reviews.llvm.org/D96910
2021-02-18 09:08:10 -08:00
Craig Topper c7dd92e8a5 [RISCV] Support isel of scalable vector bitcasts
These should be NOPs so we can just replace with the input. This
matches what SVE does with isel patterns for all permutations.
Custom isel saves us from having to list all permurations for
all LMULs.

Reviewed By: frasercrmck

Differential Revision: https://reviews.llvm.org/D96921
2021-02-18 09:01:13 -08:00
Hsiangkai Wang 065a187f33 [RISCV] Fix typo. Use ValueType instead of LLVMType. 2021-02-18 23:21:27 +08:00
Hsiangkai Wang f1efa8abaf [RISCV] Fix bugs in pseudo instructions for masked segment load.
For masked segment load, the destination register should not overlap
with mask register. It could not be V0.

In the original implementation, there is no segment load/store register
class without V0. In this patch, I added these register classes and
modify `GetVRegNoV0` to get the correct one.

Differential Revision: https://reviews.llvm.org/D96937
2021-02-18 22:17:00 +08:00
Hsiangkai Wang b97d8b32c3 [NFC][RISCV] Use concise way to describe load/store instructions.
Differential Revision: https://reviews.llvm.org/D96923
2021-02-18 22:17:00 +08:00
Benjamin Kramer ae1e6c3557 [RISCV] Rewrite assert to not give unused variable warnings in Release builds
NFCI
2021-02-18 11:42:36 +01:00
Fraser Cormack d876214990 [RISCV] Begin to support more subvector inserts/extracts
This patch adds support for INSERT_SUBVECTOR and EXTRACT_SUBVECTOR
(nominally where both operands are scalable vector types) where the
vector, subvector, and index align sufficiently to allow decomposition
to subregister manipulation:

* For extracts, the extracted subvector must correctly align with the
lower elements of a vector register.
* For inserts, the inserted subvector must be at least one full vector
register, and correctly align as above.

This approach should work for fixed-length vector insertion/extraction
too, but that will come later.

Reviewed By: craig.topper, khchen, arcbbb

Differential Revision: https://reviews.llvm.org/D96873
2021-02-18 10:18:27 +00:00
Craig Topper 016eca8f90 [RISCV] Guard LowerINSERT_VECTOR_ELT against fixed vectors.
The type legalizer can call this code based on the scalar type so
we need to verify the vector type is a scalable vector.

I think due to how type legalization visits nodes, the vector type
will have already been legalized so we don't have an issue with
using MVT here like we did for EXTRACT_VECTOR_ELT.
I've added a test just in case.
2021-02-17 19:27:08 -08:00
Craig Topper 00c4e0a8f6 [RISCV] Guard the ISD::EXTRACT_VECTOR_ELT handling in ReplaceNodeResults against fixed vectors and non-MVT types.
The type legalizer is calling this code based on the scalar type so
we need to verify the input type is a scalable vector.

The vector type has also not been legalized yet when this is called
so we need to use EVT for it.
2021-02-17 18:25:38 -08:00
Craig Topper 3bdd02735b [RISCV] Localize RISCVZvlssegTable to RISCVISelDAGToDAG.cpp, the only place it is used. 2021-02-17 11:37:28 -08:00
Craig Topper 799f7865c8 [RISCV] Use bits<7> instead of bits<11> for the EEW field size in the RISCVZvlsseg searchable table. NFCI
We only support 8, 16, 32, and 64 for EEW. These only need 7 bits
to represent.
2021-02-17 11:12:36 -08:00
Craig Topper d4353a3101 [RISCV] Merge the handlers for masked and unmasked segment loads/stores.
A lot of the code for the masked and unmasked is the same. This
patch adds a boolean to handle the differences so we can share
the code.

Differential Revision: https://reviews.llvm.org/D96841
2021-02-17 10:08:33 -08:00
Craig Topper 6f30d0035a [RISCV] Merge the vsetvli and vsetvlimax intrinsic selection
These have very similar code just with a different number of
operands and handling for vsetivl.

Differential Revision: https://reviews.llvm.org/D96834
2021-02-17 10:08:33 -08:00
luxufan 709ea8bc87 [RISCV] Simplify BP initialisation
We can re-use copyPhysReg rather than writing a specialised copy.

Differential Revision: https://reviews.llvm.org/D95227
2021-02-17 20:33:20 +08:00
Fraser Cormack d81161646a [RISCV] Add support for fixed vector vselect
This patch adds support for fixed-length vector vselect. It does so by
lowering them to a custom unmasked VSELECT_VL node with a vector length
operand.

Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D96768
2021-02-17 10:59:00 +00:00
Hsiangkai Wang a3c783dbf2 [RISCV] Spilling for RISC-V V extension. (2nd version)
Differential Revision: https://reviews.llvm.org/D95148
2021-02-17 14:05:19 +08:00
Hsiangkai Wang 5a31a67385 [RISCV] Frame handling for RISC-V V extension.
This patch proposes how to deal with RISC-V vector frame objects. The
layout of RISC-V vector frame will look like

|---------------------------------|
| scalar callee-saved registers   |
|---------------------------------|
| scalar local variables          |
|---------------------------------|
| scalar outgoing arguments       |
|---------------------------------|
| RVV local variables &&          |
| RVV outgoing arguments          |
|---------------------------------| <- end of frame (sp)

If there is realignment or variable length array in the stack, we will use
frame pointer to access fixed objects and stack pointer to access
non-fixed objects.

|---------------------------------| <- frame pointer (fp)
| scalar callee-saved registers   |
|---------------------------------|
| scalar local variables          |
|---------------------------------|
| ///// realignment /////         |
|---------------------------------|
| scalar outgoing arguments       |
|---------------------------------|
| RVV local variables &&          |
| RVV outgoing arguments          |
|---------------------------------| <- end of frame (sp)

If there are both realignment and variable length array in the stack, we
will use frame pointer to access fixed objects and base pointer to access
non-fixed objects.

|---------------------------------| <- frame pointer (fp)
| scalar callee-saved registers   |
|---------------------------------|
| scalar local variables          |
|---------------------------------|
| ///// realignment /////         |
|---------------------------------| <- base pointer (bp)
| RVV local variables &&          |
| RVV outgoing arguments          |
|---------------------------------|
| /////////////////////////////// |
| variable length array           |
| /////////////////////////////// |
|---------------------------------| <- end of frame (sp)
| scalar outgoing arguments       |
|---------------------------------|

In this version, we do not save the addresses of RVV objects in the
stack. We access them directly through the polynomial expression
(a x VLENB + b). We do not reserve frame pointer when there is any RVV
object in the stack. So, we also access the scalar frame objects through the
polynomial expression (a x VLENB + b) if the access across RVV stack
area.

Differential Revision: https://reviews.llvm.org/D94465
2021-02-17 14:05:19 +08:00
Craig Topper 61a238e6e1 [RISCV] Add isel patterns for fixed vector fmsub/fnmadd/fnmsub. 2021-02-16 12:03:33 -08:00
Craig Topper 07ca13fe07 [RISCV] Add support for fixed vector mask logic operations.
Reviewed By: frasercrmck

Differential Revision: https://reviews.llvm.org/D96741
2021-02-16 09:34:00 -08:00
Fraser Cormack 04977ce5ce [RISCV] Fix a crash in fixed-length build_vector lowering
Non-splatted non-integer build_vector nodes were mistakenly being
lowered as VID expressions, which should not happen. VID can only be
used to select integer build_vector nodes.

Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D96718
2021-02-16 10:25:15 +00:00
Fraser Cormack b870199020 [RISCV] Add patterns for scalable-vector fabs & fcopysign
The patterns mostly follow the scalar counterparts, save for some extra
optimizations to match the vector/scalar forms.

The patch adds a DAGCombine for ISD::FCOPYSIGN to try and reorder
ISD::FNEG around any ISD::FP_EXTEND or ISD::FP_TRUNC of the second
operand. This helps us achieve better codegen to match vfsgnjn.

Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D96028
2021-02-16 10:21:09 +00:00
Craig Topper 29b894a8d3 [RISCV] Add expicit i32/i64 types to RV32 or RV64 only isel patterns. NFC
This stops tablegen from generating patterns with the opposite type
in the opposite HwMode. This just adds wasted bytes to the isel table.

This reduces the isel table by about 1800 bytes.
2021-02-15 14:36:05 -08:00
Craig Topper 7ba2e1c601 [RISCV] Add support for fixed vector floating point setcc.
This is annoying because the condition code legalization belongs
to LegalizeDAG, but our custom handler runs in Legalize vector ops
which occurs earlier.

This adds some of the mask binary operations so that we can combine
multiple compares that we need for expansion.

I've also fixed up RISCVISelDAGToDAG.cpp to handle copies of masks.

This patch contains a subset of the integer setcc patch as well.
That patch is dependent on the integer binary ops patch. I'll rebase
based on what order the patches go in.

Reviewed By: frasercrmck

Differential Revision: https://reviews.llvm.org/D96567
2021-02-15 12:52:25 -08:00
Fraser Cormack 4bd5bd4009 [RISCV] Convert VSLIDE(UP|DOWN) nodes to "VL" versions (NFC)
This patch prepares the RISCV VSLIDEUP and VSLIDEDOWN custom nodes to
ones carrying additional mask and vector-length operands. This is
primarily so they can be used by both systems.

This also takes the opportunity to create some helper functions to deal
with the common task of getting the default (unmasked) VL operands.

Reviewed By: craig.topper, arcbbb

Differential Revision: https://reviews.llvm.org/D96505
2021-02-15 10:32:56 +00:00
Craig Topper 3520371ddb [RISCV] Rename the RVVBaseAddr ComplexPattern to just BaseAddr and use it to merge some scalar load/store patterns too. 2021-02-13 12:01:51 -08:00
Craig Topper 532d4bf025 [RISCV] Move riscv_vfmv_v_f_vl patterns to RISCVInstrInfoVVLPatterns.td for consistency with riscv_vmv_v_x_vl. NFC 2021-02-12 16:08:27 -08:00
Craig Topper 4220a81c84 [RISCV] Add support for fixed vector fabs 2021-02-12 15:33:36 -08:00
Craig Topper 36658376d5 [RISCV] Add support for fixed vector sqrt. 2021-02-12 15:33:29 -08:00
Craig Topper d32ed9b27e [RISCV] Use a ComplexPattern to merge the PatFrags for removing unneeded masks on shift amounts.
Rather than having patterns with and without an AND, use a
ComplexPattern to handle both cases.

Reduces the isel table by about 700 bytes.
2021-02-12 14:03:23 -08:00
Craig Topper 1697cc78b1 [RISCV] Add support for integer fixed vector setcc
I believe I've covered all orderings of splat operands here. Better
canonicalization in lowering might help reduce this. I did not handle
the immediate adjustments needed for set(u)gt/set(u)lt.

Testing here is limited to byte types because the scalable vector
type used for masks for the store is calculated assuming 8 byte
elements. But for the setcc its based on the element count of the
container type for the setcc input. So they don't agree. We'll need
to enhanced D96352 to handle this I think.

Differential Revision: https://reviews.llvm.org/D96443
2021-02-12 09:29:41 -08:00
Craig Topper 875c76de2b [RISCV] Add support for matching .vx and .vi forms of binary instructions for fixed vectors.
Unlike scalable vectors, I'm only using a ComplexPattern for
the immediate itself. The vmv_v_x is matched explicitly. We igore
the VL argument when matching a binary operator, but we do check
it when matching splat directly.

I left out tests for vXi64 as they fail on rv32 right now.

Reviewed By: frasercrmck

Differential Revision: https://reviews.llvm.org/D96365
2021-02-12 09:18:10 -08:00
luxufan feaf1d81e3 [RISCV] Change parseVTypeI function
Change parseVTypeI function to Make the added vset instruction test cases report more concrete error message.

Differential Revision: https://reviews.llvm.org/D96218
2021-02-12 19:38:34 +08:00
Fraser Cormack e88da1d677 [RISCV] Add support for integer fixed min/max
This patch extends the initial fixed-length vector support to include
smin, smax, umin, and umax.

Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D96491
2021-02-12 09:19:45 +00:00
Craig Topper 7a7836b4d8 [RISCV] Add a pattern for a scalable vector mask vnot.
We can use a vnand.mm with the same register for both inputs.
This avoids materializing an alls ones constant with vmset.mm.
2021-02-11 15:34:58 -08:00
ShihPo Hung 9e62c9146d [RISCV] Initial support for insert/extract subvector
This patch handles cast-like insert_subvector & extract_subvector
in which case:
1. index starts from 0.
2. inserting a fixed-width vector into a scalable vector,
   or extracting a fixed-width vector from a scalable vector.

Reviewed By: craig.topper, frasercrmck

Differential Revision: https://reviews.llvm.org/D96352
2021-02-11 14:35:49 -08:00
Craig Topper 033b1bd185 [RISCV] Add support loads, stores, and splats of vXi1 fixed vectors.
This refines how we determine which masks types are legal and adds
support for loads, stores, and all ones/zeros splats.

I left a fixme in store handling where I think we need to zero
extra bits if the type isn't a multiple of a byte. If I remember
right from X86 there was some case we could have a store of a
1, 2, or 4 bit mask and have a scalar zextload that then expected the
bits to be 0. Its tricky to zero the bits with RVV. We need to do
something like round VL up, zero a register, lower the VL back down,
then do a tail undisturbed move into the zero register. Another
option might be to generate a mask of 1/2/4 bits set with a VL of 8
and use that to mask off the bits.

Reviewed By: frasercrmck

Differential Revision: https://reviews.llvm.org/D96468
2021-02-11 09:13:16 -08:00
Jessica Clarke ca606dc988 [RISCV] More whitespace and comment typo fixes in RISCVInstrInfoC.td 2021-02-11 02:32:36 +00:00
Jessica Clarke 0973ce8596 [RISCV] Fix whitespace in RISCVInstrInfoC.td 2021-02-11 02:23:09 +00:00
Craig Topper 350ab4e617 [RISCV] Use OperandTransform field of ImmLeaf to slightly simplify a couple bitmanip patterns. NFC
This binds the SDNodeXForm to the ImmLeaf so we only need to mention
the ImmLeaf in both the input and output pattern.
2021-02-10 17:52:07 -08:00
Craig Topper fc4d780eaf [RISCV] Remove superfluous semicolon. NFC 2021-02-10 11:20:29 -08:00
Craig Topper cb161b3a88 [RISCV] Add support for matching .vf forms of fadd/fsub/fmul/fdiv/fma for fixed vectors.
fma+neg will come in a different patch since I haven't done it for .vv
yet either.

Differential Revision: https://reviews.llvm.org/D96375
2021-02-10 10:16:27 -08:00
Craig Topper 0c254b4a69 [RISCV] Add support for selecting vrgather.vx/vi for fixed vector splat shuffles.
The test cases extract a fixed element from a vector and splat it
into a vector. This gets DAG combined into a splat shuffle.

I've used some very wide vectors in the test to make sure we have
at least a couple tests where the element doesn't fit into the
uimm5 immediate of vrgather.vi so we fall back to vrgather.vx.

Reviewed By: frasercrmck

Differential Revision: https://reviews.llvm.org/D96186
2021-02-10 10:01:56 -08:00
Fraser Cormack a3c74d6d53 [RISCV] Add support for selecting vid.v from build_vector
This patch optimizes a build_vector "index sequence" and lowers it to
the existing custom RISCVISD::VID node. This pattern is common in
autovectorized code.

The custom node was updated to allow it to be used by both scalable and
fixed-length vectors, thus avoiding pattern duplication.

Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D96332
2021-02-10 10:58:40 +00:00
Craig Topper 18ff7e045a [RISCV] Make the min and max vector width command line options more consistent and check their relationship to each other. 2021-02-09 10:47:23 -08:00
Craig Topper fd5adae02c [RISCV] Remove SRO* and SLO* instructions from bitmanip.
As of the current draft these are no longer being considered
for the bitmanip spec. It wasn't clear what sub extension they
belonged in in the 0.93 spec.

So remove them. They can always be added back if something changes.

Reviewed By: frasercrmck

Differential Revision: https://reviews.llvm.org/D96157
2021-02-09 09:35:05 -08:00
Nemanja Ivanovic f6e4b9fc06 [RISCV] Fix shared libs build
Commit a2d19bad07 introduced a
dependency in the RISCV disassembler on two additional libraries
(MC, RISCVDesc) which wasn't added to the CMakeLists.txt. This
causes shared library builds to break. This patch just adds them
to fix failures seen on some bots, such as the PPC64LE Multistage.
2021-02-09 06:14:25 -06:00
Hsiangkai Wang a2d19bad07 [RISCV] Use whole register load/store for generic load/store.
In vector v0.10, there are whole vector register load/store
instructions. I suggest to use the whole register load/store
instructions for generic load/store for scalable vector types. It could
save up vset{i}vl{i} for these load/store.

For fractional LMUL, I keep to use vle{eew}.v/vse{eew}.v instructions to
load/store partial vector registers.

Differential Revision: https://reviews.llvm.org/D95853
2021-02-09 15:52:04 +08:00
Hsiangkai Wang a5b07a221a [RISCV] Initial support of LoopVectorizer for RISC-V Vector.
Define an option -riscv-vector-bits-max to specify the maximum vector
bits for vectorizer. Loop vectorizer will use the value to check if it
is safe to use the whole vector registers to vectorize the loop.

It is not the optimum solution for loop vectorizing for scalable vector.
It assumed the whole vector registers will be used to vectorize the code.
If it is possible, we should configure vl to do vectorize instead of
using whole vector registers.

We only consider LMUL = 1 in this patch.

This patch just an initial work for loop vectorizer for RISC-V Vector.

Differential Revision: https://reviews.llvm.org/D95659
2021-02-09 06:32:18 +08:00
Craig Topper b49aaed8c7 [RISCV] Use _COMMUTABLE fma pseudos for fixed vectors.
This matches what we do in the VLMAX SDNode patterns.
2021-02-08 11:27:23 -08:00
Craig Topper 8d8cafa32e [RISCV] Add support for splat fixed length build_vectors using RVV.
Building on the fixed vector support from D95705

I've added ISD nodes for vmv.v.x and vfmv.v.f and switched to
lowering the intrinsics to it. This allows us to share the same
isel patterns for both.

This doesn't handle splats of i64 on RV32 yet. The build_vector
gets converted to a vXi32 build_vector+bitcast during type
legalization. Not sure the best way to handle this at the moment.

Differential Revision: https://reviews.llvm.org/D96108
2021-02-08 11:12:56 -08:00
Craig Topper b8d719fbe8 [RISCV] Add support for fixed vector FMA.
Follow up to D95705. Does not include the commuting support from D95800.

Differential Revision: https://reviews.llvm.org/D96103
2021-02-08 11:12:56 -08:00
Craig Topper a719b667a9 [RISCV] Add initial support for converting fixed vectors to scalable vectors during lowering to use RVV instructions.
This is an alternative to D95563.

This is modeled after a similar feature for AArch64's SVE that uses
predicated scalable vector instructions.a

Rather than use predication, this patch uses an explicit VL operand.
I've limited it to always use LMUL=1 for now, but we can improve this
in the future.

This requires a bunch of new ISD opcodes to carry the VL operand.
I think we can probably lower intrinsics to these ISD opcodes to
cut down on the size of the isel table. Which is why I've added
patterns for all integer/float types and not just LMUL=1.

I'm only testing one vector width right now, but the width is
programmable via the command line.

Reviewed By: frasercrmck

Differential Revision: https://reviews.llvm.org/D95705
2021-02-08 10:41:30 -08:00
Craig Topper b7b4f4cbc3 [RISCV] Make scalable vector FMA commutable for register allocation.
This adds support for commuting operands and converting between
vfmadd and vfmacc to avoid register copies.

To avoid messing up intrinsic behavior, I've added new pseudo
instructions that have the isCommutable flag set. These pseudos also
force a tail agnostic policy. The intrinsic version still use
the tail undisturbed policy.

For best results it looks like we need to start with fmadd and only
pick fmacc if its beneficial. MachineCSE commutes without contraining
the operands and then commutes back if it didn't help with CSE. So
I've made sure that when the operand choice isn't constrained, we
will keep fmadd for MachineCSE and when it does the second commute,
we get back the original instruction.

Reviewed By: frasercrmck

Differential Revision: https://reviews.llvm.org/D95800
2021-02-08 10:05:33 -08:00
Craig Topper cc2c45dc54 [RISCV] Use SplatPat/SplatPat_simm5 to handle PseudoVMV_V_X_/PseudoVMV_V_I_ selection as well.
This ensures that we'll match immediates consistently regardless
of whether we match them as a standalone splat or as part of
another operation.

While I was there I added complexities to the simm5/uimm5 patterns so
we didn't have to assume that the 1 on the non-immediate was lower
than what tablegen inferred.

I had to make a minor tweak to tablegen to fix one place that
didn't expect to see a ComplexPattern that wasn't a "leaf".

Reviewed By: frasercrmck

Differential Revision: https://reviews.llvm.org/D96199
2021-02-08 09:48:27 -08:00
Mikael Holmen eb8c27c60c [RISCV] Use std::make_tuple to make some toolchains happy again
My toolchain (LLVM 8.0, libstdc++ 5.4.0) complained with:

12:38:19 ../lib/Target/RISCV/RISCVISelLowering.cpp:1717:12: error: chosen constructor is explicit in copy-initialization
12:38:19     return {RISCVISD::VECREDUCE_FADD, Op.getOperand(0),
12:38:19            ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
12:38:19 /proj/flexasic/app/llvm/8.0/bin/../lib/gcc/x86_64-unknown-linux-gnu/5.4.0/../../../../include/c++/5.4.0/tuple:479:19: note: explicit constructor declared here
12:38:19         constexpr tuple(_UElements&&... __elements)
12:38:19                   ^
12:38:19 ../lib/Target/RISCV/RISCVISelLowering.cpp:1720:12: error: chosen constructor is explicit in copy-initialization
12:38:19     return {RISCVISD::VECREDUCE_SEQ_FADD, Op.getOperand(1), Op.getOperand(0)};
12:38:19            ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
12:38:19 /proj/flexasic/app/llvm/8.0/bin/../lib/gcc/x86_64-unknown-linux-gnu/5.4.0/../../../../include/c++/5.4.0/tuple:479:19: note: explicit constructor declared here
12:38:19         constexpr tuple(_UElements&&... __elements)
12:38:19                   ^
12:38:19 2 errors generated.

This commit adds explicit calls to std::make_tuple to work around
the problem.
2021-02-08 14:37:25 +01:00
Fraser Cormack b46aac125d [RISCV] Support the scalable-vector fadd reduction intrinsic
This patch adds support for both the fadd reduction intrinsic, in both
the ordered and unordered modes.

The fmin and fmax intrinsics are not currently supported due to a
discrepancy between the LLVM semantics and the RVV ISA behaviour with
regards to signaling NaNs. This behaviour is likely fixed in version 2.3
of the RISC-V F/D/Q extension, but until then the intrinsics can be left
unsupported.

Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D95870
2021-02-08 09:52:27 +00:00
Craig Topper 3c767b96dc [RISCV] Correct types in tablegen multiclasses found by D95874. 2021-02-05 11:55:58 -08:00
Fraser Cormack e046c0c28b [RISCV] Support scalable-vector integer reduction intrinsics
This patch adds support for the integer reduction intrinsics supported
by RVV. This excludes "mul" which has no corresponding instruction.

The reduction instructions in RVV have slightly complicated type
constraints given they always produce a single "M1" vector register.

They are lowered to custom nodes including the second "scalar" reduction
operand to simplify the patterns and in the hope that they can be useful
for future DAG combines.

Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D95620
2021-02-05 10:10:08 +00:00
Fraser Cormack c3eb2da6c4 [RISCV] Optimize sign-extended EXTRACT_VECTOR_ELT nodes
This patch custom-legalizes all integer EXTRACT_VECTOR_ELT nodes where
SEW < XLEN to VMV_S_X nodes to help the compiler infer sign bits from
the result. This allows us to eliminate redundant sign extensions.

For parity, all integer EXTRACT_VECTOR_ELT nodes are legalized this way
so that we don't need TableGen patterns for some and not others.

Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D95741
2021-02-05 10:05:22 +00:00
Fraser Cormack af48d2bfc2 [RISCV] Add patterns for scalable-vector fsqrt
This patch adds support for lowering the sqrt intrinsic to the RVV
vfsqrt instruction.

Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D96012
2021-02-05 09:39:19 +00:00
Craig Topper 6b280ce34c [RISCV] Use LLVMScalarOrSameVectorWidth to make avoid needing to mention the index type for vrgatherei16 intrinsics.
Add .vv to the intrinsic name to be consistent with D95979.

Reviewed By: khchen

Differential Revision: https://reviews.llvm.org/D95981
2021-02-04 20:26:45 -08:00
Craig Topper 25ff302a79 [RISCV] Split vrgather intrinsics into separate vrgather.vv and vrgather.vx intrinsics.
The vrgather.vv instruction uses a vector of indices with the same
SEW as operand 0. The vrgather.vx instructions use a scalar index
operand of XLen bits.

By splitting this into 2 intrinsics we are able to use LLVMatchType
in the definition to avoid specifying the type for the index operand
when creating the IR for the intrinsic. For .vv it will match the
operand 0 type. And for .vx it will match the type of the vl operand
we already needed to specify a type for.

I'm considering splitting more intrinsics. This was a somewhat
odd one because the .vx doesn't use the element type, it always
use XLen.

Reviewed By: HsiangKai

Differential Revision: https://reviews.llvm.org/D95979
2021-02-04 19:50:12 -08:00
Hsiangkai Wang 63baeec66e [RISCV] Load/store vector mask types.
Use vle1.v/vse1.v to load/store vector mask types.

Differential Revision: https://reviews.llvm.org/D93364
2021-02-03 13:44:15 +08:00
Hsiangkai Wang c7189ba785 [RISCV] Add new vector instructions in v0.10.
* Add new vector instructions in v0.10.
 - load/store for mask value vle1.v vse1.v
 - vsetivli for 0-31 immediate vector length.
* Rename vector instructions in v0.10.
 - vfrsqrte7 -> vfrsqrt7
 - vfrece7 -> vfrec7
* Reserve memory width encodings for EEW>128b.

Differential Revision: https://reviews.llvm.org/D95781
2021-02-03 13:28:58 +08:00
Fraser Cormack b4106f9c7b [RISCV] Fix incorrect RVV sdiv/udiv lowering
Due to a clerical error, the sdiv operation was mapping to vdivu and
udiv to vdiv, when the opposite mapping is the correct one.

Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D95869
2021-02-02 18:35:53 +00:00
Craig Topper c4fd1981a7 [RISCV] Correct types in tablegen multiclasses found by D95874. 2021-02-02 10:39:47 -08:00
Craig Topper 912306ef21 [RISCV] Use a ComplexPattern to merge isel patterns for vector load/store with GPR and FrameIndex addresses.
This reduces the isel table size by about 3000 bytes.

Reviewed By: frasercrmck

Differential Revision: https://reviews.llvm.org/D95844
2021-02-02 10:20:52 -08:00
Craig Topper e7f9a83499 [RISCV] Replace NoX0 SDNodeXForm with a ComplexPattern to do the selection of the VL operand.
I think this is a more standard way of doing this.

Reviewed By: rogfer01

Differential Revision: https://reviews.llvm.org/D95833
2021-02-02 00:08:58 -08:00
Craig Topper 72b31ad4b8 [RISCV] Add scalable vector support for floating point FMA instructions
A follow up patch will add support for commuting operands or
changing opcode to vfmacc and friends.

Reviewed By: frasercrmck

Differential Revision: https://reviews.llvm.org/D95662
2021-02-01 09:52:43 -08:00
Craig Topper 6a3ab66625 [RISCV] Update comment text from D95774. NFC 2021-02-01 09:52:43 -08:00
Craig Topper 1097ee61bf [RISCV] Optimize (srl (and X, 0xffff), C) -> (srli (slli X, 16), 16 + C).
Rather than materializing the 0xffff immediate for the AND, use
a shift left to remove the upper bits and then shift in zeros
from the right.

This pattern occurs when type legalizing an i16 right shift.

I've implemented this with custom selection code for a number of
reasons. I've limited this to the AND having a single use. We need
to compensate for SimplifyDemandedBits altering the AND mask. I'm
using *W opcodes on RV64. We may want to generlize this in the
future. For all these reason it seemed easiest to do it this way.

Reviewed By: luismarques

Differential Revision: https://reviews.llvm.org/D95774
2021-02-01 09:37:55 -08:00
Craig Topper 44cc5abbf9 [RISCV] Custom lower fshl/fshr with Zbt extension.
We need to add a mask to the shift amount for these operations
to use the FSR/FSL instructions. We were previously doing this
in isel patterns, but custom lowering will make the mask
visible to optimizations earlier.
2021-01-31 17:49:15 -08:00
Craig Topper 3fdf2a56dd [RISCV] Use MVT instead of EVT in RISCVISelDAGToDAG.cpp
All this code runs post type legalization so we should have
exclusively legal types. The methods on MVT should be more
efficient than EVT.
2021-01-30 15:57:15 -08:00
Hsiangkai Wang 9847023660 [RISCV] Update the version number to v0.10 for vector. 2021-01-30 07:55:58 +08:00
Hsiangkai Wang 282aca10ae [RISCV] Update the version number to v0.10 for vector.
v0.10 is tagged in V specification. Update the version to v0.10.

Differential Revision: https://reviews.llvm.org/D95680
2021-01-30 07:20:05 +08:00
Hsiangkai Wang e08b67f3a8 [NFC][RISCV] Remove redundant pseudo instructions for vector load/store.
Not all combinations of SEW and LMUL we need to support. For example, we
only need to support [M1, M2, M4, M8] for SEW = 64. There is no need to
define pseudos for PseudoVLSE64MF8, PseudoVLSE64MF4, and PseudoVLSE64MF2.

Differential Revision: https://reviews.llvm.org/D95667
2021-01-30 07:20:05 +08:00
Kazu Hirata 046cfb8565 [llvm] Forward-declare formatted_raw_ostream (NFC)
Various *TargetStreamer.h need formatted_raw_ostream but rely on a
forward declaration of formatted_raw_ostream in MCStreamer.h.  This
patch adds forward declarations right in *TargetStreamer.h.

While we are at it, this patch removes the one in MCStreamer.h, where
it is unnecessary.
2021-01-28 22:21:13 -08:00
Christudasan Devadasan 892e4567e1 Support a list of CostPerUse values
This patch allows targets to define multiple cost
values for each register so that the cost model
can be more flexible and better used during the
register allocation as per the target requirements.

For AMDGPU the VGPR allocation will be more efficient
if the register cost can be associated dynamically
based on the calling convention.

Reviewed By: qcolombet

Differential Revision: https://reviews.llvm.org/D86836
2021-01-29 10:14:52 +05:30
Craig Topper c5d4b77b17 [RISCV] Remove isel patterns for Zbs *W instructions.
These instructions have been removed from the 0.94 bitmanip spec.
We should focus on optimizing the codegen without using them.

Reviewed By: asb

Differential Revision: https://reviews.llvm.org/D95302
2021-01-28 09:33:56 -08:00
Craig Topper ae82a8c863 [RISCV] Add support for scalable vector fneg using vfsgnjn.vv
Reviewed By: frasercrmck

Differential Revision: https://reviews.llvm.org/D95568
2021-01-28 09:11:49 -08:00
Simon Pilgrim aa76cebab5 Fix "32-bit shift result used in 64-bit comparison" MSVC warning. NFCI. 2021-01-28 11:21:36 +00:00
Fraser Cormack fc2f27ccf3 [RISCV] Add support for RVV int<->fp & fp<->fp conversions
This patch adds support for the full range of vector int-to-float,
float-to-int, and float-to-float conversions on legal types.

Many conversions are supported natively in RVV so are lowered with
patterns. These include conversions between (element) types of the same
size, and those that are half/double the size of the input. When
conversions take place between types that are less than half or more
than double the size we must lower them using sequences of instructions
which go via intermediate types.

Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D95447
2021-01-28 09:50:32 +00:00
Craig Topper 5d05cdf55c [RISCV] Copy isUnneededShiftMask from X86.
In d2927f786e, I added patterns
to remove (and X, 31) from sllw/srlw/sraw shift amounts.

There is code in SelectionDAGISel.cpp that knows to use
computeKnownBits to fill in bits of the mask that were removed
by SimplifyDemandedBits based on bits being known zero.

The non-W shift patterns use immbottomxlenset which allows the
mask to have more than log2(xlen) trailing ones, but doesn't
have a call to computeKnownBits to fill in bits of the mask that may
have been cleared by SimplifyDemandedBits.

This patch copies code from X86 to handle more than log2(xlen)
bottom bits set and uses computeKnownBits to fill in missing bits
before counting.

Reviewed By: luismarques

Differential Revision: https://reviews.llvm.org/D95422
2021-01-27 20:46:10 -08:00
Craig Topper 58aa049b9b [RISCV] Move RISCVVPseudosTable from RISCVBaseInfo.h to RISCVInstrInfo.h. NFC
RISCVBaseInfo.h belongs to the MC layer, but the Pseudo instructions
are only used by the CodeGen layer. So it makes sense to keep this
table in the CodeGen layer.
2021-01-27 13:38:26 -08:00
Craig Topper ff038b316d [RISCV] Reduce field sizes in searchable tables to reduce binary size. 2021-01-27 12:24:01 -08:00
Craig Topper a40e01e442 [RISCV] Rework fault first only load isel.
-Remove the ISD opcode for READ_VL. Just emit the MachineSDNode directly.
-Move segmented fault first only load intrinsic handling completely to
 RISCVISelDAGToDAG.cpp and emit the ReadVL MachineSDNode there
 instead of lowering to ISD opcodes first.
2021-01-27 11:51:41 -08:00
Craig Topper 04570e98c8 [RISCV] Group the legal vector types into lists we can iterator over in the RISCVISelLowering constructor
Remove the RISCVVMVTs namespace because I don't think it provides
a lot of value. If we change the mappings we'd likely have to add
or remove things from the list anyway.

Add a wrapper around addRegisterClass that can determine the
register class from the fixed size of the type.

Reviewed By: frasercrmck, rogfer01

Differential Revision: https://reviews.llvm.org/D95491
2021-01-27 10:20:12 -08:00
Fraser Cormack 9a75a808c2 [RISCV] Fix a codegen crash in getSetCCResultType
This patch fixes some crashes coming from
`RISCVISelLowering::getSetCCResultType`, which would occasionally return
an EVT constructed from an invalid MVT, which has a null Type pointer.

The attached test shows this happening currently for some fixed-length
vectors, which hit this issue when the V extension was enabled, even
though they're not legal types under the V extension. The fix was also
pre-emptively extended to scalable vectors which can't be represented as
an MVT, even though a test case couldn't be found for them.

Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D95434
2021-01-27 10:22:54 +00:00
Craig Topper f9d7f77267 [RISCV] Have customLegalizeToWOp truncate to the original type instead of i32 now that we use it for i8/i16 as well.
239cfbccb0 add support for legalizing
i8/i16 UDIV/UREM/SDIV to use *W instructions. So we need to truncate
to i8/i16 if we're legalizing one of those.
2021-01-26 10:50:03 -08:00
Craig Topper bfc60acd98 [RISCV] Adjust RISCVInstrInfoVSDPatterns.td for different pseudo instructions for different FPR.
Move the Suffix string into the VTypeInfo class so we don't need a helper class to get to it.

Adjust pseudo naming scheme for FPRs to put F16/F32/F64 in
place of F in the pseudo instruction name rather than as a suffix.
This avoids special cases like VFMERGE from the original patch.

Differential Revision: https://reviews.llvm.org/D95404
2021-01-26 01:00:50 -08:00
Hsiangkai Wang e72b22a40b [RISCV] Define different pseudo instructions for different FPR.
When spilling, the spill size will depend on the size of register class.
For .vf vector instructions, it may spill the floating point scalar
argument. In order to use the correct load/store instructions for
spilling, we need to provide the correct floating point register class
for the .vf vector pseudo instructions.

In this commit, we define the .vf pseudo instructions as three
different kinds of pseudo instructions for half/float/double. For
example, PseudoVFADD_M1 will become as PseudoVFADD_F16_M1,
PseudoVFADD_F32_M1, and PseudoVFADD_F64_M1.

Differential Revision: https://reviews.llvm.org/D95234
2021-01-26 15:48:35 +08:00
Hsiangkai Wang f19849a07b [RISCV] Update V extension to v1.0-draft 08a0b464.
Differential Revision: https://reviews.llvm.org/D94583
2021-01-26 12:02:43 +08:00
Hsiangkai Wang b69932b550 [RISCV] Implement vlsegff intrinsics.
Differential Revision: https://reviews.llvm.org/D95303
2021-01-26 12:02:43 +08:00
Craig Topper 15f66cf749 [RISCV] Add isel patterns to optimize slli.uw patterns without Zba extension.
This pattern can occur when an unsigned is used to index an array
on RV64.

Reviewed By: luismarques

Differential Revision: https://reviews.llvm.org/D95290
2021-01-25 16:12:08 -08:00
Fraser Cormack 15141cd115 [RISCV] Add RVV insertelt/extractelt scalable-vector patterns
Original patch by @rogfer01.

This patch adds support for insertelt and extractelt operations on
scalable vectors.

Special care must be taken on RV32 when dealing with i64 vectors as
there are no straightforward ways to insert a 64-bit element without a
register of that size. To that end, both are custom-lowered to different
sequences.

Authored-by: Roger Ferrer Ibanez <rofirrim@gmail.com>
Co-Authored-by: Fraser Cormack <fraser@codeplay.com>

Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D94615
2021-01-25 22:03:52 +00:00
Craig Topper 239cfbccb0 [RISCV] Custom type legalize i8/i16 UDIV/UREM/SDIV on RV64 so we can use divuw/remuw/divw.
This makes our i8/i16 codegen more similar to the i32 codegen.

I've also added computeKnownBits support for DIVUW/REMUW so
that we can remove zero extending ANDs from the output. Without
this we end up turning DIVUW/REMUW back into DIVU/REMU via some
isel patterns.

Reviewed By: frasercrmck, luismarques

Differential Revision: https://reviews.llvm.org/D95322
2021-01-25 10:47:22 -08:00
Craig Topper 4eb4f8963f [RISCV] Use sign extend for i32 arguments and returns in makeLibCall on RV64.
As far as I know 32 bits arguments and returns on RV64 are always
sign extended to i64. So I think we should be taking this into
account around libcalls.

Reviewed By: luismarques

Differential Revision: https://reviews.llvm.org/D95285
2021-01-25 09:33:48 -08:00
Fraser Cormack fde2466171 [SelectionDAG] Support scalable-vector splats in more cases
This patch adds support for scalable-vector splats in DAGCombiner's
`isConstantOrConstantVector` and `ISD::matchUnaryPredicate` functions,
which enable the SelectionDAG div/rem-by-constant optimizations for
scalable vector types.

It also fixes up one case where the UDIV optimization was generating a
SETCC without first consulting the target for its preferred SETCC result
type.

Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D94501
2021-01-25 10:58:15 +00:00
Simon Cook a7c1239f37 [RISCV] Add attribute support for all supported extensions
This adds support for ".attribute arch" for all extensions that are
currently supported by the compiler.

Differential Revision: https://reviews.llvm.org/D94931
2021-01-25 08:58:53 +00:00
Craig Topper 12d0753aca [RISCV] Use bitsLE instead of strict == MVT::i32 in assertsexti32 and assertzexti32.
The patterns that use this really want to know if the operand has at
least 32 sign/zero bits.

This increases opportunities to use W instructions when the original
source used i8/i16. Not sure how much this matters for performance,
but it makes i8/i16 code more consistent with i32.
2021-01-24 13:58:14 -08:00
Simon Cook f3f3c9c254 [RISCV] Fix name of Zba extension (NFC) 2021-01-24 21:02:34 +00:00
Craig Topper 116177afcc [RISCV] Use SRLIWPat in the PACKUW pattern.
This makes the code more tolerant if we ever change SimplifyDemandedBits
to not remove 1s from the lsbs of a contiguous mask.
2021-01-24 10:41:58 -08:00
Craig Topper c50457f3e4 [RISCV] Make the code in MatchSLLIUW ignore the lower bits of the AND mask where the shift has guaranteed zeros.
This avoids being dependent on SimplifyDemandedBits having cleared
those bits.

It could make sense to teach SimplifyDemandedBits to keep all
lower bits 1 in an AND mask when possible. This could be
implemented with slli+srli in the general case rather than
needing to materialize the constant.
2021-01-24 00:34:45 -08:00
Craig Topper c7d5d8fa33 [RISCV] Group some Zbs isel patterns together and remove a stale comment. NFC 2021-01-23 16:45:05 -08:00
Craig Topper 998057ec06 [RISCV] Add isel patterns to remove masks on SLO/SRO shift amounts. 2021-01-23 15:57:41 -08:00
Craig Topper d2927f786e [RISCV] Add isel patterns to remove (and X, 31) from sllw/srlw/sraw shift amounts.
We try to do this during DAG combine with SimplifyDemandedBits,
but it fails if there are multiple nodes using the AND. For
example, multiple shifts using the same shift amount.
2021-01-23 15:08:18 -08:00
Hsiangkai Wang 66a49aef69 [RISCV] Implement vsoxseg/vsuxseg intrinsics.
Define vsoxseg/vsuxseg intrinsics and pseudo instructions.
Lower vsoxseg/vsuxseg intrinsics to pseudo instructions in RISCVDAGToDAGISel.

Differential Revision: https://reviews.llvm.org/D94940
2021-01-23 08:54:56 +08:00
Hsiangkai Wang 97e33feb08 [RISCV] Implement vloxseg/vluxseg intrinsics.
Define vloxseg/vluxseg intrinsics and pseudo instructions.
Lower vloxseg/vluxseg intrinsics to pseudo instructions in RISCVDAGToDAGISel.

Differential Revision: https://reviews.llvm.org/D94903
2021-01-23 08:54:56 +08:00
Craig Topper d65e8ee507 [RISCV] Add more cmov isel patterns to handle seteq/ne with a small non-zero immediate.
Similar to our free standing setcc patterns, we can use ADDI to
subtract the immediate from the other operand. Then the cmov
can check if the result is zero or non-zero.

Reviewed By: mundaym

Differential Revision: https://reviews.llvm.org/D95169
2021-01-22 14:51:22 -08:00
Craig Topper 095e245e16 [RISCV] Add isel patterns for SH*ADD(.UW)
This adds an initial set of patterns for these instructions. Its
more complicated that I would like for the sh*add.uw instructions
because there is no guaranteed canonicalization for shl/and with
constants.

Reviewed By: asb

Differential Revision: https://reviews.llvm.org/D95106
2021-01-22 13:28:41 -08:00
Craig Topper 20f2e32d2c [RISCV] Update B extension version to 0.93.
Reviewed By: asb, frasercrmck

Differential Revision: https://reviews.llvm.org/D95002
2021-01-22 12:49:10 -08:00
Craig Topper f25f7e8ecd [RISCV] Add xperm.* instructions to Zbp extension.
Reviewed By: asb, frasercrmck

Differential Revision: https://reviews.llvm.org/D94999
2021-01-22 12:49:10 -08:00
Craig Topper 4d5aa760a7 [RISCV] Add support for rev8 and orc.b to Zbb.
These instructions use a portion of the encodings for grevi and
gorci. The full encodings are only supported with Zbp. Note,
rev8 has a different encoding between rv32 and rv64.

Zbb is closer to being finalized that Zbp which has motivated
some decisions in this patch.

I'm treating rev8 and orc.b as separate instructions when
either Zbb or Zbp is enabled. This allows us to print to suggest
that either feature needs to be enabled to support these mnemonics.
I had tried to put HasStdExtZbbAndNotZbp on the Zbb instructions,
but that caused a diagnostic that said Zbp is required if neither
feature is enabled. We should really mention Zbb since its closer
to final.

This does require extra isel patterns for the different cases so
that bswap will always print as rev8 in assembly listing since
we can't use an InstAlias.

llvm-objdump disassembling should always pick the rev8 or orc.b
instructions. llvm-mc parsing and printing text will not convert
the grevi/gorci spellings to rev8/gorc.b. We could probably fix
this with a special case in processInstruction in the assembly
parser if it its important.

Reviewed By: asb, frasercrmck

Differential Revision: https://reviews.llvm.org/D94944
2021-01-22 12:49:10 -08:00
Craig Topper 3c94cee63b [RISCV] Add zext.h instruction to Zbb.
zext.h uses the same encoding as pack rd, rs, x0 in rv32 and
packw rd, rs, x0 in rv64. Encodings without x0 as the second source
are not valid in Zbb.

I've added two new instructions with these specific encodings with
predicates that enable them when either Zbb or Zbp is enabled.

The pack spelling will only be accepted with Zbp. The disassembler
will use the zext.h instruction when either feature is enabled.

Using the pack spelling will print as pack when llvm-mc is
emitting text. We could fix this with some custom code in
processInstruction if this is important, but I'm not sure it is.

Reviewed By: asb, frasercrmck

Differential Revision: https://reviews.llvm.org/D94818
2021-01-22 12:49:10 -08:00
Craig Topper 83c92fdeda [RISCV] Move pack instructions to Zbp extension only.
Zext.h will need to come back to Zbb, but that only uses specific
encodings of pack.

Reviewed By: asb, frasercrmck

Differential Revision: https://reviews.llvm.org/D94742
2021-01-22 12:49:10 -08:00
Craig Topper 5ae92f1e11 [RISCV] Change zext.w to be an alias of add.uw rd, rs1, x0 instead of pack.
This didn't make it into the published 0.93 spec, but it was the
intention.

But it is in the tex source as of this commit
d172f029c0

This means zext.w now requires Zba. Not sure if we should still use
pack if Zbp is enabled and Zba isn't. I'll leave that for the future
when pack is closer to being final.

Reviewed By: asb, frasercrmck

Differential Revision: https://reviews.llvm.org/D94736
2021-01-22 12:49:10 -08:00
Craig Topper 9d499e037e [RISCV] Modify add.uw patterns to put the masked operand in rs1 to match 0.93 bitmanip spec.
The 0.93 spec has this implementation for add.uw

uint_xlen_t adduw(uint_xlen_t rs1, uint_xlen_t rs2) {
  uint_xlen_t rs1u = (uint32_t)rs1;
  return rs1u + rs2;
}

The 0.92 spec had the usages of rs1 and rs2 swapped.

Reviewed By: frasercrmck, asb

Differential Revision: https://reviews.llvm.org/D95090
2021-01-22 12:49:10 -08:00
Craig Topper efbcd66861 [RISCV] Rename Zbs instructions to start with just 'b' instead of 'sb' to match 0.93 bitmanip spec.
Also renamed Zbe instructions to resolve name conflict even though
that change is in the 0.94 draft.

Reviewed By: asb, frasercrmck

Differential Revision: https://reviews.llvm.org/D94653
2021-01-22 12:49:10 -08:00
Craig Topper 1355458ef6 [RISCV] Move Shift Ones instructions from Zbb to Zbp to match 0.93 bitmanip spec.
It's not really clear in the spec that these are in Zbp now, but
that's what I've gather from previous commits to the spec. I've
file an issue to get it documented properly.

Reviewed By: asb, frasercrmck

Differential Revision: https://reviews.llvm.org/D94652
2021-01-22 12:49:10 -08:00
Craig Topper 83a93ae63b [RISCV] Add SH*ADD(.UW) instructions to Zba extension based on 0.93 bitmanip spec.
Reviewed By: asb, frasercrmck

Differential Revision: https://reviews.llvm.org/D94637
2021-01-22 12:49:10 -08:00
Craig Topper 4e6ad11bc6 [RISCV] Add Zba feature and move add.uw and slli.uw to it.
Still need to add SH*ADD instructions.

Reviewed By: asb, frasercrmck

Differential Revision: https://reviews.llvm.org/D94617
2021-01-22 12:49:10 -08:00
Craig Topper b825278364 [RISCV] Rename mnemonics slliu.w->slli.uw and addu.w->add.uw to match 0.93 bitmanip spec.
Reviewed By: asb, frasercrmck

Differential Revision: https://reviews.llvm.org/D94582
2021-01-22 12:49:10 -08:00
Craig Topper d985c7321f [RISCV] Swap encodings of max and minu to match 0.93 bitmanip spec.
Reviewed By: asb, frasercrmck

Differential Revision: https://reviews.llvm.org/D94580
2021-01-22 12:49:10 -08:00
Craig Topper b2f859500f [RISCV] Remove addiwu, addwu, subwu, subuw, clmulw, clmulrw, clmulhw to match 0.93 bitmanip spec.
Reviewed By: asb, frasercrmck

Differential Revision: https://reviews.llvm.org/D94577
2021-01-22 12:49:10 -08:00
Craig Topper 6aced6bf39 [RISCV] Rename pcnt->cpop to match 0.93 bitmanip spec.
This is the first of multiple patches to bring our 0.92
implementation up to 0.93.

Reviewed By: asb, frasercrmck

Differential Revision: https://reviews.llvm.org/D94568
2021-01-22 12:49:10 -08:00
Kazu Hirata cfa241680f [llvm] Don't include StringSwitch.h where unnecessary (NFC) 2021-01-21 19:59:48 -08:00
Hsiangkai Wang 5d354220d4 [RISCV] Correct DWARF number for vector registers.
The DWARF numbers of vector registers are already defined in
riscv-elf-psabi. The DWARF number for vector is start from 96.
Correct the DWARF numbers of vector registers.

Differential Revision: https://reviews.llvm.org/D94749
2021-01-22 11:33:42 +08:00
Craig Topper f8f1b20e6b [RISCV] Don't create LMUL=8 pseudo instructions for ternary widening arithmetic instructions
These instructions produce 2*SEW result so the input can't have
an LMUL=8 or the result would need a non-existant LMUL=16. So
only create pseudos for LMUL up to 4.

Differential Revision: https://reviews.llvm.org/D95189
2021-01-21 19:29:02 -08:00
ShihPo Hung 9667750331 [RISCV] Add intrinsics for RVV1.0 VFRSQRTE7 & VFRECE7
Reviewed By: craig.topper, frasercrmck

Differential Revision: https://reviews.llvm.org/D95113
2021-01-21 18:38:49 -08:00
ShihPo Hung 976cf53cc7 [RISCV] Add intrinsics for vector unordered indexed load in RVV 1.0
Add unordered indexed load: vluxei

Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D95028
2021-01-21 18:38:49 -08:00
ShihPo Hung bea661d9a5 [RISCV] Add intrinsics for RVV 1.0 vrgatherei16
Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D95014
2021-01-21 18:38:49 -08:00
Craig Topper 3b5430eb0d [RISCV] Add a VL output to vleff intrinsics.
The fault-only-first-load instructions can reduce VL if an element
other than element 0 triggers a memory fault. This can be used to
vectorize loops with data dependent exit conditions like strcmp or
strlen.

This patch adds a VL output to these intrinsics so that the new
VL value can be captured by software. This will be expanded to
'csrr gpr, vl' after the vleff instruction during SelectionDAG.

By doing this with one intrinsic we are able to guarantee that the
csrr reads the VL value produced by the vleff instruction. Having
it as a separate intrinsic would make it impossible to guarantee
ordering without making every other vector intrinsic have side
effects.

The intrinsics are expanded during lowering into two ISD nodes
that are glued together. These ISD nodes will go
through isel separately, but should maintain the glue so that they
get emitted adjacently by InstrEmitter.

I've only ran the chain through the vleff instruction, allowing
the READ_VL to be deleted if it is unused.

Reviewed By: HsiangKai

Differential Revision: https://reviews.llvm.org/D94286
2021-01-21 17:19:58 -08:00
Hsiangkai Wang 6e360460f1 [RISCV] Use v8-v23 as argument registers to conform to the proposal.
The maximum LMUL is 8. We need 16 vector registers for two LMUL-8
arguments. The modification follows the proposal of psABI in
https://github.com/riscv/riscv-elf-psabi-doc/pull/171

Differential Revision: https://reviews.llvm.org/D95134
2021-01-22 07:55:24 +08:00
Hsiangkai Wang b7ab6726b6 [RISCV] New vector load/store in V extension v1.0
Upgrade RISC-V V extension to v1.0-08a0b46.
Indexed load/store have ordered and unordered form.
New whole vector load/store.

Differential Revision: https://reviews.llvm.org/D93614
2021-01-22 07:30:09 +08:00
Michael Munday 4ab0f51a75 Recommit "[RISCV] Legalize select when Zbt extension available"
This recommits 71ed4b6ce5 with
the polarity of some of the pattern corrected.

Original commit message:
The custom expansion of select operations in the RISC-V backend
interferes with the matching of cmov instructions. Legalizing
select when the Zbt extension is available solves that problem.

Reviewed By: luismarques, craig.topper

Differential Revision: https://reviews.llvm.org/D93767
2021-01-21 12:07:44 -08:00
Hsiangkai Wang b8921af63b [RISCV] Update V instructions constraints to conform to v1.0
Upgrade RISC-V V extension to v1.0-08a0b46.
Update instruction constraints to conform to v1.0.

Differential Revision: https://reviews.llvm.org/D93612
2021-01-22 01:15:55 +08:00
Hsiangkai Wang 266820be35 [RISCV] Add new V instructions in v1.0-08a0b46.
Add new V instructions.
vfrsqrte7.v
vfrece7.v
vrgatherei16.vv
vneg.v
vncvt.x.x.w
vfneg.v
2021-01-22 00:59:58 +08:00
Hsiangkai Wang 9dd5aea1e0 [RISCV] Make LMUL field in VTYPE continuous.
Upgrade RISC-V V extension to v1.0-08a0b46.
Update the VTYPE encoding. Make LMUL encoding in a continuous field.
2021-01-22 00:47:32 +08:00
Hsiangkai Wang a8b96eadfd [RISCV] Implement vssseg intrinsics.
Define vlsseg intrinsics and pseudo instructions. Lower vlsseg
intrinsics to pseudo instructions in RISCVDAGToDAGISel.

Differential Revision: https://reviews.llvm.org/D94863
2021-01-21 11:51:35 +08:00
Hsiangkai Wang e5e329023b [RISCV] Implement vlsseg intrinsics.
Define vlsseg intrinsics and pseudo instructions. Lower vlsseg intrinsics
to pseudo instructions in RISCVDAGToDAGISel.

Differential Revision: https://reviews.llvm.org/D94763
2021-01-21 11:51:35 +08:00
Hsiangkai Wang 47228f7854 [RISCV] Implement vsseg intrinsics.
Define vsseg intrinsics and pseudo instructions. Lower vsseg intrinsics
to pseudo instructions in RISCVDAGToDAGISel.

Differential Revision: https://reviews.llvm.org/D94688
2021-01-21 11:51:35 +08:00
Craig Topper e996f1d419 [RISCV] Add another isel pattern for slliu.w.
Previously we only matched (and (shl X, C1), 0xffffffff << C1)
which matches the InstCombine canonicalization order. But its
possible to see (shl (and X, 0xffffffff), C1) if the pattern
is introduced in SelectionDAG. For example, through expansion of
a GEP.
2021-01-20 14:54:40 -08:00
Craig Topper 9d792fef57 [RISCV] Remove unnecessary APInt copy. NFC
getAPIntValue returns a const APInt& so keep it as a reference.
2021-01-20 10:33:09 -08:00
Craig Topper b11b6ab3e0 [RISCV] Add way to mark CompressPats that should only be used for compressing.
There can be muliple patterns that map to the same compressed
instruction. Reversing those leads to multiple ways to uncompress
an instruction, but its not easily controllable which one will
be chosen by the tablegen backend.

This patch adds a flag to mark patterns that should only be used
for compressing. This allows us to leave one canonical pattern
for uncompressing.

The obvious benefit of this is getting c.mv to uncompress to
the addi patern that is aliased to the mv pseudoinstruction. For
the add/and/or/xor/li patterns it just removes some unreachable
code from the generated code.

Reviewed By: frasercrmck

Differential Revision: https://reviews.llvm.org/D94894
2021-01-20 09:20:15 -08:00
Hsiangkai Wang 8ca4b174d7 [RISCV] Implement vlseg intrinsics.
For Zvlsseg, we need continuous vector registers for the values. We need
to define new register classes for the different combinations of (number
of fields and LMUL). For example,

when the number of fields(NF) = 3, LMUL = 2, the values will be assigned
to (V0M2, V2M2, V4M2), (V2M2, V4M2, V6M2), (V4M2, V6M2, V8M2), ...

We define the vlseg intrinsics with multiple outputs. There is no way to
describe the codegen patterns with multiple outputs in the tablegen
files. We do the codegen in RISCVISelDAGToDAG and use EXTRACT_SUBREG to
extract the values of output.

The multiple scalable vector values will be put into a struct. This
patch is depended on the support for scalable vector struct.

Differential Revision: https://reviews.llvm.org/D94229
2021-01-20 14:26:04 +08:00
ShihPo Hung 4dae2247fd [RISCV] refactor VPatBinary (NFC)
Make it easier to reuse for intrinsic vrgatherei16
which needs to encode both LMUL & EMUL in the instruction name,
like PseudoVRGATHEREI16_VV_M1_M1 and PseudoVRGATHEREI16_VV_M1_M2.

Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D94951
2021-01-19 19:09:56 -08:00
Craig Topper e75a4b6ea9 [RISCV] Remove NotHasStdExtZbb predicate from zext.h/sext.b/sext.h InstAliases. NFC
NotHasStdExtZbb doesn't have an AssemblerPredicate associated with it
so it didn't do anything. We don't need it either because the sorting
rules in tablegen prioritize by number of predicates. So the
dedicated instructions in the B extension that have predicates
will be prioritized automatically.
2021-01-19 14:31:48 -08:00
Craig Topper ce8b3937dd [RISCV] Add DAG combine to turn (setcc X, 1, setne) -> (setcc X, 0, seteq) if we can prove X is 0/1.
If we are able to compare with 0 instead of 1, we might be able
to fold the setcc into a beqz/bnez.

Often these setccs start life as an xor that gets converted to
a setcc by DAG combiner's rebuildSetcc. I looked into a detecting
(xor X, 1) and converting to (seteq X, 0) based on boolean contents
being 0/1 in rebuildSetcc instead of using computeKnownBits. It was
very perturbing to AMDGPU tests which I didn't look closely at.
It had a few changes on a couple other targets, but didn't seem
to be much if any improvement.

Reviewed By: lenary

Differential Revision: https://reviews.llvm.org/D94730
2021-01-19 11:21:48 -08:00
Fraser Cormack 9c6a00fe99 [RISCV] Add ISel patterns for scalable mask exts & truncs
Original patch by @rogfer01.

This patch adds support for sign-, zero-, and any-extension from
scalable mask vector types to integer vector types, as well as
truncation in the opposite direction.

Authored-by: Roger Ferrer Ibanez <rofirrim@gmail.com>
Co-Authored-by: Fraser Cormack <fraser@codeplay.com>

Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D94590
2021-01-19 18:13:15 +00:00
Fraser Cormack 15fd6bae0e [RISCV] Extend RVV VType info with the type's AVL (NFC)
This patch factors out the "VLMax" operand passed to most
scalable-vector ISel patterns into a property of each VType.

This is seen as a preparatory change to allow RVV in the future to
more easily support fixed-length vector types with constrained vector
lengths, with the AVL operand set to the length of the fixed-length
vector. It has no effect on the scalable code generation path.

Reviewed By: HsiangKai

Differential Revision: https://reviews.llvm.org/D94594
2021-01-19 15:46:56 +00:00
Fraser Cormack c81ea9429f [RISCV] Add scalable-vector integer extension patterns
Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D94694
2021-01-19 09:30:36 +00:00
ShihPo Hung 9cf511aa08 [RISCV] Add intrinsics for vector AMO operations
Add vamoswap, vamoadd, vamoxor, vamoand, vamoor,
    vamomin, vamomax, vamominu, vamomaxu intrinsics.

Reviewed By: craig.topper, khchen

Differential Revision: https://reviews.llvm.org/D94589
2021-01-18 23:11:10 -08:00
Craig Topper 1c31459153 [RISCV] Remove empty Sched instantiations from the end of InstAlias defs. NFCI
InstAliases don't need scheduling information so I'm not sure what
these lines were even doing. Especially since the records don't
have names.
2021-01-18 14:08:29 -08:00
Fraser Cormack ac603c8d38 [RISCV] Add scalable vector truncate patterns
Original patch by @rogfer01.

This patch supports vector truncates, which on RVV must be done in a
series of instructions truncating by one power-of-two at a time. This is
done through custom-lowering and a custom node to avoid LLVM
re-combining the split TRUNCATE nodes.

Authored-by: Roger Ferrer Ibanez <rofirrim@gmail.com>
Co-Authored-by: Fraser Cormack <fraser@codeplay.com>

Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D94796
2021-01-18 10:18:43 +00:00
Craig Topper 383b6501ff [RISCV] Use tail agnostic policy for instructions with tied defs if the use operand is IMPLICIT_DEF.
The vcompress intrinsic is defined such that it requires a tail
undisturbed policy. This patch makes it so we can use the tail
agnostic policy if the user has passed vundefined to the dest
operand.

We need to do something similar for masked policy, but we need
annotation of which instructions use the mask policy first.

Not sure if this is sufficient for scheduling or if we'll need to
select different pseudos that don't have a tied def.

Reviewed By: evandro

Differential Revision: https://reviews.llvm.org/D94566
2021-01-17 23:47:58 -08:00
Hsiangkai Wang 098dbf190a [RISCV] Correct alignment settings for vector registers.
According to "9. Vector Memory Alignment Constraints" in V
specification, the alignment of vector memory access is aligned to the
size of the element. In our current implementation, we support ELEN up
to 64. We could assume the alignment of vector registers is 64 under the
assumption.

Differential Revision: https://reviews.llvm.org/D94751
2021-01-16 23:21:29 +08:00
Craig Topper 86e604c4d6 [RISCV] Add implementation of targetShrinkDemandedConstant to optimize AND immediates.
SimplifyDemandedBits can remove set bits from immediates from instructions
like AND/OR/XOR. This can prevent them from being efficiently
codegened on RISCV.

This adds an initial version that tries to keep or form 12 bit
sign extended immediates for AND operations to enable use of ANDI.
If that doesn't work we'll try to create a 32 bit sign extended immediate
to use LUI+ADDIW.

More optimizations are possible for different size immediates or
different operations. But this is a good starting point that already
has test coverage.

Reviewed By: frasercrmck

Differential Revision: https://reviews.llvm.org/D94628
2021-01-15 11:14:14 -08:00
Hsiangkai Wang 619eb14775 [NFC][RISCV] Remove useless code in RISCVRegisterInfo.td.
Differential Revision: https://reviews.llvm.org/D94750
2021-01-15 20:08:51 +08:00
Sam Elliott 141e45b99c [RISCV] Optimize Branch Comparisons
I noticed in D94450 that there were quite a few places where we generate
the sequence:
```
  xN <- comparison ...
  xN <- xor xN, 1
  bnez xN, symbol
```

Given we know the XOR will be used by BRCOND, which only looks at the lowest
bit, I think we can remove the XOR and just invert the branch condition in
these cases?

The case mostly seems to come up in floating point tests, where there is often
more logic to combine the results of multiple SETCCs, rather than a single
(BRCOND (SETCC ...) ...).

Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D94535
2021-01-15 11:28:19 +00:00
Kazu Hirata 7dc3575ef2 [llvm] Remove redundant return and continue statements (NFC)
Identified with readability-redundant-control-flow.
2021-01-14 20:30:34 -08:00
Craig Topper b894a9fb23 [RISCV] Optimize select_cc after fp compare expansion
Some FP compares expand to a sequence ending with (xor X, 1) to invert the result. If
the consumer is a select_cc we can likely get rid of this xor by fixing
up the select_cc condition.

This patch combines (select_cc (xor X, 1), 0, setne, trueV, falseV) -
(select_cc X, 0, seteq, trueV, falseV) if we can prove X is 0/1.

Reviewed By: lenary

Differential Revision: https://reviews.llvm.org/D94546
2021-01-14 13:41:40 -08:00
Craig Topper 387d3c2479 [RISCV] Merge Utils library into MCTargetDesc
MCTargetDesc includes headers from Utils and Utils includes headers
from MCTargetDesc. So from a library layering perspective it makes sense
for them to be in the same library. I guess the other option might be to
move the tablegen includes from RISCVMCTargetDesc.h to RISCVBaseInfo.h
so that RISCVBaseInfo.h didn't need to include RISCVMCTargetDesc.h.
Everything else that depends on Utils also depends on MCTargetDesc so
having one library seemed simpler.

Differential Revision: https://reviews.llvm.org/D93168
2021-01-14 11:47:30 -08:00
Sam Elliott 7c9c2a2ea5 Revert "[RISCV] Legalize select when Zbt extension available"
We found issues with this patch in additional testing. Backing out while
we work on a fix.

This reverts commit 71ed4b6ce5.
2021-01-14 16:44:34 +00:00
Craig Topper dfc1901d51 [RISCV] Custom lower ISD::VSCALE.
This patch custom lowers ISD::VSCALE into a csrr vlenb followed
by a shift right by 3 followed by a multiply by the scale amount.

I've added computeKnownBits support to indicate that the csrr vlenb
always produces 3 trailng bits of 0s so the shift right is "exact".
This allows the shift and multiply sequence to be nicely optimized
into a single shift or removed completely when the scale amount is
a power of 2.

The non power of 2 case multiplying by 24 is still producing
suboptimal code. We could remove the right shift and use a
multiply by 3. Hopefully we can improve DAG combine to fix that
since it's not unique to this sequence.

This replaces D94144.

Reviewed By: HsiangKai

Differential Revision: https://reviews.llvm.org/D94249
2021-01-13 17:14:49 -08:00
Craig Topper 1730b0f66a [RISCV] Remove '.mask' from vcompress intrinsic name. NFC
It has a mask argument, but isn't a masked instruction. It doesn't
use the mask policy of or the v0.t syntax.
2021-01-12 14:46:16 -08:00
Michael Munday 71ed4b6ce5 [RISCV] Legalize select when Zbt extension available
The custom expansion of select operations in the RISC-V backend
interferes with the matching of cmov instructions. Legalizing
select when the Zbt extension is available solves that problem.

Reviewed By: lenary, craig.topper

Differential Revision: https://reviews.llvm.org/D93767
2021-01-12 21:24:38 +00:00
Craig Topper a14040bd4d [RISCV] Use vmerge.vim for llvm.riscv.vfmerge with a 0.0 scalar operand.
We can use a 0 immediate to avoid needing to materialize 0 into
an FPR first.

Reviewed By: frasercrmck

Differential Revision: https://reviews.llvm.org/D94459
2021-01-12 11:08:26 -08:00
Evandro Menezes 7470017f24 [RISCV] Define the vfclass RVV intrinsics
Define the `vfclass` IR intrinsics for the respective V instructions.

Authored-by: Roger Ferrer Ibanez <rofirrim@gmail.com>
Co-Authored-by: Evandro Menezes <evandro.menezes@sifive.com>

Differential Revision: https://reviews.llvm.org/D94356
2021-01-11 17:40:09 -06:00
Craig Topper 278a3ea1b2 [RISCV] Use vmv.v.i vd, 0 instead of vmv.v.x vd, x0 for llvm.riscv.vfmv.v.f with 0.0
This matches what we use for integer 0. It's also consistent with
the scalar 'mv' pseudo that uses addi rather than add with x0.
2021-01-11 15:08:05 -08:00
Fraser Cormack 9ecc991c55 [RISCV] Add scalable vector vselect ISel patterns
Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D94294
2021-01-11 22:41:34 +00:00
Fraser Cormack 7989684a2e [RISCV] Add scalable vector fadd/fsub/fmul/fdiv ISel patterns
Original patch by @rogfer01.

This patch adds ISel patterns for the above operations to the
corresponding vector/vector and vector/scalar RVV instructions, as well
as extra patterns to match operand-swapped scalar/vector vfrsub and
vfrdiv.

Authored-by: Roger Ferrer Ibanez <rofirrim@gmail.com>
Co-Authored-by: Fraser Cormack <fraser@codeplay.com>

Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D94408
2021-01-11 21:19:48 +00:00
Fraser Cormack 37b41bd087 [RISCV] Add scalable vector fcmp ISel patterns
Original patch by @rogfer01.

All ordered comparisons except ONE are supported natively, and all
unordered comparisons except UNE are expanded into sequences involving
explicit NaN checks and mask arithmetic.

Additionally, we expand GT,OGT,GE,OGE to their swapped-operand versions, and
pattern-match those back to the "original", swapping operands once more. This
way we catch both operations and both "vf" and "fv" forms with fewer patterns.

Also add support for floating-point splat_vector, with an optimization for
splatting fpimm0.

Authored-by: Roger Ferrer Ibanez <rofirrim@gmail.com>
Co-Authored-by: Fraser Cormack <fraser@codeplay.com>

Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D94242
2021-01-11 19:38:56 +00:00
Craig Topper 131ce834e4 [RISCV] Clear isCodeGenOnly flag on VMSGE(U) pseudo instructions. Remove InstAliases that duplicate the asm strings in the pseudos.
The Pseudo class sets isCodeGenOnly=1 which causes the asm strings
in the pseudos to be ignored. I think this is why the aliases are
needed at all.

Reviewed By: frasercrmck

Differential Revision: https://reviews.llvm.org/D94024
2021-01-10 23:39:08 -08:00
Craig Topper 5cf73dca77 [RISCV] Convert most of the information about RVV Pseudos into bits in TSFlags.
This patch moves all but the BaseInstr to bits in TSFlags.

For the index fields, we can just use a bit to indicate their presence.
The locations of the operands are well defined.

This reduces the llc binary by about 32K on my build. It also
removes the binary search of the table from the custom inserter.
Instead we just check that the SEW op is present.

Reviewed By: rogfer01

Differential Revision: https://reviews.llvm.org/D94375
2021-01-10 19:15:45 -08:00
Craig Topper 6fc7a92eee [RISCV] Change ConstraintMask in RISCVII enum to be shifted left. NFC
This makes the mask align with the position of the bits in TSFlags
which is a little more logical.

I might be adding more fields to TSFlags and some might be single
bits where just ANDing with mask to test the bit would make sense.

While there rename TargetFlags in validateInstruction to reflect
that it's just the constraint bits.
2021-01-09 20:22:07 -08:00
Craig Topper 59908fc06a [RISCV] Use uint16_t instead of unsigned for opcodes in the RVV pseudo instruction table.
We currently have about 7000 opcodes in the RISCVGenInstrInfo.inc
enum. We can use uint16_t to store these values. We would need to
grow by nearly 9x before we run out of space so this should last
for a little while.

This reduces the llc binary by 32K.
2021-01-09 19:26:32 -08:00
Fraser Cormack b02eab9058 [RISCV] Add scalable vector icmp ISel patterns
Original patch by @rogfer01.

The RVV integer comparison instructions are defined in such a way that
many LLVM operations are defined by using the "opposite" comparison
instruction and swapping the operands. This is done in this patch in
most cases, except for the mappings where the immediate range must be
adjusted to accomodate:

    va < i --> vmsle{u}.vi vd, va, i-1, vm
    va >= i --> vmsgt{u}.vi vd, va, i-1, vm

That is left for future optimization; this patch supports all operations
but in the case of the missing mappings the immediate will be moved to
a scalar register first.

Since there are so many condition codes and operand cases to check, it
was decided to reduce the test burden by only testing the "vscale x 8"
vector types.

Authored-by: Roger Ferrer Ibanez <rofirrim@gmail.com>
Co-Authored-by: Fraser Cormack <fraser@codeplay.com>

Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D94168
2021-01-09 20:54:34 +00:00
Fraser Cormack de373ef779 [SelectionDAG] Extend immAll(Ones|Zeros)V to handle ISD::SPLAT_VECTOR
The TableGen immAllOnesV and immAllZerosV helpers implicitly wrapped the
ISD::isBuildVectorAll(Ones|Zeros) helper functions. This was inhibiting
their use for targets such as RISC-V which use ISD::SPLAT_VECTOR. In
particular, RISC-V had to define its own 'vnot' fragment.

In order to extend the scope of these nodes to include support for
ISD::SPLAT_VECTOR, two new ISD predicate functions have been introduced:
ISD::isConstantSplatVectorAll(Ones|Zeros). These effectively supersede
the older "isBuildVector" predicates, which are now simple wrappers for
the new functions. They pass a defaulted boolean toggle which preserves
the old behaviour. It is hoped that in time all call-sites can be ported
to the "isConstantSplatVector" functions.

While the use of ISD::isBuildVectorAll(Ones|Zeros) has not changed, the
behaviour of the TableGen immAll(Ones|Zeros)V **has**. To test the new
functionality, the custom RISC-V TableGen fragment has been removed and
replaced with the built-in 'vnot'. To test their use as pattern-roots, two
splat patterns have been updated accordingly.

Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D94223
2021-01-09 17:05:31 +00:00
Roger Ferrer Ibanez 524d8fa9a5 [RISCV] Do not grow the stack a second time when we need to realign the stack
This is a first change needed to fix a crash in which the emergency
spill splot ends being out of reach. This happens when we run the
register scavenger after we have eliminated the frame indexes. The fix
for the actual crash will come in a later change.

This change removes an extra stack size increase we do in
RISCVFrameLowering::determineFrameLayout.

We don't have to change the size of the stack here as
PEI::calculateFrameObjectOffsets is already doing this with the right
size accounting the extra alignment.

Differential Revision: https://reviews.llvm.org/D89237
2021-01-09 16:51:09 +00:00
Ben Shi 55f0a1b066 [RISCV] Optimize multiplication with constant
1. Break MUL with specific constant to a SLLI and an ADD/SUB on riscv32
   with the M extension.
2. Break MUL with specific constant to two SLLI and an ADD/SUB, if the
   constant needs a pair of LUI/ADDI to construct.

Reviewed by: craig.topper

Differential Revision: https://reviews.llvm.org/D93619
2021-01-09 10:37:21 +08:00
Craig Topper 0875a9da2a [RISCV] Cleanup a few section comments in RISCVInstrInfoVPseudos.td. NFC 2021-01-08 11:36:31 -08:00
Evandro Menezes 946bc50e4c [RISCV] Define the vfsqrt RVV intrinsics
Define the `vfsqrt` IR intrinsics for the respective V instructions.

Authored-by: Roger Ferrer Ibanez <rofirrim@gmail.com>
Co-Authored-by: Evandro Menezes <evandro.menezes@sifive.com>

Differential Revision: https://reviews.llvm.org/D93745
2021-01-07 17:29:29 -06:00
Fraser Cormack c9154e8fa3 [RISCV] Add vector mask arithmetic ISel patterns
The patterns that want to use 'vnot' use a custom PatFrag. This is
because 'vnot' uses immAllOnesV which implicitly uses BUILD_VECTOR
rather than SPLAT_VECTOR.

Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D94078
2021-01-07 09:43:25 +00:00
Craig Topper 7a8ced43d7 [RISCV] Fix a few section number comments in RISCVInstrInfoVPseudos.td to match the V extension 1.0 draft spec. NFC
The majority of the comments use the 1.0 draft spec section numbers.
2021-01-06 16:38:30 -08:00
Craig Topper c68faed041 [RISCV] Return a vXi1 vector type from getSetCCResultType if V extension is enabled.
nvxXi1 types are legal with V extension and that's the result
vmseq/vmsne/vmslt/etc instructions return.

No test cases yet because the setcc isel patterns aren't in
and we'll need more than basic tests to observe this. I locally
tested that this plus D947078, D94168, D94142, and D94149
was enough to be able to handle the overflow result from
llvm.sadd.overflow.
2021-01-06 11:50:15 -08:00
Fraser Cormack e130dea92a [RISCV] Add vector integer mul/mulh/div/rem ISel patterns
There is no test coverage for the mulhs or mulhu patterns as I can't get
the DAGCombiner to generate them for scalable vectors. There are a few
places in that still need updating for that to work. I left the patterns
in regardless as they are correct.

Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D94073
2021-01-06 09:24:07 +00:00
Christudasan Devadasan d68458bd56 [GlobalISel] Base implementation for sret demotion.
If the return values can't be lowered to registers
SelectionDAG performs the sret demotion. This patch
contains the basic implementation for the same in
the GlobalISel pipeline.

Furthermore, targets should bring relevant changes
during lowerFormalArguments, lowerReturn and
lowerCall to make use of this feature.

Reviewed By: arsenm

Differential Revision: https://reviews.llvm.org/D92953
2021-01-06 10:30:50 +05:30
Craig Topper 7b5a0e2f88 [RISCV] Move shift ComplexPatterns and custom isel to PatFrags with predicates
ComplexPatterns are kind of weird, they don't call any of the predicates on their operands. And their "complexity" used for tablegen ordering purposes in the matcher table is hand specified.

This started as an attempt to just use sext_inreg + SLOIPat to implement SLOIW just to have one less Select function. The matching for the or+shl is the same as long as you know the immediate is less than 32 for SLOIW. But that didn't work out because using uimm5 with SLOIPat didn't do anything if it was a ComplexPattern.

I realized I could just use a PatFrag with the opcodes I wanted to match and an immediate predicate would then evaluate correctly. This also computes the complexity just like any other pattern does. Then I just needed to check the constraints on the immediates in the predicate. Conveniently the predicate is evaluated after the fragment has been matched. So the structure has already been checked, we just need to find the constants.

I'll note that this is unusual, I didn't find any other targets looking through operands in PatFrag predicate. There is a PredicateCodeUsesOperands feature that can be used to collect the operands into an array that is used by AMDGPU/VOP3Instructions.td. I believe that feature exists to handle commuted matching, but since the nodes here use constants, they aren't ever commuted

Differential Revision: https://reviews.llvm.org/D91901
2021-01-05 11:37:48 -08:00
Craig Topper 210bc3dc0e [RISCV] Don't parse 'vmsltu.vi v0, v1, 0' as 'vmsleu.vi v0, v1, -1'
vmsltu.vi v0, v1, 0 is always false there is no unsigned number
less than 0. vmsleu.vi v0, v1, -1 on the other hand is always true
since -1 will be considered unsigned max and all numbers are <=
unsigned max.

A similar problem exists for vmsgeu.vi v0, v1, 0 which is always true,
but becomes vmsgtu.vi v0, v1, -1 which is always false.

To match the GNU assembler we'll emit vmsne.vv and vmseq.vv with
the same register for these cases instead.

I'm using AsmParserOnly pseudo instructions here because we can't
match an explicit immediate in an InstAlias. And we can't use a
AsmOperand for the zero because the output we want doesn't use an
immediate so there's nowhere to name the AsmOperand we want to use.

To keep the implementations similar I'm also handling signed with
pseudo instructions even though they don't have this issue. This
way we can avoid the special renderMethod that decremented by 1 so
the immediate we see for the pseudo instruction in processInstruction
is 0 and not -1. Another option might have been to have a different
simm5_plus1 operand for the unsigned case or just live with the
immediate being pre-decremented. I felt this way was clearer, but I'm
open to other opinions.

Reviewed By: frasercrmck

Differential Revision: https://reviews.llvm.org/D94035
2021-01-05 10:59:30 -08:00
Craig Topper 249d7de119 [RISCV] Don't print zext.b alias.
This alias for andi x, 255 was recently added to the spec. If we
print it, code we output can't be compiled with -fno-integrated-as
unless the GNU assembler is also a version that supports alias.

Reviewed By: lenary

Differential Revision: https://reviews.llvm.org/D93826
2021-01-05 10:41:08 -08:00
Craig Topper c707716c04 [RISCV] Match vmslt(u).vx intrinsics with a small immediate to vmsle(u).vx.
There are vmsle(u).vx and vmsle(u).vi instructions, but there is
only vmslt(u).vx and no vmslt(u).vi. vmslt(u).vi can be emulated
for some immediates by decrementing the immediate and using vmsle(u).vi.

To avoid the user needing to know about this, this patch does this
conversion.

The assembler does the same thing for vmslt(u).vi and vmsge(u).vi
pseudoinstructions. There is no vmsge(u).vx intrinsic or
instruction so this patch is limited to vmslt(u).

Reviewed By: frasercrmck

Differential Revision: https://reviews.llvm.org/D94070
2021-01-05 10:20:21 -08:00
Fraser Cormack 1d4411e9ea [RISCV] Add vector integer min/max ISel patterns
Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D94012
2021-01-05 09:15:50 +00:00
Craig Topper fe597efc30 [RISCV] Remove unused method RISCVInstPrinter::printSImm5Plus1. NFC
simm5_plus1 is only used by InstAliases so should never be printed.
2021-01-04 12:21:35 -08:00
Craig Topper dc9ac0e820 [RISCV] Replace i32 with XLenVT in (add AddrFI, simm12) isel patterns.
With the i32 these patterns will only fire on RV32, but they
don't look RV32 specific.

Reviewed By: lenary

Differential Revision: https://reviews.llvm.org/D93843
2021-01-04 10:53:27 -08:00
Matt Arsenault d8938c8bb5 CodeGen: Use Register 2021-01-04 12:53:06 -05:00
Craig Topper 94257d12cb [RISCV] Remove unused method isUImm5NonZero() from RISCVAsmParser.cpp. NFC
The operand predicate that used this has been gone for a while.
2021-01-04 00:17:39 -08:00
Hsiangkai Wang e4337159e3 [NFC][RISCV] Move vmsge{u}.vx processing to RISCVAsmParser.
We could expand vmsge{u}.vx pseudo instructions in RISCVAsmParser.
It is more appropriate to expand it before encoding.

Differential Revision: https://reviews.llvm.org/D93968
2021-01-02 08:42:53 +08:00
Monk Chiang 1d04cbeb43 [RISCV] Define vector single-width type-convert intrinsic.
Define intrinsics:
  1. vfcvt.xu.f.v/vfcvt.x.f.v
  2. vfcvt.rtz.xu.f.v/vfcvt.rtz.x.f.v
  3. vfcvt.f.xu.v/vfcvt.f.x.v

We work with @rogfer01 from BSC to come out this patch.

Authored-by: Roger Ferrer Ibanez <rofirrim@gmail.com>
Co-Authored-by: Monk Chiang <monk.chiang@sifive.com>

Differential Revision: https://reviews.llvm.org/D93933
2020-12-31 11:49:30 +08:00
Monk Chiang 2aed9bc98a [RISCV] Define vector narrowing type-convert intrinsic.
Define intrinsics:
  1. vfncvt.xu.f.w/vfncvt.x.f.w
  2. vfncvt.rtz.xu.f.w/vfncvt.rtz.x.f.w
  3. vfncvt.f.xu.w/vfncvt.f.x.w
  4. vfncvt.f.f.w/vfncvt.rod.f.f.w

We work with @rogfer01 from BSC to come out this patch.

Authored-by: Roger Ferrer Ibanez <rofirrim@gmail.com>
Co-Authored-by: Monk Chiang <monk.chiang@sifive.com>

Differential Revision: https://reviews.llvm.org/D93932
2020-12-31 11:48:28 +08:00
Monk Chiang fdd30faae5 [RISCV] Define vector widening type-convert intrinsic.
Define intrinsics:
  1. vfwcvt.xu.f.v/vfwcvt.x.f.v
  2. vfwcvt.rtz.xu.f.v/vfwcvt.rtz.x.f.v
  3. vfwcvt.f.xu.v/vfwcvt.f.x.v
  4. vfwcvt.f.f.v

We work with @rogfer01 from BSC to come out this patch.

Authored-by: Roger Ferrer Ibanez <rofirrim@gmail.com>
Co-Authored-by: Monk Chiang <monk.chiang@sifive.com>

Differential Revision: https://reviews.llvm.org/D93855
2020-12-31 11:48:09 +08:00
ShihPo Hung 096b02ebbf [RISCV] Add intrinsics for vcompress instruction
This patch defines vcompress intrinsics and lower to V instructions.

We work with @rogfer01 from BSC to come out this patch.

Authored-by: Roger Ferrer Ibanez <rofirrim@gmail.com>
Co-Authored-by: ShihPo Hung <shihpo.hung@sifive.com>

Differential revision: https://reviews.llvm.org/D93809
2020-12-29 18:38:15 -08:00
Zakk Chen 6da0033624 [RISCV] Define vsext/vzext intrinsics.
Define vsext/vzext intrinsics.and lower to V instructions.
Define new fraction register class fields in LMULInfo and a
NoReg to present invalid LMUL register classes.

Authored-by: ShihPo Hung <shihpo.hung@sifive.com>
Co-Authored-by: Zakk Chen <zakk.chen@sifive.com>

Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D93893
2020-12-29 16:50:53 -08:00
Fraser Cormack f7f09e2b1c [RISCV] Fill out basic integer RVV ISel patterns
This complements the existing RVV ISel patterns for arithmetic, bitwise
and shifts with the remaining operations in those categories: sub, and,
xor, sra.

Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D93852
2020-12-29 19:32:18 +00:00
Craig Topper 79cbb003c5 [RISCV] Don't use tail agnostic policy on instructions where destination is tied to source
If the destination is tied, then user has some control of the
register used for input. They would have the ability to control
the value of any tail elements. By using tail agnostic we take
this option away from them.

Its not clear that the intrinsics are defined such that this isn't
supposed to work. And undisturbed is a valid implementation for agnostic
so code wouldn't even fail to work on all systems if we always used
agnostic.

The vcompress intrinsic is defined to require tail undisturbed so
at minimum we need this for that instruction or need to redefine
the intrinsic.

I've made an exception here for vmv.s.x/fmv.s.f and reduction
instructions which only write to element 0 regardless of the tail
policy. This allows us to keep the agnostic policy on those which
should allow better redundant vsetvli removal.

An enhancement would be to check for undef input and keep the
agnostic policy, but we don't have good test coverage for that yet.

Reviewed By: khchen

Differential Revision: https://reviews.llvm.org/D93878
2020-12-29 10:37:58 -08:00
Craig Topper 2ae760e27e [RISCV] Add earlyclobber of destination register to vmsbf.m/vmsif.m/vmsof.m instructions
The spec for these instructions include this note. "The destination register
cannot overlap either the source register or the mask register ('v0') if the
instruction is masked." So we need earlyclobber to enforce this constraint.

I've regenerated the tests with update_llc_test_checks.py to show the
effects of the earlyclobber.

Reviewed By: khchen, frasercrmck

Differential Revision: https://reviews.llvm.org/D93867
2020-12-29 10:00:04 -08:00
Fraser Cormack aebb4a6052 [RISCV] Rewrite and simplify helper function. NFC.
Reviewed By: luismarques

Differential Revision: https://reviews.llvm.org/D93851
2020-12-29 11:29:44 +00:00
Zakk Chen f3f9ce3b79 [RISCV] Define vmclr.m/vmset.m intrinsics.
Define vmclr.m/vmset.m intrinsics and lower to vmxor.mm/vmxnor.mm.

Ideally all rvv pseudo instructions could be implemented in C header,
but those two instructions don't take an input, codegen can not guarantee
that the source register becomes the same as the destination.

We expand pseduo-v-inst into corresponding v-inst in
RISCVExpandPseudoInsts pass.

Reviewed By: craig.topper, frasercrmck

Differential Revision: https://reviews.llvm.org/D93849
2020-12-28 18:57:17 -08:00
Zakk Chen e673d40199 [RISCV] Define vmsbf.m/vmsif.m/vmsof.m/viota.m/vid.v intrinsics.
Define those intrinsics and lower to V instructions.

Use update_llc_test_checks.py for viota.m tests to check
earlyclobber is applied correctly.
mask viota.m tests uses the same argument as input and mask for
avoid dependency of D93364.

We work with @rogfer01 from BSC to come out this patch.

Reviewed By: HsiangKai

Differential Revision: https://reviews.llvm.org/D93823
2020-12-28 05:54:18 -08:00
Fraser Cormack d85a198e85 [RISCV] Pattern-match more vector-splatted constants
This patch extends the pattern-matching capability of vector-splatted
constants. When illegally-typed constants are legalized they are
canonically sign-extended to XLenVT. This preserves the sign and allows
us to match simm5. If they were zero-extended for whatever reason we'd
lose that ability: e.g. `(i8 -1) -> (XLenVT 255)` would not be matched
under the current logic.

To address this we first manually sign-extend the splatted constant from
the vector element type to int64_t. This preserves the semantics while
removing any implicitly-truncated bits.

The corresponding logic for uimm5 was not updated, the rationale being
that neither sign- nor zero-extending a legal uimm5 immediate should
change that (unless we expect actual "garbage" upper bits).

Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D93837
2020-12-28 07:11:10 +00:00
Craig Topper 76202f09b5 [RISCV] Improve VMConstraint checking on more unary and nullary instructions.
We weren't consistently marking unary instructions as OneInput
and vid.v is really ZeroInput but we had no way to mark that.

This patch improves this by removing the error prone OneInput constraint.
Instead we just always look for the mask in the last operand.

It appears that the "CheckReg" variable used for the check on the broken
instruction was unitialized or garbage because it was also used for
VS1/VS2 constraints. I've scoped the variable locally to each check now.

I've gone through and set NoConstraint on instructions that don't have
a real VMConstraint and don't have a mask as the last operand.

I've also removed the unused enum values in RISCVBaseInfo.h. We
never use them in C++ and we have separate versions in a td file.

Reviewed By: HsiangKai

Differential Revision: https://reviews.llvm.org/D93784
2020-12-26 18:47:59 -08:00
Monk Chiang 622ea9cf74 [RISCV] Define vector widening reduction intrinsic.
Define vwredsumu/vwredsum/vfwredosum/vfwredsum

We work with @rogfer01 from BSC to come out this patch.

Authored-by: Roger Ferrer Ibanez <rofirrim@gmail.com>
Co-Authored-by: Zakk Chen <zakk.chen@sifive.com>

Differential Revision: https://reviews.llvm.org/D93807
2020-12-26 21:42:30 +08:00
Zakk Chen da4a637e99 [RISCV] Define vpopc/vfirst intrinsics.
Define vpopc/vfirst intrinsics and lower to V instructions.

We work with @rogfer01 from BSC to come out this patch.

Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D93795
2020-12-24 19:44:34 -08:00
Kazu Hirata d6ff5cf995 [Target] Use llvm::any_of (NFC) 2020-12-24 19:43:26 -08:00
Zakk Chen 351c216f36 [RISCV] Define vector mask-register logical intrinsics.
Define vector mask-register logical intrinsics and lower them
to V instructions. Also define pseudo instructions vmmv.m
and vmnot.m.

We work with @rogfer01 from BSC to come out this patch.

Authored-by: Roger Ferrer Ibanez <rofirrim@gmail.com>
Co-Authored-by: Zakk Chen <zakk.chen@sifive.com>

Differential Revision: https://reviews.llvm.org/D93705
2020-12-24 18:59:05 -08:00
ShihPo Hung 912740a864 [RISCV] Add intrinsics for vrgather instruction
This patch defines vrgather intrinsics and lower to V instructions.

We work with @rogfer01 from BSC to come out this patch.

Authored-by: Roger Ferrer Ibanez <rofirrim@gmail.com>
Co-Authored-by: ShihPo Hung <shihpo.hung@sifive.com>

Differential revision: https://reviews.llvm.org/D93797
2020-12-24 18:16:02 -08:00
Monk Chiang afd03cd335 [RISCV] Define vector single-width reduction intrinsic.
integer group:
vredsum/vredmaxu/vredmax/vredminu/vredmin/vredand/vredor/vredxor
float group:
vfredosum/vfredsum/vfredmax/vfredmin

We work with @rogfer01 from BSC to come out this patch.

Authored-by: Roger Ferrer Ibanez <rofirrim@gmail.com>
Co-Authored-by: Zakk Chen <zakk.chen@sifive.com>

Differential Revision: https://reviews.llvm.org/D93746
2020-12-25 09:56:01 +08:00
Fraser Cormack 1a7ac29a89 [RISCV] Add ISel support for RVV vector/scalar forms
This patch extends the SDNode ISel support for RVV from only the
vector/vector instructions to include the vector/scalar and
vector/immediate forms.

It uses splat_vector to carry the scalar in each case, except when
XLEN<SEW (RV32 SEW=64) when a custom node `SPLAT_VECTOR_I64` is used for
type-legalization and to encode the fact that the value is sign-extended
to SEW. When the scalar is a full 64-bit value we use a sequence to
materialize the constant into the vector register.

The non-intrinsic ISel patterns have also been split into their own
file.

Authored-by: Roger Ferrer Ibanez <rofirrim@gmail.com>
Co-Authored-by: Fraser Cormack <fraser@codeplay.com>

Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D93312
2020-12-23 20:16:18 +00:00
Craig Topper e0110a4740 [RISCV] Add intrinsics for vfmv.v.f
Also include a special case pattern to use vmv.v.x vd, zero when
the argument is 0.0.

Reviewed By: khchen

Differential Revision: https://reviews.llvm.org/D93672
2020-12-23 10:50:48 -08:00
ShihPo Hung 6301871d06 [RISCV] Add intrinsics for vfwmacc, vfwnmacc, vfwmsac, vfwnmsac instructions
This patch defines vfwmacc, vfwnmacc, vfwmsc, vfwnmsac intrinsics
and lower to V instructions.
We work with @rogfer01 from BSC to come out this patch.

Authored-by: Roger Ferrer Ibanez <rofirrim@gmail.com>
Co-Authored-by: ShihPo Hung <shihpo.hung@sifive.com>

Differential Revision: https://reviews.llvm.org/D93693
2020-12-23 00:42:04 -08:00
Zakk Chen 032600b9ae [RISCV] Define vmerge/vfmerge intrinsics.
Define vmerge/vfmerge intrinsics and lower to V instructions.

Include support for vector-vector vfmerge by vmerge.vvm.

We work with @rogfer01 from BSC to come out this patch.

Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D93674
2020-12-23 00:07:09 -08:00
Evandro Menezes 4d47944393 [RISCV] Define the vfmin, vfmax RVV intrinsics
Define the vfmin, vfmax IR intrinsics for the respective V instructions.

Authored-by: Roger Ferrer Ibanez <rofirrim@gmail.com>
Co-Authored-by: Evandro Menezes <evandro.menezes@sifive.com>

Differential Revision: https://reviews.llvm.org/D93673
2020-12-23 00:27:38 -06:00
ShihPo Hung ad0a7ad950 [RISCV] Add intrinsics for vf[n]macc/vf[n]msac/vf[n]madd/vf[n]msub instructions
This patch defines vfmadd/vfnmacc, vfmsac/vfnmsac, vfmadd/vfnmadd,
and vfmsub/vfnmsub lower to V instructions.

Authored-by: Roger Ferrer Ibanez <rofirrim@gmail.com>
Co-Authored-by: ShihPo Hung <shihpo.hung@sifive.com>

Differential Revision: https://reviews.llvm.org/D93691
2020-12-22 18:34:00 -08:00
ShihPo Hung 4268783998 [RISCV] Add intrinsics for vwmacc[u|su|us] instructions
This patch defines vwmacc[u|su|us] intrinsics and lower to V instructions.

We work with @rogfer01 from BSC to come out this patch.

Authored-by: Roger Ferrer Ibanez <rofirrim@gmail.com>
Co-Authored-by: ShihPo Hung <shihpo.hung@sifive.com>

Differential Revision: https://reviews.llvm.org/D93675
2020-12-22 18:17:39 -08:00
ShihPo Hung c8874464b5 [RISCV] Add intrinsics for vslide1up/down, vfslide1up/down instruction
This patch adds intrinsics for vslide1up, vslide1down, vfslide1up, vfslide1down.

Authored-by: Roger Ferrer Ibanez <rofirrim@gmail.com>
Co-Authored-by: ShihPo Hung <shihpo.hung@sifive.com>

Differential Revision: https://reviews.llvm.org/D93608
2020-12-22 18:14:22 -08:00
Craig Topper 53deef9e0b [RISCV] Remove unneeded !eq comparing a single bit value to 0/1 in RISCVInstrInfoVPseudos.td. NFC
Instead we can either use the bit directly. If it was checking for
0 we need to swap the operands or use !not.
2020-12-22 11:57:16 -08:00
Nandor Licker 0586f048d7 [RISCV] Basic jump table lowering
This patch enables jump table lowering in the RISC-V backend.

In addition to the test case included, the new lowering was
tested by compiling the OCaml runtime and running it under qemu.

Differential Revision: https://reviews.llvm.org/D92097
2020-12-22 15:05:54 +00:00
Hsiangkai Wang 9a8ef927df [RISCV] Define vector compare intrinsics.
Define vector compare intrinsics and lower them to V instructions.

We work with @rogfer01 from BSC to come out this patch.

Authored-by: Roger Ferrer Ibanez <rofirrim@gmail.com>
Co-Authored-by: Hsiangkai Wang <kai.wang@sifive.com>

Differential Revision: https://reviews.llvm.org/D93368
2020-12-22 14:08:18 +08:00
Zakk Chen 7a2c8be641 [RISCV] Define vleff intrinsics.
Define vleff intrinsics and lower to V instructions.

We work with @rogfer01 from BSC to come out this patch.

Authored-by: Roger Ferrer Ibanez <rofirrim@gmail.com>
Co-Authored-by: Zakk Chen <zakk.chen@sifive.com>

Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D93516
2020-12-21 22:05:38 -08:00
ShihPo Hung b15ba2cf6f [RISCV] Add intrinsics for vmacc/vnmsac/vmadd/vnmsub instructions
This defines vmadd, vmacc, vnmsub, and vnmsac intrinsics and
lower to V instructions.

Authored-by: Roger Ferrer Ibanez <rofirrim@gmail.com>
Co-Authored-by: ShihPo Hung <shihpo.hung@sifive.com>

Differential Revision: https://reviews.llvm.org/D93632
2020-12-21 17:37:20 -08:00
Evandro Menezes ed73a78924 [RISCV] Define the vand, vor and vxor RVV intrinsics
Define the `vand`, `vor` and `vxor` IR intrinsics for the respective V instructions.

Authored-by: Roger Ferrer Ibanez <rofirrim@gmail.com>
Co-Authored-by: Evandro Menezes <evandro.menezes@sifive.com>

Differential Revision: https://reviews.llvm.org/D93574
2020-12-21 16:20:26 -06:00
Fangrui Song d9a0c40bce [MC] Split MCContext::createTempSymbol, default AlwaysAddSuffix to true, and add comments
CanBeUnnamed is rarely false. Splitting to a createNamedTempSymbol makes the
intention clearer and matches the direction of reverted r240130 (to drop the
unneeded parameters).

No behavior change.
2020-12-21 14:04:13 -08:00
Monk Chiang 3183add534 [RISCV] Define the remaining vector fixed-point arithmetic intrinsics.
This patch base on D93366, and define vector fixed-point intrinsics.
    1. vaaddu/vaadd/vasubu/vasub
    2. vsmul
    3. vssrl/vssra
    4. vnclipu/vnclip

  We work with @rogfer01 from BSC to come out this patch.

  Authored-by: Roger Ferrer Ibanez <rofirrim@gmail.com>
  Co-Authored-by: ShihPo Hung <shihpo.hung@sifive.com>

  Differential Revision: https://reviews.llvm.org/D93508
2020-12-20 22:57:07 -08:00
Kazu Hirata 966f1431de [Target] Use llvm::erase_if (NFC) 2020-12-20 17:43:22 -08:00
ShihPo Hung d86a00d8fe [RISCV] Define vslideup/vslidedown intrinsics
Differential Revision: https://reviews.llvm.org/D93286
2020-12-20 05:08:15 -08:00
Hsiangkai Wang 41ab45d662 [RISCV] Define vector vfwmul intrinsics.
Define vector vfwmul intrinsics and lower them to V instructions.

We work with @rogfer01 from BSC to come out this patch.

Authored-by: Roger Ferrer Ibanez <rofirrim@gmail.com>
Co-Authored-by: Hsiangkai Wang <kai.wang@sifive.com>

Differential Revision: https://reviews.llvm.org/D93584
2020-12-20 17:39:20 +08:00
Hsiangkai Wang f86e61d886 [RISCV] Define vector vfwadd/vfwsub intrinsics.
Define vector vfwadd/vfwsub intrinsics and lower them to V
instructions.

We work with @rogfer01 from BSC to come out this patch.

Authored-by: Roger Ferrer Ibanez <rofirrim@gmail.com>
Co-Authored-by: Hsiangkai Wang <kai.wang@sifive.com>

Differential Revision: https://reviews.llvm.org/D93583
2020-12-20 17:39:13 +08:00
Hsiangkai Wang bd576ac8d4 [RISCV] Define vector vfsgnj/vfsgnjn/vfsgnjx intrinsics.
Define vector vfsgnj/vfsgnjn/vfsgnjx intrinsics and lower them to V
instructions.

We work with @rogfer01 from BSC to come out this patch.

Authored-by: Roger Ferrer Ibanez <rofirrim@gmail.com>
Co-Authored-by: Hsiangkai Wang <kai.wang@sifive.com>

Differential Revision: https://reviews.llvm.org/D93581
2020-12-20 17:39:04 +08:00
Hsiangkai Wang 62c94f0678 [RISCV] Define vector vfmul/vfdiv/vfrdiv intrinsics.
Define vector vfmul/vfdiv/vfrdiv intrinsics and lower them to V instructions.

We work with @rogfer01 from BSC to come out this patch.

Authored-by: Roger Ferrer Ibanez <rofirrim@gmail.com>
Co-Authored-by: Hsiangkai Wang <kai.wang@sifive.com>

Differential Revision: https://reviews.llvm.org/D93580
2020-12-20 17:38:57 +08:00
Zakk Chen 9cf3b1b666 [RISCV] Define vlxe/vsxe/vsuxe intrinsics.
Define vlxe/vsxe intrinsics and lower to vlxei<EEW>/vsxei<EEW>
instructions.

We work with @rogfer01 from BSC to come out this patch.

Authored-by: Roger Ferrer Ibanez <rofirrim@gmail.com>
Co-Authored-by: Zakk Chen <zakk.chen@sifive.com>

Differential Revision: https://reviews.llvm.org/D93471
2020-12-19 06:50:20 -08:00
Fraser Cormack 7948cd11d1 [RISCV] Address clang-tidy warnings in RISCVTargetMachine. NFC. 2020-12-18 21:50:55 +00:00
Fraser Cormack d4ed253d0b [RISCV] Assume no-op addrspacecasts by default
To support OpenCL, which typically uses SPIR as an IR, non-zero address
spaces must be accounted for. This patch makes the RISC-V target assume
no-op address space casts across the board, which effectively removes
the need to support addrspacecast instructions in the backend.

For a RISC-V implementation with different configurations or specialized
address spaces where casts aren't no-ops, the function can be adjusted
as required.

Reviewed By: jrtc27

Differential Revision: https://reviews.llvm.org/D93536
2020-12-18 21:03:37 +00:00
Craig Topper 69c8d121f7 [RISCV] Add intrinsics for vsetvli instruction
This patch adds two IR intrinsics for vsetvli instruction. One to set the vector length to a user specified value and one to set it to vlmax. The vlmax uses the X0 source register encoding.

Clang builtins will follow in a separate patch

Differential Revision: https://reviews.llvm.org/D92973
2020-12-18 12:10:09 -08:00
Craig Topper 09468a9148 [RISCV] Sign extend constant arguments to V intrinsics when promoting to XLen.
The default behavior for any_extend of a constant is to zero extend.
This occurs inside of getNode rather than allowing type legalization
to promote the constant which would sign extend. By using sign extend
with getNode the constant will be sign extended. This gives a better
chance for isel to find a simm5 immediate since all xlen bits are
examined there.

For instructions that use a uimm5 immediate, this change only affects
constants >= 128 for i8 or >= 32768 for i16. Constants that large
already wouldn't have been eligible for uimm5 and would need to use a
scalar register.

If the instruction isn't able to use simm5 or the immediate is
too large, we'll need to materialize the immediate in a register.
As far as I know constants with all 1s in the upper bits should
materialize as well or better than all 0s.

Longer term we should probably have a SEW aware PatFrag to ignore
the bits above SEW before checking simm5.

I updated about half the test cases in some tests to use a negative
constant to get coverage for this.

Reviewed By: evandro

Differential Revision: https://reviews.llvm.org/D93487
2020-12-18 11:43:38 -08:00
Craig Topper 1c3a6671c2 Recommit "[RISCV] Add intrinsics for vfmv.f.s and vfmv.s.f"
This time with tests.

Original message:
Similar to D93365, but for floating point. No need for special ISD opcodes
though. We can directly isel these from intrinsics. I had to use anyfloat_ty
instead of anyvector_ty in the intrinsics to make LLVMVectorElementType not
crash when imported into the -gen-dag-isel tablegen backend.

Differential Revision: https://reviews.llvm.org/D93426
2020-12-18 11:19:05 -08:00
Craig Topper cd3e811864 Revert "[RISCV] Add intrinsics for vfmv.f.s and vfmv.s.f"
This reverts commit 46a40c4bc1.

I forgot to git add the tests.
2020-12-18 11:16:36 -08:00
Craig Topper 46a40c4bc1 [RISCV] Add intrinsics for vfmv.f.s and vfmv.s.f
Similar to D93365, but for floating point. No need for special ISD opcodes
though. We can directly isel these from intrinsics. I had to use anyfloat_ty
instead of anyvector_ty in the intrinsics to make LLVMVectorElementType not
crash when imported into the -gen-dag-isel tablegen backend.

Differential Revision: https://reviews.llvm.org/D93426
2020-12-18 11:11:15 -08:00
Craig Topper 86d282baed [RISCV] Add intrinsics for vmv.x.s and vmv.s.x
This adds intrinsics for vmv.x.s and vmv.s.x.

I've used stricter type constraints on these intrinsics than what we've been doing on the arithmetic intrinsics so far. This will allow us to not need to pass the scalar type to the Intrinsic::getDeclaration call when creating these intrinsics.

A custom ISD is used for vmv.x.s in order to implement the change in computeNumSignBitsForTargetNode which can remove sign extends on the result.

I also modified the MC layer description of these instructions to show the tied source/dest operand. This is different than what we do for masked instructions where we drop the tied source operand when converting to MC. But it is a more accurate description of the instruction. We can't do this for masked instructions since we use the same MC instruction for masked and unmasked. Tools like llvm-mca operate in the MC layer and rely on ins/outs and Uses/Defs for analysis so I don't know if we'll be able to maintain the current behavior for masked instructions. So I went with the accurate description here since it was easy.

Reviewed By: frasercrmck

Differential Revision: https://reviews.llvm.org/D93365
2020-12-18 10:30:48 -08:00
Craig Topper fc7b7fc066 [RISCV] Add intrinsics for vmv.v.v, vmv.v.x, and vmv.x.i
We work with @rogfer01 from BSC to come out this patch.

Authored-by: Roger Ferrer Ibanez <rofirrim@gmail.com>
Co-Authored-by: Craig Topper <craig.topper@sifive.com>

Differential Revision: https://reviews.llvm.org/D93514
2020-12-18 09:49:07 -08:00
Hsiangkai Wang 7087ae7be9 [RISCV] Remove NoVReg to avoid compile warning messages. 2020-12-18 11:37:47 +08:00
Monk Chiang ee2cb90e3b [RISCV] Define vsadd/vsaddu/vssub/vssubu intrinsics.
We work with @rogfer01 from BSC to come out this patch.

Authored-by: Roger Ferrer Ibanez <rofirrim@gmail.com>
Co-Authored-by: ShihPo Hung <shihpo.hung@sifive.com>
Co-Authored-by: Monk Chiang <monk.chiang@sifive.com>

Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D93366
2020-12-18 10:24:24 +08:00