Commit Graph

2266 Commits

Author SHA1 Message Date
Fraser Cormack 3e678cb772 [RISCV] Don't emit fractional VIDs with negative steps
We can't shift-right negative numbers to divide them, so avoid emitting
such sequences. Use negative numerators as a proxy for this situation, since
the indices are always non-negative.

An alternative strategy could be to add a compiler flag to emit division
instructions, which would at least allow us to test the VID sequence
matching itself.

Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D123796
2022-04-21 07:00:34 +01:00
Craig Topper 186d5c8af5 [RISCV] Make getInstSeqCost handle other Zb* instructions.
We haven't been updating this as Zb* instructions have been used
for immediate materialization. They will hit the default case and
trigger an llvm_unreachable. Instead of trying to list them all,
assume instructions that aren't explicitly listed aren't compressible.

Spotted while looking at integer materialization for other reasons.
I haven't seen a crash from this yet.
2022-04-20 22:08:04 -07:00
Craig Topper 6db0afb44e [RISCV] Fold (xor (sllw 1, x), -1) -> (rolw ~1, x).
There's an existing generic combine that does this for legal types.
This patch adds a RISCV specific combine for W instructions.

Reviewed By: luismarques

Differential Revision: https://reviews.llvm.org/D123983
2022-04-19 15:03:43 -07:00
Fraser Cormack c5cac48549 [RISCV] Fix lowering of BUILD_VECTORs as VID sequences
This patch fixes a bug when lowering BUILD_VECTOR via VID sequences.
After adding support for fractional steps in D106533, elements with zero
steps may be skipped if no step has yet been computed. This allowed
certain sequences to slip through the cracks, being identified as VID
sequences when in fact they are not.

The fix for this is to perform a second loop over the BUILD_VECTOR to
validate the entire sequence once the step has been computed. This isn't
the most efficient, but on balance the code is more readable and
maintainable than doing back-validation during the first loop.

Fixes the tests introduced in D123785.

Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D123786
2022-04-19 07:43:38 +01:00
jacquesguan 25445b94db [RISCV] Add rvv codegen support for vp.fptrunc.
This patch adds rvv codegen support for vp.fptrunc. The lowering of fp_round and vp.fptrunc share most code so use a common lowering function to handle those two, similar to vp.trunc.

Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D123841
2022-04-19 01:56:18 +00:00
Lian Wang 545d353b3c [RISCV][NFC] Refactor VL patterns for vnsrl and vnsra
Reviewed By: frasercrmck

Differential Revision: https://reviews.llvm.org/D123274
2022-04-15 07:42:59 +00:00
jacquesguan 1aa4f0bb6c [RISCV][VP] Add RVV codegen for vp.trunc.
Differential Revision: https://reviews.llvm.org/D123579
2022-04-15 02:29:53 +00:00
Lian Wang 3100893f63 [RISCV] Remove sext_inreg+riscv_grev/riscv_gorc isel patterns
Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D123565
2022-04-14 08:16:32 +00:00
Lian Wang 38706dd940 [RISCV][NFC] Refactor patterns for Multiply Add instructions
Reviewed By: craig.topper, frasercrmck

Differential Revision: https://reviews.llvm.org/D123355
2022-04-14 08:00:00 +00:00
wangpc d0828c5af9 [RISCV][NFC] Use addExpr() instead of createExpr()
It seems to be neater.

Reviewed By: asb

Differential Revision: https://reviews.llvm.org/D123675
2022-04-14 10:48:25 +08:00
Liqin Weng 8265679018 [RISCV][NFC] Refactor the type promotion of fsl/fsr/becompress/bdecompress/bfp
Reviewed By: asb, jrtc27, craig.topper, frasercrmck

Differential Revision: https://reviews.llvm.org/D123181
2022-04-13 08:52:04 +00:00
Craig Topper 057c063c9b [RISCV] Add a encodeLMUL function to RISCVVType. NFC
This moves the encoding handling out of the assembly parser.

Reviewed By: khchen, frasercrmck

Differential Revision: https://reviews.llvm.org/D123553
2022-04-12 13:39:47 -07:00
Craig Topper 2ce2562876 [RISCV][SelectionDAG] Add a hook to sign extend i32 ConstantInt operands of phis on RV64.
Materializing constants on RISCV is simpler if the constant is sign
extended from i32. By default i32 constant operands of phis are
zero extended.

This patch adds a hook to allow RISCV to override this for i32. We
have an existing isSExtCheaperThanZExt, but it operates on EVT which
we don't have at these places in the code.

Reviewed By: efriedma

Differential Revision: https://reviews.llvm.org/D122951
2022-04-11 14:38:39 -07:00
Craig Topper 76192182d0 [RISCV] Remove riscv-v-fixed-length-vector-elen-max command line option.
This was added before Zve extensions were defined. I think users
should use Zve32x or Zve32f now. Though we will lose support for limiting
ELEN to 16 or 8, but I hope no one was using that.

Reviewed By: frasercrmck

Differential Revision: https://reviews.llvm.org/D123418
2022-04-11 10:14:48 -07:00
Craig Topper c266e50430 [RISCV] Remove ExtZvl enum from RISCVSubtarget. NFC
Having an enum with names that contain the string representation
of their value doesn't add any value. We can just use the numbers.

Reviewed By: kito-cheng, frasercrmck

Differential Revision: https://reviews.llvm.org/D123417
2022-04-11 10:01:17 -07:00
LiaoChunyu 505fce5a9e [RISCV] Add basic code modeling for llvm.experimental.stepvector intrinsic
Scalable vectors llvm.experimental.stepvector intrinsic
will crash due to an invalid cost when run the code through the loopunroll.

Reviewed By: kito-cheng

Differential Revision: https://reviews.llvm.org/D122782
2022-04-11 10:19:23 +08:00
Craig Topper 4e561a581f [RISCV] Remove unnecessary cast to i8* when converting gather/scatter to strided load/store.
Not sure why I thought this necessary at the time.
2022-04-09 20:05:03 -07:00
Craig Topper 70046438d0 [RISCV] Only try LUI+SH*ADD+ADDI for int materialization if LUI+ADDI+SH*ADD failed.
There's an assert in LUI+SH*ADD+ADDI materialization that makes sure the
lower 12 bits aren't zero since that case should have been handled as
LUI+ADDI+SH*ADD. But nothing prevented the LUI+SH*ADD+ADDI checks from
running after the earlier code handled it.

The sequence would be the same length or longer so it wouldn't replace
the earlier sequence, but the assert happened before that was checked.

The vector holding the sequence also wasn't reset before the second
check so that guaranteed the sequence would never be found to be
shorter.

This patch fixes this by only trying the second expansion when the
earlier fails.

Fixes PR54812.

Reviewed By: benshi001

Differential Revision: https://reviews.llvm.org/D123406
2022-04-09 08:52:15 -07:00
Fraser Cormack 34e1b4774a [RISCV] Select unmasked FP setcc insts via ISel post-process
Similar to D123217 but for the floating-point patterns. No change in
generated output, while reducing the generated table size.

Reviewed By: arcbbb

Differential Revision: https://reviews.llvm.org/D123291
2022-04-08 17:13:43 +01:00
Craig Topper 1903b99154 [RISCV] Always select (and (srl X, C), Mask) as (srli (slli X, C2), C3).
SLLI is always compressible to C.SLLI as long as the source and dest
register is the same.

ANDI and SRLI are only compressible if the register is x8-x15. By
using SLLI we have a better chance of generating shorter code.

I had to exclude one exclusion for the BEXTI case so that it's
pattern match could still fire.

Reviewed By: luismarques

Differential Revision: https://reviews.llvm.org/D123336
2022-04-08 09:04:04 -07:00
Kito Cheng 9c5aedfbf5 [RISCV] Fixing stack offset for RVV object with vararg in stack.
We found LLVM generate wrong stack offset for RVV object when stack
having variable argument, that cause by we didn't count vaarg part during
calculate RVV stack objects.

Also update the stack layout diagram for including vaarg in the diagram.

Stack layout ref:
https://github.com/gcc-mirror/gcc/blob/master/gcc/config/riscv/riscv.cc#L3941

Reviewed By: rogfer01

Differential Revision: https://reviews.llvm.org/D123180
2022-04-08 12:01:16 +08:00
Kito Cheng 690085c9b7 [RISCV] Store/restore RISCVMachineFunctionInfo into MIR YAML file
RISCVMachineFunctionInfo has some fields like VarArgsFrameIndex and
VarArgsSaveSize are calculated at ISel lowering stage, those info are
not contained in MIR files, that cause test cases rely on those field
can't not reproduce correctly by MIR dump files.

This patch adding the MIR read/write for those fields.

Reviewed By: frasercrmck

Differential Revision: https://reviews.llvm.org/D123178
2022-04-08 11:55:48 +08:00
jacquesguan a55c19c44b [RISCV][NFC] Use defvar to simplify pattern definations.
Reviewed By: frasercrmck

Differential Revision: https://reviews.llvm.org/D123292
2022-04-08 02:51:30 +00:00
Craig Topper d98bea87ef [RISCV] Add more .vx patterns for VLMax integer setccs.
This patch synchronizes the structure of the templates with those
in RISCVInstrInfoVVLPatterns.td so that we get patterns with .vx
on the left hand side.

Reviewed By: rogfer01

Differential Revision: https://reviews.llvm.org/D123255
2022-04-07 09:17:43 -07:00
Craig Topper 82662b753d [RISCV] Add swapped patterns to VPatIntegerSetCCVL_VIPlus1.
This matches VPatIntegerSetCCVL_VI_Swappable. But as noted in the
FIXME this may only be needed due to lack of canonicalization on
VP_SETCC.

Reviewed By: rogfer01

Differential Revision: https://reviews.llvm.org/D123239
2022-04-07 09:17:08 -07:00
Luís Marques d09d297c5d [RISCV] Fix crash for section alignment with .option norvc
The existing code wasn't getting the subtarget info from the fragment,
so the current status of RVC would be ignored. This would cause a crash
for the new test case when the target then reported it couldn't write
the requested number of code alignment bytes.

Differential Revision: https://reviews.llvm.org/D122236
2022-04-07 12:02:27 +01:00
Fraser Cormack 8ebc9b1560 [RISCV] Select unmasked integer setcc insts via ISel post-process
This patch has no effect on the generated code, whilst mitigating the
increase in ISel table size caused by the recent addition of masked
patterns.

I aim to do the same for floating-point patterns once D123051 lands,
giving us a reason to use masked floating-point patterns.

Reviewed By: arcbbb

Differential Revision: https://reviews.llvm.org/D123217
2022-04-07 09:30:19 +01:00
Fraser Cormack 8216255c9f [RISCV][VP] Add basic RVV codegen for vp.fcmp
This patch adds the necessary infrastructure to lower vp.fcmp via
ISD::VP_SETCC to RVV instructions.

Most notably this patch adds cond-code legalization for VP_SETCC,
reusing the existing TargetLowering::LegalizeSetCCCondCode by passing in
additional SDValue parameters for the Mask and EVL. This method then
uses VP operations to legalize the condcode.

There is still a general lack of canonicalization on VP_SETCC as opposed
to SETCC which results in worse code than is theoretically possible.

Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D123051
2022-04-07 09:16:07 +01:00
Liqin Weng f891123556 [RISCV] Add CMOV isel pattern for (select (setgt X, Imm), Y, Z)
Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D122644
2022-04-07 05:55:53 +00:00
Lian Wang 1b547799c5 [RISCV] Supplement patterns for vnsrl.wx/vnsra.wx when splat shift is sext or zext
Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D122786
2022-04-07 02:21:41 +00:00
Craig Topper e13a44b460 [RISCV] Add lowering for vp.sext and vp.zext.
Including mask vector inputs.

Reviewed By: frasercrmck, rogfer01

Differential Revision: https://reviews.llvm.org/D123150
2022-04-06 09:59:49 -07:00
Fraser Cormack 6be5e875be [RISCV][VP] Add basic RVV codegen for vp.icmp
This patch adds the minimum required to successfully lower vp.icmp via
the new ISD::VP_SETCC node to RVV instructions.

Regular ISD::SETCC goes through a lot of canonicalization which targets
may rely on which has not hereto been ported to VP_SETCC. It also
supports expansion of individual condition codes and a non-boolean
return type. Support for all of that will follow in later patches.

In the case of RVV this largely isn't a problem as the vector integer
comparison instructions are plentiful enough that it can lower all
VP_SETCC nodes on legal integer vectors except for boolean vectors,
which regular SETCC folds away immediately into logical operations.

Floating-point VP_SETCC operations aren't as well supported in RVV and
the backend relies on condition code expansion, so support for those
operations will come in later patches.

Portions of this code were taken from the VP reference patches.

Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D122743
2022-04-06 16:51:22 +01:00
Craig Topper 3c831c9b28 [RISCV] Add support for vp.fptosi where the result is a mask type.
We can do this conversion by converting the same sized integer type, then compare the result with 0. The conversion is undefined if the converted FP value doesn't fit in an i1.

Reviewed By: frasercrmck

Differential Revision: https://reviews.llvm.org/D122678
2022-04-05 09:48:04 -07:00
Craig Topper d970e96c53 [RISCV] Add lowering for vp.fptoui and vp.uitofp.
This is a straightforward extension of D122512 to unsigned integers.
2022-04-01 18:28:46 -07:00
Craig Topper fa630e7594 [RISCV][AMDGPU][TargetLowering] Special case overflow expansion for (uaddo X, 1).
If we expand (uaddo X, 1) we previously expanded the overflow calculation
as (X + 1) <u X. This potentially increases the live range of X and
can prevent X+1 from reusing the register that previously held X.

Since we're adding 1, overflow only occurs if X was UINT_MAX in which
case (X+1) would be 0. So this patch adds a special case to expand
the overflow calculation to (X+1) == 0.

This seems to help with uaddo intrinsics that get introduced by
CodeGenPrepare after LSR. Alternatively, we could block the uaddo
transform in CodeGenPrepare for this case.

Reviewed By: arsenm

Differential Revision: https://reviews.llvm.org/D122933
2022-04-01 13:14:10 -07:00
Lian Wang 62dd3674bc [RISCV] Supplement SDNode patterns for vfwmul/vfwadd/vfwsub
Reviewed By: jacquesguan

Differential Revision: https://reviews.llvm.org/D122720
2022-04-01 03:09:50 +00:00
Fraser Cormack ee51aefba0 [RISCV][NFC] Minor formatting fix 2022-03-31 16:15:22 +01:00
Fraser Cormack a276d1f44b [RISCV][NFC] Fix formatting on one line 2022-03-31 13:17:37 +01:00
ShihPo Hung 2f1261abe4 [RISCV][RVV] Add Uses = [FRM] and mayRaiseFPException = true to RVV instructions
This patch adds Uses = [FRM] and mayRaiseFPException = true to following
instructions:

VFADD, VFSUB, VFRSUB, VFMUL, VFDIV, VFRDIV
VFWADD, VFWSUB, VFWMUL
VFMADD, VFMACC, VFMSAC, VFMSUB
VFNMADD, VFNMACC, VFNMSAC, VVFNMSUB
VFWMACC, VFWMSAC,
VFWNMACC, VFWNMSAC
VFSQRT, VFREC7
VFREDOSUM, VFREDUSUM,
VFWREDOSUM, VFWREDUSUM
and only adds mayRaiseFPException = true to following instructions:

VFRSQRT7,
VFMIN, VFMAX, VFREDMIN, VFREDMAX
VMFEQ, VMFNE, VMFLT,VMFLE, VMFGT, VMFGE

Reviewed By: rogfer01

Differential Revision: https://reviews.llvm.org/D121087
2022-03-31 01:33:17 -07:00
Fraser Cormack 893d63fbdc [RISCV][NFC] Fix comment to refer to correct file 2022-03-31 08:59:10 +01:00
Lian Wang b3851e9931 [RISCV] Add VL patterns for vfwmul/vfwadd/vfwsub
Reviewed By: frasercrmck

Differential Revision: https://reviews.llvm.org/D122369
2022-03-31 07:08:58 +00:00
Craig Topper 4477500533 [RISCV] ISel (and (shift X, C1), C2)) to shift pair in more cases
Previously, these isel optimizations were disabled if the AND could
be selected as a ANDI instruction. This patch disables the optimizations
only if the immediate is valid for C.ANDI. If we can't use C.ANDI,
we might be able to compress the shift instructions instead.

I'm not checking the C extension since we have relatively poor test
coverage of the C extension. Without C extension the code size
should be equal. My only concern would be if the shift+andi had
better latency/throughput on a particular CPU.

I did have to add a peephole to match SRLIW if the input is zexti32
to prevent a regression in rv64zbp.ll.

Reviewed By: luismarques

Differential Revision: https://reviews.llvm.org/D122701
2022-03-30 11:46:42 -07:00
Craig Topper 7417eb29ce [RISCV] Use getSplatBuildVector instead of getSplatVector for fixed vectors.
The splat_vector will be legalized to build_vector eventually
anyway. This patch makes it take fewer steps.

Unfortunately, this results in some codegen changes. It looks
like it comes down to how the nodes were ordered in the topological
sort for isel. Because the build_vector is created earlier we end up
with a different ordering of nodes.

Reviewed By: frasercrmck

Differential Revision: https://reviews.llvm.org/D122185
2022-03-30 11:36:34 -07:00
Liqin Weng 4cb85da811 [RISCV] Add CMIX isel pattern for (xor (and (xor rs1, rs3), rs2), rs3)
Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D122702
2022-03-30 16:51:09 +08:00
Fraser Cormack 75047577d6 [RISCV] Trim RVV isel pats matchable via DAG post-process
In D122512, several masked patterns were added to support lowering of
vector-predicated float-to-int and int-to-float conversions. With the
introduction of these patterns, all of the old "unmasked" patterns are
matchable via the DAG post-process introduced in D118810, once the relevant
opcode entries are set up in the helper table.

Locally this reduces the generated isel table by 4%.

Reviewed By: arcbbb

Differential Revision: https://reviews.llvm.org/D122637
2022-03-30 08:56:38 +01:00
Liqin Weng 7f81765898 [RISCV][NFC] Add immediate tests for the icmp instruction
Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D122651
2022-03-30 02:51:26 +00:00
Zakk Chen b578330754 [RISCV] Use maskedoff to decide mask policy for masked compare and vmsbf/vmsif/vmsof.
masked compare and vmsbf/vmsif/vmsof are always tail agnostic, we could
check maskedoff value to decide mask policy rather than have a addtional
policy operand.

Reviewed By: craig.topper, arcbbb

Differential Revision: https://reviews.llvm.org/D122456
2022-03-29 18:05:33 -07:00
Zakk Chen 10b2760da0 Revert "[RISCV] Add policy operand for masked compare and vmsbf/vmsif/vmsof IR"
This reverts commit 10fd2822b7.

I have a better implementation for those operations without the
additional policy operand.
masked compare and vmsbf/vmsif/vmsof are always tail agnostic so we could
assume undef maskedoff is mask agnostic.

Differential Revision: https://reviews.llvm.org/D122455
2022-03-29 18:05:33 -07:00
Liqin Weng d660c0d793 [RISCV] Optimize LI+SLT to SLTI+XORI for immediates in specific range
This transform will reduce one GPR.

Reviewed By: craig.topper, benshi001

Differential Revision: https://reviews.llvm.org/D122051
2022-03-29 14:46:49 +08:00
Craig Topper 45e85feba6 [RISCV] Pull APInt/computeKnonwbits specifics out of computeGREVOrGORC. NFC
This function now takes a uint64_t instead of an APInt. The caller
is responsible for masking the shift amount, extracting and inserting
into the KnownBits APInts, and inverting to compute zeros.

This is less code and cleaner division of responsibilities.
2022-03-28 20:53:54 -07:00
Shao-Ce SUN 662b9fa02c [NFC][CodeGen] Add a setTargetDAGCombine use ArrayRef
Reviewed By: arsenm

Differential Revision: https://reviews.llvm.org/D122557
2022-03-29 09:53:24 +08:00
Craig Topper 01203918d1 [RISCV] Add computeKnownBits support for RISCVISD::GORC.
Reviewed By: luismarques

Differential Revision: https://reviews.llvm.org/D121575
2022-03-28 16:56:33 -07:00
Craig Topper e68257fcee [RISCV][SelectionDAG] Enable TargetLowering::hasBitTest for masks that fit in ANDI.
Modified DAGCombiner to pass the shift the bittest input and the shift amount
to hasBitTest. This matches the other call to hasBitTest in TargetLowering.h

This is an alternative to D122454.

Reviewed By: luismarques

Differential Revision: https://reviews.llvm.org/D122458
2022-03-28 12:46:36 -07:00
Craig Topper cfe533da05 [RISCV] Add lowering for vp.fptosi and vp.sitofp.
This as an alternative version of D120641. Starting from the code here
https://repo.hca.bsc.es/gitlab/rferrer/llvm-epi/-/raw/EPI/llvm/lib/Target/RISCV/RISCVISelLowering.cpp
but with some modifications to how the interim types are calculated,
and adding support for f16.

Still need to add fptosi for mask vectors.

Lots of masked isel patterns added so we can pass the mask through
the type changes.

Reviewed By: frasercrmck, arcbbb

Differential Revision: https://reviews.llvm.org/D122512
2022-03-28 11:06:41 -07:00
Kazu Hirata 6212871968 [Target] Apply clang-tidy fixes for readability-redundant-member-init (NFC) 2022-03-27 22:22:37 -07:00
Maksim Panchenko 4ae9745af1 [Disassember][NFCI] Use strong type for instruction decoder
All LLVM backends use MCDisassembler as a base class for their
instruction decoders. Use "const MCDisassembler *" for the decoder
instead of "const void *". Remove unnecessary static casts.

Reviewed By: skan

Differential Revision: https://reviews.llvm.org/D122245
2022-03-25 18:53:59 -07:00
Dávid Bolvanský 9a738c477e [NFCI] Fix set-but-unused warning in RISCVAsmParser.cpp 2022-03-24 08:33:40 +01:00
jacquesguan 8910ac400c [RISCV] Add patterns for vector widening integer multiply
Add patterns for vector widening integer multiply instructions

Differential Revision: https://reviews.llvm.org/D117385
2022-03-24 15:26:08 +08:00
Vasileios Porpodas 39aa202aff Recommit "[SLP] Fix lookahead operand reordering for splat loads." attempt 3, fixed assertion crash.
Original review: https://reviews.llvm.org/D121354

This reverts commit e6ead19b77.
2022-03-23 18:32:17 -07:00
Craig Topper 6c90a654bb [RISCV] Simplify some code in lowering vector int<->fp conversions. NFC
Don't call EltVT.getSizeInBits() or SrcEltVT.getSizeInBits() a second
time. They are already in EltSize or SrcEltSize variables.

Refactor some comparisons to use multiply instead of division.
2022-03-23 12:09:05 -07:00
Arthur Eubanks e6ead19b77 Revert "Recommit "[SLP] Fix lookahead operand reordering for splat loads." attempt 2, fixed assertion crash."
This reverts commit 27bd8f9492.

Causes crashes, see comments in D121973
2022-03-23 10:57:45 -07:00
luxufan 5800fb41a6 [RISCV] Remove check and update test file in D121183
Differential Revision: https://reviews.llvm.org/D122290
2022-03-24 00:48:52 +08:00
luxufan 227496dc09 [RISCV] Generate correct ELF EFlags when .ll file has target-abi attribute
In the past, when construct RISCVAsmBackend, MCTargetOptions.ABIName would be passed and stored in RISCVAsmBackend.
But MCTargetOptions.ABIName can only be specified by -target-abi xxx in command line, if the .ll file has target-abi attribute, the codegen module will ignore it. And the generated object file would have incorrect EFlags value.

https://github.com/llvm/llvm-project/issues/50591 also caused by this problem.

This patch override the AsmPrinter::emitFunctionEntryLabel function and use it to set the target abi value that get from .ll file's target-abi attribute. And storing the target-abi in RISCVTargetStreamer instead of RISCVAsmBackend.

Differential Revision: https://reviews.llvm.org/D121183
2022-03-24 00:48:52 +08:00
Vasileios Porpodas 27bd8f9492 Recommit "[SLP] Fix lookahead operand reordering for splat loads." attempt 2, fixed assertion crash.
Original review: https://reviews.llvm.org/D121354

This reverts commit f7d7d2a08d.
2022-03-22 16:41:55 -07:00
Arthur Eubanks f7d7d2a08d Revert "Recommit "[SLP] Fix lookahead operand reordering for splat loads.""
This reverts commit 79613185d3.

Causes crashes, see comments in https://reviews.llvm.org/D121973.
2022-03-22 13:33:49 -07:00
Craig Topper 51940d69cb [RISCV] Special case sign extended scalars when type legalizing nxvXi64 .vx instrinsics on RV32.
On RV32, we need to type legalize i64 scalar arguments to intrinsics.
We usually do this by splatting the value into a vector separately.
If the scalar happens to be sign extended, we can continue using a .vx
intrinsic.

We already special cased sign extended constants, this extends it
to any sign extended value.

I've only added tests for one case of vadd. Most intrinsics go
through the same check.

Reviewed By: khchen

Differential Revision: https://reviews.llvm.org/D122186
2022-03-22 10:29:06 -07:00
Craig Topper 9b0f227d7b [TableGen][RISCV] Add InstAliases with zero_reg to cover unmasked vnot.v, vncvt.x.x.w, vneg.v, etc.
The mask being NoRegister prevented the existing aliases from matching
since NoRegister isn't in the VMV0 register class.

To workaround this I've added new aliases that look for zero_reg.
I had to motify tablegen to generate matching code for zero_reg.
And as a consequence, I had to change the EmitPriority for an ARM
alias that used zero_reg that started printing.

Reviewed By: frasercrmck

Differential Revision: https://reviews.llvm.org/D121496
2022-03-22 10:14:43 -07:00
Zakk Chen 10fd2822b7 [RISCV] Add policy operand for masked compare and vmsbf/vmsif/vmsof IR
intrinsics.

Those operations are updated under a tail agnostic policy, but they
could have mask agnostic or undisturbed.

Reviewed By: rogfer01

Differential Revision: https://reviews.llvm.org/D120228
2022-03-22 07:47:21 -07:00
Zakk Chen 9ab18cc535 [RISCV] Add policy operand for masked vid and viota IR intrinsics.
Reviewed By: rogfer01

Differential Revision: https://reviews.llvm.org/D120227
2022-03-22 02:32:31 -07:00
Zakk Chen abb5a985e9 [RISCV] Support mask policy for RVV IR intrinsics.
Add the UsesMaskPolicy flag to indicate the operations result
would be effected by the mask policy. (ex. mask operations).

It means RISCVInsertVSETVLI should decide the mask policy according
by mask policy operand or passthru operand.
If UsesMaskPolicy is false (ex. unmasked, store, and reduction operations),
the mask policy could be either mask undisturbed or agnostic.
Currently, RISCVInsertVSETVLI sets UsesMaskPolicy operations default to
MA, otherwise to MU to keep the current mask policy would not be changed
for unmasked operations.

Add masked-tama, masked-tamu, masked-tuma and masked-tumu test cases.
I didn't add all operations because most of implementations are using
the same pseudo multiclass. Some tests maybe be duplicated in different
tests. (ex. masked vmacc with tumu shows in vmacc-rv32.ll and masked-tumu)
I think having different tests only for policy would make the testing
clear.

Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D120226
2022-03-22 01:19:16 -07:00
Yeting Kuo ecd7a0132a [RISCV] Add basic cost model for vector casting
To perform the cost model of vector casting, the patch consider most vector
casts as their scalar form and consider those vector form of free scalr castings
as 1.

Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D121771
2022-03-22 14:17:08 +08:00
Vasileios Porpodas 79613185d3 Recommit "[SLP] Fix lookahead operand reordering for splat loads."
Original review: https://reviews.llvm.org/D121354

The original commit 9136145eb0 broke the build on several targets.

Differential Revision: https://reviews.llvm.org/D121973
2022-03-21 15:57:32 -07:00
Craig Topper cc5b0868ff Revert "[RISCV] Special case sign extended scalars when type legalizing nxvXi64 .vx instrinsics on RV32."
This reverts commit 8c4937b33f.

Committed by mistake.
2022-03-21 14:58:11 -07:00
Craig Topper d4aeb5000f [RISCV] Simplify some code. NFC 2022-03-21 14:50:56 -07:00
Craig Topper 19de2e8db6 [RISCV] Remove stray slash from comment. NFC 2022-03-21 14:50:56 -07:00
Craig Topper 8c4937b33f [RISCV] Special case sign extended scalars when type legalizing nxvXi64 .vx instrinsics on RV32.
On RV32, we need to type legalize i64 scalar arguments to intrinsics.
We usually do this by splatting the value into a vector separately.
If the scalar happens to be sign extended, we can continue using a .vx
intrinsic.

We already special cased sign extended constants, this extends it
to any sign extended value.

I've only added tests for one case of vadd. Most intrinsics go
through the same check. I can add more tests if we're concerned.

Differential Revision: https://reviews.llvm.org/D122186
2022-03-21 14:50:55 -07:00
Mohammed Nurul Hoque 7afa44f5f5 [RISCV] Add more sign-extending ops to MIR sext.w pass.
This patch adds single-bit and bit-counting ops to list of sign-extending ops.

A single-bit write propagates sign-extendedness if it's not in the sign-bits.

Bit extraction and bit counting always outputs a small number, so sign-extended.

Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D121152
2022-03-18 18:21:17 +08:00
Jessica Clarke 63ea7797dd [RISCV] Fix buildbot breakage by explicitly instantiating templates
RISCVISelDAGToDAG's selectImm uses RISCVTargetLowering::getAddr
(specifically the ConstantPoolSDNode) as of 41454ab256 ("[RISCV] Use
constant pool for large integers"), but nothing explicitly instantiates
any of the templates, the only reason they exist is because of the
various lowering methods in RISCVISelLowering.cpp that themselves use
the methods. However, with inlining, those can end up not existing as
real functions and thus not be exported, leading to link errors. Up
until now this hasn't happened, but for whatever reason D121654 has
triggered this on the sanitizer-ppc64be-linux buildbot, giving:

  ../../../../lib/libLLVMRISCVCodeGen.a(RISCVISelDAGToDAG.cpp.o): In function `selectImm(llvm::SelectionDAG*, llvm::SDLoc const&, llvm::MVT, long, llvm::RISCVSubtarget const&)':
  RISCVISelDAGToDAG.cpp:(.text._ZL9selectImmPN4llvm12SelectionDAGERKNS_5SDLocENS_3MVTElRKNS_14RISCVSubtargetE+0x3d8): undefined reference to `llvm::SDValue llvm::RISCVTargetLowering::getAddr<llvm::ConstantPoolSDNode>(llvm::ConstantPoolSDNode*, llvm::SelectionDAG&, bool) const'
  collect2: error: ld returned 1 exit status

Fix this by explicitly instantiating getAddr in its four different forms
so separate translation units can reliably use it.

Fixes: 41454ab256 ("[RISCV] Use constant pool for large integers")
2022-03-18 02:22:17 +00:00
Craig Topper bbd2ecf9f0 [RISCV] Add +experimental-zvfh extension to cover half types in vectors.
Currently we allow half types in vectors if the scalar Zfh extension
is enabled. This behavior is not inline with the vector spec. For f32
and f64 types, the Zve32f, Zve64f, Zve64d, and V explicitly control
the availablity of floating point types in vectors.

In order to make our compiler compliant, we either need to remove all support
for half in vectors or we need an extension to control it.

Draft spec here https://github.com/riscv/riscv-v-spec/pull/780

Reviewed By: kito-cheng

Differential Revision: https://reviews.llvm.org/D121345
2022-03-17 10:04:02 -07:00
Craig Topper 7e15303062 [RISCV] Simplify scalable vector case in lowerVectorMaskExt.
Since we have SPLAT_VECTOR_PARTS these days, I don't think we need
to go through extra lengths to avoid introducing an illegal scalar type.
We can just call getConstant using the scalable vector type and let
it create either a SPLAT_VECTOR or a SPLAT_VECTOR_PARTS.

Reviewed By: frasercrmck, rogfer01

Differential Revision: https://reviews.llvm.org/D121645
2022-03-17 09:43:13 -07:00
Lian Wang 214afc7116 [RISCV] Add patterns for vnsrl.wi and vnsra.wi instructions
Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D121675
2022-03-17 07:22:32 +00:00
Lian Wang b26abcad81 [RISCV][NFC] Replace redundant code with VLOpFrag
Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D121783
2022-03-17 02:05:21 +00:00
Craig Topper 2e10671ec7 [RISCV] Improve detection of when to skip (and (srl x, c2) c1) -> (srli (slli x, c3-c2), c3) isel.
We have a special case to skip this transform if c1 is 0xffffffff
and x is sext_inreg in order to use sraiw+zext.w. But we were only
checking that we have a sext_inreg opcode, not how many bits are
being sign extended.

This commit adds a check that it is a sext_inreg from i32 so we know for
sure that an sraiw can be created.
2022-03-16 14:54:34 -07:00
Jessica Clarke 659363c0cc [RISCV] Ensure PseudoLA* can be hoisted
Since we mark the pseudos as mayLoad but do not provide any MMOs,
isSafeToMove conservatively returns false, stopping MachineLICM from
hoisting the instructions. PseudoLA_TLS_GD does not actually expand to a
load, so stop marking that as mayLoad to allow it to be hoisted, and for
the others make sure to add MMOs during lowering to indicate they're GOT
loads and thus can be freely moved.

Fixes https://github.com/llvm/llvm-project/issues/54372

Reviewed By: MaskRay, arichardson

Differential Revision: https://reviews.llvm.org/D121654
2022-03-16 18:45:36 +00:00
Shengchen Kan 37b378386e [NFC][CodeGen] Rename some functions in MachineInstr.h and remove duplicated comments 2022-03-16 20:25:42 +08:00
serge-sans-paille 989f1c72e0 Cleanup codegen includes
This is a (fixed) recommit of https://reviews.llvm.org/D121169

after:  1061034926
before: 1063332844

Discourse thread: https://discourse.llvm.org/t/include-what-you-use-include-cleanup
Differential Revision: https://reviews.llvm.org/D121681
2022-03-16 08:43:00 +01:00
Haocong.Lu 6a54776fe0 [RISCV] Select SRLI+SLLI for AND with leading ones mask
Select SRLI+SLLI for and i64 %x, imm if the imm is a leading ones mask.
It's useful in RV64 when the mask exceeds simm32 (cannot be generated by LUI).

Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D121598
2022-03-16 02:10:57 +00:00
Craig Topper 06c5d74090 [RISCV] Remove lowerSPLAT_VECTOR
This code handles fixed vector SPLAT_VECTOR, but is never called in
any tests.

We only form fixed vector splat vectors for vXi64 on RV32 as part
of DAGCombine. This will be type legalized to SPLAT_VECTOR_PARTS.
So the Custom handling for SPLAT_VECTOR is never needed.

This patch makes SPLAT_VECTOR for vXi64 'Legal' on RV32 so that
DAGCombine will create it, but there's no need for Custom handler.
It will still be type legalized to SPLAT_VECTOR_PARTS.

Reviewed By: frasercrmck

Differential Revision: https://reviews.llvm.org/D121673
2022-03-15 08:22:13 -07:00
Yeting Kuo ae7c6647f3 [RISCV] Add basic code modeling for fixed length vector reduction.
Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D121447
2022-03-14 11:04:31 +08:00
Craig Topper eeb3bfd74a [RISCV] Merge ReplaceNodeResults code for SHFL and GREV/GORC. NFC 2022-03-13 18:42:26 -07:00
Lehua Ding 1648852c98 [RISCV][RVV] Fix vslide1up/down intrinsics overflow bug for SEW=64 on RV32
Reviewed By: craig.topper, kito-cheng

Differential Revision: https://reviews.llvm.org/D120899
2022-03-13 18:06:09 +08:00
Craig Topper fd4d584d6b [RISCV] Add DAGCombine to fold (bitreverse (bswap X)) to brev8 with Zbkb.
If the type is less than XLenVT, type legalization will turn this
into (srl (bitreverse (bswap (srl (bswap X), C))), C). We can't
completely recover from these shifts. They introduce zeros into
the upper bits of the result and we can't easily tell if they are
needed. By doing a DAG combine early, we avoid introducing these
shifts.
2022-03-12 16:39:39 -08:00
Craig Topper 43f668b98e [RISCV] Move GORCIW/GREVIW formation to isel patterns.
Type legalize narrow RISCVISD::GREV/GORC with constant to a larger
type without switching to W. Detect sext_inreg+gorci/grevi with a
uimm5 immediate during isel to emit GREVIW/GORCIW.

This allows us to better propagate known bits information through
extended bits after type legalization. It will also simplify a
change I'm considering for BREV8 with Zbkb.

A future patch will add computeKnownBits support for GORC.

A further improvement here would be to use hasAllWUsers and
doPeepholeSExtW like we do for SLLIW, but I don't think we have
the test coverage for that yet.
2022-03-11 18:02:47 -08:00
Craig Topper d0969e485c [RISCV] Optimize vfmv.s.f intrinsic with scalar 0.0 to vmv.s.x with x0.
We already do this for RISCVISD::VFMV_S_F_VL and the vfmv.v.f
intrinsic.

Reviewed By: kito-cheng

Differential Revision: https://reviews.llvm.org/D121429
2022-03-11 10:05:43 -08:00
Craig Topper e9d4922543 [RISCV] Add tablegen helper classes to create PatFrag to check for one use. NFC
Reduces code and the class can be instantiated in isel patterns to
avoid creating more *_oneuse classes.
2022-03-10 23:14:21 -08:00
Craig Topper 337d49da84 [RISCV] Fix typo in comment. NFC 2022-03-10 22:00:18 -08:00
Eric Tang 336c92d5e8 [RISCV] Add alias for HFENCE.VVMA
Signed-off-by: Eric Tang <eric.tang@starfivetech.com>

Differential Revision: https://reviews.llvm.org/D120878
2022-03-11 13:32:52 +08:00
Craig Topper 1f3a8d58a6 [RISCV] Use ZERO_EXTEND instead of ANY_EXTEND when promoting i32 RISCVISD::SHFL. NFC
We know the shift amount is a constant with bit 31 clear. anyext
of constant will be either zext or sext which will produce the
same result here. But we really shouldn't rely on that. It would
be valid to put a random number in the upper bits. Our isel patterns
expect the upper bits to be 0 so we should ask for it explicitly.
2022-03-10 20:57:04 -08:00
Craig Topper 9ce6b1ca86 [RISCV] Remove performANY_EXTENDCombine.
This doesn't appear to be needed any more. I did some inspecting
of the gcc torture suite and SPEC2006 with this removed and didn't
find any meaningful changes.

I think we're more aggressive about forming ADDIW now using
sign_extend_inreg during type legalization and hasAllWUsers in isel.
This probably helps catch the cases this helped with before.
2022-03-10 11:29:31 -08:00
Craig Topper e0e8edf823 [RISCV] Add isel patterns for masked RISCVISD::FMA_VL with RISCVISD::FNEG_VL.
This helps us form vfnmsub, vfnmadd, and vfmusb from masked VP
intrinsics.

I've used "srcvalue" for the mask parameter in the fneg nodes. We
can't match "V0" because that doesn't ensure the mask the is the same.
Instead it matches two different nodes and generates two copies to
V0 of those separate values.

Reviewed By: rogfer01

Differential Revision: https://reviews.llvm.org/D120287
2022-03-10 10:05:42 -08:00
Nico Weber a278250b0f Revert "Cleanup codegen includes"
This reverts commit 7f230feeea.
Breaks CodeGenCUDA/link-device-bitcode.cu in check-clang,
and many LLVM tests, see comments on https://reviews.llvm.org/D121169
2022-03-10 07:59:22 -05:00
serge-sans-paille 7f230feeea Cleanup codegen includes
after:  1061034926
before: 1063332844

Differential Revision: https://reviews.llvm.org/D121169
2022-03-10 10:00:30 +01:00
Luke 0803dba7dd [RISCV] Add fixed-length vector instrinsics for segment load
Inspired by reviews.llvm.org/D107790.

Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D119834
2022-03-10 16:23:40 +08:00
Craig Topper d53707508a [RISCV] Remove RISCVISD::VLE_VL/VSE_VL. Use intrinsics instead.
Similar to what we do for other loads/stores, use the intrinsic
version that we already have custom isel for.

Reviewed By: rogfer01

Differential Revision: https://reviews.llvm.org/D121166
2022-03-09 22:44:28 -08:00
Craig Topper edd6632127 [RISCV] Support 'generic' as a valid CPU name.
Most other targets support 'generic', but RISCV issues an error.
This can require a special case in tools that use LLVM that aren't
clang.

This patch treats "generic" the same as an empty string and remaps
it to generic-rv/rv64 based on the triple. Unfortunately, it has to
be added to RISCV.td because MCSubtargetInfo is constructed and
parses the CPU before RISCVSubtarget's constructor gets a chance
to remap it. The CPU will then reparsed and the state in the
MCSubtargetInfo subclass will be updated again.

Fixes PR54146.

Reviewed By: khchen

Differential Revision: https://reviews.llvm.org/D121149
2022-03-09 16:43:22 -08:00
Shao-Ce SUN 365c858a5d [RISCV] Share PatFprFpr classes for F, D, and Zfh
Inspired by D115469

Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D121066
2022-03-08 13:02:04 +08:00
jacquesguan e55b9b0d0a [RISCV] Add patterns for vector widening floating-point reduction instructions.
Add patterns for vector widening floating-point reduction instructions.

Differential Revision: https://reviews.llvm.org/D120390
2022-03-08 10:53:49 +08:00
Craig Topper 845bfcede1 [RISCV] Rename 'SplatOperand' to 'ScalarOperand'. NFC
vslide1up/down have this flag set, but the value isn't a splat.
Rename for clarity.

Reviewed By: khchen

Differential Revision: https://reviews.llvm.org/D121037
2022-03-07 11:28:32 -08:00
Zakk Chen 3be907621f [RISCV] Fix incorrect optimization for masked vmsgeu.vi with 0 immediate.
vmsgeu.vi with 0 is always true, but in the masked with mask undisturbed
policy, we still need to keep inactive elelemt which come from maskedoff.

We could return mask directly if it's mask agnostic policy in the future.

Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D121080
2022-03-06 19:22:35 -08:00
Benjamin Kramer fbce4a7803 Drop some more global std::maps. NFCI. 2022-03-06 13:28:29 +01:00
Craig Topper bd5f124716 [RISCV] Add SimplifyDemandedBits support for FSR/FSL/FSRW/FSLW. 2022-03-05 21:26:51 -08:00
Zakk Chen 33b61c5678 [RISCV] Fix incorrect codegen introduced by D119688.
We should not emit a tail agnostic vlse for a tail undisturbed vmv.s.x

In D119688:
-    if (IsScalarMove && !Node->getOperand(0).isUndef())
+    bool HasPassthruOperand = Node->getOpcode() != ISD::SPLAT_VECTOR;
+    if (HasPassthruOperand && !IsScalarMove &&
!Node->getOperand(0).isUndef())
       break;

The IsScalarMove check in the if statement had been changed.

Differential Revision: https://reviews.llvm.org/D120963
2022-03-05 06:10:26 -08:00
Craig Topper 1e569e3b7b [RISCV] Add CMOV isel pattern for (select (setgt X, -1), Y, Z)
setgt X, -1 is the canonical form of setge X, 0. We can swap the
select operands and use setlt X, X0 when selecting CMOV. This
avoid materializing the -1 in a register.
2022-03-04 22:35:13 -08:00
Craig Topper 232f57319d [RISCV] Move vslide1up/down intrinsics into lowerVectorIntrinsicSplats. NFC
Rename to lowerVectorIntrinsicScalars.

This allows us to share the code that checks if the scalar needs
to be type legalized.
2022-03-04 18:21:53 -08:00
Craig Topper 3d4e83f17d [RISCV] With Zbb, fold (sext_inreg (abs X)) -> (max X, (negw X))
With Zbb, abs is expanded to (max X, neg) by default. If X has 33 or
more sign bits, we can expand it a little early using negw instead of
neg to save a sext_inreg. If X started as a 32 bit value, type
legalization would have inserted a sext before the abs so X having
33 sign bits should always be true.

Note: I've used ISD::FREEZE here since we increase the number of uses.
Our default expansion for ABS doesn't do that, but I think that's a bug.

We can't do this with custom type legalization because ISD::FREEZE
doesn't propagate sign bits so later DAG combine won't expand be
able to see optmize it.

Alives2 https://alive2.llvm.org/ce/z/Gx3RNe

Reviewed By: spatel

Differential Revision: https://reviews.llvm.org/D120597
2022-03-03 15:42:29 -08:00
Alex Tsao 89f15fc687 [RISCV] Add cost modelling for masked memory op
The patch adds very basic cost model for masked memory op on scalable vector.

Reviewed By: frasercrmck

Differential Revision: https://reviews.llvm.org/D117884
2022-03-03 20:47:58 +08:00
jacquesguan 44a430354d [RISCV] Fold store of vmv.f.s to a vse with VL=1.
This patch support the FP part of D109482.

Differential Revision: https://reviews.llvm.org/D120235
2022-03-03 16:35:19 +08:00
Craig Topper 6cb42cd666 [RISCV] More correctly ignore Zfinx register classes in getRegForInlineAsmConstraint.
Until Zfinx is supported in CodeGen we need to convert all Zfinx
register classes to GPR.

Remove the zfinx-types.ll test which didn't test anything meaningful
since -mattr=zfinx isn't implemented completely in llc.

Follow up to D93298.
2022-03-02 11:22:46 -08:00
Craig Topper a1f8349d77 [RISCV] Don't combine ROTR ((GREV x, 24), 16)->(GREV x, 8) on RV64.
This miscompile was introduced in D119527.

This was a special pattern for rotate+bswap on RV32. It doesn't
work for RV64 since the rotate needs to be half the bitwidth. The
equivalent pattern for RV64 is ROTR ((GREV x, 56), 32) so match
that instead.

This could be generalized further as noted in the new FIXME.

Reviewed By: Chenbing.Zheng

Differential Revision: https://reviews.llvm.org/D120686
2022-03-02 09:47:06 -08:00
Nikita Popov 98cfcae4e9 Revert "[RISCV] Add cost modelling for masked memory op"
This reverts commit 76f243b53b.

The newly added test fails.
2022-03-02 17:32:10 +01:00
Alex Tsao 76f243b53b [RISCV] Add cost modelling for masked memory op
The patch adds very basic cost model for masked memory op on scalable vector.

Reviewed By: frasercrmck

Differential Revision: https://reviews.llvm.org/D117884
2022-03-02 22:48:41 +08:00
Shao-Ce SUN 0e38b29543 [RISCV] add the MC layer support of Zfinx extension
This patch added the MC layer support of Zfinx extension.

Authored-by: StephenFan
Co-Authored-by: Shao-Ce Sun

Reviewed By: asb

Differential Revision: https://reviews.llvm.org/D93298
2022-03-02 14:25:19 +08:00
Mircea Trofin cb2160760e [nfc][codegen] Move RegisterBank[Info].h under CodeGen
This wraps up from D119053. The 2 headers are moved as described,
fixed file headers and include guards, updated all files where the old
paths were detected (simple grep through the repo), and `clang-format`-ed it all.

Differential Revision: https://reviews.llvm.org/D119876
2022-03-01 21:53:25 -08:00
Craig Topper b9d6e8c441 [RISCV] Lower VECTOR_SPLICE to RVV instructions.
This lowers VECTOR_SPLICE of scalable vectors to a slidedown follow by a slideup.
Fixed vectors are encouraged to use shufflevector instruction. The equivalent patch
for fixed vectors is D119039.

I've used a tail agnostic slidedown and limited the VL to only the
elements that will not be overwritten by the slideup. The slideup
uses VLMax for its VL. It unfortunately uses tail undisturbed policy
but it isn't required as there is no tail. We just need the merge
operand to carry the bits for the lower portion of the result.

Care was taken to ensure that either the slideup or slidedown will
be able to use a .vi instruction when the immediate is small. Which
one uses the immediate depends on the sign of the immediate.

Reviewed By: frasercrmck, ABataev

Differential Revision: https://reviews.llvm.org/D119303
2022-03-01 10:10:13 -08:00
Lian Wang db85cd729a [RISCV] Add FMV_W_X and FMV_H_X instrutions to hasAllNBitUsers
Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D120699
2022-03-01 08:13:59 +00:00
lian wang 5d91a8a707 [RISCV] Add schedule class for Zbp extension and Zbr extension
Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D120012
2022-03-01 07:35:59 +00:00
Lian Wang e2c150ab52 [RISCV][NFC] Move defined non_imm12 to proper place in RISCVInstrInfoZb.td
Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D120656
2022-03-01 01:45:30 +00:00
Craig Topper e83db8c001 [RISCV] Only enable combineROTR_ROTL_RORW_ROLW with Zbp.
I think the immediate values we check for on the GREV nodes already
protect this, but better to be explicit.
2022-02-28 12:47:36 -08:00
Craig Topper b083157b7b [RISCV] Don't call combineROTR_ROTL_RORW_ROLW for SLLW/SRLW/SRAW nodes. NFC
I think the function does the correct thing internally, but it's
confusing to read.
2022-02-28 11:05:10 -08:00
Craig Topper f46890711f [RISCV] Custom type legalize i32 ISD::ABS on RV64 without Zbb.
Default type legalization will create sext_inreg+abs, but we may
not be able to remove the sext_inreg.

Instead this patch expands abs during type legalization to
Y = sraiw X, 31; subw(xor X, Y), Y) which doesn't require the input
to be sign extended.

This gives a big improvement for some neg-abs tests where the
abs is used more than the the neg. Previously the abs was expanded
a different way before and after type legalization. Now they are
expanded in a similar way enabling more CSE.

Reviewed By: asb

Differential Revision: https://reviews.llvm.org/D120636
2022-02-28 09:30:27 -08:00
eric.tang b496a172e4 [RISCV] Support hypervisor extention instructions
According to privileged spec version-20211203

    Add the following hypervisor instructions:
        - HLV.B HLV.BU
        - HLV.H HLV.HU HLVX.HU
        - HLV.W HLV.WU HLVX.WU
        - HLV.D
        - HSV.B HSV.H HSV.W HSV.D

Signed-off-by: eric.tang <eric.tang@starfivetech.com>

Differential Revision: https://reviews.llvm.org/D117733
2022-02-28 14:02:43 +08:00
eric.tang 386c5be92a [RISCV] Support Sinval extension and hypervisor memory management fence instructions
According to Privileged spec version-20211203

    Add Supervisor Memory-Management Instructions:
        - SINVAL.VMA, SFENCE.W.INVAL, SFENCE.INVAL.IR
    Add Hypervisor Memory-Management Instructions:
        - HFENCE.VVMA, HFENCE.GVMA, HINVAL.VVMA, HINVAL.GVMA

Signed-off-by: eric.tang <eric.tang@starfivetech.com>

Differential Revision: https://reviews.llvm.org/D117654
2022-02-28 14:02:43 +08:00
Eric Tang cf80ef1393 [RISCV] Change GPRMemAtomic to GPRMemZeroOffset for general usage
Not only some AMO instructions but also other instructions need to
    process (${gpr}) or 0(${gpr}), where the 0 is be silently ignored.

    This patch does some changes for general usage.

Signed-off-by: Eric Tang <eric.tang@starfivetech.com>

Differential Revision: https://reviews.llvm.org/D120017
2022-02-28 14:02:43 +08:00
Chenbing Zheng 7f811ce127 [RISCV] Optimize (sext.w, srli) to sraiw with Zba.
In this patch, we add a more narrower exclusion for
zeroext (srl x) -> srli (slli x), so that it provides an opportunity
for the selection of sraiw.

Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D120467
2022-02-28 10:34:35 +08:00
Jessica Clarke 6aa8521fdb [RISCV] Fix parseBareSymbol to not double-parse top-level operators
By failing to lex the token we end up both parsing it as a binary
operator ourselves and parsing it as a unary operator when calling
parseExpression on the RHS. For plus this is harmless but for minus this
parses "foo - 4" as "foo - -4", effectively treating a top-level minus
as a plus.

Fixes https://github.com/llvm/llvm-project/issues/54105

Reviewed By: asb, MaskRay

Differential Revision: https://reviews.llvm.org/D120635
2022-02-27 20:48:52 +00:00
Jameson Nash c4b1a63a1b mark getTargetTransformInfo and getTargetIRAnalysis as const
Seems like this can be const, since Passes shouldn't modify it.

Reviewed By: wsmoses

Differential Revision: https://reviews.llvm.org/D120518
2022-02-25 14:30:44 -05:00
Haocong.Lu 865fe131f8 [RISCV] Fix a mistake in PostprocessISelDAG
With the condition N->use_empty(), the root node of DAG always
misses peephole optimization. So a dummy node is needed.

Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D119934
2022-02-25 12:38:31 +00:00
Chenbing Zheng b20e80aa59 [RISCV] DAG Combine vcpop and vfirst with VL=0 to li imm
vcpop and vfirst are still useful when VL=0.
vcpop equivalents to li 0 and vfirst equivalents to li -1,
since no mask elements are active.

Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D120302
2022-02-25 14:44:25 +08:00
Zakk Chen 4e115b7d88 [RISCV] Update computeTargetABI from llc as well as clang
Clang computes the default ABI if -mabi is empty
and encode it in LLVM IR module flag since D105555.
For correctness, llc need to give the same target-abi
(Options.MCOptions.ABIName) with ABI encoded in IR.
The getSubtargetImpl already has a check for them only if
Options.MCOptions.ABIName is not empty.

In order to get more robustness we could have a check for
explicit ABI, but now we have two different logic to
compute the default ABI.

The front-end ABI is defautl to the ilp32/ilp32e/lp64, and
ilp32d/lp64d when hardware support for extension D.
The backend ABI is default to the ilp32/ilp32e/lp64.

Reviewed by: asb, jrtc27

Differential Revision: https://reviews.llvm.org/D118333
2022-02-24 21:55:44 -08:00
lian wang f37d21ed20 [RISCV] Add schedule class for Zbt extension
Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D119808
2022-02-25 01:57:20 +00:00
Qihan Cai 0d058ed3d6 [RISCV] Change rvv version to 1.0 and remove ratify notice
This patch changes the version of V extension from 0.1 to 1.0 in RISCVInstrInfoVPseudos.td, RISCVInstrInfoVSDPatterns.td,  RISCVInstrInfoVVLPatterns.td, RISCVInstrInfoV.td

Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D120525
2022-02-25 11:38:20 +11:00
Craig Topper 506ac29632 [RISCV] Add 'i64' to some isel so tablegen will remove them for RV32. NFC
Saves a 100 bytes or so from the isel table.
2022-02-24 15:10:05 -08:00
Craig Topper a975ca97c3 [RISCV] Fold (sext_inreg (fmv_x_anyexth X), i16) -> (fmv_x_signexth X).
Add a new ISD opcode to represent the sign extending behavior of
vmv.x.h. Keep the previous anyext opcode to allow the existing
(fmv_x_anyexth (fmv_h_x X)) combine to keep working without needing
to generate a sign extend.

For fmv.x.w we are able to match the sext_inreg in an isel pattern,
but a 16-bit sext_inreg is lowered to a shift pair before isel. This
seemed like a larger match than we should do in isel.

Reviewed By: asb

Differential Revision: https://reviews.llvm.org/D118974
2022-02-24 09:19:01 -08:00
Shao-Ce SUN 78b5f0fb05 [NFC][RISCV] Reuse ISD::NodeType in float extension
Reviewed By: asb

Differential Revision: https://reviews.llvm.org/D120412
2022-02-24 19:57:55 +08:00
Nikita Popov c7fe6f9c92 Revert "[RISCV] add the MC layer support of Zfinx extension"
This reverts commit 7798ecca9c.

As reported in https://reviews.llvm.org/D93298#3331641 and
following, this causes assertion failures with inline assembly.
2022-02-24 12:14:31 +01:00
lian wang e1d4d1c242 [RISCV] Add schedule class for Zbm and Zbe extension
Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D119805
2022-02-24 08:49:25 +00:00
Chenbing.Zheng 2ae92e19eb [RISCV][NFC] Add helper function isVectorConfigInstr to reduce Repeated code.
Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D119924
2022-02-24 05:59:12 +00:00
Craig Topper 5b7ac107b1 [RISCV] Use SelectionDAG::getFreeze to simplify some code. NFC 2022-02-23 21:13:01 -08:00
Craig Topper c7d6448d03 [DAGCombiner][TargetLowering] Pass SDValue by value to isMulAddWithConstProfitable.
Internally to DAGCombiner the SDValues were passed by non-const
reference despite not being modified. They were then passed by
const reference to TLI.

This patch passes them by value which is consistent with the vast
majority of code.

Reviewed By: RKSimon

Differential Revision: https://reviews.llvm.org/D120420
2022-02-23 12:40:45 -08:00
Alex Bradbury c5bcfb983e [RISCV] Avoid infinite loop between DAGCombiner::visitMUL and RISCVISelLowering::transformAddImmMulImm
See https://github.com/llvm/llvm-project/issues/53831 for a full discussion.

The basic issue is that DAGCombiner::visitMUL and
RISCVISelLowering;:transformAddImmMullImm get stuck in a loop, as the
current checks in transformAddImmMulImm aren't sufficient to avoid all
cases where DAGCombiner::isMulAddWithConstProfitable might trigger a
transformation. This patch makes transformAddImmMulImm bail out if C0
(the constant used for multiplication) has more than one use.

Differential Revision: https://reviews.llvm.org/D120332
2022-02-23 11:05:46 +00:00
jacquesguan 5acd9c49a8 [RISCV] Add patterns for vector widening integer reduction instructions
Add patterns for vector widening integer reduction instructions.

Differential Revision: https://reviews.llvm.org/D117643
2022-02-22 14:14:05 +08:00
Zakk Chen f7dfc5d1af [RISCV] Optimize tail agnostic vmv.s.x which don't need to select tail value.
Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D120250
2022-02-21 14:53:37 -08:00
Craig Topper 90d240553d [RISCV] Teach shouldSinkOperands to sink splat operands of vp.fma intrinsics.
Reviewed By: frasercrmck

Differential Revision: https://reviews.llvm.org/D120167
2022-02-21 11:52:59 -08:00
Lian Wang 4abe484525 [RISCV][NFC] Add sched for some instructions in Zb extension
Add sched to brev8, zip and unzip instruction.

Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D120009
2022-02-21 09:58:08 +08:00
Zakk Chen 17d5ba5bc7 [RISCV][NFC] Remove unused multiclass def. 2022-02-18 23:58:56 -08:00
Craig Topper 5489969550 [RISCV] Add IsRV32 to the isel pattern for ZIP_RV32/UNZIP_RV32. NFC
I think the i32 in the pattern prevents this from matching on RV64,
but using IsRV32 is safer.

Add tests for RV64 to make sure we don't print zip or unzip
because we incorrectly picked ZIP_RV32/UNZIP_RV32.
2022-02-18 22:38:14 -08:00
Zakk Chen ca78312407 [RISCV] Add the policy operand for nomask vector Multiply-Add IR intrinsics.
The goal is support tail and mask policy in RVV builtins.
We focus on IR part first.

The nomask vector Multiply-Add need a policy operand
because merge value could not be undef.

Reviewed By: monkchiang

Differential Revision: https://reviews.llvm.org/D119727
2022-02-17 09:12:46 -08:00
Craig Topper bbee9e77f3 [RISCV] Match shufflevector corresponding to slideup.
This generalizes isElementRotate to work when there's only a single
slide needed. I've removed matchShuffleAsSlideDown which is now
redundant.

Reviewed By: frasercrmck, khchen

Differential Revision: https://reviews.llvm.org/D119759
2022-02-17 08:19:10 -08:00
Craig Topper 954fe404ab [RISCV] Fix incorrect MemOperand copy converting splat+load to vlse.
Due to an incorrect copy/paste from load intrinsic handling we
checked if the splat node was a MemSDNode which of course it isn't.

Instead get the MemOperand from the LoadSDNode for the source of
the splat.

This enables LICM to see the load is loop invariant and hoist it
out of the loop.

Reviewed By: frasercrmck

Differential Revision: https://reviews.llvm.org/D120014
2022-02-17 08:15:50 -08:00
Zakk Chen eeb7754f68 [RISCV] Add the passthru operand for vmv.vv/vmv.vx/vfmv.vf IR intrinsics.
Add the passthru operand for
VMV_V_X_VL, VFMV_V_F_VL and SPLAT_VECTOR_SPLIT_I64_VL also.

The goal is support tail and mask policy in RVV builtins.
We focus on IR part first.
If the passthru operand is undef, we use tail agnostic, otherwise
use tail undisturbed.

Reviewed By: rogfer01

Differential Revision: https://reviews.llvm.org/D119688
2022-02-17 06:38:14 -08:00
Shao-Ce SUN 7798ecca9c [RISCV] add the MC layer support of Zfinx extension
This patch added the MC layer support of Zfinx extension.

Authored-by: StephenFan
Co-Authored-by: Shao-Ce Sun

Reviewed By: asb

Differential Revision: https://reviews.llvm.org/D93298
2022-02-17 21:54:13 +08:00
Zakk Chen 093ecccdab [RISCV] Add the passthru operand for vadc/vsbc/vmerge/vfmerge IR intrinsics.
The goal is support tail and mask policy in RVV builtins.
We focus on IR part first.
If the passthru operand is undef, we use tail agnostic, otherwise
use tail undisturbed.

Reviewed By: rogfer01

Differential Revision: https://reviews.llvm.org/D119686
2022-02-17 02:21:39 -08:00
Ben Shi 0b93e90971 Revert "[RISCV] LUI used for address computation should not isAsCheapAsAMove"
This reverts commit 23a5073600.

Although this patch achieved better codegen in most cases, it is really
important to accurately describe the cost of instructions. So I revert it.
2022-02-17 17:27:37 +08:00
Jessica Paquette 67ab4c010b [MachineOutliner] NFC: Update LRU stuff for RISCV
I missed it in my grep. Fixes broken buildbot.`
2022-02-16 12:01:59 -08:00
Jessica Paquette 6d58f4ab07 [MachineOutliner] NFC: Hide LRU-related stuff behind helper functions
It's not particularly user-friendly to have to call `initLRU` everywhere. Also,
it wasn't particularly great that the LRU for registers used in a sequence was
also initialized by `initLRU`.

This patch hides this stuff behind some helper functions:

* `isAvailableAcrossAndOutOfSeq`
* `isAnyUnavailableAcrossOrOutOfSeq`
* `isAvailableInsideSeq`

This allows the user to avoid calling `initLRU` explicitly. Also, it allows
us to separate initializing the used-in-sequence LRU from the main LRU.

Since both ARM and AArch64 check LR liveness in `insertOutlinedCall`, this
refactor requires that we de-const the Candidate there.

Some other quality-of-code improvements:

* LRUs in outliner::Candidate now have more descriptive names
* Use `Register` instead of `unsigned` in some places
* Improve readability in some places by using ranges rather than `std::for_each`

This is a preparatory commit for a larger compile time related change for the
AArch64 outliner.
2022-02-16 11:39:07 -08:00
Craig Topper cfbbcc544c [RISCV] Improve lowering of SHL_PARTS/SRL_PARTS/SRA_PARTS.
Part of the shift lowering creates a (sub XLEN-1, ShAmt). When this
value is used we know that ShAmt is [0..XLEN-1]. Since XLEN is a power
of 2 we can replace the sub with an xor. This allows us to use XORI
instead of LI+SUB.

Reviewed By: asb

Differential Revision: https://reviews.llvm.org/D119411
2022-02-16 09:22:11 -08:00
Zakk Chen e8973dd389 [RISCV] Add the passthru operand for some RVV nomask unary and nullary intrinsics.
The goal is support tail and mask policy in RVV builtins.
We focus on IR part first.
If the passthru operand is undef, we use tail agnostic, otherwise
use tail undisturbed.

My plan is to handle more complex operations in follow-up patches.

Reviewers: frasercrmck

Differential Revision: https://reviews.llvm.org/D118253
2022-02-15 22:34:06 -08:00
Shao-Ce SUN 2aed07e96c [NFC][MC] remove unused argument `MCRegisterInfo` in `MCCodeEmitter`
Reviewed By: skan

Differential Revision: https://reviews.llvm.org/D119846
2022-02-16 13:10:09 +08:00
Shao-Ce SUN 9cc49c1951 Revert "[NFC][MC] remove unused argument `MCRegisterInfo` in `MCCodeEmitter`"
This reverts commit fe25c06cc5.
2022-02-16 11:57:49 +08:00
Shao-Ce SUN fe25c06cc5 [NFC][MC] remove unused argument `MCRegisterInfo` in `MCCodeEmitter`
For ten years, it seems that `MCRegisterInfo` is not used by any target.

Reviewed By: skan

Differential Revision: https://reviews.llvm.org/D119846
2022-02-16 11:47:17 +08:00
Zakk Chen b784719904 [RISCV] Add the passthru operand for RVV nomask binary intrinsics.
The goal is support tail and mask policy in RVV builtins.
We focus on IR part first.
If the passthru operand is undef, we use tail agnostic, otherwise
use tail undisturbed.

Add passthru operand for VSLIDE1UP_VL and VSLIDE1DOWN_VL to support
i64 scalar in rv32.

The masked VSLIDE1 would only emit mask undisturbed policy regardless
of giving mask agnostic policy until InsertVSETVLI supports mask agnostic.

Reviewed by: craig.topper, rogfer01

Differential Revision: https://reviews.llvm.org/D117989
2022-02-15 18:36:18 -08:00
Craig Topper ab6e02dded [RISCV] Match vwmulsu_vx with scalar splat input.
This is a more generic version of D119110 that uses MaskedValueIsZero
to do the matching and SimplifyDemandedBits to remove any unneeded
AND instructions.

Tests were taken from D119110.

Reviewed By: Chenbing.Zheng

Differential Revision: https://reviews.llvm.org/D119622
2022-02-15 08:45:21 -08:00
Craig Topper d132b47bb9 [RISCV] Replace llvm_unreachable with report_fatal_error.
Parsing errors aren't handled earlier in all cases. A simple
example is llc -mtriple=riscv64 -mattr=+zve32f. If F or Finx is
not also specified, this will hit a parse error.

Use a fatal_error so that the error is conveyed to the user.
2022-02-15 08:40:37 -08:00
jacquesguan bfb4c0c370 [RISCV] Recover the implication between Zve* extensions and the V extension.
This revision recover the implication between Zve* extensions and the V extension.

Differential Revision: https://reviews.llvm.org/D119210
2022-02-14 15:52:07 +08:00
Craig Topper 478c237e21 [RISCV] Fix incorrect extend type in vwmulsu combine.
While matching widening multiply, if we matched an extend from i8->i32,
i16->i64 or i8->i64, we need to reintroduce a narrower extend. If we're
matching a vwmulsu we need to use a sext for op0 and a zext for op1.

This bug exists in LLVM 14 and will need to be backported.

Differential Revision: https://reviews.llvm.org/D119618
2022-02-12 12:47:20 -08:00
Dimitry Andric 7af3d4ab3d Revert "[RISCV] Enable shrink wrap by default"
This reverts commit 5ebdb07e7e.

Enabling shrink wrap by default can cause assertions or crashes, and
these should first be investigated and fixed. For now, reverting the
change so it can be cherry-picked into 14.0.0 is the safest choice.
2022-02-12 19:04:12 +01:00
Haocong.Lu 23a5073600 [RISCV] LUI used for address computation should not isAsCheapAsAMove
A LUI instruction with flag RISCVII::MO_HI is usually used in conjunction
with ADDI, and jointly complete address computation. To bind the cost
evaluation of address computation, the LUI should not be regarded as a cheap
 move separately, which is consistent with ADDI.

In this test case, it improves the unroll-loop code that the rematerialization
of array's base address miss MachineCSE with Heuristics #1 at isProfitableToCSE.

Reviewed By: asb, frasercrmck

Differential Revision: https://reviews.llvm.org/D118216
2022-02-12 07:14:38 +00:00
Chenbing.Zheng 9e975e558b [RISCV][NFC] Move some combine patterns to DAG combine.
Move some combine patterns to DAG combine,and
it dealt with fixme left in RISCVInstrInfoZb.td.

Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D119527
2022-02-12 02:52:21 +00:00
Craig Topper 541c9ba842 [RISCV] Insert VSETVLI at the end of a basic block if we didn't produce BlockInfo.Exit.
This is an alternative to D118667 that instead of fixing the store
to match phase 1, it tries to detect the mismatch with the expected
value at the end of the block. This inserts a vsetvli after the vse
to satisfy the requirement of the other basic block.

We still have serious design issues in the pass, that is going to
require some rethinking.

Differential Revision: https://reviews.llvm.org/D119518
2022-02-11 09:34:16 -08:00
Craig Topper f35ac872b8 Revert "[RISCV] Fix a vsetvli insertion bug involving loads/stores." and "[RISCC] Add missing words to comment. NFC"
This reverts commit f943c58cae.
and commit 7eb7810727.

This introduced a new bug that appears to be easier to hit.

Differential Revision: https://reviews.llvm.org/D119517
2022-02-11 09:34:16 -08:00
Zakk Chen d224be3b99 [RISCV] Add the policy operand for some masked RVV ternary IR intrinsics.
Masked reduction intrinsics are specical cases which don't need to have policy
operand. The mask only affects which elements are read. It doesn't effect the
destination register.
The reduction intrinsics have a dedicated destination operand. If it
is undef, we use tail agnostic. If it not undef we use tail
undisturbed.

Co-Authored-by: Craig Topper <craig.topper@sifive.com>

Differential Revision: https://reviews.llvm.org/D117681
2022-02-11 05:02:03 -08:00
serge-sans-paille 06943537d9 Cleanup MCParser headers
As usual with that header cleanup series, some implicit dependencies now need to
be explicit:

llvm/MC/MCParser/MCAsmParser.h no longer includes llvm/MC/MCParser/MCAsmLexer.h

Preprocessed lines to build llvm on my setup:
after:  1068185081
before: 1068324320

So no compile time benefit to expect, but we still get the looser coupling
between files which is great.

Discourse thread: https://discourse.llvm.org/t/include-what-you-use-include-cleanup
Differential Revision: https://reviews.llvm.org/D119359
2022-02-11 10:39:29 +01:00
Fangrui Song 8eb750189c [RISCV] Fix -Wunused-variable in -DLLVM_ENABLE_ASSERTIONS=off builds 2022-02-10 20:10:12 -08:00
Craig Topper b0e77d5e48 [RISCV] Lower the shufflevector equivalent of vector.splice
We can lower a vector splice to a vslidedown and a vslideup.

The majority of the matching code here came from X86's code for matching
PALIGNR and VPALIGND/Q.

The slidedown and slideup lowering don't really require it to be concatenation,
but it happened to be an interesting pattern with existing analysis code I
could use.

This helps with cases where the scalar loop optimizer forwarded a load
result from a previous loop iteration. For example, this happens if the
loop uses x[i] and x[i+1] on the same iteration. The scalar optimizer
will forward x[i+1] load from the previous loop to satisfy x[i] on this
loop. When this get vectorized it results in one element of a vector
being forwarded from the previous loop to be concatenated with elements
loaded on this iteration.

Whether that's more efficient than doing a shifted loaded or reloading
the single scalar and using vslide1up is an interesting question.
But that's not something the backend can help with.

Reviewed By: khchen

Differential Revision: https://reviews.llvm.org/D119039
2022-02-10 09:39:35 -08:00
Craig Topper b861ddf365 [RISCV] Move the creation of VLMaxSentinel to isel. Use X0 during lowering.
The VLMaxSentinel is represented as TargetConstant, but that's included
in isa<ConstantSDNode>. To keep constant VLs and VLMax separate as long
as possible, use the X0 register during lowering and only convert to
VLMaxSentinel during isel.

Reviewed By: frasercrmck

Differential Revision: https://reviews.llvm.org/D118845
2022-02-10 09:28:44 -08:00
Craig Topper 727cd5205f [RISCV] Remove stale comment. NFC
Now that we pre-process SPLAT_VECTOR to VFMV_V_F_VL, these patterns
handled scalable vectors and vectors converted from fixed. These
are also used by vp.fma lowering.
2022-02-10 09:04:32 -08:00
Fraser Cormack fd43d99c93 [RISCV] Pre-process FP SPLAT_VECTOR to RISCVISD::VFMV_V_F_VL
This patch builds on top of D119197 to canonicalize floating-point
SPLAT_VECTOR as RISCVISD::VFMV_V_F_VL as a pre-process ISel step.

This primarily benefits scalable-vector VP code, where our VP patterns
only match VFMV_V_F_VL to reduce the burden on our ISel patterns, but
where at the same time, scalable-vector code doesn't custom-legalize
SPLAT_VECTOR.

Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D117670
2022-02-10 09:56:00 +00:00
Chenbing.Zheng c5d3b231e0 [RISCV] Add support for matching vwmaccsu/vwmaccus from fixed vectors
Add pattern to match add and widening mul to vwmacc, and
two multipliers are sext and zext.

Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D119314
2022-02-10 01:59:31 +00:00
Craig Topper c45c1b130b [RISCV] Teach RISCVDAGToDAGISel::selectShiftMask to replace sub from constant with neg.
If the shift amount is (sub C, X) where C is 0 modulo the size of
the shift, we can replace it with neg or negw.

Similar is is done for AArch64 and X86.

Reviewed By: khchen

Differential Revision: https://reviews.llvm.org/D119089
2022-02-09 12:33:01 -08:00
Craig Topper 09629215c2 [RISCV] Add a really basic cost model for SK_Splice.
While testing scalable vectors I found that if we generate a
vector splice intrinsic and run the code through the loop unroller,
we'll crash due to an invalid cost.

This adds a basic cost based on the 2 slide instructions used by the
lowering in D119303.

We probably need to factor LMUL into this, but that's true for
arithmetic instructions too. So I've ignored for the moment.

Reviewed By: ABataev

Differential Revision: https://reviews.llvm.org/D119316
2022-02-09 11:43:31 -08:00
Craig Topper 279b3b8179 [RISCV][VP] Lower VP_FMA to RVV instructions.
We already had FMA_VL node, but we didn't have masked patterns.
I have not added the fneg variations. I'll do those after I add
llvm.vp.fneg.

Reviewed By: frasercrmck

Differential Revision: https://reviews.llvm.org/D119196
2022-02-09 11:33:12 -08:00
Craig Topper 63e711549c [RISCV] Lower VP_FNEG to RVV instructions
Reviewed By: frasercrmck

Differential Revision: https://reviews.llvm.org/D119269
2022-02-09 10:56:39 -08:00
Craig Topper e305b1de7e [RISCV] Pre-process integer ISD::SPLAT_VECTOR to RISCISD::VMV_V_X_VL before isel.
This allows us to remove some isel patterns that exist for both
operations. Saving nearly 3000 bytes from the isel table.

Reviewed By: frasercrmck

Differential Revision: https://reviews.llvm.org/D119197
2022-02-09 08:10:21 -08:00
Lian Wang af2cd94555 [RISCV][NFC] Remove useless code
Reviewed By: craig.topper, asb

Differential Revision: https://reviews.llvm.org/D119317
2022-02-09 19:17:25 +08:00
serge-sans-paille ef736a1c39 Cleanup LLVMMC headers
There's a few relevant forward declarations in there that may require downstream
adding explicit includes:

llvm/MC/MCContext.h no longer includes llvm/BinaryFormat/ELF.h, llvm/MC/MCSubtargetInfo.h, llvm/MC/MCTargetOptions.h
llvm/MC/MCObjectStreamer.h no longer include llvm/MC/MCAssembler.h
llvm/MC/MCAssembler.h no longer includes llvm/MC/MCFixup.h, llvm/MC/MCFragment.h

Counting preprocessed lines required to rebuild llvm-project on my setup:
before: 1052436830
after:  1049293745

Which is significant and backs up the change in addition to the usual benefits of
decreasing coupling between headers and compilation units.

Discourse thread: https://discourse.llvm.org/t/include-what-you-use-include-cleanup
Differential Revision: https://reviews.llvm.org/D119244
2022-02-09 11:09:17 +01:00
Fraser Cormack 6449bea508 [RISCV] Select unmasked RVV pseudos in a DAG post-process
This patch drops TableGen patterns matching all-ones masked RVV pseudos
in the case where there are fallback patterns matching the generic
masked forms to "_MASK" pseudos. This optimization is now performed with
a SelectionDAG post-processing step which peephole-optimizes these same
pseudos with all-ones masks and swaps them out to their unmasked
pseudos.

This cuts our generated ISel table down by around ~5% (~110kB) in lieu
of a far smaller auto-generated table to help with the peephole.

This only targets our custom RISCVISD::*_VL binary operator nodes, which
use the one form for both masked and unmasked variants. A similar
approach could be used for our intrinsics but we'd need to do some work,
e.g., to represent unmasked intrinsics as true-masked intrinsics at the
IR or ISel level. At a rough estimate, this could save us a further 9%
on the size of our ISel table for the binary intrinsic patterns alone.

There is no observable impact on our tests.

Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D118810
2022-02-09 07:50:15 +00:00
Zakk Chen cfe7f69036 [RISCV][NFC] Refactor RISCVISAInfo.
1. Remove computeDefaultABIFromArch and add computeDefaultABI in
RISCVISAInfo.
2. Add parseFeatureBits which may used in D118333.

Differential Revision: https://reviews.llvm.org/D119250
2022-02-08 18:37:43 -08:00
jacquesguan 5e71bbfb6c [RISCV] Add patterns for vector widening floating-point fused multiply-add instructions
Add patterns for vector widening floating-point fused multiply-add instructions.

Differential Revision: https://reviews.llvm.org/D117546
2022-02-09 10:34:39 +08:00
Fraser Cormack 62c4ac764b [RISCV] Optimize splats of extracted vector elements
This patch adds an optimization to splat-like operations where the
splatted value is extracted from a identically-sized vector. On RVV we
can splat that via vrgather.vx/vrgather.vi without dropping to scalar
beforehand.

We do have a similar VECTOR_SHUFFLE-specific optimization but that only
works on fixed-length vector types and for those with a constant splat
lane. This patch extends this optimization to make it work on
scalable-vector types and on unknown extract indices.

It is performed during fixed-vector BUILD_VECTOR lowering and during a
new DAGCombine on SPLAT_VECTOR for scalable vectors.

Reviewed By: craig.topper, khchen

Differential Revision: https://reviews.llvm.org/D118456
2022-02-08 10:35:25 +00:00
wangpc c53d99c37d [RISCV] Split f64 undef into two i32 undefs
So that no store instruction will be generated.

Reviewed By: asb

Differential Revision: https://reviews.llvm.org/D118222
2022-02-08 13:42:15 +08:00
Craig Topper 2c26cfdef7 [RISCV] Use splat_vector instead of SplatPat in widening FP instruction patterns. NFCI
We use splat_vector for FP nodes without VL, not SplatPat which handles
splat_vector and integer VMV_V_X_VL.

Reduces isel table size by a few hundred bytes.
2022-02-07 15:53:27 -08:00
Kazu Hirata 3a3cb929ab [llvm] Use = default (NFC) 2022-02-06 22:18:35 -08:00
Craig Topper c1cef111a3 Revert "[RISCV] Fold (sext_inreg (fmv_x_anyexth X), i16) -> (fmv_x_signexth X)."
This reverts commit 673d68cd92.

This hadn't been reviewed yet.
2022-02-05 12:51:01 -08:00
Craig Topper 673d68cd92 [RISCV] Fold (sext_inreg (fmv_x_anyexth X), i16) -> (fmv_x_signexth X).
Add a new ISD opcode to represent the sign extending behavior of
vmv.x.h. Keep the previous anyext opcode to allow the existing
(fmv_x_anyexth (fmv_h_x X)) combine to keep working without needing
to generate a sign extend.

For fmv.x.w we are able to match the sext_inreg in an isel pattern,
but a 16-bit sext_inreg is lowered to a shift pair before isel. This
seemed like a larger match than we should do in isel.

Differential Revision: https://reviews.llvm.org/D118974
2022-02-05 12:42:12 -08:00
Craig Topper 5f35009996 [RISCV] Remove a ComputeNumSignBits call from an isel special case.
Only isel (and (srl (sexti32 Y), c2), c1) -> (srliw (sraiw Y, 31), c3 - 32)
when there is a sext_inreg present. Don't both checking for Y
having 32 sign bits.
2022-02-04 23:26:53 -08:00
Craig Topper d752ea9a72 [RISCV] Remove exclusions for zext.h/zext.w from our (and (srl X, C1), C2) selection code.
This code tries to replace the pattern with a pair of shifts, but
we were excluding if the And could be a zext.h or zext.w. The SLLI/SRL
pair is more compressible and doesn't come with much down side.

We do regress one test case in rv64i-exhaustive-w-insts.ll but we
can probably add a narrower exclusion for that case.
2022-02-04 17:10:48 -08:00
Craig Topper 1d8bbe3d25 [RISCV] Implement a basic version of AArch64RedundantCopyElimination pass.
Using AArch64's original implementation for reference, this patch
implements a pass to remove unneeded copies of X0. This pass runs
after register allocation and looks to see if a register is implied
to be 0 by a branch in the predecessor basic block.

Reviewed By: asb

Differential Revision: https://reviews.llvm.org/D118160
2022-02-04 10:43:46 -08:00
Craig Topper 234e54bdd8 [RISCV] Add more types of shuffles isShuffleMaskLegal.
Add the vslidedown and interleave patterns that I recently implemented.

Reviewed By: frasercrmck

Differential Revision: https://reviews.llvm.org/D118952
2022-02-04 09:13:13 -08:00
Craig Topper c83905a308 [RISCV] Add inline expansion for vector fround.
This avoids a crash for scalable vectors and or scalarization for
fixed vectors.

The algorithm is different enough that I don't think it makes sense
to merge with ceil/floor/trunc. Algorithm is adapted from gcc's X86
SSE2 output.

Reviewed By: frasercrmck

Differential Revision: https://reviews.llvm.org/D117247
2022-02-04 09:12:09 -08:00
serge-sans-paille ffe8720aa0 Reduce dependencies on llvm/BinaryFormat/Dwarf.h
This header is very large (3M Lines once expended) and was included in location
where dwarf-specific information were not needed.

More specifically, this commit suppresses the dependencies on
llvm/BinaryFormat/Dwarf.h in two headers: llvm/IR/IRBuilder.h and
llvm/IR/DebugInfoMetadata.h. As these headers (esp. the former) are widely used,
this has a decent impact on number of preprocessed lines generated during
compilation of LLVM, as showcased below.

This is achieved by moving some definitions back to the .cpp file, no
performance impact implied[0].

As a consequence of that patch, downstream user may need to manually some extra
files:

llvm/IR/IRBuilder.h no longer includes llvm/BinaryFormat/Dwarf.h
llvm/IR/DebugInfoMetadata.h no longer includes llvm/BinaryFormat/Dwarf.h

In some situations, codes maybe relying on the fact that
llvm/BinaryFormat/Dwarf.h was including llvm/ADT/Triple.h, this hidden
dependency now needs to be explicit.

$ clang++ -E  -Iinclude -I../llvm/include ../llvm/lib/Transforms/Scalar/*.cpp -std=c++14 -fno-rtti -fno-exceptions | wc -l
after:   10978519
before:  11245451

Related Discourse thread: https://llvm.discourse.group/t/include-what-you-use-include-cleanup
[0] https://llvm-compile-time-tracker.com/compare.php?from=fa7145dfbf94cb93b1c3e610582c495cb806569b&to=995d3e326ee1d9489145e20762c65465a9caeab4&stat=instructions

Differential Revision: https://reviews.llvm.org/D118781
2022-02-04 11:44:03 +01:00
Craig Topper 237eb37260 [RISCV] Add FMV_X_W and FMV_X_H to RISCVSExtWRemoval.
Add -target-abi to sextw-removal.ll RUN lines to show benefit on
new test case.
2022-02-03 09:40:47 -08:00
Craig Topper 997a86b99c [RISCV] Remove createVirtualRegister from RISCVInstrInfo::movImm.
Based on the discussion in D61884, this was done to enable compressed
instructions by giving freedom to pick a compressible register.

Integer materializing can generate LUI, ADDI, ADDIW, SLLI and some
Zb* instructions. C.LI, C.LUI, C.ADDI, C.ADDIW, and C.SLLI all have a 5-bit
register encoding. The Zb* instructions aren't compressible. Based on
that I don't think compressibility of the register is a concern.

Reviewed By: asb

Differential Revision: https://reviews.llvm.org/D118741
2022-02-03 08:34:26 -08:00
Craig Topper 2349fb0312 [RISCV] Remove RISCVISD::SPLAT_VECTOR_I64 in favor of RISCVISD::VMV_V_X_VL.
SPLAT_VECTOR_I64 has the same semantics as RISCVISD::VMV_V_X_VL, it
just assumed VLMax instead of carrying a VL operand.

Include order of RISCVInstrInfoVSDPatterns.td and RISCVInstrInfoVVLPatterns.td
has been swapped to avoid moving riscv_vmv_v_x_vl into
RISCVInstrInfoVSDPatterns.td and to allow moving other "_vl" SDNodes back to
RISCVInstrInfoVVLPatterns.td

Reviewed By: frasercrmck

Differential Revision: https://reviews.llvm.org/D118841
2022-02-03 08:30:25 -08:00
Shao-Ce SUN 005fd8aa70 [RISCV] Add support for Zihintpause extention
Add support for the 'pause' hint instruction as an alias for
'fence w, 0'. To do this allow the 'fence' operands pred and succ
to be set to 0 (the empty set). This will also allow future hints
to be encoded as 'fence 0, <x>' and 'fence <x>, 0'.

This patch revised from @mundaym's D93019.

Reviewed By: asb

Differential Revision: https://reviews.llvm.org/D117789
2022-02-03 20:55:47 +08:00
Craig Topper abc6716038 [RISCV] Remove unused variables. NFC 2022-02-02 19:23:16 -08:00
Craig Topper f1720abb54 [RISCV] Cleanup some places that assumed VLMaxSentinel and -1 constant mean the same thing. NFCI
VLMaxSentintel happens to be represented as -1 TargetConstant. A user
provided -1 would be an ISD::Constant. We shouldn't assume that they
are the same thing. I'm still not entirely convinced that we should be
treating -1 from the user as VLMAX.

Also fix one place that failed to use XLenVT for the VLMaxSentinel,
using MVT::i64 in code that only executes on RV32.
2022-02-02 12:23:12 -08:00
Craig Topper b73d151a11 [RISCV] Add DAG combines to transform ADD_VL/SUB_VL into widening add/sub.
This adds or reuses ISD opcodes for vadd.wv, vaddu.wv, vadd.vv, vaddu.vv
and a similar set for sub.

I've included support for narrowing scalar splats that have known
sign/zero bits similar to what was done for MUL_VL.

The conversion to vwadd.vv proceeds in two phases. First we'll form
a vwadd.wv by narrowing one of the operands. Then we'll visit the
vwadd.wv to try to narrow the other operand. This turned out to be
simpler than catching all the cases in one step. The forming of of
vwadd.wv can happen for either operand for add, but only the right
hand side for sub since sub isn't commutable.

An interesting quirk is that ADD_VL and VZEXT_VL/VSEXT_VL are formed
during vector op legalization, but VMV_V_X_VL isn't usually formed
until op legalization when BUILD_VECTORS are handled. This leads to
VWADD_W_VL forming in one DAG combine round, and then a later DAG combine
round sees the VMV_V_X_VL and needs to commute the operands to get the
splat in position. This alone necessitated a VWADD_W_VL combine function
which made forming vwadd.vv in two stages an easy choice.

I've left out trying hard to form vwadd.wx instructions for now. It would
only save an extend in the scalar domain which isn't as interesting.

Might need to review the test coverage a bit. Most of the vwadd.wv
instructions are coming from vXi64 tests on rv64. The tests were
copy pasted from the existing multiply tests.

Reviewed By: rogfer01

Differential Revision: https://reviews.llvm.org/D117954
2022-02-02 10:03:08 -08:00
Craig Topper 5a5037c602 [RISCV] Fix some 80 column violations in ComputeNumSignBitsForTargetNode. NFC 2022-02-01 21:43:11 -08:00
Craig Topper f943c58cae [RISCC] Add missing words to comment. NFC 2022-02-01 07:39:51 -08:00
Craig Topper 7eb7810727 [RISCV] Fix a vsetvli insertion bug involving loads/stores.
The first phase of the analysis can avoid a vsetvli if an earlier
instruction in the block used an SEW and LMUL that when combined with
the EEW of the load/store would produce the desired EMUL. If we
avoided a vsetvli this will affect the global analysis we do in the
second phase.

The third phase where we really insert the vsetvlis needs to agree
with the first phase. If it doesn't we can insert vsetvlis that
invalidate the global analysis.

In the test case there is a VSETVLI in the preheader that sets
SEW=64 and LMUL=1. Inside the loop there is a VADD with SEW=64 and LMUL=1.
This VADD is followed by a store that wants wants SEW=32 LMUL=1/2.
Because it has EEW=32 as part of the opcode the SEW=64 LMUL=1 from the
VADD can be become EMUL=1 for the store. So the first phase determines no
vsetvli is needed.

The third phase manages CurInfo differently than BBInfo.Change from the
first phase. CurInfo is only updated when we see a vsetvli or insert
a vsetvli. This was done to allow predecessor block information from
the global analysis to be applied to multiple instructions. Since the
loop body has no vsetvli we won't update CurInfo for either the VADD
or the VSE. This prevented us from checking the store vsetvli elision
for the VSE resulting in a vsetvli SEW=32 LMUL=1/2 being emitted which
invalidated the global analysis.

To mitigate this, I've added a BBLocalInfo variable that more closely
matches the first phase propagation. This gets updated based on the
VADD and prevents emitting a vsetvli for the store like we did in the
first phase.

I wonder if we should do an earlier phase to handle the load/store case
by adding more pseudo opcodes and changing the SEW/LMUL for those
instructions before the insertion analysis. That might be more robust
than trying to guarantee two phases make the same decision.

Fixes the test from D118629.

Reviewed By: frasercrmck

Differential Revision: https://reviews.llvm.org/D118667
2022-02-01 07:29:01 -08:00
Shao-Ce SUN a2a7fc7ea5 [RISCV] Adjust some comments. 2022-02-01 22:53:54 +08:00
Craig Topper 2e45e8abb1 [RISCV] Add a fatal error if ISD::VSCALE is used with Zvl32b.
We convert VLEN to vscale by dividing by RVVBitsPerBlock which is
currently 64. This is only correct if VLEN is evenly divisible by
64. With only Zvl32b we can't assume that.

This patch adds a fatal_error to prevent generating code that may
be broken.

We probably need to look at how we size stack frame objects too.

Reviewed By: frasercrmck

Differential Revision: https://reviews.llvm.org/D118583
2022-01-31 09:13:14 -08:00
Craig Topper 09606d6a63 [RISCV] Update the computeKnownBitsForTargetNode for RISCVISD::READ_VLENB to consider Zve/Zvl.
We had previously hardcoded this to assume that vector registers
are 128 bits. This was true when only V existed, but after Zve
extensions were added this became incorrect.

This patch adjusts it to support 128, 64, or 32 bit vectors depending
on Zvl. The 128-bit limit is artificial, but we don't have any test
coverage showing that we larger values so I was being conservative.

None of our lit tests depend on this code today due to the custom
lowering of ISD::VSCALE that inserts the appropriate left or right
shift to convert from VLENB to VSCALE. That code was added after
this code in computeKnownBitsForTargetNode.

Reviewed By: frasercrmck

Differential Revision: https://reviews.llvm.org/D118582
2022-01-31 09:13:14 -08:00
Craig Topper aae947e860 [RISCV] Separate the Zfhmin and Zfh extensions.
The spec doesn't seem to be written as if Zfh implies Zfhmin. They
seem to be separate extensions.

This patch moves the instructions from Zfhmin to be enabled with
either the Zfh or Zfhmin extensions.

Reviewed By: achieveartificialintelligence

Differential Revision: https://reviews.llvm.org/D118581
2022-01-31 09:06:43 -08:00
Nikita Popov 0801940c17 [RISCV] Avoid pointer element type access for masked atomicrmw intrinsics
masked.atomicrmw.*.i32 intrinsics access an i32 (and then possibly
mask it), so hardcode MVT::i32 as the access type here, rather than
determining it from the pointer element type.

Differential Revision: https://reviews.llvm.org/D118336
2022-01-31 09:28:39 +01:00
Craig Topper 5fbc3cda9e [RISCV] Use existing variable intead of calling getOperand again. NFCI
This is a slight change because I'm using the ANY_EXTEND result
instead of the original operand, but getNode should constant fold.

While there, add a comment about why the code specifically checks
for a ConstantSDNode.
2022-01-30 18:42:19 -08:00
Craig Topper 744be8c502 [RISCV] Lower riscv_zip/unzip intrinsic to RISCVISD::SHFL/UNSHFL.
These are special versions of the more general shfli/unshfli
instructions. We can use the general ISD opcodes with the correct
immediates.
2022-01-30 13:27:41 -08:00
Craig Topper e1075186a6 [RISCV] Custom lower brev8 intrinsic to RISCVISD::GREV.
We can use the RISCVISD::GREV encoding that swaps the bits in
each byte.  This allows it to use the existing computeKnownBits
support for RISCVISD::GREV.
2022-01-30 12:41:09 -08:00
Craig Topper 524545317c [RISCV] Remove RISCVISD::BREV8 and use RISCVISD::GREV instead.
We already have an ISD opcode for the more general GREV/GREVI
instructon. We can just use it with the encoding that corresponds
to the behavior of brev8. This is similar to what we do for orc.b
where we use the GORC ISD opcode.
2022-01-29 22:45:43 -08:00
Craig Topper 0405ac0150 [RISCV] Rerrange RISCVInstrInfoZB.td to better group related wthings. NFC
Especially placing W instructions/patterns near their non-W versions.
2022-01-29 21:16:15 -08:00
Craig Topper 815786eb67 [RISCV] Use RVBUnary to simplify ZEXT_H_RV32/ZEXT_H_RV64 definitions. NFC 2022-01-29 18:28:14 -08:00
Craig Topper 8faf2a0638 [RISCV] Correct predicate orc.b pattern to not include Zbkb.
This was incorrectly lumped in when the predicate was changed for
the rotate instructions.
2022-01-29 00:10:54 -08:00
Craig Topper d8f929a567 [RISCV] Custom legalize BITREVERSE with Zbkb.
With Zbkb, a bitreverse can be split into a rev8 and a brev8.

Reviewed By: VincentWu

Differential Revision: https://reviews.llvm.org/D118430
2022-01-28 23:11:12 -08:00
jacquesguan 1276678982 [RISCV] Improve extract_vector_elt for fixed mask registers.
Now the backend promotes mask vector to an i8 vector and extract element from that. We could bitcast to a widen element vector, and extract from it to GPR, then use I instruction to extract the certain bit.

Differential Revision: https://reviews.llvm.org/D117389
2022-01-29 11:07:53 +08:00
Craig Topper 06bd56d47d [RISCV] Update comments about getInstSizeInBytes hard-coding the number of bytes.
After D118175, we get the information from the tablegen definition.

Differential Revision: https://reviews.llvm.org/D118488
2022-01-28 09:51:49 -08:00
Craig Topper ea05ee9059 [RISCV] Preserve VL when truncating i64 gather/scatter indices on RV32.
We were creating a truncate with the default for the type, but for
VP intrinsics we have a VL that we should use.

Reviewed By: frasercrmck

Differential Revision: https://reviews.llvm.org/D118406
2022-01-28 09:25:30 -08:00
Craig Topper de0c2d75bf [RISCV] Use tablegen size for getInstSizeInBytes.
Fix the pseudos to have the correct size in the MCInstrDesc description.

Inspired by D118009 and D117970.

Reviewed By: asb

Differential Revision: https://reviews.llvm.org/D118175
2022-01-28 09:21:28 -08:00
Kito Cheng a9d5bb926d [RISCV] Use __extendhfsf2/__truncsfhf2 for fp16 <-> fp32
`__gnu_h2f_ieee` and `__gnu_f2h_ieee` are introduce by ARM and set that as
default name for fp16 and fp32 conversion in LLVM.

However RISC-V GCC using default naming scheme for that, which is
`__extendhfsf2` and `__truncsfhf2` for that, that cause runtime ABI
incompatible issue.

Although we didn't have formal runtime ABI spec to specify those naming
convention yet, but I think it would be great to fix the incompatible
issue first.

And I've plan to create a runtime ABI spec undere psABI spec this year.

Reviewed By: asb

Differential Revision: https://reviews.llvm.org/D118207
2022-01-29 00:01:00 +08:00
Alex Bradbury 588f121ada [RISCV][NFC] Make Zb* instruction naming match the convention used elsewhere in the RISC-V backend
Where the instruction mnemonic contains a dot, we name the corresponding
instruction in the .td file using a _ in the place of the dot. e.g. LR_W
rather than LRW. This commit updates RISCVInstrInfoZb.td to follow that
convention.
2022-01-28 15:20:37 +00:00
Chenbing.Zheng 6d6c44a3f3 [RISCV] Add support for matching vwmulsu from fixed vectors
According to riscv-v-spec-1.0, widening signed(vs2)-unsigned integer multiply
vwmulsu.vv vd, vs2, vs1, vm # vector-vector
vwmulsu.vx vd, vs2, rs1, vm # vector-scalar

It is worth noting that signed op is only for vs2.
For vwmulsu.vv, we can swap two ops, and don't care which is sign extension,
but for vwmulsu.vx signExt can not be a vector extended from scalar (rs1).
I specifically added two functions ending with _swap in the test case.

Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D118215
2022-01-28 02:33:30 +00:00
Craig Topper 70e1cc6792 [RISCV] Prefer vmslt.vx v0, v8, zero over vmsle.vi v0, v8, -1.
At least when starting from a vmslt.vx intrinsic or ISD::SETLT. We
don't handle the case where the user used vmsle.vx intrinsic with -1.
2022-01-27 11:48:27 -08:00
Fraser Cormack 84e85e025e [SelectionDAG][VP] Provide expansion for VP_MERGE
This patch adds support for expanding VP_MERGE through a sequence of
vector operations producing a full-length mask setting up the elements
past EVL/pivot to be false, combining this with the original mask, and
culminating in a full-length vector select.

This expansion should work for any data type, though the only use for
RVV is for boolean vectors, which themselves rely on an expansion for
the VSELECT.

Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D118058
2022-01-27 09:00:41 +00:00
Wu Xinlong 6a4d3f37b5 [RISCV] fix dead code
fix dead code mentioned on https://reviews.llvm.org/D98136

Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D118323
2022-01-27 16:00:01 +08:00
Wu Xinlong 615d71d9a3 [RISCV][CodeGen] Implement IR Intrinsic support for K extension
This revision implements IR Intrinsic support for RISCV Scalar Crypto extension according to the specification of version [[ https://github.com/riscv/riscv-crypto/releases/tag/v1.0.0-scalar | 1.0]]
Co-author:@ksyx & @VincentWu & @lihongliang & @achieveartificialintelligence

Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D102310
2022-01-27 15:53:35 +08:00
Craig Topper b3bec6e453 [RISCV] Use vnsrl.wx with x0 instead of vnsrl.vi for truncate.
This matches what the spec uses for the vncvt.x.x.w assembly
pseudoinstruction.

Reviewed By: kito-cheng

Differential Revision: https://reviews.llvm.org/D118295
2022-01-26 18:38:13 -08:00
Craig Topper f487a76430 [RISCV] Add hasStdExtZbp() to hasAndNotCompare. 2022-01-26 13:54:05 -08:00
Craig Topper b3d94b199c [RISCV] Remove references to 'B' extension from AssemblerPredicate and SubtargetFeature strings.
For Zba/Zbb/Zbc/Zbs I've removed the 'B' completely and used the
extension names as presented at the start of Chapter 1 of the
1.0.0 Bitmanipulation spec.

For the unratified extensions, I've replaced 'B' with 'Zb' and
otherwise left them unchanged.

Reviewed By: asb

Differential Revision: https://reviews.llvm.org/D117822
2022-01-26 11:08:29 -08:00
Benjamin Kramer f15014ff54 Revert "Rename llvm::array_lengthof into llvm::size to match std::size from C++17"
This reverts commit ef82063207.

- It conflicts with the existing llvm::size in STLExtras, which will now
  never be called.
- Calling it without llvm:: breaks C++17 compat
2022-01-26 16:55:53 +01:00
serge-sans-paille ef82063207 Rename llvm::array_lengthof into llvm::size to match std::size from C++17
As a conquence move llvm::array_lengthof from STLExtras.h to
STLForwardCompat.h (which is included by STLExtras.h so no build
breakage expected).
2022-01-26 16:17:45 +01:00
jacquesguan 267711e38b [RISCV] Fix support of vlen = 64.
In the Zve* extensions, the vlen could be 64. This patch change the vlen constraint of low bound to 64.

Differential Revision: https://reviews.llvm.org/D118217
2022-01-26 16:31:21 +08:00
Zakk Chen 510710d037 [RISCV][NFC] Add getVLOperand for RVV intrinsics.
Use the VLOperand information to get the VL.

Differential Revision: https://reviews.llvm.org/D118156
2022-01-25 17:37:58 -08:00
Zakk Chen 9273378b85 [RISCV] Add the passthru operand for RVV nomask load intrinsics.
The goal is support tail and mask policy in RVV builtins.
We focus on IR part first.
If the passthru operand is undef, we use tail agnostic, otherwise
use tail undisturbed.

Co-Authored-by: Hsiangkai Wang <Hsiangkai@gmail.com>

Reviewers: craig.topper, frasercrmck

Differential Revision: https://reviews.llvm.org/D117647
2022-01-25 17:31:36 -08:00
eopXD b089e4072a [RISCV] Don't allow i64 vector div by constant to use mulh with Zve64x
EEW=64 of mulh and its vairants requires V extension.

Authored by: Craig Topper <craig.topper@sifive.com> @craig.topper

Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D117947
2022-01-25 09:55:05 -08:00
Nikita Popov aa97bc116d [NFC] Remove uses of PointerType::getElementType()
Instead use either Type::getPointerElementType() or
Type::getNonOpaquePointerElementType().

This is part of D117885, in preparation for deprecating the API.
2022-01-25 09:44:52 +01:00
Craig Topper fd0a4bc76b [RISCV] Add missing space to 'clang-format on' directive. NFC
Without a space after the comment characters it seems to be ignored.
2022-01-24 17:00:37 -08:00
Craig Topper cd2a9ff397 [RISCV] Select int_riscv_vsll with shift of 1 to vadd.vv.
Add might be faster than shift. We can't do this earlier without
using a Freeze instruction.

This is the intrinsic version of D106689.

Reviewed By: frasercrmck

Differential Revision: https://reviews.llvm.org/D118013
2022-01-24 08:04:53 -08:00
Fraser Cormack d42678b453 [RISCV] Add side-effect-free vsetvli intrinsics
This patch introduces new intrinsics that enable the use of vsetvli in
contexts where only the returned vector length is of interest. The
pre-existing intrinsics are marked with side-effects, which prevents
even trivial optimizations on/across them.

These intrinsics are intended to be used in situations where the vector
length is fed in turn to RVV intrinsics or to vector-predication
intrinsics during loop vectorization, for example. Those codegen paths
ensure that instructions are generated with their own implicit vsetvli,
so the vector length and vtype can be relied upon to be correct.

No corresponding C builtins are planned at this stage, though that is a
possibility for the future if the need arises.

Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D117910
2022-01-24 13:52:08 +00:00
SForeKeeper 70f83f3084 [RISCV] add support for zbkx subextension in MC layer.
This patch adds support for zbkx extension from K extension(v1.0.0) in MC layer.
Instructions with same functionality and same encoding is defined in the bitmanip extension.
It defines {Xperm8, Xperm4} as instruction aliases for xperm.* in Zbp extension. When Zbkx is enabled while Zbp is not, xperm.h will not be available. When Zbkx and Zbp are both enabled, the instructions will be decoded in Zbp format.

[[ https://reviews.llvm.org/D94999 | D94999 ]] this is the patch that introduces xperm.* instructions.

Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D117889
2022-01-24 20:38:46 +08:00
Fraser Cormack af773a1818 [RISCV][VP] Lower VP_MERGE to RVV instructions
This patch adds lowering of the llvm.vp.merge.* intrinsic
(ISD::VP_MERGE) to RVV vmerge/vfmerge instructions. It introduces a
special pseudo form of vmerge which allows a tied merge operand,
allowing us to specify the tail elements as being equal to the "on
false" operand, using a tied-def constraint and a "tail undisturbed"
policy.

While this strategy allows us to often lower the intrinsic to just one
instruction, it may be less efficient in fixed-vector types as the
number of tail elements may extend far beyond the length of the fixed
vector. Another strategy could be to use a vmerge/vfmerge instruction
with an AVL equal to the length of the vector type, and manipulate the
condition operand such that mask elements greater than the operation's
EVL are false.

I've also observed inefficient codegen in which our 'VF' patterns don't
match raw floating-point SPLAT_VECTORs, which occur in scalable-vector
code.

Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D117561
2022-01-24 11:05:05 +00:00
Fraser Cormack e7926e8d97 [RISCV] Match VF variants for masked VFRDIV/VFRSUB
This patch follows up on D117697 to help the simple binary operations
behave similarly in the presence of masks.

It also enables CGP sinking support for vp.fdiv and vp.fsub intrinsics,
now that VFRDIV and VFRSUB are consistently matched with a LHS splat for
masked and unmasked variants.

Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D117783
2022-01-24 10:59:43 +00:00
Chenbing.Zheng 9aaa74aeef [RISCV] Add patterns of SET[U]LT_VI for STECC forms
This patch optmizes "li a0, 5
                     vmsgt[u].vx v10, v8, a0"
                 -> "vmsgt[u].vi v10, v8, 5"

Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D118014
2022-01-24 08:50:49 +00:00
jacquesguan ba16e3c31f [RISCV] Decouple Zve* extensions and the V extension.
According to the spec, there are some difference between V and Zve64d. For example, the vmulh integer multiply variants that return the high word of the product (vmulh.vv, vmulh.vx, vmulhu.vv, vmulhu.vx, vmulhsu.vv, vmulhsu.vx) are not included for EEW=64 in Zve64*, but V extension does support these instructions. So we should decouple Zve* extensions and the V extension.

Differential Revision: https://reviews.llvm.org/D117854
2022-01-24 14:55:21 +08:00
Wu Xinlong e29d8fb169 [RISCV] Initially support the K-extension instructions on the LLVM MC layer
This commit is currently implementing supports for scalar cryptography extension for LLVM according to version v1.0.0 of [K Ext specification](https://github.com/riscv/riscv-crypto/releases)(scala crypto has been ratified already). Currently, we are implementing the MC (Machine Code) layer of his extension and the majority of work is done under `llvm/lib/Target/RISCV` directory. There are also some test files in `llvm/test/MC/RISCV` directory.

Remove the subfeature of Zbk* which conflict with b extensions to reduce the size of the patch.
(Zbk* will be resubmit after this patch has been merged)

**Co-author:**@ksyx & @VincentWu & @lihongliang & @achieveartificialintelligence

Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D98136
2022-01-24 14:45:35 +08:00
Jim Lin 3f24cdec25 [RISCV][NFC] Remove tailing whitespaces in RISCVInstrInfoVSDPatterns.td and RISCVInstrInfoVVLPatterns.td 2022-01-24 10:49:43 +08:00
Craig Topper 413684313d [RISCV] Adjust the header comment in RISCVInstrInfoZb.td to better integrate Zbk* extensions.
The Zbk* extensions have some overlap with Zb so have been placed in this file.

Reviewed By: VincentWu

Differential Revision: https://reviews.llvm.org/D117958
2022-01-23 11:42:52 -08:00
eopXD 3cf15af2da [RISCV] Remove experimental prefix from rvv-related extensions.
Extensions affected: +v, +zve*, +zvl*

Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D117860
2022-01-22 20:18:40 -08:00
Craig Topper d44b6be6ea [RISCV] Don't Custom legalize f16/f32/f64 bitcasts if those types aren't Legal. 2022-01-22 11:55:18 -08:00
Alex Fan e796eaf2af [RISCV][RFC] add MC support for zbkc subextension
Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D117874
2022-01-22 10:23:01 +08:00
Craig Topper 0379459fc5 [RISCV] Strengthen a SDTypeProfile. Fix formatting. 2022-01-21 13:01:53 -08:00
Craig Topper 48132bb1e4 [RISCV] Simplify interface to combineMUL_VLToVWMUL. NFC
Instead of passing the both the SDNode* and 2 of the operands
in two different orders, just pass the SDNode * and a bool to
indicate which operand order to test.

While there rename to combineMUL_VLToVWMUL_VL.
2022-01-21 11:43:06 -08:00
Craig Topper 11754a4dbb [RISCV] Use RVBUnary in more places to simplify some tablegen declarations. NFCI 2022-01-21 10:55:35 -08:00
Fraser Cormack 4d268dc94a [RISCV] Enable CGP to sink splat operands of VP intrinsics
This patch brings better splat-matching to our VP support, by sinking
splat operands of VP intrinsics back into the same block as the VP
operation. The list of VP intrinsics we are interested in matches that
of the regular instructions.

Some optimization is still lacking. For instance, our VL nodes aren't
recognized as commutative, so splats must be on the RHS. Because of
this, we limit our sinking of splats to just the RHS operand for now.
Improvement in this regard can come in another patch.

Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D117703
2022-01-21 11:30:37 +00:00
wangpc 8def89b5dc [RISCV] Set CostPerUse to 1 iff RVC is enabled
After D86836, we can define multiple cost values for
different cost models. So here we set CostPerUse to
1 iff RVC is enabled to avoid potential impact on RA.

Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D117741
2022-01-21 14:44:26 +08:00
Craig Topper 7b3d307288 [RISCV] Add isel patterns for grevi, shfli, and unshfli to brev8/zip/unzip instructions.
Zbkb supports some encodings of the general grevi, shfli, and
unshfli instructions legal, so we added separate instructions for
those encodings to improve the diagnostics for assembler and
disassembler. To be consistent we should always use these separate
instructions whenever those specific encodings of grevi/shfli/unshfli
occur. So this patch adds specific isel patterns to override the generic
isel patterns for these cases. Similar was done for rev8 and zext.h
for Zbb previously.
2022-01-20 20:43:52 -08:00
Wu Xinlong 7ee1c162cc [RISCV][RFC] add inst support of zbkb
This commit add instructions supports of `zbkb` which defined in scalar cryptography extension version v1.0.0 (has been ratified already).

Most of the zbkb directives reuse parts of the zbp and zbb directives, so this patch just modified some of the inst aliases and predicates.

Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D117640
2022-01-21 11:49:36 +08:00
Hsiangkai Wang ad06e65dc4 [RISCV] Fix the bug in the register allocator caused by reserved BP.
Originally, hasRVVFrameObject() will scan all the stack objects to check
whether if there is any scalable vector object on the stack or not.
However, it causes errors in the register allocator. In issue 53016, it
returns false before RA because there is no RVV stack objects. After RA,
it returns true because there are spilling slots for RVV values during RA.
The compiler will not reserve BP during register allocation and generate BP
access in the PEI pass due to the inconsistent behavior of the function.

The function is changed to use hasStdExtV() as the return value. It is
not precise, but it can make the register allocation correct.

Refer to https://github.com/llvm/llvm-project/issues/53016.

Differential Revision: https://reviews.llvm.org/D117663
2022-01-21 01:23:01 +00:00
Craig Topper cfae2c65db [RISCV] Factor Zve32 support into RISCVSubtarget::getMaxELENForFixedLengthVectors.
This is needed to properly limit fractional LMULs for Zve32.

Add new RUN Zve32 RUN lines to the existing tests for the
-riscv-v-fixed-length-vector-elen-max command line option.
2022-01-20 16:31:12 -08:00
Craig Topper 5e88f527da [RISCV] Remove RISCVSubtarget::hasStdExtV() and hasStdExtZve*(). NFC
All code should use one of the cleaner named hasVInstructions*
functions. Fix the two uses that weren't and delete the methods
so no new uses can be created.
2022-01-20 15:05:09 -08:00
Craig Topper fa8bb22466 [RISCV] Optimize vector_shuffles that are interleaving the lowest elements of two vectors.
RISCV only has a unary shuffle that requires places indices in a
register. For interleaving two vectors this means we need at least
two vrgathers and a vmerge to do a shuffle of two vectors.

This patch teaches shuffle lowering to use a widening addu followed
by a widening vmaccu to implement the interleave. First we extract
the low half of both V1 and V2. Then we implement
(zext(V1) + zext(V2)) + (zext(V2) * zext(2^eltbits - 1)) which
simplifies to (zext(V1) + zext(V2) * 2^eltbits). This further
simplifies to (zext(V1) + zext(V2) << eltbits). Then we bitcast the
result back to the original type splitting the wide elements in half.

We can only do this if we have a type with wider elements available.
Because we're using extends we also have to be careful with fractional
lmuls. Floating point types are supported by bitcasting to/from integer.

The tests test a varied combination of LMULs split across VLEN>=128 and
VLEN>=512 tests. There a few tests with shuffle indices commuted as well
as tests for undef indices. There's one test for a vXi64/vXf64 vector which
we can't optimize, but verifies we don't crash.

Reviewed By: rogfer01

Differential Revision: https://reviews.llvm.org/D117743
2022-01-20 14:44:47 -08:00
Craig Topper dd7b69a61f [RISCV] Remove HadStdExtV and HasStdZve* Predicates from tablegen.
No instructions should be using these. Everything should use
HasVInstructions* Predicates. Remove them so that they can't be
used by accident.
2022-01-20 12:54:20 -08:00
Craig Topper 7a275dc354 [RISCV] Remove Zvlsseg extension.
This string no longer appears in the Vector Extension specification.
The segment load/store instructions are just part of the vector
instruction set.

Reviewed By: asb

Differential Revision: https://reviews.llvm.org/D117724
2022-01-20 12:40:07 -08:00
Craig Topper 94e69fbb4f [RISCV] Add DAG combine to fold (fp_to_int_sat (ffloor X)) -> (select X == nan, 0, (fcvt X, rdn))
Similar for ceil, trunc, round, and roundeven. This allows us to use
static rounding modes to avoid a libcall.

This is similar to D116771, but for the saturating conversions.

This optimization is done for AArch64 as isel patterns.
RISCV doesn't have instructions for ceil/floor/trunc/round/roundeven
so the operations don't stick around until isel to enable a pattern
match. Thus I've implemented a DAG combine.

I'm only handling saturating to i64 or i32. This could be extended
to other sizes in the future.

Reviewed By: asb

Differential Revision: https://reviews.llvm.org/D116864
2022-01-20 11:35:37 -08:00
Fraser Cormack ca36cc56ac [RISCV] Match RVV VF variants also through masked operations
This brings floating-point RVV vector/scalar support more in line with
the integer vector patterns, which can already match '.vx' instructions
with masked operations.

Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D117697
2022-01-20 12:08:02 +00:00
Fraser Cormack 5a12024b95 [RISCV] Optimize lowering of floating-point -0.0
This idea has come up in several reviews -- D115978 and D105902 -- so I
can't take any credit for the idea. Instead of using a constant pool to
lower -0.0, we can emit a sequence of two instructions:

    fmv.[hwd].x freg, zero
    fsgnjn.[hsd] freg, freg, freg

This is only done when the floating-point type is legal.

Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D117687
2022-01-20 11:46:28 +00:00
Chenbing.Zheng 0be3da1fab [RISCV] Add intrinsic for Zbt extension
RV32: fsl, fsr, fsri
RV64: fsl, fsr, fsri, fslw, fsrw, fsriw

Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D117468
2022-01-20 08:27:05 +00:00
eopXD 8eae99dfe5 [RISCV] Add the zve extension according to the v1.0 spec
`zve` is the new standard vector extension to specify varying degrees of
vector support for embedding processors. The `zve` extension is related
to the `zvl` extension and other updates that are added in v1.0.

According to https://github.com/riscv-non-isa/riscv-c-api-doc/pull/21,
Clang defines macro `__riscv_v_max_elen`,  `__riscv_v_max_elen_fp` for
`zve` and it can be used by applications that uses the vector extension.

Authored by: Zakk Chen <zakk.chen@sifive.com> @khchen
Co-Authored by: Eop Chen <eop.chen@sifive.com> @eopXD

Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D112408
2022-01-19 23:48:28 -08:00
Mohammed Nurul Hoque 21c79be5d7 [RISCV] Add patterns to MIR sign-extension removal pass.
This patch adds a few instruction patterns that generate sign-extended values or propagate them, adding to the pass introduced in https://reviews.llvm.org/D116397

Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D117465
2022-01-19 17:33:58 -08:00
Luís Marques a767ae2c5c [RISCV] Fix incomplete asm statement parsing
For instructions without operands, the final `AsmToken::EndOfStatement`
wasn't being consumed. In the context of inline assembly, the resulting
empty statements would cause extraneous empty lines to be emitted. Fix
the issue by consuming the `EndOfStatement` token.

Differential Revision: https://reviews.llvm.org/D117565
2022-01-19 21:56:21 +00:00
Craig Topper 4060b81e76 [RISCV] Obey -riscv-v-fixed-length-vector-elen-max when lowering mask BUILD_VECTORs.
We may not be allowed to use vXiXLen vectors. Consult ELEN to
determine what is allowed. This will become even more important
when Zve32 is added.

Reviewed By: frasercrmck, arcbbb

Differential Revision: https://reviews.llvm.org/D117518
2022-01-19 10:47:37 -08:00
Jim Lin d6b0734837 [NFC] Use Register instead of unsigned 2022-01-19 20:17:04 +08:00
eopXD 9f27941c2f [RISCV] Add patterns for vector narrowing integer right shift instructions
Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D117454
2022-01-18 22:30:13 -08:00
Craig Topper 5a6c622afd [RISCV] Remove special case for constant shift amount in FSHL/FSHR lowering to FSL/FSR.
Remove fshl/fshr with constant shift amount isel patterns. Replace
with fsr/fsl with constant isel patterns.

This hack was trying to preserve as much optimization opportunity
for fshl/fshr by constant as possible, but the conversion to
RISCVISD::FSR/FSL happens so late it probably isn't worth much.

The new isel patterns are needed by D117468 anyway.
2022-01-18 11:47:50 -08:00
Craig Topper aa7fc02feb Recommit "[RISCV] Make the operand order for RISCVISD::FSL(W)/FSR(W) match the instruction register numbering."
This reverts the revert commit e328385739.

Accidental demanded bits change has been removed. The demanded bits
code itself was remove in a pre-commit since it isn't tested.

Original commit message:
Previous we used the fshl/fshr operand ordering for simplicity. This
made things confusing when D117468 proposed adding intrinsics for
the instructions. We can't just use the generic funnel shifting
intrinsics because fsl/fsr have different functionality that should
be exposed to software.

Now we use rs1, rs3, rs2/shamt order which matches the instruction
printing order and the order used in this intrinsic header
https://github.com/riscv/riscv-bitmanip/blob/main-history/cproofs/rvintrin.h
2022-01-18 10:52:43 -08:00
Craig Topper b3a0ec7645 [RISCV] Remove DemandedBits handling for FSR/FSL until we have test cases for it.
Testing may be easier after D117468. Right now we get demanded bits
optimizations done on ISD::FSHL/FSHR before they become FSR/FSL. This
makes it hard to test.
2022-01-18 10:52:43 -08:00
Craig Topper e328385739 Revert "[RISCV] Make the operand order for RISCVISD::FSL(W)/FSR(W) match the instruction register numbering."
This reverts commit b634f8a663.

I broke the SimplifyDemandedBits code, but we don't have tests.
2022-01-18 10:36:03 -08:00
Craig Topper b634f8a663 [RISCV] Make the operand order for RISCVISD::FSL(W)/FSR(W) match the instruction register numbering.
Previous we used the fshl/fshr operand ordering for simplicity. This
made things confusing when D117468 proposed adding intrinsics for
the instructions. We can't just use the generic funnel shifting
intrinsics because fsl/fsr have different functionality that should
be exposed to software.

Now we use rs1, rs3, rs2/shamt order which matches the instruction
printing order and the order used in this intrinsic header
https://github.com/riscv/riscv-bitmanip/blob/main-history/cproofs/rvintrin.h
2022-01-18 09:47:28 -08:00
David Sherwood f4515ab858 Revert "[CodeGen][AArch64] Ensure isSExtCheaperThanZExt returns true for negative constants"
This reverts commit 197f3c0deb.

Reverting after miscompilation errors discovered with ffmpeg.
2022-01-18 08:40:20 +00:00
Lian Wang 5ceb4f5446 [RISCV] Add instruction schedule for Zbc extension and Zbs extension
Zbc extension:
CLMUL/CLMULR/CLMULH are grouped together, defined one schedule class.

Zbs extension:
BCLR/BSET/BINV/BEXT are grouped together, defined one schedule class.
BCLRI/BSETI/BINVI/BEXTI are grouped together, defined one schedule class.

Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D117538
2022-01-18 07:31:50 +00:00
jacquesguan 1090000b63 [RISCV] Add patterns for vector widening floating-point multiply
Add patterns for vector widening floating-point multiply

Differential Revision: https://reviews.llvm.org/D117530
2022-01-18 14:52:43 +08:00
Han-Kuan Chen ec9cb3a79c [RISCV] Provide VLOperand in td.
Currently, users expected VL is the last operand. However, since some
intrinsics has tail policy in the last operand, this rule cannot be used
anymore.

Reviewed By: craig.topper, frasercrmck

Differential Revision: https://reviews.llvm.org/D117452
2022-01-17 20:25:47 -08:00
Han-Kuan Chen 3fc4b5896a [RISCV] Make SplatOperand start from 0.
Current SplatOperand starts from 1 because operand 0 (or 1) is intrinsic
id in SelectionDAG.

Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D117453
2022-01-17 20:14:59 -08:00
jacquesguan c29d6c410e [RISCV] Add patterns for vector widening floating-point add/subtract instructions
Add patterns for Vector Widening Floating-Point Add/Subtract Instructions

Differential Revision: https://reviews.llvm.org/D117466
2022-01-18 10:33:56 +08:00
Craig Topper 116af698e2 [RISCV] When expanding CONCAT_VECTORS, don't create INSERT_SUBVECTORS for undef subvectors.
For fixed vectors, the undef will get expanded to an all zeros
build_vector. We don't want that so suppress creating the
insert_subvector.

Reviewed By: frasercrmck

Differential Revision: https://reviews.llvm.org/D117379
2022-01-17 14:40:59 -08:00
Craig Topper 9c410838d2 [RISCV] Legalize fixed length (insert_subvector undef, X, 0) to a scalable insert.
We were considering this legal, but later the undef would become an all
zeros vector. This would cause us to need to re-legalize the insert later
into a vslideup with zero vector.

This patch catches the case and directly legalizes it to a scalable
insert.

Reviewed By: frasercrmck

Differential Revision: https://reviews.llvm.org/D117377
2022-01-17 14:31:30 -08:00
David Sherwood 197f3c0deb [CodeGen][AArch64] Ensure isSExtCheaperThanZExt returns true for negative constants
When we know the value we're extending is a negative constant then it
makes sense to use SIGN_EXTEND because this may improve code quality in
some cases, particularly when doing a constant splat of an unpacked vector
type. For example, for SVE when splatting the value -1 into all elements
of a vector of type <vscale x 2 x i32> the element type will get promoted
from i32 -> i64. In this case we want the splat value to sign-extend from
(i32 -1) -> (i64 -1), whereas currently it zero-extends from
(i32 -1) -> (i64 0xFFFFFFFF). Sign-extending the constant means we can use
a single mov immediate instruction.

New tests added here:

  CodeGen/AArch64/sve-vector-splat.ll

I believe we see some code quality improvements in these existing
tests too:

  CodeGen/AArch64/reduce-and.ll
  CodeGen/AArch64/unfold-masked-merge-vector-variablemask.ll

The apparent regressions in CodeGen/AArch64/fast-isel-cmp-vec.ll only
occur because the test disables codegen prepare and branch folding.

Differential Revision: https://reviews.llvm.org/D114357
2022-01-17 11:08:57 +00:00
Lian Wang 85def34f5e [RISCV] Add scheduler for bfp instruction in Zbf extension
Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D117290
2022-01-17 09:17:18 +00:00
Kito Cheng cc35161dc7 [RISCV] Add initial support for getRegUsageForType and getNumberOfRegisters
Those two TTI hooks are used during vectorization for calculating
register pressure, the default implementation isn't consider for LMUL,
and that's also definitly wrong value for register number (all register class
are 8 registers).

So in this patch we tried to:

1. Calculate right register usage for vector type and scalar type.
2. Return right number of register for general purpose register and
   vector register.

Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D116890
2022-01-17 15:27:54 +08:00
eopXD 5a457782a2 [RISCV] Add patterns for vector widening integer multiply-add instructions
Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D117404
2022-01-16 18:37:13 -08:00
Craig Topper 4c1e1e05cb [RISCV] Add RISCVISD::BFPW to ComputeNumSignBitsForTargetNode. 2022-01-15 15:23:49 -08:00
Fraser Cormack 877d1b3d07 [SelectionDAG][VP] Add splitting/widening for VP_LOAD and VP_STORE
Original patch by @hussainjk.

This patch was split off from D109377 to keep vector legalization
(widening/splitting) separate from vector element legalization
(promoting).

While the original patch added a third overload of
SelectionDAG::getVPStore, this patch takes the liberty of collapsing
those all down to 1, as three overloads seems excessive for a
little-used node.

The original patch also used ModifyToType in places, but that method
still crashes on scalable vector types. Seeing as the other VP
legalization methods only work when all operands need identical
widening, this patch follows in that vein.

Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D117235
2022-01-15 11:41:29 +00:00
Alex Bradbury 0ee679e22c [RISCV] Add CSRs defined in the recently ratified Sstc extension
The 'RISC-V "stimecmp / vstimecmp" Extension' was ratified at the end of
last year though hasn't yet been integrated in the main specification
documents (see
<https://wiki.riscv.org/display/TECH/Recently+Ratified+Extensions>).

RISC-V "stimecmp / vstimecmp" Extension
<https://github.com/riscv/riscv-time-compare/releases/download/v0.5.4/Sstc.pdf>.

Differential Revision: https://reviews.llvm.org/D117311
2022-01-15 08:36:04 +00:00
Alex Bradbury 1ca79823e0 [RISCV] Add CSRs defined in the recently ratified Smstateen extension
The "RISC-V State Enable Extension" was ratified at the end of at the
end of last year though hasn't yet been integrated in the main
specification documents (see
<https://wiki.riscv.org/display/TECH/Recently+Ratified+Extensions>).

This commit adds the CSRs defined by this extension in
<https://github.com/riscv/riscv-state-enable/releases/download/v0.6.3/Smstateen.pdf>.

Differential Revision: https://reviews.llvm.org/D117310
2022-01-15 08:35:47 +00:00
Alex Bradbury f00a98a0a9 [RISCV] Add CSRs defined in the recently ratified Sscofpmf extension
The "RISC-V Count Overflow and Mode-Based Filtering Extension" was
ratified at the end of last year though hasn't yet been integrated in
the main specification documents (see
<https://wiki.riscv.org/display/TECH/Recently+Ratified+Extensions>).

This commit adds the CSRs defined by this extension in
<https://github.com/riscv/riscv-count-overflow/releases/download/v0.5.2/Sscofpmf.pdf>.

Differential Revision: https://reviews.llvm.org/D117308
2022-01-15 08:35:13 +00:00
Chenbing.Zheng fdd33a0c75 [RISCV][NFC] Add a function to customLegalizeToWOp by Intrinsic
These cases follow the same pattern, so they can be combined
to a unqiue function.

Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D117378
2022-01-15 08:28:08 +00:00
eopXD 26bb1b1dab [RISCV] Add the zvl extension according to the v1.0 spec
`zvl` is the new standard vector extension that specifies the minimum vector length of the vector extension.
The `zvl` extension is related to the `zve` extension and other updates that are added in v1.0.

According to https://github.com/riscv-non-isa/riscv-c-api-doc/pull/21,
Clang defines macro `__riscv_v_min_vlen` for `zvl` and it can be used for applications that uses the vector extension.
LLVM checks whether the option `riscv-v-vector-bits-min` (if specified) matches the `zvl*` extension specified.

Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D108694
2022-01-14 23:01:48 -08:00
Lian Wang 21dad9a522 [RISCV][NFC] Add IsRV64 predicate in xperm.w pattern
Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D117191
2022-01-15 04:22:16 +00:00
jacquesguan b148348ad4 [RISCV] Add patterns for vector widening integer add/subtract
Add patterns for vector widening integer add/subtract instructions

Differential Revision: https://reviews.llvm.org/D117188
2022-01-15 09:41:07 +08:00
Shao-Ce SUN a0a76fee0c [RISCV] update zfh and zfhmin extention to v1.0
`zfh` and `zfhmin` have been ratified, with version 1.0.

Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D117098
2022-01-15 09:21:24 +08:00
Craig Topper 2baa1dffd1 [RISCV] Add basic support for matching shuffles to vslidedown.vi.
Specifically the unary shuffle case where the elements being
shifted in are undef. This handles the shuffles produce by expanding
llvm.reduce.mul.

I did not reduce the VL which would increase the number of vsetvlis,
but may improve the execution speed. We'd also want to narrow the
multiplies so we could share vsetvlis between the vslidedown.vi and
the next multiply.

Reviewed By: frasercrmck

Differential Revision: https://reviews.llvm.org/D117239
2022-01-14 09:04:54 -08:00
Craig Topper ac6b4896ea [RISCV] Honor the VT when converting float point register names to register class for inline assembly.
It appears the code here was written for the inline asm clobbering
a specific register, but it also gets used for named input and
output registers.

For the input and output case, we should honor the VT so we
don't insert conversion instructions around the inline assembly.

For the clobber, case we need to pick the largest register class.

Reviewed By: asb, jrtc27

Differential Revision: https://reviews.llvm.org/D117279
2022-01-14 09:04:00 -08:00
Alex Bradbury 4a4a652f34 [RISCV][NFC] Use TableGen 'foreach' to simplify repetitive CSR definitions
Make the definitions of hpmcounter3-hpmcounter31,
hpmcounter3h-hpmcounter31h, mhpmcounter3-mhpmcounter31,
mhpmcounter3h-mhpmcounter31h, pmpaddr0-pmpaddr63, mhpmevent3-31, and
pmpcfg0-15 substantially less repetitive using a foreach loop.

Differential Revision: https://reviews.llvm.org/D117227
2022-01-14 11:59:39 +00:00
jacquesguan 88c0e0806b [RISCV] Improve i64 splat vector lowering in RV32.
We could use vmv.v.i/vmv.v.x whose eew is 32 to lower the i64 splat vector if the i64 constant scalar could be splitted into two same i32 scalar.

Differential Revision: https://reviews.llvm.org/D117079
2022-01-14 14:06:01 +08:00
David Sherwood ba471ba8d2 Revert "[CodeGen][AArch64] Ensure isSExtCheaperThanZExt returns true for negative constants"
This reverts commit 31009f0b5a.

It seems to be causing SVE VLA buildbot failures and has introduced a
genuine regression. Reverting for now.
2022-01-13 15:59:43 +00:00
David Sherwood 31009f0b5a [CodeGen][AArch64] Ensure isSExtCheaperThanZExt returns true for negative constants
When we know the value we're extending is a negative constant then it
makes sense to use SIGN_EXTEND because this may improve code quality in
some cases, particularly when doing a constant splat of an unpacked vector
type. For example, for SVE when splatting the value -1 into all elements
of a vector of type <vscale x 2 x i32> the element type will get promoted
from i32 -> i64. In this case we want the splat value to sign-extend from
(i32 -1) -> (i64 -1), whereas currently it zero-extends from
(i32 -1) -> (i64 0xFFFFFFFF). Sign-extending the constant means we can use
a single mov immediate instruction.

New tests added here:

  CodeGen/AArch64/sve-vector-splat.ll

I believe we see some code quality improvements in these existing
tests too:

  CodeGen/AArch64/dag-numsignbits.ll
  CodeGen/AArch64/reduce-and.ll
  CodeGen/AArch64/unfold-masked-merge-vector-variablemask.ll

The apparent regressions in CodeGen/AArch64/fast-isel-cmp-vec.ll only
occur because the test disables codegen prepare and branch folding.

Differential Revision: https://reviews.llvm.org/D114357
2022-01-13 09:43:07 +00:00
Lian Wang 16877c5d2c [RISCV] Add bfp and bfpw intrinsic in zbf extension
Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D116994
2022-01-13 02:53:00 +00:00
Alex Bradbury 33d008b169 [RISCV] Update recently ratified Zb{a,b,c,s} extensions to no longer be experimental
Agreed policy is that RISC-V extensions that have not yet been ratified
should be marked as experimental, and enabling them requires the use of
the -menable-experimental-extensions flag when using clang alongside the
version number. These extensions have now been ratified, so this is no
longer necessary, and the target feature names can be renamed to no
longer be prefixed with "experimental-".

Differential Revision: https://reviews.llvm.org/D117131
2022-01-12 19:33:44 +00:00
Craig Topper 632c263eb3 [RISCV] Add RISCVProcFamilyEnum and add SiFive7.
Use it to remove explicit string compares from unrolling preferences.

I'm of two minds on this. Ideally, we would define things in terms
of architectural or microarchitectural features, but it's hard to
do that with things like unrolling preferences without just ending up
with FeatureSiFive7UnrollingPreferences.

Having a proc enum is consistent with ARM and AArch64. X86 only has
a few and is trying to move away from it.

Reviewed By: asb, mcberg2021

Differential Revision: https://reviews.llvm.org/D117060
2022-01-12 09:34:02 -08:00
Shao-Ce SUN edb9175de6 [RISCV][llvm] Update CSRs
According the newest RISC-V Privileged Spec, updated CSRs.

Reviewed By: asb

Differential Revision: https://reviews.llvm.org/D116645
2022-01-12 20:14:04 +08:00
Craig Topper 63b17eb9ec [RISCV] Add strictfp support for compares.
This adds support for STRICT_FSETCC(quiet) and STRICT_FSETCCS(signaling).

FEQ matches well to STRICT_FSETCC oeq.
FLT/FLE matches well to STRICT_FSETCCS olt/ole.

Others require commuting operands or multiple instructions.

STRICT_FSETCC olt/ole/ogt/oge/ult/ule/ugt/uge uses FLT/FLE,
but we need to save/restore FFLAGS around them to avoid spurious
exceptions. I've implemented pseudo instructions with a
CustomInserter to insert the save/restore CSR instructions.
Unfortunately, this doesn't honor exceptions for signaling NANs
but I'm not sure if signaling nans are really supported by the
constrained intrinsics.

STRICT_FSETCC one and ueq expand to a pair of FLT instructions
with a save/restore of fflags around each. This could be improved
in the future.

There may be some opportunities to generate better code for strict
comparisons mixed with nonans fast math flags. I've left FIXMEs in
the .td files for that.

Co-Authored-by: ShihPo Hung <shihpo.hung@sifive.com>

Reviewed By: arcbbb

Differential Revision: https://reviews.llvm.org/D116694
2022-01-11 20:01:41 -08:00
Craig Topper be1cc64cc1 [RISCV] Add DAG combine to fold (fp_to_int (ffloor X)) -> (fcvt X, rdn)
Similar for ceil, trunc, round, and roundeven. This allows us to use
static rounding modes to avoid a libcall.

This optimization is done for AArch64 as isel patterns.
RISCV doesn't have instructions for ceil/floor/trunc/round/roundeven
so the operations don't stick around until isel to enable a pattern
match. Thus I've implemented a DAG combine.

We only handle XLen types except i32 on RV64. i32 will be type
legalized to a RISCVISD node. All other types will be type legalized
to XLen and maintain the FP_TO_SINT/UINT ISD opcode.

Reviewed By: asb

Differential Revision: https://reviews.llvm.org/D116771
2022-01-11 09:05:57 -08:00
wangpc c6430fade3 [RISCV] Generate 32 bits jumptable entries when code model is small
The code can only address the whole RV32 address space or the lower 2 GiB
of the RV64 address space in small code model, so 32 bits entry is enough.
Cache hit ratio and code size have some improvements.

Reviewed By: asb

Differential Revision: https://reviews.llvm.org/D116435
2022-01-11 18:20:37 +08:00
wangpc 98d51c2542 [RISCV] Override TargetLowering::BuildSDIVPow2 to generate SELECT
When `Zbt` is enabled, we can generate SELECT for division by power
of 2, so that there is no data dependency.

Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D114856
2022-01-11 15:54:35 +08:00
Chenbing.Zheng 9ea772ff81 [RISCV] Block vmsgeu.vi with 0 immediate in Isel
For vmsgeu.vi with 0, we know this is always true. So we can replace
it with vmset.m (unmasked) or vmset.m+vmand.mm (masked).

Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D116584
2022-01-11 03:04:44 +00:00
jacquesguan d0554ae4cf [RISCV] Select vl op to X0 when it is equal to ~0.
Now the backend will select ~0 vl to a register and load instruction, we could use X0 to replace it.

Differential Revision: https://reviews.llvm.org/D116798
2022-01-11 10:56:25 +08:00
Haocong.Lu bd653f6406 [RISCV] Use shift for zero extension when Zbb and Zbp are not enabled
Now AND is used for zero extension when both Zbb and Zbp are not enabled.
It may be better to use shift operation if the trailing ones mask exceeds simm12.

This patch optimzes LUI+ADDI+AND to SLLI+SRLI.

Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D116720
2022-01-11 02:37:03 +00:00
jacquesguan b607cd3928 [RISCV] Use vmv.s.x to build one element splat vector.
When we want to create an splat vector that only the first element is initialized, we could use vmv.s.x or vfmv.s.f to build it.

Differential Revision: https://reviews.llvm.org/D116277
2022-01-11 10:21:18 +08:00
Craig Topper b645bcd98a [RISCV] Generalize (srl (and X, 0xffff), C) -> (srli (slli X, (XLen-16), (XLen-16) + C) optimization.
This can be generalized to (srl (and X, C2), C) ->
(srli (slli X, (XLen-C3), (XLen-C3) + C). Where C2 is a mask with
C3 trailing ones.

This can avoid constant materialization for C2. This is beneficial
even when C2 can be selected to ANDI because the SLLI can become
C.SLLI, but C.ANDI cannot cover all the immediates of ANDI.

This also enables CSE in some cases of i8 sdiv by constant codegen.
2022-01-09 23:37:10 -08:00
Craig Topper 296e8cae5c [RISCV] Isel (sra (sext_inreg X, i16), C) -> (srai (slli X, (XLen-16), (XLen-16) + C).
Similar for (sra (sext_inreg X, i8), C).

With Zbb, sext_inreg of i8 and i16 are legal for sext.b and sext.h.
This transform makes the Zbb codegen the same as without Zbb. The
shifts are more compressible. This also exposes an opportunity for
CSE with another slli in the i16 sdiv by constant codegen.
2022-01-09 21:23:43 -08:00
jacquesguan 6b8362eb8d [RISCV] Disable EEW=64 for index values when XLEN=32.
Disable EEW=64 for vector index load/store when XLEN=32.

Differential Revision: https://reviews.llvm.org/D106518
2022-01-10 10:51:27 +08:00
Craig Topper 2dd52f840b [RISCV] Fold (srl (and X, 0xffff), C)->(srli (slli X, (XLen-16), (XLen-16) + C) even with Zbb/Zbp.
We can use zext.h with Zbb, but srli/slli may offer more opportunities
for compression.
2022-01-09 18:42:03 -08:00
Kazu Hirata 435a5a3652 [llvm] Fix bugprone argument comments (NFC)
Identified with bugprone-argument-comment.
2022-01-08 11:56:38 -08:00
Craig Topper 042394b69e [RISCV] Add a command line option to control the LMUL used by TTI's getRegisterBitWidth.
By default we return the width of an LMUL=1 register. We can enable
testing with larger LMUL values by returning a larger bit width.

This patch adds a RISCV specific option to provide a LMUL which will be
multiplied by the LMUL=1 bit width.

Reviewed By: kito-cheng

Differential Revision: https://reviews.llvm.org/D116339
2022-01-07 20:02:10 -08:00
Kito Cheng f142c45f1e [RISCV] Set getMinVectorRegisterBitWidth to 16 if enable fixed length vector code gen for RVV
getMinVectorRegisterBitWidth means what vector types is supported in
this target, and actually RISC-V support all fixed length vector types with
vector length less than `getMinRVVVectorSizeInBits`, so set it to 16,
means 2 x i8, that is minimal fixed length vector size in theory.

That also fixed one issue, some testcase migth become non-vectorizable
when `-riscv-v-vector-bits-min` set to larger value, because the vector size is
smaller than `-riscv-v-vector-bits-min`.

For example, following code can vectorize by SLP with
`-riscv-v-vector-bits-min=128` or `-riscv-v-vector-bits-min=256`, but
can't vectorize `-riscv-v-vector-bits-min=512` or larger:

```
void foo(double *da) {
  da[0] = 0;
  da[1] = 1;
  da[2] = 2;
  da[3] = 3;
}
```

Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D116534
2022-01-08 11:16:21 +08:00
Baoshan Pang af931a51b9 [RISCV] Materializing constants with 'rori'
Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D116574
2022-01-07 15:39:22 -08:00
Lian Wang e8f1dfe923 [RISCV] Supplement PACKH instruction pattern
Optimize (rs1 & 255) | ((rs2 & 255) << 8) -> (PACKH rs1, rs2).

Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D116791
2022-01-07 17:59:19 +08:00
Kazu Hirata f3a344d212 [Target] Remove redundant member initialization (NFC)
Identified with readability-redundant-member-init.
2022-01-06 22:01:44 -08:00
wangpc 91cf2a9b6c [RISCV][NFC] Use sub operator to generate register list
There are several duplicated lines for generating GPRXXX's
register list that can be eliminated by using `sub` operator.

Reviewed By: asb

Differential Revision: https://reviews.llvm.org/D116729
2022-01-07 12:29:58 +08:00
Shao-Ce SUN 808c0987c3 [NFC][RISCV] Make the macro names more uniform
Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D116719
2022-01-07 11:09:41 +08:00
Liqin Weng 92153a9aa7 [RISCV] Support immediate vtype of VSETVLI/VSETIVLI in asm parser
Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D115133
2022-01-07 02:26:41 +00:00
Craig Topper ec4dd862bf [RISCV] Use simm5_plus1_nonzero in isel patterns for vmsgeu.vi/vmsltu.vi intrinsics.
The 0 immediate can't be selected to vmsgtu.vi/vmsleu.vi by decrementing
the immediate. To prevent his we had special patterns that provided
alternate lowering for the 0 cases. This relied on tablegen prioritizing
the 0 pattern over the sim5_plus1 range.

This patch introduces simm5_plus1_nonzero that excludes 0. It also
excludes the special case for vmsltu.vi since we can just use
vmsltu.vx and let the 0 be selected to X0.

This is an alternative to some of the changes in D116584.

Reviewed By: Chenbing.Zheng, asb

Differential Revision: https://reviews.llvm.org/D116723
2022-01-06 08:27:27 -08:00
Craig Topper 56ca11e31e [RISCV] Add an MIR pass to replace redundant sext.w instructions with copies.
Function calls and compare instructions tend to cause sext.w
instructions to be inserted. If we make good use of W instructions,
these operations can often end up being redundant. We don't always
detect these during SelectionDAG due to things like phis. There also
some cases caused by failure to turn extload into sextload in
SelectionDAG. extload selects to LW allowing later sext.ws to become
redundant.

This patch adds a pass that examines the input of sext.w instructions trying
to determine if it is already sign extended. Either by finding a
W instruction, other instructions that produce a sign extended result,
or looking through instructions that propagate sign bits. It uses
a worklist and visited set to search as far back as necessary.

Reviewed By: asb, kito-cheng

Differential Revision: https://reviews.llvm.org/D116397
2022-01-06 08:23:42 -08:00
Craig Topper 75117fb340 [RISCV] Don't advertise i32->i64 zextload as free for RV64.
The zextload hook is only used to determine whether to insert a
zero_extend or any_extend for narrow types leaving a basic block.
Returning true from this hook tends to cause any load whose output
leaves the basic block to become an LWU instead of an LW.

Since we tend to prefer sexts for i32 compares on RV64, this can
cause extra sext.w instructions to be created in other basic blocks.

If we use LW instead of LWU this gives the MIR pass from D116397
a better chance of removing them.

Another option might be to teach getPreferredExtendForValue in
FunctionLoweringInfo.cpp about our preference for sign_extend of
i32 compares. That would cause SIGN_EXTEND to be chosen for any
value used by a compare instead of using the isZExtFree heuristic.
That will require code to convert from the llvm::Type* to EVT/MVT
as well as querying the type legalization actions to get the
promoted type in order to call TargetLowering::isSExtCheaperThanZExt.
That seemed like many extra steps when no other target wants it.
Though it would avoid us needing to lean on the MIR pass in some cases.

Reviewed By: asb

Differential Revision: https://reviews.llvm.org/D116567
2022-01-06 08:13:42 -08:00
Craig Topper 808c662665 [RISCV] Change RISCVISD::FCVT*RTZ opcodes to take rounding mode as an operand.
Pre-work for a future change that will use these opcodes with other
rounding modes.

Differential Revision: https://reviews.llvm.org/D116724
2022-01-06 08:12:12 -08:00
Craig Topper fd992aac19 [RISCV] Use macros to reduce repetive switch cases. NFC
These 3 switches map LMUL enum to instruction names. These follow
a regular pattern. Use a macro to reduce the number of source code
lines.

Reviewed By: arcbbb

Differential Revision: https://reviews.llvm.org/D116631
2022-01-05 09:00:48 -08:00
Craig Topper df2e728b77 [RISCV] Teach RISCVGatherScatterLowering to handle more complex recurrence start values.
Previously we only recognized strided loads/store when the initial
value for the phi was a strided constant vector.

This patch extends the support to a strided_constant added to a
splatted value. The rewritten loop will add the splat value to the
first element of the strided constant vector to use as the scalar
start value. The stride is unaffected.

Reviewed By: frasercrmck

Differential Revision: https://reviews.llvm.org/D115958
2022-01-04 10:13:34 -08:00
Kazu Hirata e5947760c2 Revert "[llvm] Remove redundant member initialization (NFC)"
This reverts commit fd4808887e.

This patch causes gcc to issue a lot of warnings like:

  warning: base class ‘class llvm::MCParsedAsmOperand’ should be
  explicitly initialized in the copy constructor [-Wextra]
2022-01-03 11:28:47 -08:00
Jim Lin d38637a0e6 [RISCV] Fix the code alignment for GroupFloatVectors. NFC
Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D116520
2022-01-03 17:08:53 +08:00
Victor Perez 5527139302 [RISCV][VP] Add RVV codegen for [nX]vXi1 vp.select
Expand [nX]vXi1 vp.select the same way as [nX]vXi1 vselect.

Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D115546
2022-01-02 23:12:32 -08:00
Craig Topper bc091e0862 [RISCV] Prune more unnecessary vector pseudo instructions. NFC
For floating point specific vector instructions, we don't need
pseudos for mf8.

Reviewed By: khchen

Differential Revision: https://reviews.llvm.org/D116460
2022-01-02 23:00:42 -08:00
Kazu Hirata 41bfac6aed [Target] Remove unused forward declarations (NFC) 2022-01-02 10:20:15 -08:00
Craig Topper 4602f4169a [RISCV] Prune unnecessary vector pseudo instructions. NFC
For .vf instructions, we don't need MF8 pseudos for f16. We don't
need MF8 or MF4 pseudos for f32. Or MF8, MF4, MF2 for f64.

Reviewed By: khchen

Differential Revision: https://reviews.llvm.org/D116437
2022-01-01 19:53:53 -08:00
Kazu Hirata fd4808887e [llvm] Remove redundant member initialization (NFC)
Identified with readability-redundant-member-init.
2022-01-01 16:18:18 -08:00
Craig Topper 6f45fe9851 [RISCV] Use MxListW instead of MxList[0-5]. NFC
Better to use the named list instead of assuming the size of MxList.
2021-12-31 00:22:55 -08:00
Craig Topper 8811a87e8c [RISCV] Use defvar to simplify some code. NFC
Rather than wrapping a def around a list, we can just make a defvar
of the list.
2021-12-30 23:48:39 -08:00
wangpc 41454ab256 [RISCV] Use constant pool for large integers
For large integers (for example, magic numbers generated by
TargetLowering::BuildSDIV when dividing by constant), we may
need about 4~8 instructions to build them.
In the same time, it just takes two instructions to load
constants (with extra cycles to access memory), so it may be
profitable to put these integers into constant pool.

Reviewed By: asb, craig.topper

Differential Revision: https://reviews.llvm.org/D114950
2021-12-31 14:48:48 +08:00
jacquesguan 05f82dc877 [RISCV] Fix incorrect cases of vmv.s.f in the VSETVLI insert pass.
Fix incorrect cases of vmv.s.f and add test cases for it.

Differential Revision: https://reviews.llvm.org/D116432
2021-12-31 14:17:03 +08:00
Craig Topper 15787ccd45 [RISCV] Add support for STRICT_LRINT/LLRINT/LROUND/LLROUND. Tests for other strict intrinsics.
This patch adds isel support for STRICT_LRINT/LLRINT/LROUND/LLROUND.

It also adds test cases for f32 and f64 constrained intrinsics that
correspond to the intrinsics in float-intrinsics.ll and
double-intrinsics.ll. Support for promoting the integer argument of
STRICT_FPOWI was added.

I've skipped adding tests for f16 intrinsics, since we don't have libcalls
for them and we have inconsistent support for promoting them in LegalizeDAG.
This will need to be examined more closely.

Reviewed By: asb

Differential Revision: https://reviews.llvm.org/D116323
2021-12-30 11:54:32 -08:00
jacquesguan 128c6ed73b [RISCV] Teach VSETVLInsert to eliminate redundant vsetvli for vmv.s.x and vfmv.s.f.
Differential Revision: https://reviews.llvm.org/D116307
2021-12-30 17:16:18 +08:00
jacquesguan 1dd5e6fed5 [RISCV] Use vmv.s.x instead of vfmv.s.f when the floating point scalar is 0.
Use integer vector scalar move instruction when move 0 to avoid add a integer-float move instruction.

Differential Revision: https://reviews.llvm.org/D116365
2021-12-30 10:16:54 +08:00
Chenbing.Zheng 43c8296cda [RISCV] Refactor immediate comparison instructions patterns
The patterns of the immediate comparison instruction is rewrite here, and put similar code to a class.
Do not change any function of the original code, making the code more concise.

Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D116215
2021-12-30 09:31:01 +08:00
Craig Topper 015ff729cb [RISCV] Add a few more instructions to hasAllNBitUsers. 2021-12-29 09:17:47 -08:00
Kazu Hirata 5a667c0e74 [llvm] Use nullptr instead of 0 (NFC)
Identified with modernize-use-nullptr.
2021-12-28 08:52:25 -08:00
Hsiangkai Wang a1c7ddf926 [RISCV] Support passing scalable vectur values through the stack.
After consuming all vector registers, the scalable vector values will be
passed indirectly. The pointer values will be saved in general
registers. If all general registers are used up, we will report an error to
notify users the compiler does not support passing scalable vector
values through the stack. In this patch, we remove the restriction. After
all general registers are used up, we use the stack to save the
pointers which point to the indirect passed scalable vector values.

Differential Revision: https://reviews.llvm.org/D116310
2021-12-28 09:26:36 +08:00
Hsiangkai Wang 5d47e7d768 [RISCV] Convert whole register copies as the source defined explicitly.
The implicit defines may come from a partial define in an instruction.
It does not mean the defining instruction and the COPY instruction have
the same vl and vtype. When the source comes from the implicit defines,
do not convert the whole register copies to vmv.v.v.

Differential Revision: https://reviews.llvm.org/D115866
2021-12-27 13:59:49 +08:00
Shao-Ce SUN 70a98008ea [RISCV] Reduce repetitive codes in flw, fsw
Trying to improve code reuse in F,D,Zfh *.td files.

Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D116089
2021-12-27 09:29:35 +08:00
Kazu Hirata e7774f499b Use static_assert instead of assert (NFC)
Identified with misc-static-assert.
2021-12-26 14:26:44 -08:00
Jim Lin 02478a26f2 [RISCV] Use DAG variable directly instead of DCI.DAG
Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D116087
2021-12-24 13:06:55 +08:00
Craig Topper 0a35211b34 [RISCV] Don't allow vector types to be used with inline asm 'r' constraint
The 'r' constraint uses the GPR class. There is generic support
for bitcasting and extending/truncating non-integer VTs to the
required integer VT. This doesn't work for scalable vectors and
instead crashes.

To prevent this, explicitly reject vectors. Fixed vectors might
work without crashing, but it doesn't seem worthwhile to allow.

While there remove an unnecessary level of indentation in the
"vr" and "vm" constraint handling.

Differential Revision: https://reviews.llvm.org/D115810
2021-12-23 20:32:36 -06:00
Victor Perez 10b3675aa9 [RISCV][VP] Lower mask vector VP AND/OR/XOR to RVV instructions
For fixed and scalable vectors, each intrinsic x is lowered to vmx.mm,
dropping the mask, which is safe to do as masked-off elements are
undef anyway.

Differential Revision: https://reviews.llvm.org/D115339
2021-12-23 15:02:32 -06:00
Craig Topper 7704c503ec [RISCV] Use positive 0.0 for the neutral element in fadd reductions if nsz is present.
-0.0 requires a constant pool. +0.0 can be made with vmv.v.x x0.

Not doing this in getNeutralElement for fear of changing other targets.

Differential Revision: https://reviews.llvm.org/D115978
2021-12-23 10:38:00 -06:00
Craig Topper b7b260e19a [RISCV] Support strict FP conversion operations.
This adds support for strict conversions between fp types and between
integer and fp.

NOTE: RISCV has static rounding mode instructions, but the constrainted
intrinsic metadata is not used to select static rounding modes. Dynamic
rounding mode is always used.

Differential Revision: https://reviews.llvm.org/D115997
2021-12-23 09:40:58 -06:00
Craig Topper a9486a40f7 [RISCV] Disable interleaving scalar loops in the loop vectorizer.
The loop vectorizer can interleave scalar loops even if it doesn't
vectorize them. I don't believe we intended to enable this when
we enabled interleaving for vector instructions.

Disable interleaving for VF=1 like X86 and AMDGPU already do. Test
lifted from AMDGPU.

Differential Revision: https://reviews.llvm.org/D115975
2021-12-23 08:37:24 -06:00
jacquesguan 28a3e7dea2 [RISCV] Override hasAndNotCompare to use more andn when have Zbb extension.
Enable transform (X & Y) == Y ---> (~X & Y) == 0 and (X & Y) != Y ---> (~X & Y) != 0 when have Zbb extension to use more andn instruction.

Differential Revision: https://reviews.llvm.org/D115922
2021-12-23 10:42:20 +08:00
Shao-Ce SUN 68bc6d7cae [RISCV] Remove Zvamo Extention
Based on D111692. Zvamo is not part of the 1.0 V spec. Remove it.

Reviewed By: arcbbb

Differential Revision: https://reviews.llvm.org/D115709
2021-12-20 10:28:39 +08:00
Michael Berg f95ee6074a [RISCV] Add target specific loop unrolling and peeling preferences
Both these preference helper functions have initial support with
this change. The loop unrolling preferences are set with initial
settings to control thresholds, size and attributes of loops to
unroll with some tuning done.  The peeling preferences may need
some tuning as well as the initial support looks much like what
other architectures utilize.

Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D113798
2021-12-18 12:54:50 -08:00
Craig Topper be41996f4f [RISCV} Add FSGNJ_H to isAsCheapAsAMove and isCopyInstrImpl.
This matches FSGNJ_S and FSGNJ_D.
2021-12-17 09:14:20 -08:00
Craig Topper 66bbefeb13 [RISCV] Revert Zfhmin related changes that aren't tested and depend on f16 being a legal type.
Our Zfhmin support is only MC layer, but these are CodeGen layer
interfaces. If f16 isn't a Legal type for CodeGen with Zfhmin, then
these interfaces should keep their non-Zfh behavior.

Reviewed By: luismarques

Differential Revision: https://reviews.llvm.org/D115822
2021-12-16 08:55:28 -08:00
jacquesguan 7dfbf0b60f [RISCV] Fold (and (not (srl X, C)), 1) to (xor (bexti X, C), 1) when have Zbs extension.
When have Zbs extension, we could use bexti to fold (and (not (srl X, C)), 1) to (xor (bexti X, C), 1).

Differential Revision: https://reviews.llvm.org/D115629
2021-12-16 15:01:05 +08:00
jacquesguan d3c2ad154e [RISCV] Fix whole vector register move instruction's vector register constraint.
According to the v-spec, the source and destination VR of vmv<nr>r.v should be aligned for the VR group size.

Differential Revision: https://reviews.llvm.org/D115720
2021-12-16 10:58:55 +08:00
Craig Topper 3926893439 [RISCV] Add isel support for scalar STRICT_FADD/FSUB/FMUL/FDIV/FSQRT.
Test that STRICT_FMINNUM/FMAXNUM are lowered to libcalls for f32/f64.
The RISC-V instructions don't match the behavior of fmin/fmax libcalls
with respect to SNaN.

Promoting FMINNUM/FMAXNUM for f16 needs more work outside of the
RISC-V backend.

Reviewed By: asb, arcbbb

Differential Revision: https://reviews.llvm.org/D115680
2021-12-14 10:50:55 -08:00
Craig Topper 3f1c403a2b [RISCV] Use AdjustInstrPostInstrSelection to insert a FRM dependency for scalar FP instructions with dynamic rounding mode.
In order to support constrained FP intrinsics we need to model FRM
dependency. Whether or not a instruction uses FRM is based on a 3
bit field in the instruction. Because of this we can't add
'Uses = [FRM]' to the tablegen descriptions.

This patch examines the immediate after isel and adds an implicit
use of FRM. This idea came from Roger Ferrer Ibanez.

Other ideas:
We could be overly conservative and just pretend all instructions with
frm field read the FRM register. Or we could have pseudoinstructions
for CodeGen with rounding mode.

Reviewed By: asb, frasercrmck, arcbbb

Differential Revision: https://reviews.llvm.org/D115555
2021-12-14 10:17:57 -08:00
Craig Topper d4d76409d1 [RISCV] Add mayRaiseFPException to RISCV scalar FP instructions.
FRM dependency will be added in a future patch.

Reviewed By: arcbbb

Differential Revision: https://reviews.llvm.org/D115540
2021-12-14 09:53:30 -08:00
Craig Topper 7598ac5ec5 [RISCV] Convert (splat_vector (load)) to vlse with 0 stride.
We already do this for splat nodes that carry a VL, but not for
splats that use VLMAX.

Reviewed By: frasercrmck

Differential Revision: https://reviews.llvm.org/D115483
2021-12-14 09:14:03 -08:00
Craig Topper 3cda38796c [RISCV] Add rs2 encoding to the FPUnaryOp_r and FPUnaryOp_r_frm template arguments.
Instead of having unary instruction include a 'let' in their class
body, add rs2val as a template parameter. Then we can use a let
in FPUnaryOp_r and FPUnaryOp_r_frm. This reduces the overall
verbosity of the FP files.

Reviewed By: achieveartificialintelligence

Differential Revision: https://reviews.llvm.org/D115537
2021-12-13 21:38:42 -08:00
Nelson Chu 10a71981e9 [RISCV] Support named opcodes in .insn directive.
This patch is one of the TODO of commit, 283879793d

We build the GenericTable for these opcodes, and also extend class RISCVOpcode, to store the names of opcodes.  Then we call the parseInsnDirectiveOpcode to parse the opcode filed in .insn directive.  We only allow users to write the recognized opcode names, or just write the immediate values in the 7 bits range.

Documentation: https://sourceware.org/binutils/docs-2.37/as/RISC_002dV_002dFormats.html

Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D115224
2021-12-13 20:59:33 -08:00
Fangrui Song a6a07a514b [MachineOutliner] Don't outline functions starting with PATCHABLE_FUNCTION_ENTER/FENTRL_CALL
MachineOutliner may outline a "patchable-function-entry" function whose body has
a TargetOpcode::PATCHABLE_FUNCTION_ENTER MachineInstr. This is incorrect because
the special code sequence must stay unchanged to be used at run-time.
Avoid outlining PATCHABLE_FUNCTION_ENTER. While here, avoid outlining FENTRY_CALL too
(which doesn't reproduce currently) to allow phase ordering flexibility.

Fixes #52635

Reviewed By: paquette

Differential Revision: https://reviews.llvm.org/D115614
2021-12-13 13:24:29 -08:00
Craig Topper b18b2a01ef [RISCV] Don't use VLMAX for start value splat in reduction lowering.
The reduction instructions only reads the first element. The
execution time for a splat may take longer with a larger VL.
We should use the smallest VL we can.

Reviewed By: frasercrmck, HsiangKai

Differential Revision: https://reviews.llvm.org/D115536
2021-12-13 09:06:42 -08:00
Craig Topper 80ed2f6b36 [RISCV] Share tablegen classes for F, D, and Zfh. Other simplifications. NFC
By adding the register class and funct as template parameters we
can share the classes with all 3 extensions.

I've used "let SchedRW =" to avoid repeating scheduler classes on
multiple lines where we previously inherited from the Sched class.

A subsequent patch will add mayRaiseFPException and FRM dependencies.
Reducing the number of classes means less repeating for those changes.

This of course conflicts with the Zfinx patch D93298.

Reviewed By: achieveartificialintelligence

Differential Revision: https://reviews.llvm.org/D115469
2021-12-10 09:35:51 -08:00
Craig Topper 5861cf77da [RISCV] Remove FCSR from RISCVRegisterInfo.
We only used this to mark it as a reserved register. But that's not
important if we don't do anything else with it.

I think if we were ever to do anything with it, we would need to
model it as a super register of FRM and FFLAGS. But it might be
easier to reference both FRM and FFLAGS in implicit defs/uses
for anything we were to do with "fcsr".

Reviewed By: sepavloff

Differential Revision: https://reviews.llvm.org/D115455
2021-12-10 09:24:13 -08:00
eopXD a4bf1b449d [RISCV] Unify depedency check and extension implication parsing logics
Originially there are two places that does parsing - `parseArchString` and
`parseFeatures`, each with its code on dependency check and implication.
This patch extracts common parts of the two  as functions of `RISCVISAInfo`
and let them 2 use it.

Reviewed By: asb

Differential Revision: https://reviews.llvm.org/D112359
2021-12-09 21:16:04 -08:00
Craig Topper 6f7de819b9 [RISCV] Use MULHU for more division by constant cases.
D113805 improved handling of i32 divu/remu on RV64. The basic idea
from that can be extended to (mul (and X, C2), C1) where C2 is any
mask constant.

We can replace the and with an SLLI by shifting by the number of
leading zeros in C2 if we also shift C1 left by XLen - lzcnt(C1)
bits. This will give the full product XLen additional trailing zeros,
putting the result in the output of MULHU. If we can't use ANDI,
ZEXT.H, or ZEXT.W, this will avoid materializing C2 in a register.

The downside is it make take 1 additional instruction to create C1.
But since that's not on the critical path, it can hopefully be
interleaved with other operations.

The previous tablegen pattern is replaced by custom isel code.

Reviewed By: asb

Differential Revision: https://reviews.llvm.org/D115310
2021-12-09 09:10:14 -08:00
Kito Cheng 39c861719b [RISCV] Fix vm operand constraint to fit GCC's behavior
- `vm` constraint is used for masking operand, which always v0.

- Update testcase, only masking operand should use `vm`, vector mask operations
  should just use `vr` for any vector register.

 - Revise the description of `vm` constraint.

- This patch also fix issue on RISCVRegisterInfo.td and RISCVISelLowering.cpp.

  RISCVRegisterInfo.td:
  - The first VT in the list must be the largest total size since the
    SelectionDAGBuilder uses the first register in the list as the canonical
    type for the register.

  RISCVISelLowering.cpp:
  - Fix RISCVTargetLowering::splitValueIntoRegisterParts and
    RISCVTargetLowering::joinRegisterPartsIntoValue for handling vectors
    with different total size, that will happened on fractional LMUL since
    fractional LMUL is always occupy one vector register.

Reviewed By: frasercrmck

Differential Revision: https://reviews.llvm.org/D112599
2021-12-09 14:46:49 +08:00
Craig Topper de8d26ac02 [RISCV] Improve tracking of EndLoc in the assembly parser.
The SMLoc::getFromPointer(S.getPointer() - 1) pattern used in
several place didn't make sense since that puts the End before the
Start location.

This patch corrects this to properly calculate the end location as
we parse. Unsure how much this matters, a lot of these are for custom
operand parsing. If the custom parsing succeeds, the instruction
matching probably won't fail, so the end loc won't be used to build
a range for a diagnostic.

I've also fixed the creation functions for the CSR and VType operands
to assign the EndLoc of the RISCVOperand class using the StartLoc
instead of an empty SMLoc. I don't think these will ever be accessed,
but it makes the code look less like we forgot to assign it. This is
consistent with some operands on the ARM target.

Reviewed By: luismarques

Differential Revision: https://reviews.llvm.org/D115192
2021-12-08 15:32:44 -08:00
Michael Berg 3e363f14e1 Revert "[RISCV] Add target specific loop unrolling and peeling preferences"
This reverts commit 8487981a72.
2021-12-07 15:13:42 -08:00
Michael Berg 8487981a72 [RISCV] Add target specific loop unrolling and peeling preferences
Both these preference helper functions have initial support with
this change. The loop unrolling preferences are set with initial
settings to control thresholds, size and attributes of loops to
unroll with some tuning done.  The peeling preferences may need
some tuning as well as the initial support looks much like what
other architectures utilize.

Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D113798
2021-12-07 15:06:42 -08:00
Craig Topper 622d689480 [RISCV] Revise RISCVInstPrinter::printVTypeI to not assume there are 3 invalid vtype bits.
Instead of checking [10:8]. Check for non-zero in 8 and above.

Addresses a post-commit comment from @jrtc27 in D114581.
2021-12-07 09:40:50 -08:00
Craig Topper 2a9b2444d9 [RISCV] Replace uses of RISCVOpcode<0b0010011> and RISCVOpcode<0b0011011> with existing named objects. NFC
These are already instantiated with names as OPC_OP_IMM and
OPC_OP_IMM_32.

Reviewed By: frasercrmck

Differential Revision: https://reviews.llvm.org/D115172
2021-12-07 08:07:14 -08:00
Evandro Menezes 2ba7698423 [RISCV] Add scheduling resources for Vector pseudo instructions.
Add the scheduling resources for the V extension pseudo instructions.

Authored-by: Evandro Menezes <evandro.menezes@sifive.com>

Differential Revision: https://reviews.llvm.org/D113353
2021-12-07 09:14:28 +08:00
Craig Topper acdbd34cfb [RISCV] Loosen some restrictions on lowering constant BUILD_VECTORs using vid.v.
The immediate size check on StepNumerator did not take into account
that vmul.vi does not exist. It also did not account for power of 2
constants that can be done with vshl.vi.

This patch fixes this by moving the conversion from mul to shift
further up. Then we can consider the immediates separately for MUL
vs SHL. For MUL I've allowed simm12 which requires a single addi
before a vmul.vx. For SHL I've allowed any uimm5 which works with
vshl.vi. We could relax these further in the future. This is a
starting point that allows us to emit the same number of instructions
we were already using for smaller numerators.

Reviewed By: frasercrmck

Differential Revision: https://reviews.llvm.org/D115081
2021-12-06 09:34:40 -08:00
Victor Perez 9eb7322748 [RISCV][VP] Add RVV codegen for vp.select
Lower vp.select instrinsic to VSELECT_VL.

Reviewed By: rogfer01

Differential Revision: https://reviews.llvm.org/D114629
2021-12-03 11:02:20 +00:00
Craig Topper 2f6beb7b0e [RISCV] Add inline expansion for vector ftrunc/fceil/ffloor.
This prevents scalarization of fixed vector operations or crashes
on scalable vectors.

We don't have direct support for these operations. To emulate
ftrunc we can convert to the same sized integer and back to fp using
round to zero. We don't need to do a convert if the value is large
enough to have no fractional bits or is a nan.

The ceil and floor lowering would be better if we changed FRM, but
we don't model FRM correctly yet. So I've used the trunc lowering
with a conditional add or subtract with 1.0 if the truncate rounded
in the wrong direction.

There are also missed opportunities to use masked instructions.

Reviewed By: frasercrmck

Differential Revision: https://reviews.llvm.org/D113543
2021-12-01 11:25:28 -08:00
Craig Topper d8f9eaad89 [RISCV] Teach RISCVTargetLowering::shouldSinkOperands to handle udiv/sdiv/urem/srem.
The V extension supports .vx instructions for integer division and
remainder so we should sink splats for that operand.
2021-11-30 18:47:51 -08:00
David Green 9e8a71caf0 [DAG] Create fptosi.sat from clamped fptosi
This adds a fold in DAGCombine to create fptosi_sat from sequences for
smin(smax(fptosi(x))) nodes, where the min/max saturate the output of
the fp convert to a specific bitwidth (say INT_MIN and INT_MAX). Because
it is dealing with smin(/smax) in DAG they may currently be ISD::SMIN,
ISD::SETCC/ISD::SELECT, ISD::VSELECT or ISD::SELECT_CC nodes which need
to be handled similarly.

A shouldConvertFpToSat method was added to control when converting may
be profitable. The original fptosi will have a less strict semantics
than the fptosisat, with less values that need to produce defined
behaviour.

This especially helps on ARM/AArch64 where the vcvt instructions
naturally saturate the result.

Differential Revision: https://reviews.llvm.org/D111976
2021-11-30 15:29:14 +00:00
Hans Wennborg a87782c34d Revert "[DAG] Create fptosi.sat from clamped fptosi"
It causes builds to fail with this assert:

llvm/include/llvm/ADT/APInt.h:990:
bool llvm::APInt::operator==(const llvm::APInt &) const:
Assertion `BitWidth == RHS.BitWidth && "Comparison requires equal bit widths"' failed.

See comment on the code review.

> This adds a fold in DAGCombine to create fptosi_sat from sequences for
> smin(smax(fptosi(x))) nodes, where the min/max saturate the output of
> the fp convert to a specific bitwidth (say INT_MIN and INT_MAX). Because
> it is dealing with smin(/smax) in DAG they may currently be ISD::SMIN,
> ISD::SETCC/ISD::SELECT, ISD::VSELECT or ISD::SELECT_CC nodes which need
> to be handled similarly.
>
> A shouldConvertFpToSat method was added to control when converting may
> be profitable. The original fptosi will have a less strict semantics
> than the fptosisat, with less values that need to produce defined
> behaviour.
>
> This especially helps on ARM/AArch64 where the vcvt instructions
> naturally saturate the result.
>
> Differential Revision: https://reviews.llvm.org/D111976

This reverts commit 52ff3b0093.
2021-11-30 15:36:56 +01:00
David Green 52ff3b0093 [DAG] Create fptosi.sat from clamped fptosi
This adds a fold in DAGCombine to create fptosi_sat from sequences for
smin(smax(fptosi(x))) nodes, where the min/max saturate the output of
the fp convert to a specific bitwidth (say INT_MIN and INT_MAX). Because
it is dealing with smin(/smax) in DAG they may currently be ISD::SMIN,
ISD::SETCC/ISD::SELECT, ISD::VSELECT or ISD::SELECT_CC nodes which need
to be handled similarly.

A shouldConvertFpToSat method was added to control when converting may
be profitable. The original fptosi will have a less strict semantics
than the fptosisat, with less values that need to produce defined
behaviour.

This especially helps on ARM/AArch64 where the vcvt instructions
naturally saturate the result.

Differential Revision: https://reviews.llvm.org/D111976
2021-11-30 11:05:32 +00:00
Ben Shi 29d4230d6b [RISCV] Decode vtype with reserved fields to raw immediate
This patch fixes a crash when doing "llvm-objdump -D --mattr=+experimental-v"
against an object file which happens to keep a word that can be decoded to
VSETVLI & VSETIVLI with reserved vlmul[2:0]=4. All vtype values with
reserved fields (vlmul[2:0]=4, vsew[2:0]=0b1xx, non-zero bits 8/9/10) are
printed to raw immediate.

Reviewed By: jhenderson, jrtc27, craig.topper

Differential Revision: https://reviews.llvm.org/D114581
2021-11-30 08:31:20 +00:00
Craig Topper b121d23a9c [RISCV] Promote f16 log/pow/exp/sin/cos/etc. to f32 libcalls.
Prevents crashes or cannot select errors.

Reviewed By: frasercrmck

Differential Revision: https://reviews.llvm.org/D113822
2021-11-29 18:49:11 -08:00
Hsiangkai Wang 9a88566537 [RISCV] Fix a bug in RISCVFrameLowering.
When we have out-going arguments passing through stack and we do not
reserve the stack space in the prologue. Use BP to access stack objects
after adjusting the stack pointer before function calls.

callseq_start  ->  sp = sp - reserved_space
//
// Use FP to access fixed stack objects.
// Use BP to access non-fixed stack objects.
//
call @foo
callseq_end    ->  sp = sp + reserved_space

Differential Revision: https://reviews.llvm.org/D114246
2021-11-30 10:39:35 +08:00
Hsiangkai Wang b0c7421524 [RISCV] Emit DWARF location expression for RVV stack objects.
VLENB is the length of a vector register in bytes. We use
<vscale x 64 bits> to represent one vector register. The dwarf offset is
VLENB * scalable_offset / 8.

For the mask vector, it occupies one vector register.

Differential Revision: https://reviews.llvm.org/D107432
2021-11-27 15:13:10 +08:00
Hsiangkai Wang 137d3474ca [RISCV] Reverse the order of loading/storing callee-saved registers.
Currently, we restore the return address register as the last restoring
instruction in the epilog. The next instruction is `ret` usually. It is
a use of return address register. In some microarchitectures, there is
load-to-use data hazard. To avoid the load-to-use data hazard, we could
separate the load instruction from its use as far as possible. In this
patch, we reverse the order of restoring callee-saved registers to
increase the distance of `load ra` and `ret` in the epilog.

Differential Revision: https://reviews.llvm.org/D113967
2021-11-22 23:02:11 +08:00
wangpc af0ecfccae [RISCV] Generate pseudo instruction li
Add an alias of `addi [x], zero, imm` to generate pseudo
instruction li, which makes assembly mush more readable.
For existed tests, users can update them by running script
`llvm/utils/update_llc_test_checks.py`.

Reviewed By: asb

Differential Revision: https://reviews.llvm.org/D112692
2021-11-22 14:01:37 +08:00
Philipp Tomsich af57a71d18 [RISCV] Don't call setHasMultipleConditionRegisters(), so icmp is sunk
On RISC-V, icmp is not sunk (as the following snippet shows) which
generates the following suboptimal branch pattern:
```
  core_list_find:
	lh	a2, 2(a1)
	seqz	a3, a0         <<
	bltz	a2, .LBB0_5
	bnez	a3, .LBB0_9    << should sink the seqz
        [...]
	j	.LBB0_9
  .LBB0_5:
	bnez	a3, .LBB0_9    << should sink the seqz
	lh	a1, 0(a1)
        [...]
```
due to an icmp not being sunk.

The blocks after `codegenprepare` look as follows:
```
  define dso_local %struct.list_head_s* @core_list_find(%struct.list_head_s* readonly %list, %struct.list_data_s* nocapture readonly %info) local_unnamed_addr #0 {
  entry:
    %idx = getelementptr inbounds %struct.list_data_s, %struct.list_data_s* %info, i64 0, i32 1
    %0 = load i16, i16* %idx, align 2, !tbaa !4
    %cmp = icmp sgt i16 %0, -1
    %tobool.not37 = icmp eq %struct.list_head_s* %list, null
    br i1 %cmp, label %while.cond.preheader, label %while.cond9.preheader

  while.cond9.preheader:                            ; preds = %entry
    br i1 %tobool.not37, label %return, label %land.rhs11.lr.ph
```
where the `%tobool.not37` is the result of the icmp that is not sunk.
Note that it is computed in the basic-block up until what becomes the
`bltz` instruction and the `bnez` is a basic-block of its own.

Compare this to what happens on AArch64 (where the icmp is correctly sunk):
```
  define dso_local %struct.list_head_s* @core_list_find(%struct.list_head_s* readonly %list, %struct.list_data_s* nocapture readonly %info) local_unnamed_addr #0 {
  entry:
    %idx = getelementptr inbounds %struct.list_data_s, %struct.list_data_s* %info, i64 0, i32 1
    %0 = load i16, i16* %idx, align 2, !tbaa !6
    %cmp = icmp sgt i16 %0, -1
    br i1 %cmp, label %while.cond.preheader, label %while.cond9.preheader

  while.cond9.preheader:                            ; preds = %entry
    %1 = icmp eq %struct.list_head_s* %list, null
    br i1 %1, label %return, label %land.rhs11.lr.ph
```

This is caused by sinkCmpExpression() being skipped, if multiple
condition registers are supported.

Given that the check for multiple condition registers affect only
sinkCmpExpression() and shouldNormalizeToSelectSequence(), this change
adjusts the RISC-V target as follows:
 * we no longer signal multiple condition registers (thus changing
   the behaviour of sinkCmpExpression() back to sinking the icmp)
 * we override shouldNormalizeToSelectSequence() to let always select
   the preferred normalisation strategy for our backend

With both changes, the test results remain unchanged.  Note that without
the target-specific override to shouldNormalizeToSelectSequence(), there
is worse code (more branches) generated for select-and.ll and select-or.ll.

The original test case changes as expected:
```
  core_list_find:
	lh	a2, 2(a1)
	bltz	a2, .LBB0_5
	beqz	a0, .LBB0_9    <<
        [...]
	j	.LBB0_9
.LBB0_5:
	beqz	a0, .LBB0_9    <<
	lh	a1, 0(a1)
        [...]
```

Differential Revision: https://reviews.llvm.org/D98932
2021-11-19 08:32:59 -08:00
Zi Xuan Wu 24d1673c8b [llvm-tblgen][RISCV] Make llvm-tblgen RISCVCompressInstEmitter to be common infra across different targets
Not only RISCV but also other target such as CSKY, there are compressed instructions mixed with normal instructions.
To reuse the basic infra to compress/uncompress and predict instruction, we need reconstruct the RISCVCompressInstEmitter
and make it more general and suitable for other target.

Differential Revision: https://reviews.llvm.org/D113475
2021-11-18 11:14:27 +08:00
Zarko Todorovski 5b8bbbecfa [NFC][llvm] Inclusive language: reword and remove uses of sanity in llvm/lib/Target
Reworded removed code comments that contain `sanity check` and `sanity
test`.
2021-11-17 21:59:00 -05:00
Craig Topper 0274be28d7 [RISCV] Lower vector CTLZ_ZERO_UNDEF/CTTZ_ZERO_UNDEF by converting to FP and extracting the exponent.
If we have a large enough floating point type that can exactly
represent the integer value, we can convert the value to FP and
use the exponent to calculate the leading/trailing zeros.

The exponent will contain log2 of the value plus the exponent bias.
We can then remove the bias and convert from log2 to leading/trailing
zeros.

This doesn't work for zero since the exponent of zero is zero so we
can only do this for CTLZ_ZERO_UNDEF/CTTZ_ZERO_UNDEF. If we need
a value for zero we can use a vmseq and a vmerge to handle it.

We need to be careful to make sure the floating point type is legal.
If it isn't we'll continue using the integer expansion. We could split the vector
and concatenate the results but that needs some additional work and evaluation.

Differential Revision: https://reviews.llvm.org/D111904
2021-11-17 10:29:41 -08:00
Jay Foad 3264e95938 [CodeGen] Update LiveIntervals in TargetInstrInfo::convertToThreeAddress
Delegate updating of LiveIntervals to each target's
convertToThreeAddress implementation, instead of repairing LiveIntervals
after the fact in TwoAddressInstruction::convertInstTo3Addr.

Differential Revision: https://reviews.llvm.org/D113493
2021-11-17 10:16:47 +00:00
jacquesguan 6405e8b584 [RISCV] Refactor some rvv instructions' definition with foreach.
Simplify rvv instructions that use eew in their mnemonic and encoding with foreach. And fix a scheduling bug.

Differential Revision: https://reviews.llvm.org/D113453
2021-11-16 15:20:45 +08:00
Craig Topper 391b0ba603 [RISCV] Override TargetLowering::hasAndNot for Zbb.
Differential Revision: https://reviews.llvm.org/D113937
2021-11-15 18:44:07 -08:00
Ben Shi 4c3d916c4b [RISCV] Optimize immediate materialisation with SH*ADD
Use LUI+SH*ADD+ADDI to compose specific immediates.

Reviewed By: craig.topper, luismarques

Differential Revision: https://reviews.llvm.org/D113568
2021-11-15 23:34:28 +00:00
Craig Topper f59307bfdc [RISCV] Teach needVSETVLIPHI to handle mask register instructions.
This handles the case where the mask register instruction input
comes from a Phi of vsetvlis. If the VLMAX is the same as the VLMAX
required by the mask register instruction, we can avoid a vsetvli.

Reviewed By: frasercrmck

Differential Revision: https://reviews.llvm.org/D113204
2021-11-15 09:57:28 -08:00
Craig Topper 02bed66cd5 [RISCV] Improve codegen for i32 udiv/urem by constant on RV64.
The division by constant optimization often produces constants that
are uimm32, but not simm32. These constants require 3 or 4 instructions
to materialize without Zba.

Since these instructions are often used by a multiply with a LHS
that needs to be zero extended with an AND, we can switch the MUL
to a MULHU by shifting both inputs left by 32. Once we shift the
constant left, the upper 32 bits no longer need to be 0 so constant
materialization is free to use LUI+ADDIW. This reduces the constant
materialization from 4 instructions to 3 in some cases while also
reducing the zero extend of the LHS from 2 shifts to 1.

Differential Revision: https://reviews.llvm.org/D113805
2021-11-12 14:49:10 -08:00
Craig Topper ee7a006ce4 [RISCV] Promote f16 ceil/floor/round/roundeven/nearbyint/rint/trunc intrinsics to f32 libcalls.
Previously these would crash. I don't think these can be generated
directly from C. Not sure if any optimizations can introduce them.

Reviewed By: asb

Differential Revision: https://reviews.llvm.org/D113527
2021-11-11 08:28:41 -08:00
Craig Topper 4183522e80 [RISCV] Promote f16 frem with Zfh.
Add riscv64 coverage for f32 and f64 frem.

Reviewed By: luismarques

Differential Revision: https://reviews.llvm.org/D113531
2021-11-10 17:35:07 -08:00
Craig Topper 9ee5cec688 [RISCV] Prevent bad legalizer behavior when bitcasting fixed vectors to i64 on RV32 with Zve32.
Similar to D113219, we need to make sure we don't create a vXi64
vector when it isn't legal. This fixes an error found by an
expensive checks build.
2021-11-10 11:58:49 -08:00
Craig Topper 57bc7b1089 [RISCV] Prevent crashes when bitcasting between fixed vectors and scalars.
Not all scalar element types are allowed in vectors so we may not
be able to bitcast to a 1 element vector to use insert/extract.

This will become a bigger issue when the Zve extensions are commited.
For now, I'm using the ELEN limit to limit the element types.

Reviewed By: frasercrmck

Differential Revision: https://reviews.llvm.org/D113219
2021-11-10 09:21:52 -08:00
Shao-Ce SUN 1c81941f19 [NFC][RISCV] Fix wrong predicates of vfwredsum 2021-11-09 17:19:50 +08:00
Craig Topper 376233113e [RISCV] Use TargetConstant for CSR number for READ_CSR/WRITE_CSR.
This is consistent with what we do for other operands that are required
to be constants.

I don't think this results in any real changes. The pattern match
code for isel treats ConstantSDNode and TargetConstantSDNode the same.
2021-11-08 15:10:24 -08:00
Craig Topper 304edbb553 [RISCV] SMUL_LOHI/UMUL_LOHI should expand for RVV.
These and MULHS/MULHU both default to Legal. Targets need to set
the ones they don't support to Expand.

I think MULHS/MULHU likely has priority in most places so this
change probably isn't directly testable. I found it while looking
at disabling MULHS/MULHU for nxvXi64 as required for Zve64x.

Reviewed By: frasercrmck

Differential Revision: https://reviews.llvm.org/D113325
2021-11-08 09:38:36 -08:00
Ben Shi e32cf690df [RISCV] Optimize (add (mul r, c0), c1)
Optimize (add (mul x, c0), c1) ->
         (add (mul (add x, c1/c0+1), c0), c1%c0-c0),
if c1/c0+1 and c1%c0-c0 are simm12, while c1 is not.

Optimize (add (mul x, c0), c1) ->
         (add (mul (add x, c1/c0-1), c0), c1%c0+c0),
if c1/c0-1 and c1%c0+c0 are simm12, while c1 is not.

Reviewed By: craig.topper, asb

Differential Revision: https://reviews.llvm.org/D111141
2021-11-08 02:58:25 +00:00
Bin Cheng 54d891a7d5 [RISCV]: Fix typo by abstracting VWholeLoad* classes
This patch abstracts VWholeLoad* classes into VWholeLoadN, simplifies
existing code as well as fixes a typo.

Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D109319
2021-11-06 10:48:03 +08:00
Bin Cheng d488f1fff2 [RISCV][NFC]: Refactor classes for load/store instructions of RVV
This patch refactors classes for load/store of V extension by:
- Introduce new class for VUnitStrideLoadFF and VUnitStrideSegmentLoadFF
  so that uses of L/SUMOP* are not spread around different places.
- Reorder classes for Unit-Stride load/store in line with table
  describing lumop/sumop in riscv-v-spec.pdf.

Reviewed By: HsiangKai, craig.topper

Differential Revision: https://reviews.llvm.org/D109318
2021-11-06 10:48:03 +08:00
Shao-Ce SUN 5c3d7184b4 [RISCV] Support Zfhmin extension
According to RISC-V Unprivileged ISA 15.6.

Reviewed By: asb

Differential Revision: https://reviews.llvm.org/D111866
2021-11-06 01:41:02 +08:00
Zakk Chen 0649dfebba [RISCV] Rename some assembler mnemonic and intrinsic functions for RVV 1.0.
Rename vpopc/vmandnot/vmornot to vcpop/vmandn/vmorn assembler mnemonic.

Reviewed By: frasercrmck, jrtc27, craig.topper

Differential Revision: https://reviews.llvm.org/D111062
2021-11-04 10:08:01 -07:00
Craig Topper 5022ac0771 [RISCV] Use HasVInstructions and HasVInstructionsAnyF in more place in TableGen. NFC
Change RISCVSubtarget.hasVInstructionAnyF() to call hasVInstructionsF32
so that any changes to hasVInstructionsF32 are reflected.

The files were missed in D112496.
2021-11-03 14:32:45 -07:00
Fraser Cormack d065b03801 [RISCV] Optimize vp.load with an all-ones mask
Similar to D110206, this patch optimizes unmasked vp.load intrinsics to
avoid the need of a vmset instruction to set the mask. It does so by
selecting a riscv_vle intrinsic rather than a riscv_vle_mask intrinsic.

Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D113022
2021-11-02 17:23:39 +00:00
David Callahan 4ec1b8eeac [RISCV] Fix invalid kill on callee save
A callee save may be live (specifically X1) on entry and so a spill
should not mark it killed.

Differential Revision: https://reviews.llvm.org/D111285
2021-11-02 11:56:54 +00:00
Craig Topper ada5458521 [RISCV] Expand scalable vector bswap. Fix crash for bitreverse.
Fix LegalizeVectorOps to not try shuffle or unrolling expansions for
scalable vectors.

Differential Revision: https://reviews.llvm.org/D112236
2021-10-31 10:01:27 -07:00
Craig Topper aefcd59895 [RISCV] Teach RISCVInsertVSETVLI::needVSETVLI to handle mask register instructions better.
If the VL operand of a mask register instruction comes from an
explicit vsetvli with a different VTYPE, we can still avoid needing
a vsetvli as long as the SEW/LMUL ratio is the same and policy bits
match.

Differential Revision: https://reviews.llvm.org/D112762
2021-10-29 09:49:36 -07:00
Hsiangkai Wang 7051f73d69 [RISCV] Sync Zvlsseg register order as the same as vector registers.
Sync the order of Zvlsseg registers with vector registers to avoid
unnecessary register copies between vector instructions and zvlsseg
instructions.

Differential Revision: https://reviews.llvm.org/D110250
2021-10-28 13:34:53 +08:00
Hsiangkai Wang 0a9b82960c [RISCV] Use vmv.v.[v|i] if we know COPY is under the same vl and vtype.
If we know the source operand of COPY is defined by a vector instruction
with tail agnostic and the same LMUL and there is no vsetvli between
COPY and the define instruction to change the vl and vtype, we could use
vmv.v.v or vmv.v.i to copy vector registers to get better performance than
the whole vector register move instructions.

If the source of COPY is from vmv.v.i, we could use vmv.v.i for the
COPY.

This patch only considers all these instructions within one basic block.

Case 1:
```
bb.0:
  ...
  VSETVLI          # The first VSETVLI before COPY and VOP.
  ...              # Use this VSETVLI to check LMUL and tail agnostic.
  ...
  vy = VOP va, vb  # Define vy.
  ...              # There is no vsetvli between VOP and COPY.
  vx = COPY vy
```

Case 2:
```
bb.0:
  ...
  VSETVLI          # The first VSETVLI before VOP.
  ...              # Use this VSETVLI to check LMUL and tail agnostic.
  ...
  vy = VOP va, vb  # Define vy.
  ...              # There is no vsetvli to change vl between VOP and COPY.
  ...
  VSETVLI          # The first VSETVLI before COPY.
  ...              # This VSETVLI does not change vl and vtype.
  ...
  vx = COPY vy
```

Co-Authored-by: Zakk Chen <zakk.chen@sifive.com>
Co-Authored-by: Kito Cheng <kito.cheng@sifive.com>

Differential Revision: https://reviews.llvm.org/D103510
2021-10-28 11:39:04 +08:00