Commit Graph

2266 Commits

Author SHA1 Message Date
Shao-Ce SUN 662b9fa02c [NFC][CodeGen] Add a setTargetDAGCombine use ArrayRef
Reviewed By: arsenm

Differential Revision: https://reviews.llvm.org/D122557
2022-03-29 09:53:24 +08:00
Craig Topper 01203918d1 [RISCV] Add computeKnownBits support for RISCVISD::GORC.
Reviewed By: luismarques

Differential Revision: https://reviews.llvm.org/D121575
2022-03-28 16:56:33 -07:00
Craig Topper e68257fcee [RISCV][SelectionDAG] Enable TargetLowering::hasBitTest for masks that fit in ANDI.
Modified DAGCombiner to pass the shift the bittest input and the shift amount
to hasBitTest. This matches the other call to hasBitTest in TargetLowering.h

This is an alternative to D122454.

Reviewed By: luismarques

Differential Revision: https://reviews.llvm.org/D122458
2022-03-28 12:46:36 -07:00
Craig Topper cfe533da05 [RISCV] Add lowering for vp.fptosi and vp.sitofp.
This as an alternative version of D120641. Starting from the code here
https://repo.hca.bsc.es/gitlab/rferrer/llvm-epi/-/raw/EPI/llvm/lib/Target/RISCV/RISCVISelLowering.cpp
but with some modifications to how the interim types are calculated,
and adding support for f16.

Still need to add fptosi for mask vectors.

Lots of masked isel patterns added so we can pass the mask through
the type changes.

Reviewed By: frasercrmck, arcbbb

Differential Revision: https://reviews.llvm.org/D122512
2022-03-28 11:06:41 -07:00
Kazu Hirata 6212871968 [Target] Apply clang-tidy fixes for readability-redundant-member-init (NFC) 2022-03-27 22:22:37 -07:00
Maksim Panchenko 4ae9745af1 [Disassember][NFCI] Use strong type for instruction decoder
All LLVM backends use MCDisassembler as a base class for their
instruction decoders. Use "const MCDisassembler *" for the decoder
instead of "const void *". Remove unnecessary static casts.

Reviewed By: skan

Differential Revision: https://reviews.llvm.org/D122245
2022-03-25 18:53:59 -07:00
Dávid Bolvanský 9a738c477e [NFCI] Fix set-but-unused warning in RISCVAsmParser.cpp 2022-03-24 08:33:40 +01:00
jacquesguan 8910ac400c [RISCV] Add patterns for vector widening integer multiply
Add patterns for vector widening integer multiply instructions

Differential Revision: https://reviews.llvm.org/D117385
2022-03-24 15:26:08 +08:00
Vasileios Porpodas 39aa202aff Recommit "[SLP] Fix lookahead operand reordering for splat loads." attempt 3, fixed assertion crash.
Original review: https://reviews.llvm.org/D121354

This reverts commit e6ead19b77.
2022-03-23 18:32:17 -07:00
Craig Topper 6c90a654bb [RISCV] Simplify some code in lowering vector int<->fp conversions. NFC
Don't call EltVT.getSizeInBits() or SrcEltVT.getSizeInBits() a second
time. They are already in EltSize or SrcEltSize variables.

Refactor some comparisons to use multiply instead of division.
2022-03-23 12:09:05 -07:00
Arthur Eubanks e6ead19b77 Revert "Recommit "[SLP] Fix lookahead operand reordering for splat loads." attempt 2, fixed assertion crash."
This reverts commit 27bd8f9492.

Causes crashes, see comments in D121973
2022-03-23 10:57:45 -07:00
luxufan 5800fb41a6 [RISCV] Remove check and update test file in D121183
Differential Revision: https://reviews.llvm.org/D122290
2022-03-24 00:48:52 +08:00
luxufan 227496dc09 [RISCV] Generate correct ELF EFlags when .ll file has target-abi attribute
In the past, when construct RISCVAsmBackend, MCTargetOptions.ABIName would be passed and stored in RISCVAsmBackend.
But MCTargetOptions.ABIName can only be specified by -target-abi xxx in command line, if the .ll file has target-abi attribute, the codegen module will ignore it. And the generated object file would have incorrect EFlags value.

https://github.com/llvm/llvm-project/issues/50591 also caused by this problem.

This patch override the AsmPrinter::emitFunctionEntryLabel function and use it to set the target abi value that get from .ll file's target-abi attribute. And storing the target-abi in RISCVTargetStreamer instead of RISCVAsmBackend.

Differential Revision: https://reviews.llvm.org/D121183
2022-03-24 00:48:52 +08:00
Vasileios Porpodas 27bd8f9492 Recommit "[SLP] Fix lookahead operand reordering for splat loads." attempt 2, fixed assertion crash.
Original review: https://reviews.llvm.org/D121354

This reverts commit f7d7d2a08d.
2022-03-22 16:41:55 -07:00
Arthur Eubanks f7d7d2a08d Revert "Recommit "[SLP] Fix lookahead operand reordering for splat loads.""
This reverts commit 79613185d3.

Causes crashes, see comments in https://reviews.llvm.org/D121973.
2022-03-22 13:33:49 -07:00
Craig Topper 51940d69cb [RISCV] Special case sign extended scalars when type legalizing nxvXi64 .vx instrinsics on RV32.
On RV32, we need to type legalize i64 scalar arguments to intrinsics.
We usually do this by splatting the value into a vector separately.
If the scalar happens to be sign extended, we can continue using a .vx
intrinsic.

We already special cased sign extended constants, this extends it
to any sign extended value.

I've only added tests for one case of vadd. Most intrinsics go
through the same check.

Reviewed By: khchen

Differential Revision: https://reviews.llvm.org/D122186
2022-03-22 10:29:06 -07:00
Craig Topper 9b0f227d7b [TableGen][RISCV] Add InstAliases with zero_reg to cover unmasked vnot.v, vncvt.x.x.w, vneg.v, etc.
The mask being NoRegister prevented the existing aliases from matching
since NoRegister isn't in the VMV0 register class.

To workaround this I've added new aliases that look for zero_reg.
I had to motify tablegen to generate matching code for zero_reg.
And as a consequence, I had to change the EmitPriority for an ARM
alias that used zero_reg that started printing.

Reviewed By: frasercrmck

Differential Revision: https://reviews.llvm.org/D121496
2022-03-22 10:14:43 -07:00
Zakk Chen 10fd2822b7 [RISCV] Add policy operand for masked compare and vmsbf/vmsif/vmsof IR
intrinsics.

Those operations are updated under a tail agnostic policy, but they
could have mask agnostic or undisturbed.

Reviewed By: rogfer01

Differential Revision: https://reviews.llvm.org/D120228
2022-03-22 07:47:21 -07:00
Zakk Chen 9ab18cc535 [RISCV] Add policy operand for masked vid and viota IR intrinsics.
Reviewed By: rogfer01

Differential Revision: https://reviews.llvm.org/D120227
2022-03-22 02:32:31 -07:00
Zakk Chen abb5a985e9 [RISCV] Support mask policy for RVV IR intrinsics.
Add the UsesMaskPolicy flag to indicate the operations result
would be effected by the mask policy. (ex. mask operations).

It means RISCVInsertVSETVLI should decide the mask policy according
by mask policy operand or passthru operand.
If UsesMaskPolicy is false (ex. unmasked, store, and reduction operations),
the mask policy could be either mask undisturbed or agnostic.
Currently, RISCVInsertVSETVLI sets UsesMaskPolicy operations default to
MA, otherwise to MU to keep the current mask policy would not be changed
for unmasked operations.

Add masked-tama, masked-tamu, masked-tuma and masked-tumu test cases.
I didn't add all operations because most of implementations are using
the same pseudo multiclass. Some tests maybe be duplicated in different
tests. (ex. masked vmacc with tumu shows in vmacc-rv32.ll and masked-tumu)
I think having different tests only for policy would make the testing
clear.

Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D120226
2022-03-22 01:19:16 -07:00
Yeting Kuo ecd7a0132a [RISCV] Add basic cost model for vector casting
To perform the cost model of vector casting, the patch consider most vector
casts as their scalar form and consider those vector form of free scalr castings
as 1.

Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D121771
2022-03-22 14:17:08 +08:00
Vasileios Porpodas 79613185d3 Recommit "[SLP] Fix lookahead operand reordering for splat loads."
Original review: https://reviews.llvm.org/D121354

The original commit 9136145eb0 broke the build on several targets.

Differential Revision: https://reviews.llvm.org/D121973
2022-03-21 15:57:32 -07:00
Craig Topper cc5b0868ff Revert "[RISCV] Special case sign extended scalars when type legalizing nxvXi64 .vx instrinsics on RV32."
This reverts commit 8c4937b33f.

Committed by mistake.
2022-03-21 14:58:11 -07:00
Craig Topper d4aeb5000f [RISCV] Simplify some code. NFC 2022-03-21 14:50:56 -07:00
Craig Topper 19de2e8db6 [RISCV] Remove stray slash from comment. NFC 2022-03-21 14:50:56 -07:00
Craig Topper 8c4937b33f [RISCV] Special case sign extended scalars when type legalizing nxvXi64 .vx instrinsics on RV32.
On RV32, we need to type legalize i64 scalar arguments to intrinsics.
We usually do this by splatting the value into a vector separately.
If the scalar happens to be sign extended, we can continue using a .vx
intrinsic.

We already special cased sign extended constants, this extends it
to any sign extended value.

I've only added tests for one case of vadd. Most intrinsics go
through the same check. I can add more tests if we're concerned.

Differential Revision: https://reviews.llvm.org/D122186
2022-03-21 14:50:55 -07:00
Mohammed Nurul Hoque 7afa44f5f5 [RISCV] Add more sign-extending ops to MIR sext.w pass.
This patch adds single-bit and bit-counting ops to list of sign-extending ops.

A single-bit write propagates sign-extendedness if it's not in the sign-bits.

Bit extraction and bit counting always outputs a small number, so sign-extended.

Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D121152
2022-03-18 18:21:17 +08:00
Jessica Clarke 63ea7797dd [RISCV] Fix buildbot breakage by explicitly instantiating templates
RISCVISelDAGToDAG's selectImm uses RISCVTargetLowering::getAddr
(specifically the ConstantPoolSDNode) as of 41454ab256 ("[RISCV] Use
constant pool for large integers"), but nothing explicitly instantiates
any of the templates, the only reason they exist is because of the
various lowering methods in RISCVISelLowering.cpp that themselves use
the methods. However, with inlining, those can end up not existing as
real functions and thus not be exported, leading to link errors. Up
until now this hasn't happened, but for whatever reason D121654 has
triggered this on the sanitizer-ppc64be-linux buildbot, giving:

  ../../../../lib/libLLVMRISCVCodeGen.a(RISCVISelDAGToDAG.cpp.o): In function `selectImm(llvm::SelectionDAG*, llvm::SDLoc const&, llvm::MVT, long, llvm::RISCVSubtarget const&)':
  RISCVISelDAGToDAG.cpp:(.text._ZL9selectImmPN4llvm12SelectionDAGERKNS_5SDLocENS_3MVTElRKNS_14RISCVSubtargetE+0x3d8): undefined reference to `llvm::SDValue llvm::RISCVTargetLowering::getAddr<llvm::ConstantPoolSDNode>(llvm::ConstantPoolSDNode*, llvm::SelectionDAG&, bool) const'
  collect2: error: ld returned 1 exit status

Fix this by explicitly instantiating getAddr in its four different forms
so separate translation units can reliably use it.

Fixes: 41454ab256 ("[RISCV] Use constant pool for large integers")
2022-03-18 02:22:17 +00:00
Craig Topper bbd2ecf9f0 [RISCV] Add +experimental-zvfh extension to cover half types in vectors.
Currently we allow half types in vectors if the scalar Zfh extension
is enabled. This behavior is not inline with the vector spec. For f32
and f64 types, the Zve32f, Zve64f, Zve64d, and V explicitly control
the availablity of floating point types in vectors.

In order to make our compiler compliant, we either need to remove all support
for half in vectors or we need an extension to control it.

Draft spec here https://github.com/riscv/riscv-v-spec/pull/780

Reviewed By: kito-cheng

Differential Revision: https://reviews.llvm.org/D121345
2022-03-17 10:04:02 -07:00
Craig Topper 7e15303062 [RISCV] Simplify scalable vector case in lowerVectorMaskExt.
Since we have SPLAT_VECTOR_PARTS these days, I don't think we need
to go through extra lengths to avoid introducing an illegal scalar type.
We can just call getConstant using the scalable vector type and let
it create either a SPLAT_VECTOR or a SPLAT_VECTOR_PARTS.

Reviewed By: frasercrmck, rogfer01

Differential Revision: https://reviews.llvm.org/D121645
2022-03-17 09:43:13 -07:00
Lian Wang 214afc7116 [RISCV] Add patterns for vnsrl.wi and vnsra.wi instructions
Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D121675
2022-03-17 07:22:32 +00:00
Lian Wang b26abcad81 [RISCV][NFC] Replace redundant code with VLOpFrag
Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D121783
2022-03-17 02:05:21 +00:00
Craig Topper 2e10671ec7 [RISCV] Improve detection of when to skip (and (srl x, c2) c1) -> (srli (slli x, c3-c2), c3) isel.
We have a special case to skip this transform if c1 is 0xffffffff
and x is sext_inreg in order to use sraiw+zext.w. But we were only
checking that we have a sext_inreg opcode, not how many bits are
being sign extended.

This commit adds a check that it is a sext_inreg from i32 so we know for
sure that an sraiw can be created.
2022-03-16 14:54:34 -07:00
Jessica Clarke 659363c0cc [RISCV] Ensure PseudoLA* can be hoisted
Since we mark the pseudos as mayLoad but do not provide any MMOs,
isSafeToMove conservatively returns false, stopping MachineLICM from
hoisting the instructions. PseudoLA_TLS_GD does not actually expand to a
load, so stop marking that as mayLoad to allow it to be hoisted, and for
the others make sure to add MMOs during lowering to indicate they're GOT
loads and thus can be freely moved.

Fixes https://github.com/llvm/llvm-project/issues/54372

Reviewed By: MaskRay, arichardson

Differential Revision: https://reviews.llvm.org/D121654
2022-03-16 18:45:36 +00:00
Shengchen Kan 37b378386e [NFC][CodeGen] Rename some functions in MachineInstr.h and remove duplicated comments 2022-03-16 20:25:42 +08:00
serge-sans-paille 989f1c72e0 Cleanup codegen includes
This is a (fixed) recommit of https://reviews.llvm.org/D121169

after:  1061034926
before: 1063332844

Discourse thread: https://discourse.llvm.org/t/include-what-you-use-include-cleanup
Differential Revision: https://reviews.llvm.org/D121681
2022-03-16 08:43:00 +01:00
Haocong.Lu 6a54776fe0 [RISCV] Select SRLI+SLLI for AND with leading ones mask
Select SRLI+SLLI for and i64 %x, imm if the imm is a leading ones mask.
It's useful in RV64 when the mask exceeds simm32 (cannot be generated by LUI).

Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D121598
2022-03-16 02:10:57 +00:00
Craig Topper 06c5d74090 [RISCV] Remove lowerSPLAT_VECTOR
This code handles fixed vector SPLAT_VECTOR, but is never called in
any tests.

We only form fixed vector splat vectors for vXi64 on RV32 as part
of DAGCombine. This will be type legalized to SPLAT_VECTOR_PARTS.
So the Custom handling for SPLAT_VECTOR is never needed.

This patch makes SPLAT_VECTOR for vXi64 'Legal' on RV32 so that
DAGCombine will create it, but there's no need for Custom handler.
It will still be type legalized to SPLAT_VECTOR_PARTS.

Reviewed By: frasercrmck

Differential Revision: https://reviews.llvm.org/D121673
2022-03-15 08:22:13 -07:00
Yeting Kuo ae7c6647f3 [RISCV] Add basic code modeling for fixed length vector reduction.
Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D121447
2022-03-14 11:04:31 +08:00
Craig Topper eeb3bfd74a [RISCV] Merge ReplaceNodeResults code for SHFL and GREV/GORC. NFC 2022-03-13 18:42:26 -07:00
Lehua Ding 1648852c98 [RISCV][RVV] Fix vslide1up/down intrinsics overflow bug for SEW=64 on RV32
Reviewed By: craig.topper, kito-cheng

Differential Revision: https://reviews.llvm.org/D120899
2022-03-13 18:06:09 +08:00
Craig Topper fd4d584d6b [RISCV] Add DAGCombine to fold (bitreverse (bswap X)) to brev8 with Zbkb.
If the type is less than XLenVT, type legalization will turn this
into (srl (bitreverse (bswap (srl (bswap X), C))), C). We can't
completely recover from these shifts. They introduce zeros into
the upper bits of the result and we can't easily tell if they are
needed. By doing a DAG combine early, we avoid introducing these
shifts.
2022-03-12 16:39:39 -08:00
Craig Topper 43f668b98e [RISCV] Move GORCIW/GREVIW formation to isel patterns.
Type legalize narrow RISCVISD::GREV/GORC with constant to a larger
type without switching to W. Detect sext_inreg+gorci/grevi with a
uimm5 immediate during isel to emit GREVIW/GORCIW.

This allows us to better propagate known bits information through
extended bits after type legalization. It will also simplify a
change I'm considering for BREV8 with Zbkb.

A future patch will add computeKnownBits support for GORC.

A further improvement here would be to use hasAllWUsers and
doPeepholeSExtW like we do for SLLIW, but I don't think we have
the test coverage for that yet.
2022-03-11 18:02:47 -08:00
Craig Topper d0969e485c [RISCV] Optimize vfmv.s.f intrinsic with scalar 0.0 to vmv.s.x with x0.
We already do this for RISCVISD::VFMV_S_F_VL and the vfmv.v.f
intrinsic.

Reviewed By: kito-cheng

Differential Revision: https://reviews.llvm.org/D121429
2022-03-11 10:05:43 -08:00
Craig Topper e9d4922543 [RISCV] Add tablegen helper classes to create PatFrag to check for one use. NFC
Reduces code and the class can be instantiated in isel patterns to
avoid creating more *_oneuse classes.
2022-03-10 23:14:21 -08:00
Craig Topper 337d49da84 [RISCV] Fix typo in comment. NFC 2022-03-10 22:00:18 -08:00
Eric Tang 336c92d5e8 [RISCV] Add alias for HFENCE.VVMA
Signed-off-by: Eric Tang <eric.tang@starfivetech.com>

Differential Revision: https://reviews.llvm.org/D120878
2022-03-11 13:32:52 +08:00
Craig Topper 1f3a8d58a6 [RISCV] Use ZERO_EXTEND instead of ANY_EXTEND when promoting i32 RISCVISD::SHFL. NFC
We know the shift amount is a constant with bit 31 clear. anyext
of constant will be either zext or sext which will produce the
same result here. But we really shouldn't rely on that. It would
be valid to put a random number in the upper bits. Our isel patterns
expect the upper bits to be 0 so we should ask for it explicitly.
2022-03-10 20:57:04 -08:00
Craig Topper 9ce6b1ca86 [RISCV] Remove performANY_EXTENDCombine.
This doesn't appear to be needed any more. I did some inspecting
of the gcc torture suite and SPEC2006 with this removed and didn't
find any meaningful changes.

I think we're more aggressive about forming ADDIW now using
sign_extend_inreg during type legalization and hasAllWUsers in isel.
This probably helps catch the cases this helped with before.
2022-03-10 11:29:31 -08:00
Craig Topper e0e8edf823 [RISCV] Add isel patterns for masked RISCVISD::FMA_VL with RISCVISD::FNEG_VL.
This helps us form vfnmsub, vfnmadd, and vfmusb from masked VP
intrinsics.

I've used "srcvalue" for the mask parameter in the fneg nodes. We
can't match "V0" because that doesn't ensure the mask the is the same.
Instead it matches two different nodes and generates two copies to
V0 of those separate values.

Reviewed By: rogfer01

Differential Revision: https://reviews.llvm.org/D120287
2022-03-10 10:05:42 -08:00
Nico Weber a278250b0f Revert "Cleanup codegen includes"
This reverts commit 7f230feeea.
Breaks CodeGenCUDA/link-device-bitcode.cu in check-clang,
and many LLVM tests, see comments on https://reviews.llvm.org/D121169
2022-03-10 07:59:22 -05:00
serge-sans-paille 7f230feeea Cleanup codegen includes
after:  1061034926
before: 1063332844

Differential Revision: https://reviews.llvm.org/D121169
2022-03-10 10:00:30 +01:00
Luke 0803dba7dd [RISCV] Add fixed-length vector instrinsics for segment load
Inspired by reviews.llvm.org/D107790.

Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D119834
2022-03-10 16:23:40 +08:00
Craig Topper d53707508a [RISCV] Remove RISCVISD::VLE_VL/VSE_VL. Use intrinsics instead.
Similar to what we do for other loads/stores, use the intrinsic
version that we already have custom isel for.

Reviewed By: rogfer01

Differential Revision: https://reviews.llvm.org/D121166
2022-03-09 22:44:28 -08:00
Craig Topper edd6632127 [RISCV] Support 'generic' as a valid CPU name.
Most other targets support 'generic', but RISCV issues an error.
This can require a special case in tools that use LLVM that aren't
clang.

This patch treats "generic" the same as an empty string and remaps
it to generic-rv/rv64 based on the triple. Unfortunately, it has to
be added to RISCV.td because MCSubtargetInfo is constructed and
parses the CPU before RISCVSubtarget's constructor gets a chance
to remap it. The CPU will then reparsed and the state in the
MCSubtargetInfo subclass will be updated again.

Fixes PR54146.

Reviewed By: khchen

Differential Revision: https://reviews.llvm.org/D121149
2022-03-09 16:43:22 -08:00
Shao-Ce SUN 365c858a5d [RISCV] Share PatFprFpr classes for F, D, and Zfh
Inspired by D115469

Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D121066
2022-03-08 13:02:04 +08:00
jacquesguan e55b9b0d0a [RISCV] Add patterns for vector widening floating-point reduction instructions.
Add patterns for vector widening floating-point reduction instructions.

Differential Revision: https://reviews.llvm.org/D120390
2022-03-08 10:53:49 +08:00
Craig Topper 845bfcede1 [RISCV] Rename 'SplatOperand' to 'ScalarOperand'. NFC
vslide1up/down have this flag set, but the value isn't a splat.
Rename for clarity.

Reviewed By: khchen

Differential Revision: https://reviews.llvm.org/D121037
2022-03-07 11:28:32 -08:00
Zakk Chen 3be907621f [RISCV] Fix incorrect optimization for masked vmsgeu.vi with 0 immediate.
vmsgeu.vi with 0 is always true, but in the masked with mask undisturbed
policy, we still need to keep inactive elelemt which come from maskedoff.

We could return mask directly if it's mask agnostic policy in the future.

Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D121080
2022-03-06 19:22:35 -08:00
Benjamin Kramer fbce4a7803 Drop some more global std::maps. NFCI. 2022-03-06 13:28:29 +01:00
Craig Topper bd5f124716 [RISCV] Add SimplifyDemandedBits support for FSR/FSL/FSRW/FSLW. 2022-03-05 21:26:51 -08:00
Zakk Chen 33b61c5678 [RISCV] Fix incorrect codegen introduced by D119688.
We should not emit a tail agnostic vlse for a tail undisturbed vmv.s.x

In D119688:
-    if (IsScalarMove && !Node->getOperand(0).isUndef())
+    bool HasPassthruOperand = Node->getOpcode() != ISD::SPLAT_VECTOR;
+    if (HasPassthruOperand && !IsScalarMove &&
!Node->getOperand(0).isUndef())
       break;

The IsScalarMove check in the if statement had been changed.

Differential Revision: https://reviews.llvm.org/D120963
2022-03-05 06:10:26 -08:00
Craig Topper 1e569e3b7b [RISCV] Add CMOV isel pattern for (select (setgt X, -1), Y, Z)
setgt X, -1 is the canonical form of setge X, 0. We can swap the
select operands and use setlt X, X0 when selecting CMOV. This
avoid materializing the -1 in a register.
2022-03-04 22:35:13 -08:00
Craig Topper 232f57319d [RISCV] Move vslide1up/down intrinsics into lowerVectorIntrinsicSplats. NFC
Rename to lowerVectorIntrinsicScalars.

This allows us to share the code that checks if the scalar needs
to be type legalized.
2022-03-04 18:21:53 -08:00
Craig Topper 3d4e83f17d [RISCV] With Zbb, fold (sext_inreg (abs X)) -> (max X, (negw X))
With Zbb, abs is expanded to (max X, neg) by default. If X has 33 or
more sign bits, we can expand it a little early using negw instead of
neg to save a sext_inreg. If X started as a 32 bit value, type
legalization would have inserted a sext before the abs so X having
33 sign bits should always be true.

Note: I've used ISD::FREEZE here since we increase the number of uses.
Our default expansion for ABS doesn't do that, but I think that's a bug.

We can't do this with custom type legalization because ISD::FREEZE
doesn't propagate sign bits so later DAG combine won't expand be
able to see optmize it.

Alives2 https://alive2.llvm.org/ce/z/Gx3RNe

Reviewed By: spatel

Differential Revision: https://reviews.llvm.org/D120597
2022-03-03 15:42:29 -08:00
Alex Tsao 89f15fc687 [RISCV] Add cost modelling for masked memory op
The patch adds very basic cost model for masked memory op on scalable vector.

Reviewed By: frasercrmck

Differential Revision: https://reviews.llvm.org/D117884
2022-03-03 20:47:58 +08:00
jacquesguan 44a430354d [RISCV] Fold store of vmv.f.s to a vse with VL=1.
This patch support the FP part of D109482.

Differential Revision: https://reviews.llvm.org/D120235
2022-03-03 16:35:19 +08:00
Craig Topper 6cb42cd666 [RISCV] More correctly ignore Zfinx register classes in getRegForInlineAsmConstraint.
Until Zfinx is supported in CodeGen we need to convert all Zfinx
register classes to GPR.

Remove the zfinx-types.ll test which didn't test anything meaningful
since -mattr=zfinx isn't implemented completely in llc.

Follow up to D93298.
2022-03-02 11:22:46 -08:00
Craig Topper a1f8349d77 [RISCV] Don't combine ROTR ((GREV x, 24), 16)->(GREV x, 8) on RV64.
This miscompile was introduced in D119527.

This was a special pattern for rotate+bswap on RV32. It doesn't
work for RV64 since the rotate needs to be half the bitwidth. The
equivalent pattern for RV64 is ROTR ((GREV x, 56), 32) so match
that instead.

This could be generalized further as noted in the new FIXME.

Reviewed By: Chenbing.Zheng

Differential Revision: https://reviews.llvm.org/D120686
2022-03-02 09:47:06 -08:00
Nikita Popov 98cfcae4e9 Revert "[RISCV] Add cost modelling for masked memory op"
This reverts commit 76f243b53b.

The newly added test fails.
2022-03-02 17:32:10 +01:00
Alex Tsao 76f243b53b [RISCV] Add cost modelling for masked memory op
The patch adds very basic cost model for masked memory op on scalable vector.

Reviewed By: frasercrmck

Differential Revision: https://reviews.llvm.org/D117884
2022-03-02 22:48:41 +08:00
Shao-Ce SUN 0e38b29543 [RISCV] add the MC layer support of Zfinx extension
This patch added the MC layer support of Zfinx extension.

Authored-by: StephenFan
Co-Authored-by: Shao-Ce Sun

Reviewed By: asb

Differential Revision: https://reviews.llvm.org/D93298
2022-03-02 14:25:19 +08:00
Mircea Trofin cb2160760e [nfc][codegen] Move RegisterBank[Info].h under CodeGen
This wraps up from D119053. The 2 headers are moved as described,
fixed file headers and include guards, updated all files where the old
paths were detected (simple grep through the repo), and `clang-format`-ed it all.

Differential Revision: https://reviews.llvm.org/D119876
2022-03-01 21:53:25 -08:00
Craig Topper b9d6e8c441 [RISCV] Lower VECTOR_SPLICE to RVV instructions.
This lowers VECTOR_SPLICE of scalable vectors to a slidedown follow by a slideup.
Fixed vectors are encouraged to use shufflevector instruction. The equivalent patch
for fixed vectors is D119039.

I've used a tail agnostic slidedown and limited the VL to only the
elements that will not be overwritten by the slideup. The slideup
uses VLMax for its VL. It unfortunately uses tail undisturbed policy
but it isn't required as there is no tail. We just need the merge
operand to carry the bits for the lower portion of the result.

Care was taken to ensure that either the slideup or slidedown will
be able to use a .vi instruction when the immediate is small. Which
one uses the immediate depends on the sign of the immediate.

Reviewed By: frasercrmck, ABataev

Differential Revision: https://reviews.llvm.org/D119303
2022-03-01 10:10:13 -08:00
Lian Wang db85cd729a [RISCV] Add FMV_W_X and FMV_H_X instrutions to hasAllNBitUsers
Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D120699
2022-03-01 08:13:59 +00:00
lian wang 5d91a8a707 [RISCV] Add schedule class for Zbp extension and Zbr extension
Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D120012
2022-03-01 07:35:59 +00:00
Lian Wang e2c150ab52 [RISCV][NFC] Move defined non_imm12 to proper place in RISCVInstrInfoZb.td
Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D120656
2022-03-01 01:45:30 +00:00
Craig Topper e83db8c001 [RISCV] Only enable combineROTR_ROTL_RORW_ROLW with Zbp.
I think the immediate values we check for on the GREV nodes already
protect this, but better to be explicit.
2022-02-28 12:47:36 -08:00
Craig Topper b083157b7b [RISCV] Don't call combineROTR_ROTL_RORW_ROLW for SLLW/SRLW/SRAW nodes. NFC
I think the function does the correct thing internally, but it's
confusing to read.
2022-02-28 11:05:10 -08:00
Craig Topper f46890711f [RISCV] Custom type legalize i32 ISD::ABS on RV64 without Zbb.
Default type legalization will create sext_inreg+abs, but we may
not be able to remove the sext_inreg.

Instead this patch expands abs during type legalization to
Y = sraiw X, 31; subw(xor X, Y), Y) which doesn't require the input
to be sign extended.

This gives a big improvement for some neg-abs tests where the
abs is used more than the the neg. Previously the abs was expanded
a different way before and after type legalization. Now they are
expanded in a similar way enabling more CSE.

Reviewed By: asb

Differential Revision: https://reviews.llvm.org/D120636
2022-02-28 09:30:27 -08:00
eric.tang b496a172e4 [RISCV] Support hypervisor extention instructions
According to privileged spec version-20211203

    Add the following hypervisor instructions:
        - HLV.B HLV.BU
        - HLV.H HLV.HU HLVX.HU
        - HLV.W HLV.WU HLVX.WU
        - HLV.D
        - HSV.B HSV.H HSV.W HSV.D

Signed-off-by: eric.tang <eric.tang@starfivetech.com>

Differential Revision: https://reviews.llvm.org/D117733
2022-02-28 14:02:43 +08:00
eric.tang 386c5be92a [RISCV] Support Sinval extension and hypervisor memory management fence instructions
According to Privileged spec version-20211203

    Add Supervisor Memory-Management Instructions:
        - SINVAL.VMA, SFENCE.W.INVAL, SFENCE.INVAL.IR
    Add Hypervisor Memory-Management Instructions:
        - HFENCE.VVMA, HFENCE.GVMA, HINVAL.VVMA, HINVAL.GVMA

Signed-off-by: eric.tang <eric.tang@starfivetech.com>

Differential Revision: https://reviews.llvm.org/D117654
2022-02-28 14:02:43 +08:00
Eric Tang cf80ef1393 [RISCV] Change GPRMemAtomic to GPRMemZeroOffset for general usage
Not only some AMO instructions but also other instructions need to
    process (${gpr}) or 0(${gpr}), where the 0 is be silently ignored.

    This patch does some changes for general usage.

Signed-off-by: Eric Tang <eric.tang@starfivetech.com>

Differential Revision: https://reviews.llvm.org/D120017
2022-02-28 14:02:43 +08:00
Chenbing Zheng 7f811ce127 [RISCV] Optimize (sext.w, srli) to sraiw with Zba.
In this patch, we add a more narrower exclusion for
zeroext (srl x) -> srli (slli x), so that it provides an opportunity
for the selection of sraiw.

Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D120467
2022-02-28 10:34:35 +08:00
Jessica Clarke 6aa8521fdb [RISCV] Fix parseBareSymbol to not double-parse top-level operators
By failing to lex the token we end up both parsing it as a binary
operator ourselves and parsing it as a unary operator when calling
parseExpression on the RHS. For plus this is harmless but for minus this
parses "foo - 4" as "foo - -4", effectively treating a top-level minus
as a plus.

Fixes https://github.com/llvm/llvm-project/issues/54105

Reviewed By: asb, MaskRay

Differential Revision: https://reviews.llvm.org/D120635
2022-02-27 20:48:52 +00:00
Jameson Nash c4b1a63a1b mark getTargetTransformInfo and getTargetIRAnalysis as const
Seems like this can be const, since Passes shouldn't modify it.

Reviewed By: wsmoses

Differential Revision: https://reviews.llvm.org/D120518
2022-02-25 14:30:44 -05:00
Haocong.Lu 865fe131f8 [RISCV] Fix a mistake in PostprocessISelDAG
With the condition N->use_empty(), the root node of DAG always
misses peephole optimization. So a dummy node is needed.

Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D119934
2022-02-25 12:38:31 +00:00
Chenbing Zheng b20e80aa59 [RISCV] DAG Combine vcpop and vfirst with VL=0 to li imm
vcpop and vfirst are still useful when VL=0.
vcpop equivalents to li 0 and vfirst equivalents to li -1,
since no mask elements are active.

Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D120302
2022-02-25 14:44:25 +08:00
Zakk Chen 4e115b7d88 [RISCV] Update computeTargetABI from llc as well as clang
Clang computes the default ABI if -mabi is empty
and encode it in LLVM IR module flag since D105555.
For correctness, llc need to give the same target-abi
(Options.MCOptions.ABIName) with ABI encoded in IR.
The getSubtargetImpl already has a check for them only if
Options.MCOptions.ABIName is not empty.

In order to get more robustness we could have a check for
explicit ABI, but now we have two different logic to
compute the default ABI.

The front-end ABI is defautl to the ilp32/ilp32e/lp64, and
ilp32d/lp64d when hardware support for extension D.
The backend ABI is default to the ilp32/ilp32e/lp64.

Reviewed by: asb, jrtc27

Differential Revision: https://reviews.llvm.org/D118333
2022-02-24 21:55:44 -08:00
lian wang f37d21ed20 [RISCV] Add schedule class for Zbt extension
Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D119808
2022-02-25 01:57:20 +00:00
Qihan Cai 0d058ed3d6 [RISCV] Change rvv version to 1.0 and remove ratify notice
This patch changes the version of V extension from 0.1 to 1.0 in RISCVInstrInfoVPseudos.td, RISCVInstrInfoVSDPatterns.td,  RISCVInstrInfoVVLPatterns.td, RISCVInstrInfoV.td

Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D120525
2022-02-25 11:38:20 +11:00
Craig Topper 506ac29632 [RISCV] Add 'i64' to some isel so tablegen will remove them for RV32. NFC
Saves a 100 bytes or so from the isel table.
2022-02-24 15:10:05 -08:00
Craig Topper a975ca97c3 [RISCV] Fold (sext_inreg (fmv_x_anyexth X), i16) -> (fmv_x_signexth X).
Add a new ISD opcode to represent the sign extending behavior of
vmv.x.h. Keep the previous anyext opcode to allow the existing
(fmv_x_anyexth (fmv_h_x X)) combine to keep working without needing
to generate a sign extend.

For fmv.x.w we are able to match the sext_inreg in an isel pattern,
but a 16-bit sext_inreg is lowered to a shift pair before isel. This
seemed like a larger match than we should do in isel.

Reviewed By: asb

Differential Revision: https://reviews.llvm.org/D118974
2022-02-24 09:19:01 -08:00
Shao-Ce SUN 78b5f0fb05 [NFC][RISCV] Reuse ISD::NodeType in float extension
Reviewed By: asb

Differential Revision: https://reviews.llvm.org/D120412
2022-02-24 19:57:55 +08:00
Nikita Popov c7fe6f9c92 Revert "[RISCV] add the MC layer support of Zfinx extension"
This reverts commit 7798ecca9c.

As reported in https://reviews.llvm.org/D93298#3331641 and
following, this causes assertion failures with inline assembly.
2022-02-24 12:14:31 +01:00
lian wang e1d4d1c242 [RISCV] Add schedule class for Zbm and Zbe extension
Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D119805
2022-02-24 08:49:25 +00:00
Chenbing.Zheng 2ae92e19eb [RISCV][NFC] Add helper function isVectorConfigInstr to reduce Repeated code.
Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D119924
2022-02-24 05:59:12 +00:00
Craig Topper 5b7ac107b1 [RISCV] Use SelectionDAG::getFreeze to simplify some code. NFC 2022-02-23 21:13:01 -08:00
Craig Topper c7d6448d03 [DAGCombiner][TargetLowering] Pass SDValue by value to isMulAddWithConstProfitable.
Internally to DAGCombiner the SDValues were passed by non-const
reference despite not being modified. They were then passed by
const reference to TLI.

This patch passes them by value which is consistent with the vast
majority of code.

Reviewed By: RKSimon

Differential Revision: https://reviews.llvm.org/D120420
2022-02-23 12:40:45 -08:00
Alex Bradbury c5bcfb983e [RISCV] Avoid infinite loop between DAGCombiner::visitMUL and RISCVISelLowering::transformAddImmMulImm
See https://github.com/llvm/llvm-project/issues/53831 for a full discussion.

The basic issue is that DAGCombiner::visitMUL and
RISCVISelLowering;:transformAddImmMullImm get stuck in a loop, as the
current checks in transformAddImmMulImm aren't sufficient to avoid all
cases where DAGCombiner::isMulAddWithConstProfitable might trigger a
transformation. This patch makes transformAddImmMulImm bail out if C0
(the constant used for multiplication) has more than one use.

Differential Revision: https://reviews.llvm.org/D120332
2022-02-23 11:05:46 +00:00
jacquesguan 5acd9c49a8 [RISCV] Add patterns for vector widening integer reduction instructions
Add patterns for vector widening integer reduction instructions.

Differential Revision: https://reviews.llvm.org/D117643
2022-02-22 14:14:05 +08:00
Zakk Chen f7dfc5d1af [RISCV] Optimize tail agnostic vmv.s.x which don't need to select tail value.
Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D120250
2022-02-21 14:53:37 -08:00
Craig Topper 90d240553d [RISCV] Teach shouldSinkOperands to sink splat operands of vp.fma intrinsics.
Reviewed By: frasercrmck

Differential Revision: https://reviews.llvm.org/D120167
2022-02-21 11:52:59 -08:00
Lian Wang 4abe484525 [RISCV][NFC] Add sched for some instructions in Zb extension
Add sched to brev8, zip and unzip instruction.

Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D120009
2022-02-21 09:58:08 +08:00
Zakk Chen 17d5ba5bc7 [RISCV][NFC] Remove unused multiclass def. 2022-02-18 23:58:56 -08:00
Craig Topper 5489969550 [RISCV] Add IsRV32 to the isel pattern for ZIP_RV32/UNZIP_RV32. NFC
I think the i32 in the pattern prevents this from matching on RV64,
but using IsRV32 is safer.

Add tests for RV64 to make sure we don't print zip or unzip
because we incorrectly picked ZIP_RV32/UNZIP_RV32.
2022-02-18 22:38:14 -08:00
Zakk Chen ca78312407 [RISCV] Add the policy operand for nomask vector Multiply-Add IR intrinsics.
The goal is support tail and mask policy in RVV builtins.
We focus on IR part first.

The nomask vector Multiply-Add need a policy operand
because merge value could not be undef.

Reviewed By: monkchiang

Differential Revision: https://reviews.llvm.org/D119727
2022-02-17 09:12:46 -08:00
Craig Topper bbee9e77f3 [RISCV] Match shufflevector corresponding to slideup.
This generalizes isElementRotate to work when there's only a single
slide needed. I've removed matchShuffleAsSlideDown which is now
redundant.

Reviewed By: frasercrmck, khchen

Differential Revision: https://reviews.llvm.org/D119759
2022-02-17 08:19:10 -08:00
Craig Topper 954fe404ab [RISCV] Fix incorrect MemOperand copy converting splat+load to vlse.
Due to an incorrect copy/paste from load intrinsic handling we
checked if the splat node was a MemSDNode which of course it isn't.

Instead get the MemOperand from the LoadSDNode for the source of
the splat.

This enables LICM to see the load is loop invariant and hoist it
out of the loop.

Reviewed By: frasercrmck

Differential Revision: https://reviews.llvm.org/D120014
2022-02-17 08:15:50 -08:00
Zakk Chen eeb7754f68 [RISCV] Add the passthru operand for vmv.vv/vmv.vx/vfmv.vf IR intrinsics.
Add the passthru operand for
VMV_V_X_VL, VFMV_V_F_VL and SPLAT_VECTOR_SPLIT_I64_VL also.

The goal is support tail and mask policy in RVV builtins.
We focus on IR part first.
If the passthru operand is undef, we use tail agnostic, otherwise
use tail undisturbed.

Reviewed By: rogfer01

Differential Revision: https://reviews.llvm.org/D119688
2022-02-17 06:38:14 -08:00
Shao-Ce SUN 7798ecca9c [RISCV] add the MC layer support of Zfinx extension
This patch added the MC layer support of Zfinx extension.

Authored-by: StephenFan
Co-Authored-by: Shao-Ce Sun

Reviewed By: asb

Differential Revision: https://reviews.llvm.org/D93298
2022-02-17 21:54:13 +08:00
Zakk Chen 093ecccdab [RISCV] Add the passthru operand for vadc/vsbc/vmerge/vfmerge IR intrinsics.
The goal is support tail and mask policy in RVV builtins.
We focus on IR part first.
If the passthru operand is undef, we use tail agnostic, otherwise
use tail undisturbed.

Reviewed By: rogfer01

Differential Revision: https://reviews.llvm.org/D119686
2022-02-17 02:21:39 -08:00
Ben Shi 0b93e90971 Revert "[RISCV] LUI used for address computation should not isAsCheapAsAMove"
This reverts commit 23a5073600.

Although this patch achieved better codegen in most cases, it is really
important to accurately describe the cost of instructions. So I revert it.
2022-02-17 17:27:37 +08:00
Jessica Paquette 67ab4c010b [MachineOutliner] NFC: Update LRU stuff for RISCV
I missed it in my grep. Fixes broken buildbot.`
2022-02-16 12:01:59 -08:00
Jessica Paquette 6d58f4ab07 [MachineOutliner] NFC: Hide LRU-related stuff behind helper functions
It's not particularly user-friendly to have to call `initLRU` everywhere. Also,
it wasn't particularly great that the LRU for registers used in a sequence was
also initialized by `initLRU`.

This patch hides this stuff behind some helper functions:

* `isAvailableAcrossAndOutOfSeq`
* `isAnyUnavailableAcrossOrOutOfSeq`
* `isAvailableInsideSeq`

This allows the user to avoid calling `initLRU` explicitly. Also, it allows
us to separate initializing the used-in-sequence LRU from the main LRU.

Since both ARM and AArch64 check LR liveness in `insertOutlinedCall`, this
refactor requires that we de-const the Candidate there.

Some other quality-of-code improvements:

* LRUs in outliner::Candidate now have more descriptive names
* Use `Register` instead of `unsigned` in some places
* Improve readability in some places by using ranges rather than `std::for_each`

This is a preparatory commit for a larger compile time related change for the
AArch64 outliner.
2022-02-16 11:39:07 -08:00
Craig Topper cfbbcc544c [RISCV] Improve lowering of SHL_PARTS/SRL_PARTS/SRA_PARTS.
Part of the shift lowering creates a (sub XLEN-1, ShAmt). When this
value is used we know that ShAmt is [0..XLEN-1]. Since XLEN is a power
of 2 we can replace the sub with an xor. This allows us to use XORI
instead of LI+SUB.

Reviewed By: asb

Differential Revision: https://reviews.llvm.org/D119411
2022-02-16 09:22:11 -08:00
Zakk Chen e8973dd389 [RISCV] Add the passthru operand for some RVV nomask unary and nullary intrinsics.
The goal is support tail and mask policy in RVV builtins.
We focus on IR part first.
If the passthru operand is undef, we use tail agnostic, otherwise
use tail undisturbed.

My plan is to handle more complex operations in follow-up patches.

Reviewers: frasercrmck

Differential Revision: https://reviews.llvm.org/D118253
2022-02-15 22:34:06 -08:00
Shao-Ce SUN 2aed07e96c [NFC][MC] remove unused argument `MCRegisterInfo` in `MCCodeEmitter`
Reviewed By: skan

Differential Revision: https://reviews.llvm.org/D119846
2022-02-16 13:10:09 +08:00
Shao-Ce SUN 9cc49c1951 Revert "[NFC][MC] remove unused argument `MCRegisterInfo` in `MCCodeEmitter`"
This reverts commit fe25c06cc5.
2022-02-16 11:57:49 +08:00
Shao-Ce SUN fe25c06cc5 [NFC][MC] remove unused argument `MCRegisterInfo` in `MCCodeEmitter`
For ten years, it seems that `MCRegisterInfo` is not used by any target.

Reviewed By: skan

Differential Revision: https://reviews.llvm.org/D119846
2022-02-16 11:47:17 +08:00
Zakk Chen b784719904 [RISCV] Add the passthru operand for RVV nomask binary intrinsics.
The goal is support tail and mask policy in RVV builtins.
We focus on IR part first.
If the passthru operand is undef, we use tail agnostic, otherwise
use tail undisturbed.

Add passthru operand for VSLIDE1UP_VL and VSLIDE1DOWN_VL to support
i64 scalar in rv32.

The masked VSLIDE1 would only emit mask undisturbed policy regardless
of giving mask agnostic policy until InsertVSETVLI supports mask agnostic.

Reviewed by: craig.topper, rogfer01

Differential Revision: https://reviews.llvm.org/D117989
2022-02-15 18:36:18 -08:00
Craig Topper ab6e02dded [RISCV] Match vwmulsu_vx with scalar splat input.
This is a more generic version of D119110 that uses MaskedValueIsZero
to do the matching and SimplifyDemandedBits to remove any unneeded
AND instructions.

Tests were taken from D119110.

Reviewed By: Chenbing.Zheng

Differential Revision: https://reviews.llvm.org/D119622
2022-02-15 08:45:21 -08:00
Craig Topper d132b47bb9 [RISCV] Replace llvm_unreachable with report_fatal_error.
Parsing errors aren't handled earlier in all cases. A simple
example is llc -mtriple=riscv64 -mattr=+zve32f. If F or Finx is
not also specified, this will hit a parse error.

Use a fatal_error so that the error is conveyed to the user.
2022-02-15 08:40:37 -08:00
jacquesguan bfb4c0c370 [RISCV] Recover the implication between Zve* extensions and the V extension.
This revision recover the implication between Zve* extensions and the V extension.

Differential Revision: https://reviews.llvm.org/D119210
2022-02-14 15:52:07 +08:00
Craig Topper 478c237e21 [RISCV] Fix incorrect extend type in vwmulsu combine.
While matching widening multiply, if we matched an extend from i8->i32,
i16->i64 or i8->i64, we need to reintroduce a narrower extend. If we're
matching a vwmulsu we need to use a sext for op0 and a zext for op1.

This bug exists in LLVM 14 and will need to be backported.

Differential Revision: https://reviews.llvm.org/D119618
2022-02-12 12:47:20 -08:00
Dimitry Andric 7af3d4ab3d Revert "[RISCV] Enable shrink wrap by default"
This reverts commit 5ebdb07e7e.

Enabling shrink wrap by default can cause assertions or crashes, and
these should first be investigated and fixed. For now, reverting the
change so it can be cherry-picked into 14.0.0 is the safest choice.
2022-02-12 19:04:12 +01:00
Haocong.Lu 23a5073600 [RISCV] LUI used for address computation should not isAsCheapAsAMove
A LUI instruction with flag RISCVII::MO_HI is usually used in conjunction
with ADDI, and jointly complete address computation. To bind the cost
evaluation of address computation, the LUI should not be regarded as a cheap
 move separately, which is consistent with ADDI.

In this test case, it improves the unroll-loop code that the rematerialization
of array's base address miss MachineCSE with Heuristics #1 at isProfitableToCSE.

Reviewed By: asb, frasercrmck

Differential Revision: https://reviews.llvm.org/D118216
2022-02-12 07:14:38 +00:00
Chenbing.Zheng 9e975e558b [RISCV][NFC] Move some combine patterns to DAG combine.
Move some combine patterns to DAG combine,and
it dealt with fixme left in RISCVInstrInfoZb.td.

Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D119527
2022-02-12 02:52:21 +00:00
Craig Topper 541c9ba842 [RISCV] Insert VSETVLI at the end of a basic block if we didn't produce BlockInfo.Exit.
This is an alternative to D118667 that instead of fixing the store
to match phase 1, it tries to detect the mismatch with the expected
value at the end of the block. This inserts a vsetvli after the vse
to satisfy the requirement of the other basic block.

We still have serious design issues in the pass, that is going to
require some rethinking.

Differential Revision: https://reviews.llvm.org/D119518
2022-02-11 09:34:16 -08:00
Craig Topper f35ac872b8 Revert "[RISCV] Fix a vsetvli insertion bug involving loads/stores." and "[RISCC] Add missing words to comment. NFC"
This reverts commit f943c58cae.
and commit 7eb7810727.

This introduced a new bug that appears to be easier to hit.

Differential Revision: https://reviews.llvm.org/D119517
2022-02-11 09:34:16 -08:00
Zakk Chen d224be3b99 [RISCV] Add the policy operand for some masked RVV ternary IR intrinsics.
Masked reduction intrinsics are specical cases which don't need to have policy
operand. The mask only affects which elements are read. It doesn't effect the
destination register.
The reduction intrinsics have a dedicated destination operand. If it
is undef, we use tail agnostic. If it not undef we use tail
undisturbed.

Co-Authored-by: Craig Topper <craig.topper@sifive.com>

Differential Revision: https://reviews.llvm.org/D117681
2022-02-11 05:02:03 -08:00
serge-sans-paille 06943537d9 Cleanup MCParser headers
As usual with that header cleanup series, some implicit dependencies now need to
be explicit:

llvm/MC/MCParser/MCAsmParser.h no longer includes llvm/MC/MCParser/MCAsmLexer.h

Preprocessed lines to build llvm on my setup:
after:  1068185081
before: 1068324320

So no compile time benefit to expect, but we still get the looser coupling
between files which is great.

Discourse thread: https://discourse.llvm.org/t/include-what-you-use-include-cleanup
Differential Revision: https://reviews.llvm.org/D119359
2022-02-11 10:39:29 +01:00
Fangrui Song 8eb750189c [RISCV] Fix -Wunused-variable in -DLLVM_ENABLE_ASSERTIONS=off builds 2022-02-10 20:10:12 -08:00
Craig Topper b0e77d5e48 [RISCV] Lower the shufflevector equivalent of vector.splice
We can lower a vector splice to a vslidedown and a vslideup.

The majority of the matching code here came from X86's code for matching
PALIGNR and VPALIGND/Q.

The slidedown and slideup lowering don't really require it to be concatenation,
but it happened to be an interesting pattern with existing analysis code I
could use.

This helps with cases where the scalar loop optimizer forwarded a load
result from a previous loop iteration. For example, this happens if the
loop uses x[i] and x[i+1] on the same iteration. The scalar optimizer
will forward x[i+1] load from the previous loop to satisfy x[i] on this
loop. When this get vectorized it results in one element of a vector
being forwarded from the previous loop to be concatenated with elements
loaded on this iteration.

Whether that's more efficient than doing a shifted loaded or reloading
the single scalar and using vslide1up is an interesting question.
But that's not something the backend can help with.

Reviewed By: khchen

Differential Revision: https://reviews.llvm.org/D119039
2022-02-10 09:39:35 -08:00
Craig Topper b861ddf365 [RISCV] Move the creation of VLMaxSentinel to isel. Use X0 during lowering.
The VLMaxSentinel is represented as TargetConstant, but that's included
in isa<ConstantSDNode>. To keep constant VLs and VLMax separate as long
as possible, use the X0 register during lowering and only convert to
VLMaxSentinel during isel.

Reviewed By: frasercrmck

Differential Revision: https://reviews.llvm.org/D118845
2022-02-10 09:28:44 -08:00
Craig Topper 727cd5205f [RISCV] Remove stale comment. NFC
Now that we pre-process SPLAT_VECTOR to VFMV_V_F_VL, these patterns
handled scalable vectors and vectors converted from fixed. These
are also used by vp.fma lowering.
2022-02-10 09:04:32 -08:00
Fraser Cormack fd43d99c93 [RISCV] Pre-process FP SPLAT_VECTOR to RISCVISD::VFMV_V_F_VL
This patch builds on top of D119197 to canonicalize floating-point
SPLAT_VECTOR as RISCVISD::VFMV_V_F_VL as a pre-process ISel step.

This primarily benefits scalable-vector VP code, where our VP patterns
only match VFMV_V_F_VL to reduce the burden on our ISel patterns, but
where at the same time, scalable-vector code doesn't custom-legalize
SPLAT_VECTOR.

Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D117670
2022-02-10 09:56:00 +00:00
Chenbing.Zheng c5d3b231e0 [RISCV] Add support for matching vwmaccsu/vwmaccus from fixed vectors
Add pattern to match add and widening mul to vwmacc, and
two multipliers are sext and zext.

Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D119314
2022-02-10 01:59:31 +00:00
Craig Topper c45c1b130b [RISCV] Teach RISCVDAGToDAGISel::selectShiftMask to replace sub from constant with neg.
If the shift amount is (sub C, X) where C is 0 modulo the size of
the shift, we can replace it with neg or negw.

Similar is is done for AArch64 and X86.

Reviewed By: khchen

Differential Revision: https://reviews.llvm.org/D119089
2022-02-09 12:33:01 -08:00
Craig Topper 09629215c2 [RISCV] Add a really basic cost model for SK_Splice.
While testing scalable vectors I found that if we generate a
vector splice intrinsic and run the code through the loop unroller,
we'll crash due to an invalid cost.

This adds a basic cost based on the 2 slide instructions used by the
lowering in D119303.

We probably need to factor LMUL into this, but that's true for
arithmetic instructions too. So I've ignored for the moment.

Reviewed By: ABataev

Differential Revision: https://reviews.llvm.org/D119316
2022-02-09 11:43:31 -08:00
Craig Topper 279b3b8179 [RISCV][VP] Lower VP_FMA to RVV instructions.
We already had FMA_VL node, but we didn't have masked patterns.
I have not added the fneg variations. I'll do those after I add
llvm.vp.fneg.

Reviewed By: frasercrmck

Differential Revision: https://reviews.llvm.org/D119196
2022-02-09 11:33:12 -08:00
Craig Topper 63e711549c [RISCV] Lower VP_FNEG to RVV instructions
Reviewed By: frasercrmck

Differential Revision: https://reviews.llvm.org/D119269
2022-02-09 10:56:39 -08:00
Craig Topper e305b1de7e [RISCV] Pre-process integer ISD::SPLAT_VECTOR to RISCISD::VMV_V_X_VL before isel.
This allows us to remove some isel patterns that exist for both
operations. Saving nearly 3000 bytes from the isel table.

Reviewed By: frasercrmck

Differential Revision: https://reviews.llvm.org/D119197
2022-02-09 08:10:21 -08:00
Lian Wang af2cd94555 [RISCV][NFC] Remove useless code
Reviewed By: craig.topper, asb

Differential Revision: https://reviews.llvm.org/D119317
2022-02-09 19:17:25 +08:00
serge-sans-paille ef736a1c39 Cleanup LLVMMC headers
There's a few relevant forward declarations in there that may require downstream
adding explicit includes:

llvm/MC/MCContext.h no longer includes llvm/BinaryFormat/ELF.h, llvm/MC/MCSubtargetInfo.h, llvm/MC/MCTargetOptions.h
llvm/MC/MCObjectStreamer.h no longer include llvm/MC/MCAssembler.h
llvm/MC/MCAssembler.h no longer includes llvm/MC/MCFixup.h, llvm/MC/MCFragment.h

Counting preprocessed lines required to rebuild llvm-project on my setup:
before: 1052436830
after:  1049293745

Which is significant and backs up the change in addition to the usual benefits of
decreasing coupling between headers and compilation units.

Discourse thread: https://discourse.llvm.org/t/include-what-you-use-include-cleanup
Differential Revision: https://reviews.llvm.org/D119244
2022-02-09 11:09:17 +01:00
Fraser Cormack 6449bea508 [RISCV] Select unmasked RVV pseudos in a DAG post-process
This patch drops TableGen patterns matching all-ones masked RVV pseudos
in the case where there are fallback patterns matching the generic
masked forms to "_MASK" pseudos. This optimization is now performed with
a SelectionDAG post-processing step which peephole-optimizes these same
pseudos with all-ones masks and swaps them out to their unmasked
pseudos.

This cuts our generated ISel table down by around ~5% (~110kB) in lieu
of a far smaller auto-generated table to help with the peephole.

This only targets our custom RISCVISD::*_VL binary operator nodes, which
use the one form for both masked and unmasked variants. A similar
approach could be used for our intrinsics but we'd need to do some work,
e.g., to represent unmasked intrinsics as true-masked intrinsics at the
IR or ISel level. At a rough estimate, this could save us a further 9%
on the size of our ISel table for the binary intrinsic patterns alone.

There is no observable impact on our tests.

Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D118810
2022-02-09 07:50:15 +00:00
Zakk Chen cfe7f69036 [RISCV][NFC] Refactor RISCVISAInfo.
1. Remove computeDefaultABIFromArch and add computeDefaultABI in
RISCVISAInfo.
2. Add parseFeatureBits which may used in D118333.

Differential Revision: https://reviews.llvm.org/D119250
2022-02-08 18:37:43 -08:00
jacquesguan 5e71bbfb6c [RISCV] Add patterns for vector widening floating-point fused multiply-add instructions
Add patterns for vector widening floating-point fused multiply-add instructions.

Differential Revision: https://reviews.llvm.org/D117546
2022-02-09 10:34:39 +08:00
Fraser Cormack 62c4ac764b [RISCV] Optimize splats of extracted vector elements
This patch adds an optimization to splat-like operations where the
splatted value is extracted from a identically-sized vector. On RVV we
can splat that via vrgather.vx/vrgather.vi without dropping to scalar
beforehand.

We do have a similar VECTOR_SHUFFLE-specific optimization but that only
works on fixed-length vector types and for those with a constant splat
lane. This patch extends this optimization to make it work on
scalable-vector types and on unknown extract indices.

It is performed during fixed-vector BUILD_VECTOR lowering and during a
new DAGCombine on SPLAT_VECTOR for scalable vectors.

Reviewed By: craig.topper, khchen

Differential Revision: https://reviews.llvm.org/D118456
2022-02-08 10:35:25 +00:00
wangpc c53d99c37d [RISCV] Split f64 undef into two i32 undefs
So that no store instruction will be generated.

Reviewed By: asb

Differential Revision: https://reviews.llvm.org/D118222
2022-02-08 13:42:15 +08:00
Craig Topper 2c26cfdef7 [RISCV] Use splat_vector instead of SplatPat in widening FP instruction patterns. NFCI
We use splat_vector for FP nodes without VL, not SplatPat which handles
splat_vector and integer VMV_V_X_VL.

Reduces isel table size by a few hundred bytes.
2022-02-07 15:53:27 -08:00
Kazu Hirata 3a3cb929ab [llvm] Use = default (NFC) 2022-02-06 22:18:35 -08:00
Craig Topper c1cef111a3 Revert "[RISCV] Fold (sext_inreg (fmv_x_anyexth X), i16) -> (fmv_x_signexth X)."
This reverts commit 673d68cd92.

This hadn't been reviewed yet.
2022-02-05 12:51:01 -08:00
Craig Topper 673d68cd92 [RISCV] Fold (sext_inreg (fmv_x_anyexth X), i16) -> (fmv_x_signexth X).
Add a new ISD opcode to represent the sign extending behavior of
vmv.x.h. Keep the previous anyext opcode to allow the existing
(fmv_x_anyexth (fmv_h_x X)) combine to keep working without needing
to generate a sign extend.

For fmv.x.w we are able to match the sext_inreg in an isel pattern,
but a 16-bit sext_inreg is lowered to a shift pair before isel. This
seemed like a larger match than we should do in isel.

Differential Revision: https://reviews.llvm.org/D118974
2022-02-05 12:42:12 -08:00
Craig Topper 5f35009996 [RISCV] Remove a ComputeNumSignBits call from an isel special case.
Only isel (and (srl (sexti32 Y), c2), c1) -> (srliw (sraiw Y, 31), c3 - 32)
when there is a sext_inreg present. Don't both checking for Y
having 32 sign bits.
2022-02-04 23:26:53 -08:00
Craig Topper d752ea9a72 [RISCV] Remove exclusions for zext.h/zext.w from our (and (srl X, C1), C2) selection code.
This code tries to replace the pattern with a pair of shifts, but
we were excluding if the And could be a zext.h or zext.w. The SLLI/SRL
pair is more compressible and doesn't come with much down side.

We do regress one test case in rv64i-exhaustive-w-insts.ll but we
can probably add a narrower exclusion for that case.
2022-02-04 17:10:48 -08:00
Craig Topper 1d8bbe3d25 [RISCV] Implement a basic version of AArch64RedundantCopyElimination pass.
Using AArch64's original implementation for reference, this patch
implements a pass to remove unneeded copies of X0. This pass runs
after register allocation and looks to see if a register is implied
to be 0 by a branch in the predecessor basic block.

Reviewed By: asb

Differential Revision: https://reviews.llvm.org/D118160
2022-02-04 10:43:46 -08:00
Craig Topper 234e54bdd8 [RISCV] Add more types of shuffles isShuffleMaskLegal.
Add the vslidedown and interleave patterns that I recently implemented.

Reviewed By: frasercrmck

Differential Revision: https://reviews.llvm.org/D118952
2022-02-04 09:13:13 -08:00
Craig Topper c83905a308 [RISCV] Add inline expansion for vector fround.
This avoids a crash for scalable vectors and or scalarization for
fixed vectors.

The algorithm is different enough that I don't think it makes sense
to merge with ceil/floor/trunc. Algorithm is adapted from gcc's X86
SSE2 output.

Reviewed By: frasercrmck

Differential Revision: https://reviews.llvm.org/D117247
2022-02-04 09:12:09 -08:00
serge-sans-paille ffe8720aa0 Reduce dependencies on llvm/BinaryFormat/Dwarf.h
This header is very large (3M Lines once expended) and was included in location
where dwarf-specific information were not needed.

More specifically, this commit suppresses the dependencies on
llvm/BinaryFormat/Dwarf.h in two headers: llvm/IR/IRBuilder.h and
llvm/IR/DebugInfoMetadata.h. As these headers (esp. the former) are widely used,
this has a decent impact on number of preprocessed lines generated during
compilation of LLVM, as showcased below.

This is achieved by moving some definitions back to the .cpp file, no
performance impact implied[0].

As a consequence of that patch, downstream user may need to manually some extra
files:

llvm/IR/IRBuilder.h no longer includes llvm/BinaryFormat/Dwarf.h
llvm/IR/DebugInfoMetadata.h no longer includes llvm/BinaryFormat/Dwarf.h

In some situations, codes maybe relying on the fact that
llvm/BinaryFormat/Dwarf.h was including llvm/ADT/Triple.h, this hidden
dependency now needs to be explicit.

$ clang++ -E  -Iinclude -I../llvm/include ../llvm/lib/Transforms/Scalar/*.cpp -std=c++14 -fno-rtti -fno-exceptions | wc -l
after:   10978519
before:  11245451

Related Discourse thread: https://llvm.discourse.group/t/include-what-you-use-include-cleanup
[0] https://llvm-compile-time-tracker.com/compare.php?from=fa7145dfbf94cb93b1c3e610582c495cb806569b&to=995d3e326ee1d9489145e20762c65465a9caeab4&stat=instructions

Differential Revision: https://reviews.llvm.org/D118781
2022-02-04 11:44:03 +01:00
Craig Topper 237eb37260 [RISCV] Add FMV_X_W and FMV_X_H to RISCVSExtWRemoval.
Add -target-abi to sextw-removal.ll RUN lines to show benefit on
new test case.
2022-02-03 09:40:47 -08:00
Craig Topper 997a86b99c [RISCV] Remove createVirtualRegister from RISCVInstrInfo::movImm.
Based on the discussion in D61884, this was done to enable compressed
instructions by giving freedom to pick a compressible register.

Integer materializing can generate LUI, ADDI, ADDIW, SLLI and some
Zb* instructions. C.LI, C.LUI, C.ADDI, C.ADDIW, and C.SLLI all have a 5-bit
register encoding. The Zb* instructions aren't compressible. Based on
that I don't think compressibility of the register is a concern.

Reviewed By: asb

Differential Revision: https://reviews.llvm.org/D118741
2022-02-03 08:34:26 -08:00
Craig Topper 2349fb0312 [RISCV] Remove RISCVISD::SPLAT_VECTOR_I64 in favor of RISCVISD::VMV_V_X_VL.
SPLAT_VECTOR_I64 has the same semantics as RISCVISD::VMV_V_X_VL, it
just assumed VLMax instead of carrying a VL operand.

Include order of RISCVInstrInfoVSDPatterns.td and RISCVInstrInfoVVLPatterns.td
has been swapped to avoid moving riscv_vmv_v_x_vl into
RISCVInstrInfoVSDPatterns.td and to allow moving other "_vl" SDNodes back to
RISCVInstrInfoVVLPatterns.td

Reviewed By: frasercrmck

Differential Revision: https://reviews.llvm.org/D118841
2022-02-03 08:30:25 -08:00
Shao-Ce SUN 005fd8aa70 [RISCV] Add support for Zihintpause extention
Add support for the 'pause' hint instruction as an alias for
'fence w, 0'. To do this allow the 'fence' operands pred and succ
to be set to 0 (the empty set). This will also allow future hints
to be encoded as 'fence 0, <x>' and 'fence <x>, 0'.

This patch revised from @mundaym's D93019.

Reviewed By: asb

Differential Revision: https://reviews.llvm.org/D117789
2022-02-03 20:55:47 +08:00
Craig Topper abc6716038 [RISCV] Remove unused variables. NFC 2022-02-02 19:23:16 -08:00
Craig Topper f1720abb54 [RISCV] Cleanup some places that assumed VLMaxSentinel and -1 constant mean the same thing. NFCI
VLMaxSentintel happens to be represented as -1 TargetConstant. A user
provided -1 would be an ISD::Constant. We shouldn't assume that they
are the same thing. I'm still not entirely convinced that we should be
treating -1 from the user as VLMAX.

Also fix one place that failed to use XLenVT for the VLMaxSentinel,
using MVT::i64 in code that only executes on RV32.
2022-02-02 12:23:12 -08:00
Craig Topper b73d151a11 [RISCV] Add DAG combines to transform ADD_VL/SUB_VL into widening add/sub.
This adds or reuses ISD opcodes for vadd.wv, vaddu.wv, vadd.vv, vaddu.vv
and a similar set for sub.

I've included support for narrowing scalar splats that have known
sign/zero bits similar to what was done for MUL_VL.

The conversion to vwadd.vv proceeds in two phases. First we'll form
a vwadd.wv by narrowing one of the operands. Then we'll visit the
vwadd.wv to try to narrow the other operand. This turned out to be
simpler than catching all the cases in one step. The forming of of
vwadd.wv can happen for either operand for add, but only the right
hand side for sub since sub isn't commutable.

An interesting quirk is that ADD_VL and VZEXT_VL/VSEXT_VL are formed
during vector op legalization, but VMV_V_X_VL isn't usually formed
until op legalization when BUILD_VECTORS are handled. This leads to
VWADD_W_VL forming in one DAG combine round, and then a later DAG combine
round sees the VMV_V_X_VL and needs to commute the operands to get the
splat in position. This alone necessitated a VWADD_W_VL combine function
which made forming vwadd.vv in two stages an easy choice.

I've left out trying hard to form vwadd.wx instructions for now. It would
only save an extend in the scalar domain which isn't as interesting.

Might need to review the test coverage a bit. Most of the vwadd.wv
instructions are coming from vXi64 tests on rv64. The tests were
copy pasted from the existing multiply tests.

Reviewed By: rogfer01

Differential Revision: https://reviews.llvm.org/D117954
2022-02-02 10:03:08 -08:00
Craig Topper 5a5037c602 [RISCV] Fix some 80 column violations in ComputeNumSignBitsForTargetNode. NFC 2022-02-01 21:43:11 -08:00
Craig Topper f943c58cae [RISCC] Add missing words to comment. NFC 2022-02-01 07:39:51 -08:00
Craig Topper 7eb7810727 [RISCV] Fix a vsetvli insertion bug involving loads/stores.
The first phase of the analysis can avoid a vsetvli if an earlier
instruction in the block used an SEW and LMUL that when combined with
the EEW of the load/store would produce the desired EMUL. If we
avoided a vsetvli this will affect the global analysis we do in the
second phase.

The third phase where we really insert the vsetvlis needs to agree
with the first phase. If it doesn't we can insert vsetvlis that
invalidate the global analysis.

In the test case there is a VSETVLI in the preheader that sets
SEW=64 and LMUL=1. Inside the loop there is a VADD with SEW=64 and LMUL=1.
This VADD is followed by a store that wants wants SEW=32 LMUL=1/2.
Because it has EEW=32 as part of the opcode the SEW=64 LMUL=1 from the
VADD can be become EMUL=1 for the store. So the first phase determines no
vsetvli is needed.

The third phase manages CurInfo differently than BBInfo.Change from the
first phase. CurInfo is only updated when we see a vsetvli or insert
a vsetvli. This was done to allow predecessor block information from
the global analysis to be applied to multiple instructions. Since the
loop body has no vsetvli we won't update CurInfo for either the VADD
or the VSE. This prevented us from checking the store vsetvli elision
for the VSE resulting in a vsetvli SEW=32 LMUL=1/2 being emitted which
invalidated the global analysis.

To mitigate this, I've added a BBLocalInfo variable that more closely
matches the first phase propagation. This gets updated based on the
VADD and prevents emitting a vsetvli for the store like we did in the
first phase.

I wonder if we should do an earlier phase to handle the load/store case
by adding more pseudo opcodes and changing the SEW/LMUL for those
instructions before the insertion analysis. That might be more robust
than trying to guarantee two phases make the same decision.

Fixes the test from D118629.

Reviewed By: frasercrmck

Differential Revision: https://reviews.llvm.org/D118667
2022-02-01 07:29:01 -08:00
Shao-Ce SUN a2a7fc7ea5 [RISCV] Adjust some comments. 2022-02-01 22:53:54 +08:00
Craig Topper 2e45e8abb1 [RISCV] Add a fatal error if ISD::VSCALE is used with Zvl32b.
We convert VLEN to vscale by dividing by RVVBitsPerBlock which is
currently 64. This is only correct if VLEN is evenly divisible by
64. With only Zvl32b we can't assume that.

This patch adds a fatal_error to prevent generating code that may
be broken.

We probably need to look at how we size stack frame objects too.

Reviewed By: frasercrmck

Differential Revision: https://reviews.llvm.org/D118583
2022-01-31 09:13:14 -08:00
Craig Topper 09606d6a63 [RISCV] Update the computeKnownBitsForTargetNode for RISCVISD::READ_VLENB to consider Zve/Zvl.
We had previously hardcoded this to assume that vector registers
are 128 bits. This was true when only V existed, but after Zve
extensions were added this became incorrect.

This patch adjusts it to support 128, 64, or 32 bit vectors depending
on Zvl. The 128-bit limit is artificial, but we don't have any test
coverage showing that we larger values so I was being conservative.

None of our lit tests depend on this code today due to the custom
lowering of ISD::VSCALE that inserts the appropriate left or right
shift to convert from VLENB to VSCALE. That code was added after
this code in computeKnownBitsForTargetNode.

Reviewed By: frasercrmck

Differential Revision: https://reviews.llvm.org/D118582
2022-01-31 09:13:14 -08:00
Craig Topper aae947e860 [RISCV] Separate the Zfhmin and Zfh extensions.
The spec doesn't seem to be written as if Zfh implies Zfhmin. They
seem to be separate extensions.

This patch moves the instructions from Zfhmin to be enabled with
either the Zfh or Zfhmin extensions.

Reviewed By: achieveartificialintelligence

Differential Revision: https://reviews.llvm.org/D118581
2022-01-31 09:06:43 -08:00
Nikita Popov 0801940c17 [RISCV] Avoid pointer element type access for masked atomicrmw intrinsics
masked.atomicrmw.*.i32 intrinsics access an i32 (and then possibly
mask it), so hardcode MVT::i32 as the access type here, rather than
determining it from the pointer element type.

Differential Revision: https://reviews.llvm.org/D118336
2022-01-31 09:28:39 +01:00
Craig Topper 5fbc3cda9e [RISCV] Use existing variable intead of calling getOperand again. NFCI
This is a slight change because I'm using the ANY_EXTEND result
instead of the original operand, but getNode should constant fold.

While there, add a comment about why the code specifically checks
for a ConstantSDNode.
2022-01-30 18:42:19 -08:00
Craig Topper 744be8c502 [RISCV] Lower riscv_zip/unzip intrinsic to RISCVISD::SHFL/UNSHFL.
These are special versions of the more general shfli/unshfli
instructions. We can use the general ISD opcodes with the correct
immediates.
2022-01-30 13:27:41 -08:00
Craig Topper e1075186a6 [RISCV] Custom lower brev8 intrinsic to RISCVISD::GREV.
We can use the RISCVISD::GREV encoding that swaps the bits in
each byte.  This allows it to use the existing computeKnownBits
support for RISCVISD::GREV.
2022-01-30 12:41:09 -08:00
Craig Topper 524545317c [RISCV] Remove RISCVISD::BREV8 and use RISCVISD::GREV instead.
We already have an ISD opcode for the more general GREV/GREVI
instructon. We can just use it with the encoding that corresponds
to the behavior of brev8. This is similar to what we do for orc.b
where we use the GORC ISD opcode.
2022-01-29 22:45:43 -08:00
Craig Topper 0405ac0150 [RISCV] Rerrange RISCVInstrInfoZB.td to better group related wthings. NFC
Especially placing W instructions/patterns near their non-W versions.
2022-01-29 21:16:15 -08:00
Craig Topper 815786eb67 [RISCV] Use RVBUnary to simplify ZEXT_H_RV32/ZEXT_H_RV64 definitions. NFC 2022-01-29 18:28:14 -08:00
Craig Topper 8faf2a0638 [RISCV] Correct predicate orc.b pattern to not include Zbkb.
This was incorrectly lumped in when the predicate was changed for
the rotate instructions.
2022-01-29 00:10:54 -08:00
Craig Topper d8f929a567 [RISCV] Custom legalize BITREVERSE with Zbkb.
With Zbkb, a bitreverse can be split into a rev8 and a brev8.

Reviewed By: VincentWu

Differential Revision: https://reviews.llvm.org/D118430
2022-01-28 23:11:12 -08:00
jacquesguan 1276678982 [RISCV] Improve extract_vector_elt for fixed mask registers.
Now the backend promotes mask vector to an i8 vector and extract element from that. We could bitcast to a widen element vector, and extract from it to GPR, then use I instruction to extract the certain bit.

Differential Revision: https://reviews.llvm.org/D117389
2022-01-29 11:07:53 +08:00
Craig Topper 06bd56d47d [RISCV] Update comments about getInstSizeInBytes hard-coding the number of bytes.
After D118175, we get the information from the tablegen definition.

Differential Revision: https://reviews.llvm.org/D118488
2022-01-28 09:51:49 -08:00
Craig Topper ea05ee9059 [RISCV] Preserve VL when truncating i64 gather/scatter indices on RV32.
We were creating a truncate with the default for the type, but for
VP intrinsics we have a VL that we should use.

Reviewed By: frasercrmck

Differential Revision: https://reviews.llvm.org/D118406
2022-01-28 09:25:30 -08:00
Craig Topper de0c2d75bf [RISCV] Use tablegen size for getInstSizeInBytes.
Fix the pseudos to have the correct size in the MCInstrDesc description.

Inspired by D118009 and D117970.

Reviewed By: asb

Differential Revision: https://reviews.llvm.org/D118175
2022-01-28 09:21:28 -08:00
Kito Cheng a9d5bb926d [RISCV] Use __extendhfsf2/__truncsfhf2 for fp16 <-> fp32
`__gnu_h2f_ieee` and `__gnu_f2h_ieee` are introduce by ARM and set that as
default name for fp16 and fp32 conversion in LLVM.

However RISC-V GCC using default naming scheme for that, which is
`__extendhfsf2` and `__truncsfhf2` for that, that cause runtime ABI
incompatible issue.

Although we didn't have formal runtime ABI spec to specify those naming
convention yet, but I think it would be great to fix the incompatible
issue first.

And I've plan to create a runtime ABI spec undere psABI spec this year.

Reviewed By: asb

Differential Revision: https://reviews.llvm.org/D118207
2022-01-29 00:01:00 +08:00
Alex Bradbury 588f121ada [RISCV][NFC] Make Zb* instruction naming match the convention used elsewhere in the RISC-V backend
Where the instruction mnemonic contains a dot, we name the corresponding
instruction in the .td file using a _ in the place of the dot. e.g. LR_W
rather than LRW. This commit updates RISCVInstrInfoZb.td to follow that
convention.
2022-01-28 15:20:37 +00:00
Chenbing.Zheng 6d6c44a3f3 [RISCV] Add support for matching vwmulsu from fixed vectors
According to riscv-v-spec-1.0, widening signed(vs2)-unsigned integer multiply
vwmulsu.vv vd, vs2, vs1, vm # vector-vector
vwmulsu.vx vd, vs2, rs1, vm # vector-scalar

It is worth noting that signed op is only for vs2.
For vwmulsu.vv, we can swap two ops, and don't care which is sign extension,
but for vwmulsu.vx signExt can not be a vector extended from scalar (rs1).
I specifically added two functions ending with _swap in the test case.

Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D118215
2022-01-28 02:33:30 +00:00
Craig Topper 70e1cc6792 [RISCV] Prefer vmslt.vx v0, v8, zero over vmsle.vi v0, v8, -1.
At least when starting from a vmslt.vx intrinsic or ISD::SETLT. We
don't handle the case where the user used vmsle.vx intrinsic with -1.
2022-01-27 11:48:27 -08:00
Fraser Cormack 84e85e025e [SelectionDAG][VP] Provide expansion for VP_MERGE
This patch adds support for expanding VP_MERGE through a sequence of
vector operations producing a full-length mask setting up the elements
past EVL/pivot to be false, combining this with the original mask, and
culminating in a full-length vector select.

This expansion should work for any data type, though the only use for
RVV is for boolean vectors, which themselves rely on an expansion for
the VSELECT.

Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D118058
2022-01-27 09:00:41 +00:00
Wu Xinlong 6a4d3f37b5 [RISCV] fix dead code
fix dead code mentioned on https://reviews.llvm.org/D98136

Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D118323
2022-01-27 16:00:01 +08:00
Wu Xinlong 615d71d9a3 [RISCV][CodeGen] Implement IR Intrinsic support for K extension
This revision implements IR Intrinsic support for RISCV Scalar Crypto extension according to the specification of version [[ https://github.com/riscv/riscv-crypto/releases/tag/v1.0.0-scalar | 1.0]]
Co-author:@ksyx & @VincentWu & @lihongliang & @achieveartificialintelligence

Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D102310
2022-01-27 15:53:35 +08:00
Craig Topper b3bec6e453 [RISCV] Use vnsrl.wx with x0 instead of vnsrl.vi for truncate.
This matches what the spec uses for the vncvt.x.x.w assembly
pseudoinstruction.

Reviewed By: kito-cheng

Differential Revision: https://reviews.llvm.org/D118295
2022-01-26 18:38:13 -08:00
Craig Topper f487a76430 [RISCV] Add hasStdExtZbp() to hasAndNotCompare. 2022-01-26 13:54:05 -08:00
Craig Topper b3d94b199c [RISCV] Remove references to 'B' extension from AssemblerPredicate and SubtargetFeature strings.
For Zba/Zbb/Zbc/Zbs I've removed the 'B' completely and used the
extension names as presented at the start of Chapter 1 of the
1.0.0 Bitmanipulation spec.

For the unratified extensions, I've replaced 'B' with 'Zb' and
otherwise left them unchanged.

Reviewed By: asb

Differential Revision: https://reviews.llvm.org/D117822
2022-01-26 11:08:29 -08:00
Benjamin Kramer f15014ff54 Revert "Rename llvm::array_lengthof into llvm::size to match std::size from C++17"
This reverts commit ef82063207.

- It conflicts with the existing llvm::size in STLExtras, which will now
  never be called.
- Calling it without llvm:: breaks C++17 compat
2022-01-26 16:55:53 +01:00
serge-sans-paille ef82063207 Rename llvm::array_lengthof into llvm::size to match std::size from C++17
As a conquence move llvm::array_lengthof from STLExtras.h to
STLForwardCompat.h (which is included by STLExtras.h so no build
breakage expected).
2022-01-26 16:17:45 +01:00
jacquesguan 267711e38b [RISCV] Fix support of vlen = 64.
In the Zve* extensions, the vlen could be 64. This patch change the vlen constraint of low bound to 64.

Differential Revision: https://reviews.llvm.org/D118217
2022-01-26 16:31:21 +08:00
Zakk Chen 510710d037 [RISCV][NFC] Add getVLOperand for RVV intrinsics.
Use the VLOperand information to get the VL.

Differential Revision: https://reviews.llvm.org/D118156
2022-01-25 17:37:58 -08:00
Zakk Chen 9273378b85 [RISCV] Add the passthru operand for RVV nomask load intrinsics.
The goal is support tail and mask policy in RVV builtins.
We focus on IR part first.
If the passthru operand is undef, we use tail agnostic, otherwise
use tail undisturbed.

Co-Authored-by: Hsiangkai Wang <Hsiangkai@gmail.com>

Reviewers: craig.topper, frasercrmck

Differential Revision: https://reviews.llvm.org/D117647
2022-01-25 17:31:36 -08:00
eopXD b089e4072a [RISCV] Don't allow i64 vector div by constant to use mulh with Zve64x
EEW=64 of mulh and its vairants requires V extension.

Authored by: Craig Topper <craig.topper@sifive.com> @craig.topper

Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D117947
2022-01-25 09:55:05 -08:00
Nikita Popov aa97bc116d [NFC] Remove uses of PointerType::getElementType()
Instead use either Type::getPointerElementType() or
Type::getNonOpaquePointerElementType().

This is part of D117885, in preparation for deprecating the API.
2022-01-25 09:44:52 +01:00
Craig Topper fd0a4bc76b [RISCV] Add missing space to 'clang-format on' directive. NFC
Without a space after the comment characters it seems to be ignored.
2022-01-24 17:00:37 -08:00
Craig Topper cd2a9ff397 [RISCV] Select int_riscv_vsll with shift of 1 to vadd.vv.
Add might be faster than shift. We can't do this earlier without
using a Freeze instruction.

This is the intrinsic version of D106689.

Reviewed By: frasercrmck

Differential Revision: https://reviews.llvm.org/D118013
2022-01-24 08:04:53 -08:00
Fraser Cormack d42678b453 [RISCV] Add side-effect-free vsetvli intrinsics
This patch introduces new intrinsics that enable the use of vsetvli in
contexts where only the returned vector length is of interest. The
pre-existing intrinsics are marked with side-effects, which prevents
even trivial optimizations on/across them.

These intrinsics are intended to be used in situations where the vector
length is fed in turn to RVV intrinsics or to vector-predication
intrinsics during loop vectorization, for example. Those codegen paths
ensure that instructions are generated with their own implicit vsetvli,
so the vector length and vtype can be relied upon to be correct.

No corresponding C builtins are planned at this stage, though that is a
possibility for the future if the need arises.

Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D117910
2022-01-24 13:52:08 +00:00
SForeKeeper 70f83f3084 [RISCV] add support for zbkx subextension in MC layer.
This patch adds support for zbkx extension from K extension(v1.0.0) in MC layer.
Instructions with same functionality and same encoding is defined in the bitmanip extension.
It defines {Xperm8, Xperm4} as instruction aliases for xperm.* in Zbp extension. When Zbkx is enabled while Zbp is not, xperm.h will not be available. When Zbkx and Zbp are both enabled, the instructions will be decoded in Zbp format.

[[ https://reviews.llvm.org/D94999 | D94999 ]] this is the patch that introduces xperm.* instructions.

Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D117889
2022-01-24 20:38:46 +08:00
Fraser Cormack af773a1818 [RISCV][VP] Lower VP_MERGE to RVV instructions
This patch adds lowering of the llvm.vp.merge.* intrinsic
(ISD::VP_MERGE) to RVV vmerge/vfmerge instructions. It introduces a
special pseudo form of vmerge which allows a tied merge operand,
allowing us to specify the tail elements as being equal to the "on
false" operand, using a tied-def constraint and a "tail undisturbed"
policy.

While this strategy allows us to often lower the intrinsic to just one
instruction, it may be less efficient in fixed-vector types as the
number of tail elements may extend far beyond the length of the fixed
vector. Another strategy could be to use a vmerge/vfmerge instruction
with an AVL equal to the length of the vector type, and manipulate the
condition operand such that mask elements greater than the operation's
EVL are false.

I've also observed inefficient codegen in which our 'VF' patterns don't
match raw floating-point SPLAT_VECTORs, which occur in scalable-vector
code.

Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D117561
2022-01-24 11:05:05 +00:00
Fraser Cormack e7926e8d97 [RISCV] Match VF variants for masked VFRDIV/VFRSUB
This patch follows up on D117697 to help the simple binary operations
behave similarly in the presence of masks.

It also enables CGP sinking support for vp.fdiv and vp.fsub intrinsics,
now that VFRDIV and VFRSUB are consistently matched with a LHS splat for
masked and unmasked variants.

Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D117783
2022-01-24 10:59:43 +00:00
Chenbing.Zheng 9aaa74aeef [RISCV] Add patterns of SET[U]LT_VI for STECC forms
This patch optmizes "li a0, 5
                     vmsgt[u].vx v10, v8, a0"
                 -> "vmsgt[u].vi v10, v8, 5"

Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D118014
2022-01-24 08:50:49 +00:00
jacquesguan ba16e3c31f [RISCV] Decouple Zve* extensions and the V extension.
According to the spec, there are some difference between V and Zve64d. For example, the vmulh integer multiply variants that return the high word of the product (vmulh.vv, vmulh.vx, vmulhu.vv, vmulhu.vx, vmulhsu.vv, vmulhsu.vx) are not included for EEW=64 in Zve64*, but V extension does support these instructions. So we should decouple Zve* extensions and the V extension.

Differential Revision: https://reviews.llvm.org/D117854
2022-01-24 14:55:21 +08:00
Wu Xinlong e29d8fb169 [RISCV] Initially support the K-extension instructions on the LLVM MC layer
This commit is currently implementing supports for scalar cryptography extension for LLVM according to version v1.0.0 of [K Ext specification](https://github.com/riscv/riscv-crypto/releases)(scala crypto has been ratified already). Currently, we are implementing the MC (Machine Code) layer of his extension and the majority of work is done under `llvm/lib/Target/RISCV` directory. There are also some test files in `llvm/test/MC/RISCV` directory.

Remove the subfeature of Zbk* which conflict with b extensions to reduce the size of the patch.
(Zbk* will be resubmit after this patch has been merged)

**Co-author:**@ksyx & @VincentWu & @lihongliang & @achieveartificialintelligence

Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D98136
2022-01-24 14:45:35 +08:00
Jim Lin 3f24cdec25 [RISCV][NFC] Remove tailing whitespaces in RISCVInstrInfoVSDPatterns.td and RISCVInstrInfoVVLPatterns.td 2022-01-24 10:49:43 +08:00
Craig Topper 413684313d [RISCV] Adjust the header comment in RISCVInstrInfoZb.td to better integrate Zbk* extensions.
The Zbk* extensions have some overlap with Zb so have been placed in this file.

Reviewed By: VincentWu

Differential Revision: https://reviews.llvm.org/D117958
2022-01-23 11:42:52 -08:00
eopXD 3cf15af2da [RISCV] Remove experimental prefix from rvv-related extensions.
Extensions affected: +v, +zve*, +zvl*

Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D117860
2022-01-22 20:18:40 -08:00
Craig Topper d44b6be6ea [RISCV] Don't Custom legalize f16/f32/f64 bitcasts if those types aren't Legal. 2022-01-22 11:55:18 -08:00
Alex Fan e796eaf2af [RISCV][RFC] add MC support for zbkc subextension
Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D117874
2022-01-22 10:23:01 +08:00
Craig Topper 0379459fc5 [RISCV] Strengthen a SDTypeProfile. Fix formatting. 2022-01-21 13:01:53 -08:00
Craig Topper 48132bb1e4 [RISCV] Simplify interface to combineMUL_VLToVWMUL. NFC
Instead of passing the both the SDNode* and 2 of the operands
in two different orders, just pass the SDNode * and a bool to
indicate which operand order to test.

While there rename to combineMUL_VLToVWMUL_VL.
2022-01-21 11:43:06 -08:00
Craig Topper 11754a4dbb [RISCV] Use RVBUnary in more places to simplify some tablegen declarations. NFCI 2022-01-21 10:55:35 -08:00
Fraser Cormack 4d268dc94a [RISCV] Enable CGP to sink splat operands of VP intrinsics
This patch brings better splat-matching to our VP support, by sinking
splat operands of VP intrinsics back into the same block as the VP
operation. The list of VP intrinsics we are interested in matches that
of the regular instructions.

Some optimization is still lacking. For instance, our VL nodes aren't
recognized as commutative, so splats must be on the RHS. Because of
this, we limit our sinking of splats to just the RHS operand for now.
Improvement in this regard can come in another patch.

Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D117703
2022-01-21 11:30:37 +00:00
wangpc 8def89b5dc [RISCV] Set CostPerUse to 1 iff RVC is enabled
After D86836, we can define multiple cost values for
different cost models. So here we set CostPerUse to
1 iff RVC is enabled to avoid potential impact on RA.

Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D117741
2022-01-21 14:44:26 +08:00
Craig Topper 7b3d307288 [RISCV] Add isel patterns for grevi, shfli, and unshfli to brev8/zip/unzip instructions.
Zbkb supports some encodings of the general grevi, shfli, and
unshfli instructions legal, so we added separate instructions for
those encodings to improve the diagnostics for assembler and
disassembler. To be consistent we should always use these separate
instructions whenever those specific encodings of grevi/shfli/unshfli
occur. So this patch adds specific isel patterns to override the generic
isel patterns for these cases. Similar was done for rev8 and zext.h
for Zbb previously.
2022-01-20 20:43:52 -08:00
Wu Xinlong 7ee1c162cc [RISCV][RFC] add inst support of zbkb
This commit add instructions supports of `zbkb` which defined in scalar cryptography extension version v1.0.0 (has been ratified already).

Most of the zbkb directives reuse parts of the zbp and zbb directives, so this patch just modified some of the inst aliases and predicates.

Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D117640
2022-01-21 11:49:36 +08:00
Hsiangkai Wang ad06e65dc4 [RISCV] Fix the bug in the register allocator caused by reserved BP.
Originally, hasRVVFrameObject() will scan all the stack objects to check
whether if there is any scalable vector object on the stack or not.
However, it causes errors in the register allocator. In issue 53016, it
returns false before RA because there is no RVV stack objects. After RA,
it returns true because there are spilling slots for RVV values during RA.
The compiler will not reserve BP during register allocation and generate BP
access in the PEI pass due to the inconsistent behavior of the function.

The function is changed to use hasStdExtV() as the return value. It is
not precise, but it can make the register allocation correct.

Refer to https://github.com/llvm/llvm-project/issues/53016.

Differential Revision: https://reviews.llvm.org/D117663
2022-01-21 01:23:01 +00:00
Craig Topper cfae2c65db [RISCV] Factor Zve32 support into RISCVSubtarget::getMaxELENForFixedLengthVectors.
This is needed to properly limit fractional LMULs for Zve32.

Add new RUN Zve32 RUN lines to the existing tests for the
-riscv-v-fixed-length-vector-elen-max command line option.
2022-01-20 16:31:12 -08:00
Craig Topper 5e88f527da [RISCV] Remove RISCVSubtarget::hasStdExtV() and hasStdExtZve*(). NFC
All code should use one of the cleaner named hasVInstructions*
functions. Fix the two uses that weren't and delete the methods
so no new uses can be created.
2022-01-20 15:05:09 -08:00
Craig Topper fa8bb22466 [RISCV] Optimize vector_shuffles that are interleaving the lowest elements of two vectors.
RISCV only has a unary shuffle that requires places indices in a
register. For interleaving two vectors this means we need at least
two vrgathers and a vmerge to do a shuffle of two vectors.

This patch teaches shuffle lowering to use a widening addu followed
by a widening vmaccu to implement the interleave. First we extract
the low half of both V1 and V2. Then we implement
(zext(V1) + zext(V2)) + (zext(V2) * zext(2^eltbits - 1)) which
simplifies to (zext(V1) + zext(V2) * 2^eltbits). This further
simplifies to (zext(V1) + zext(V2) << eltbits). Then we bitcast the
result back to the original type splitting the wide elements in half.

We can only do this if we have a type with wider elements available.
Because we're using extends we also have to be careful with fractional
lmuls. Floating point types are supported by bitcasting to/from integer.

The tests test a varied combination of LMULs split across VLEN>=128 and
VLEN>=512 tests. There a few tests with shuffle indices commuted as well
as tests for undef indices. There's one test for a vXi64/vXf64 vector which
we can't optimize, but verifies we don't crash.

Reviewed By: rogfer01

Differential Revision: https://reviews.llvm.org/D117743
2022-01-20 14:44:47 -08:00
Craig Topper dd7b69a61f [RISCV] Remove HadStdExtV and HasStdZve* Predicates from tablegen.
No instructions should be using these. Everything should use
HasVInstructions* Predicates. Remove them so that they can't be
used by accident.
2022-01-20 12:54:20 -08:00
Craig Topper 7a275dc354 [RISCV] Remove Zvlsseg extension.
This string no longer appears in the Vector Extension specification.
The segment load/store instructions are just part of the vector
instruction set.

Reviewed By: asb

Differential Revision: https://reviews.llvm.org/D117724
2022-01-20 12:40:07 -08:00
Craig Topper 94e69fbb4f [RISCV] Add DAG combine to fold (fp_to_int_sat (ffloor X)) -> (select X == nan, 0, (fcvt X, rdn))
Similar for ceil, trunc, round, and roundeven. This allows us to use
static rounding modes to avoid a libcall.

This is similar to D116771, but for the saturating conversions.

This optimization is done for AArch64 as isel patterns.
RISCV doesn't have instructions for ceil/floor/trunc/round/roundeven
so the operations don't stick around until isel to enable a pattern
match. Thus I've implemented a DAG combine.

I'm only handling saturating to i64 or i32. This could be extended
to other sizes in the future.

Reviewed By: asb

Differential Revision: https://reviews.llvm.org/D116864
2022-01-20 11:35:37 -08:00
Fraser Cormack ca36cc56ac [RISCV] Match RVV VF variants also through masked operations
This brings floating-point RVV vector/scalar support more in line with
the integer vector patterns, which can already match '.vx' instructions
with masked operations.

Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D117697
2022-01-20 12:08:02 +00:00
Fraser Cormack 5a12024b95 [RISCV] Optimize lowering of floating-point -0.0
This idea has come up in several reviews -- D115978 and D105902 -- so I
can't take any credit for the idea. Instead of using a constant pool to
lower -0.0, we can emit a sequence of two instructions:

    fmv.[hwd].x freg, zero
    fsgnjn.[hsd] freg, freg, freg

This is only done when the floating-point type is legal.

Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D117687
2022-01-20 11:46:28 +00:00
Chenbing.Zheng 0be3da1fab [RISCV] Add intrinsic for Zbt extension
RV32: fsl, fsr, fsri
RV64: fsl, fsr, fsri, fslw, fsrw, fsriw

Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D117468
2022-01-20 08:27:05 +00:00
eopXD 8eae99dfe5 [RISCV] Add the zve extension according to the v1.0 spec
`zve` is the new standard vector extension to specify varying degrees of
vector support for embedding processors. The `zve` extension is related
to the `zvl` extension and other updates that are added in v1.0.

According to https://github.com/riscv-non-isa/riscv-c-api-doc/pull/21,
Clang defines macro `__riscv_v_max_elen`,  `__riscv_v_max_elen_fp` for
`zve` and it can be used by applications that uses the vector extension.

Authored by: Zakk Chen <zakk.chen@sifive.com> @khchen
Co-Authored by: Eop Chen <eop.chen@sifive.com> @eopXD

Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D112408
2022-01-19 23:48:28 -08:00
Mohammed Nurul Hoque 21c79be5d7 [RISCV] Add patterns to MIR sign-extension removal pass.
This patch adds a few instruction patterns that generate sign-extended values or propagate them, adding to the pass introduced in https://reviews.llvm.org/D116397

Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D117465
2022-01-19 17:33:58 -08:00
Luís Marques a767ae2c5c [RISCV] Fix incomplete asm statement parsing
For instructions without operands, the final `AsmToken::EndOfStatement`
wasn't being consumed. In the context of inline assembly, the resulting
empty statements would cause extraneous empty lines to be emitted. Fix
the issue by consuming the `EndOfStatement` token.

Differential Revision: https://reviews.llvm.org/D117565
2022-01-19 21:56:21 +00:00
Craig Topper 4060b81e76 [RISCV] Obey -riscv-v-fixed-length-vector-elen-max when lowering mask BUILD_VECTORs.
We may not be allowed to use vXiXLen vectors. Consult ELEN to
determine what is allowed. This will become even more important
when Zve32 is added.

Reviewed By: frasercrmck, arcbbb

Differential Revision: https://reviews.llvm.org/D117518
2022-01-19 10:47:37 -08:00
Jim Lin d6b0734837 [NFC] Use Register instead of unsigned 2022-01-19 20:17:04 +08:00
eopXD 9f27941c2f [RISCV] Add patterns for vector narrowing integer right shift instructions
Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D117454
2022-01-18 22:30:13 -08:00
Craig Topper 5a6c622afd [RISCV] Remove special case for constant shift amount in FSHL/FSHR lowering to FSL/FSR.
Remove fshl/fshr with constant shift amount isel patterns. Replace
with fsr/fsl with constant isel patterns.

This hack was trying to preserve as much optimization opportunity
for fshl/fshr by constant as possible, but the conversion to
RISCVISD::FSR/FSL happens so late it probably isn't worth much.

The new isel patterns are needed by D117468 anyway.
2022-01-18 11:47:50 -08:00
Craig Topper aa7fc02feb Recommit "[RISCV] Make the operand order for RISCVISD::FSL(W)/FSR(W) match the instruction register numbering."
This reverts the revert commit e328385739.

Accidental demanded bits change has been removed. The demanded bits
code itself was remove in a pre-commit since it isn't tested.

Original commit message:
Previous we used the fshl/fshr operand ordering for simplicity. This
made things confusing when D117468 proposed adding intrinsics for
the instructions. We can't just use the generic funnel shifting
intrinsics because fsl/fsr have different functionality that should
be exposed to software.

Now we use rs1, rs3, rs2/shamt order which matches the instruction
printing order and the order used in this intrinsic header
https://github.com/riscv/riscv-bitmanip/blob/main-history/cproofs/rvintrin.h
2022-01-18 10:52:43 -08:00
Craig Topper b3a0ec7645 [RISCV] Remove DemandedBits handling for FSR/FSL until we have test cases for it.
Testing may be easier after D117468. Right now we get demanded bits
optimizations done on ISD::FSHL/FSHR before they become FSR/FSL. This
makes it hard to test.
2022-01-18 10:52:43 -08:00
Craig Topper e328385739 Revert "[RISCV] Make the operand order for RISCVISD::FSL(W)/FSR(W) match the instruction register numbering."
This reverts commit b634f8a663.

I broke the SimplifyDemandedBits code, but we don't have tests.
2022-01-18 10:36:03 -08:00
Craig Topper b634f8a663 [RISCV] Make the operand order for RISCVISD::FSL(W)/FSR(W) match the instruction register numbering.
Previous we used the fshl/fshr operand ordering for simplicity. This
made things confusing when D117468 proposed adding intrinsics for
the instructions. We can't just use the generic funnel shifting
intrinsics because fsl/fsr have different functionality that should
be exposed to software.

Now we use rs1, rs3, rs2/shamt order which matches the instruction
printing order and the order used in this intrinsic header
https://github.com/riscv/riscv-bitmanip/blob/main-history/cproofs/rvintrin.h
2022-01-18 09:47:28 -08:00
David Sherwood f4515ab858 Revert "[CodeGen][AArch64] Ensure isSExtCheaperThanZExt returns true for negative constants"
This reverts commit 197f3c0deb.

Reverting after miscompilation errors discovered with ffmpeg.
2022-01-18 08:40:20 +00:00
Lian Wang 5ceb4f5446 [RISCV] Add instruction schedule for Zbc extension and Zbs extension
Zbc extension:
CLMUL/CLMULR/CLMULH are grouped together, defined one schedule class.

Zbs extension:
BCLR/BSET/BINV/BEXT are grouped together, defined one schedule class.
BCLRI/BSETI/BINVI/BEXTI are grouped together, defined one schedule class.

Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D117538
2022-01-18 07:31:50 +00:00
jacquesguan 1090000b63 [RISCV] Add patterns for vector widening floating-point multiply
Add patterns for vector widening floating-point multiply

Differential Revision: https://reviews.llvm.org/D117530
2022-01-18 14:52:43 +08:00
Han-Kuan Chen ec9cb3a79c [RISCV] Provide VLOperand in td.
Currently, users expected VL is the last operand. However, since some
intrinsics has tail policy in the last operand, this rule cannot be used
anymore.

Reviewed By: craig.topper, frasercrmck

Differential Revision: https://reviews.llvm.org/D117452
2022-01-17 20:25:47 -08:00
Han-Kuan Chen 3fc4b5896a [RISCV] Make SplatOperand start from 0.
Current SplatOperand starts from 1 because operand 0 (or 1) is intrinsic
id in SelectionDAG.

Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D117453
2022-01-17 20:14:59 -08:00
jacquesguan c29d6c410e [RISCV] Add patterns for vector widening floating-point add/subtract instructions
Add patterns for Vector Widening Floating-Point Add/Subtract Instructions

Differential Revision: https://reviews.llvm.org/D117466
2022-01-18 10:33:56 +08:00
Craig Topper 116af698e2 [RISCV] When expanding CONCAT_VECTORS, don't create INSERT_SUBVECTORS for undef subvectors.
For fixed vectors, the undef will get expanded to an all zeros
build_vector. We don't want that so suppress creating the
insert_subvector.

Reviewed By: frasercrmck

Differential Revision: https://reviews.llvm.org/D117379
2022-01-17 14:40:59 -08:00
Craig Topper 9c410838d2 [RISCV] Legalize fixed length (insert_subvector undef, X, 0) to a scalable insert.
We were considering this legal, but later the undef would become an all
zeros vector. This would cause us to need to re-legalize the insert later
into a vslideup with zero vector.

This patch catches the case and directly legalizes it to a scalable
insert.

Reviewed By: frasercrmck

Differential Revision: https://reviews.llvm.org/D117377
2022-01-17 14:31:30 -08:00
David Sherwood 197f3c0deb [CodeGen][AArch64] Ensure isSExtCheaperThanZExt returns true for negative constants
When we know the value we're extending is a negative constant then it
makes sense to use SIGN_EXTEND because this may improve code quality in
some cases, particularly when doing a constant splat of an unpacked vector
type. For example, for SVE when splatting the value -1 into all elements
of a vector of type <vscale x 2 x i32> the element type will get promoted
from i32 -> i64. In this case we want the splat value to sign-extend from
(i32 -1) -> (i64 -1), whereas currently it zero-extends from
(i32 -1) -> (i64 0xFFFFFFFF). Sign-extending the constant means we can use
a single mov immediate instruction.

New tests added here:

  CodeGen/AArch64/sve-vector-splat.ll

I believe we see some code quality improvements in these existing
tests too:

  CodeGen/AArch64/reduce-and.ll
  CodeGen/AArch64/unfold-masked-merge-vector-variablemask.ll

The apparent regressions in CodeGen/AArch64/fast-isel-cmp-vec.ll only
occur because the test disables codegen prepare and branch folding.

Differential Revision: https://reviews.llvm.org/D114357
2022-01-17 11:08:57 +00:00
Lian Wang 85def34f5e [RISCV] Add scheduler for bfp instruction in Zbf extension
Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D117290
2022-01-17 09:17:18 +00:00
Kito Cheng cc35161dc7 [RISCV] Add initial support for getRegUsageForType and getNumberOfRegisters
Those two TTI hooks are used during vectorization for calculating
register pressure, the default implementation isn't consider for LMUL,
and that's also definitly wrong value for register number (all register class
are 8 registers).

So in this patch we tried to:

1. Calculate right register usage for vector type and scalar type.
2. Return right number of register for general purpose register and
   vector register.

Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D116890
2022-01-17 15:27:54 +08:00
eopXD 5a457782a2 [RISCV] Add patterns for vector widening integer multiply-add instructions
Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D117404
2022-01-16 18:37:13 -08:00
Craig Topper 4c1e1e05cb [RISCV] Add RISCVISD::BFPW to ComputeNumSignBitsForTargetNode. 2022-01-15 15:23:49 -08:00
Fraser Cormack 877d1b3d07 [SelectionDAG][VP] Add splitting/widening for VP_LOAD and VP_STORE
Original patch by @hussainjk.

This patch was split off from D109377 to keep vector legalization
(widening/splitting) separate from vector element legalization
(promoting).

While the original patch added a third overload of
SelectionDAG::getVPStore, this patch takes the liberty of collapsing
those all down to 1, as three overloads seems excessive for a
little-used node.

The original patch also used ModifyToType in places, but that method
still crashes on scalable vector types. Seeing as the other VP
legalization methods only work when all operands need identical
widening, this patch follows in that vein.

Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D117235
2022-01-15 11:41:29 +00:00
Alex Bradbury 0ee679e22c [RISCV] Add CSRs defined in the recently ratified Sstc extension
The 'RISC-V "stimecmp / vstimecmp" Extension' was ratified at the end of
last year though hasn't yet been integrated in the main specification
documents (see
<https://wiki.riscv.org/display/TECH/Recently+Ratified+Extensions>).

RISC-V "stimecmp / vstimecmp" Extension
<https://github.com/riscv/riscv-time-compare/releases/download/v0.5.4/Sstc.pdf>.

Differential Revision: https://reviews.llvm.org/D117311
2022-01-15 08:36:04 +00:00
Alex Bradbury 1ca79823e0 [RISCV] Add CSRs defined in the recently ratified Smstateen extension
The "RISC-V State Enable Extension" was ratified at the end of at the
end of last year though hasn't yet been integrated in the main
specification documents (see
<https://wiki.riscv.org/display/TECH/Recently+Ratified+Extensions>).

This commit adds the CSRs defined by this extension in
<https://github.com/riscv/riscv-state-enable/releases/download/v0.6.3/Smstateen.pdf>.

Differential Revision: https://reviews.llvm.org/D117310
2022-01-15 08:35:47 +00:00
Alex Bradbury f00a98a0a9 [RISCV] Add CSRs defined in the recently ratified Sscofpmf extension
The "RISC-V Count Overflow and Mode-Based Filtering Extension" was
ratified at the end of last year though hasn't yet been integrated in
the main specification documents (see
<https://wiki.riscv.org/display/TECH/Recently+Ratified+Extensions>).

This commit adds the CSRs defined by this extension in
<https://github.com/riscv/riscv-count-overflow/releases/download/v0.5.2/Sscofpmf.pdf>.

Differential Revision: https://reviews.llvm.org/D117308
2022-01-15 08:35:13 +00:00
Chenbing.Zheng fdd33a0c75 [RISCV][NFC] Add a function to customLegalizeToWOp by Intrinsic
These cases follow the same pattern, so they can be combined
to a unqiue function.

Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D117378
2022-01-15 08:28:08 +00:00
eopXD 26bb1b1dab [RISCV] Add the zvl extension according to the v1.0 spec
`zvl` is the new standard vector extension that specifies the minimum vector length of the vector extension.
The `zvl` extension is related to the `zve` extension and other updates that are added in v1.0.

According to https://github.com/riscv-non-isa/riscv-c-api-doc/pull/21,
Clang defines macro `__riscv_v_min_vlen` for `zvl` and it can be used for applications that uses the vector extension.
LLVM checks whether the option `riscv-v-vector-bits-min` (if specified) matches the `zvl*` extension specified.

Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D108694
2022-01-14 23:01:48 -08:00
Lian Wang 21dad9a522 [RISCV][NFC] Add IsRV64 predicate in xperm.w pattern
Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D117191
2022-01-15 04:22:16 +00:00
jacquesguan b148348ad4 [RISCV] Add patterns for vector widening integer add/subtract
Add patterns for vector widening integer add/subtract instructions

Differential Revision: https://reviews.llvm.org/D117188
2022-01-15 09:41:07 +08:00
Shao-Ce SUN a0a76fee0c [RISCV] update zfh and zfhmin extention to v1.0
`zfh` and `zfhmin` have been ratified, with version 1.0.

Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D117098
2022-01-15 09:21:24 +08:00
Craig Topper 2baa1dffd1 [RISCV] Add basic support for matching shuffles to vslidedown.vi.
Specifically the unary shuffle case where the elements being
shifted in are undef. This handles the shuffles produce by expanding
llvm.reduce.mul.

I did not reduce the VL which would increase the number of vsetvlis,
but may improve the execution speed. We'd also want to narrow the
multiplies so we could share vsetvlis between the vslidedown.vi and
the next multiply.

Reviewed By: frasercrmck

Differential Revision: https://reviews.llvm.org/D117239
2022-01-14 09:04:54 -08:00
Craig Topper ac6b4896ea [RISCV] Honor the VT when converting float point register names to register class for inline assembly.
It appears the code here was written for the inline asm clobbering
a specific register, but it also gets used for named input and
output registers.

For the input and output case, we should honor the VT so we
don't insert conversion instructions around the inline assembly.

For the clobber, case we need to pick the largest register class.

Reviewed By: asb, jrtc27

Differential Revision: https://reviews.llvm.org/D117279
2022-01-14 09:04:00 -08:00
Alex Bradbury 4a4a652f34 [RISCV][NFC] Use TableGen 'foreach' to simplify repetitive CSR definitions
Make the definitions of hpmcounter3-hpmcounter31,
hpmcounter3h-hpmcounter31h, mhpmcounter3-mhpmcounter31,
mhpmcounter3h-mhpmcounter31h, pmpaddr0-pmpaddr63, mhpmevent3-31, and
pmpcfg0-15 substantially less repetitive using a foreach loop.

Differential Revision: https://reviews.llvm.org/D117227
2022-01-14 11:59:39 +00:00
jacquesguan 88c0e0806b [RISCV] Improve i64 splat vector lowering in RV32.
We could use vmv.v.i/vmv.v.x whose eew is 32 to lower the i64 splat vector if the i64 constant scalar could be splitted into two same i32 scalar.

Differential Revision: https://reviews.llvm.org/D117079
2022-01-14 14:06:01 +08:00
David Sherwood ba471ba8d2 Revert "[CodeGen][AArch64] Ensure isSExtCheaperThanZExt returns true for negative constants"
This reverts commit 31009f0b5a.

It seems to be causing SVE VLA buildbot failures and has introduced a
genuine regression. Reverting for now.
2022-01-13 15:59:43 +00:00
David Sherwood 31009f0b5a [CodeGen][AArch64] Ensure isSExtCheaperThanZExt returns true for negative constants
When we know the value we're extending is a negative constant then it
makes sense to use SIGN_EXTEND because this may improve code quality in
some cases, particularly when doing a constant splat of an unpacked vector
type. For example, for SVE when splatting the value -1 into all elements
of a vector of type <vscale x 2 x i32> the element type will get promoted
from i32 -> i64. In this case we want the splat value to sign-extend from
(i32 -1) -> (i64 -1), whereas currently it zero-extends from
(i32 -1) -> (i64 0xFFFFFFFF). Sign-extending the constant means we can use
a single mov immediate instruction.

New tests added here:

  CodeGen/AArch64/sve-vector-splat.ll

I believe we see some code quality improvements in these existing
tests too:

  CodeGen/AArch64/dag-numsignbits.ll
  CodeGen/AArch64/reduce-and.ll
  CodeGen/AArch64/unfold-masked-merge-vector-variablemask.ll

The apparent regressions in CodeGen/AArch64/fast-isel-cmp-vec.ll only
occur because the test disables codegen prepare and branch folding.

Differential Revision: https://reviews.llvm.org/D114357
2022-01-13 09:43:07 +00:00
Lian Wang 16877c5d2c [RISCV] Add bfp and bfpw intrinsic in zbf extension
Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D116994
2022-01-13 02:53:00 +00:00
Alex Bradbury 33d008b169 [RISCV] Update recently ratified Zb{a,b,c,s} extensions to no longer be experimental
Agreed policy is that RISC-V extensions that have not yet been ratified
should be marked as experimental, and enabling them requires the use of
the -menable-experimental-extensions flag when using clang alongside the
version number. These extensions have now been ratified, so this is no
longer necessary, and the target feature names can be renamed to no
longer be prefixed with "experimental-".

Differential Revision: https://reviews.llvm.org/D117131
2022-01-12 19:33:44 +00:00
Craig Topper 632c263eb3 [RISCV] Add RISCVProcFamilyEnum and add SiFive7.
Use it to remove explicit string compares from unrolling preferences.

I'm of two minds on this. Ideally, we would define things in terms
of architectural or microarchitectural features, but it's hard to
do that with things like unrolling preferences without just ending up
with FeatureSiFive7UnrollingPreferences.

Having a proc enum is consistent with ARM and AArch64. X86 only has
a few and is trying to move away from it.

Reviewed By: asb, mcberg2021

Differential Revision: https://reviews.llvm.org/D117060
2022-01-12 09:34:02 -08:00
Shao-Ce SUN edb9175de6 [RISCV][llvm] Update CSRs
According the newest RISC-V Privileged Spec, updated CSRs.

Reviewed By: asb

Differential Revision: https://reviews.llvm.org/D116645
2022-01-12 20:14:04 +08:00
Craig Topper 63b17eb9ec [RISCV] Add strictfp support for compares.
This adds support for STRICT_FSETCC(quiet) and STRICT_FSETCCS(signaling).

FEQ matches well to STRICT_FSETCC oeq.
FLT/FLE matches well to STRICT_FSETCCS olt/ole.

Others require commuting operands or multiple instructions.

STRICT_FSETCC olt/ole/ogt/oge/ult/ule/ugt/uge uses FLT/FLE,
but we need to save/restore FFLAGS around them to avoid spurious
exceptions. I've implemented pseudo instructions with a
CustomInserter to insert the save/restore CSR instructions.
Unfortunately, this doesn't honor exceptions for signaling NANs
but I'm not sure if signaling nans are really supported by the
constrained intrinsics.

STRICT_FSETCC one and ueq expand to a pair of FLT instructions
with a save/restore of fflags around each. This could be improved
in the future.

There may be some opportunities to generate better code for strict
comparisons mixed with nonans fast math flags. I've left FIXMEs in
the .td files for that.

Co-Authored-by: ShihPo Hung <shihpo.hung@sifive.com>

Reviewed By: arcbbb

Differential Revision: https://reviews.llvm.org/D116694
2022-01-11 20:01:41 -08:00
Craig Topper be1cc64cc1 [RISCV] Add DAG combine to fold (fp_to_int (ffloor X)) -> (fcvt X, rdn)
Similar for ceil, trunc, round, and roundeven. This allows us to use
static rounding modes to avoid a libcall.

This optimization is done for AArch64 as isel patterns.
RISCV doesn't have instructions for ceil/floor/trunc/round/roundeven
so the operations don't stick around until isel to enable a pattern
match. Thus I've implemented a DAG combine.

We only handle XLen types except i32 on RV64. i32 will be type
legalized to a RISCVISD node. All other types will be type legalized
to XLen and maintain the FP_TO_SINT/UINT ISD opcode.

Reviewed By: asb

Differential Revision: https://reviews.llvm.org/D116771
2022-01-11 09:05:57 -08:00
wangpc c6430fade3 [RISCV] Generate 32 bits jumptable entries when code model is small
The code can only address the whole RV32 address space or the lower 2 GiB
of the RV64 address space in small code model, so 32 bits entry is enough.
Cache hit ratio and code size have some improvements.

Reviewed By: asb

Differential Revision: https://reviews.llvm.org/D116435
2022-01-11 18:20:37 +08:00
wangpc 98d51c2542 [RISCV] Override TargetLowering::BuildSDIVPow2 to generate SELECT
When `Zbt` is enabled, we can generate SELECT for division by power
of 2, so that there is no data dependency.

Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D114856
2022-01-11 15:54:35 +08:00
Chenbing.Zheng 9ea772ff81 [RISCV] Block vmsgeu.vi with 0 immediate in Isel
For vmsgeu.vi with 0, we know this is always true. So we can replace
it with vmset.m (unmasked) or vmset.m+vmand.mm (masked).

Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D116584
2022-01-11 03:04:44 +00:00
jacquesguan d0554ae4cf [RISCV] Select vl op to X0 when it is equal to ~0.
Now the backend will select ~0 vl to a register and load instruction, we could use X0 to replace it.

Differential Revision: https://reviews.llvm.org/D116798
2022-01-11 10:56:25 +08:00
Haocong.Lu bd653f6406 [RISCV] Use shift for zero extension when Zbb and Zbp are not enabled
Now AND is used for zero extension when both Zbb and Zbp are not enabled.
It may be better to use shift operation if the trailing ones mask exceeds simm12.

This patch optimzes LUI+ADDI+AND to SLLI+SRLI.

Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D116720
2022-01-11 02:37:03 +00:00
jacquesguan b607cd3928 [RISCV] Use vmv.s.x to build one element splat vector.
When we want to create an splat vector that only the first element is initialized, we could use vmv.s.x or vfmv.s.f to build it.

Differential Revision: https://reviews.llvm.org/D116277
2022-01-11 10:21:18 +08:00
Craig Topper b645bcd98a [RISCV] Generalize (srl (and X, 0xffff), C) -> (srli (slli X, (XLen-16), (XLen-16) + C) optimization.
This can be generalized to (srl (and X, C2), C) ->
(srli (slli X, (XLen-C3), (XLen-C3) + C). Where C2 is a mask with
C3 trailing ones.

This can avoid constant materialization for C2. This is beneficial
even when C2 can be selected to ANDI because the SLLI can become
C.SLLI, but C.ANDI cannot cover all the immediates of ANDI.

This also enables CSE in some cases of i8 sdiv by constant codegen.
2022-01-09 23:37:10 -08:00
Craig Topper 296e8cae5c [RISCV] Isel (sra (sext_inreg X, i16), C) -> (srai (slli X, (XLen-16), (XLen-16) + C).
Similar for (sra (sext_inreg X, i8), C).

With Zbb, sext_inreg of i8 and i16 are legal for sext.b and sext.h.
This transform makes the Zbb codegen the same as without Zbb. The
shifts are more compressible. This also exposes an opportunity for
CSE with another slli in the i16 sdiv by constant codegen.
2022-01-09 21:23:43 -08:00
jacquesguan 6b8362eb8d [RISCV] Disable EEW=64 for index values when XLEN=32.
Disable EEW=64 for vector index load/store when XLEN=32.

Differential Revision: https://reviews.llvm.org/D106518
2022-01-10 10:51:27 +08:00
Craig Topper 2dd52f840b [RISCV] Fold (srl (and X, 0xffff), C)->(srli (slli X, (XLen-16), (XLen-16) + C) even with Zbb/Zbp.
We can use zext.h with Zbb, but srli/slli may offer more opportunities
for compression.
2022-01-09 18:42:03 -08:00
Kazu Hirata 435a5a3652 [llvm] Fix bugprone argument comments (NFC)
Identified with bugprone-argument-comment.
2022-01-08 11:56:38 -08:00
Craig Topper 042394b69e [RISCV] Add a command line option to control the LMUL used by TTI's getRegisterBitWidth.
By default we return the width of an LMUL=1 register. We can enable
testing with larger LMUL values by returning a larger bit width.

This patch adds a RISCV specific option to provide a LMUL which will be
multiplied by the LMUL=1 bit width.

Reviewed By: kito-cheng

Differential Revision: https://reviews.llvm.org/D116339
2022-01-07 20:02:10 -08:00
Kito Cheng f142c45f1e [RISCV] Set getMinVectorRegisterBitWidth to 16 if enable fixed length vector code gen for RVV
getMinVectorRegisterBitWidth means what vector types is supported in
this target, and actually RISC-V support all fixed length vector types with
vector length less than `getMinRVVVectorSizeInBits`, so set it to 16,
means 2 x i8, that is minimal fixed length vector size in theory.

That also fixed one issue, some testcase migth become non-vectorizable
when `-riscv-v-vector-bits-min` set to larger value, because the vector size is
smaller than `-riscv-v-vector-bits-min`.

For example, following code can vectorize by SLP with
`-riscv-v-vector-bits-min=128` or `-riscv-v-vector-bits-min=256`, but
can't vectorize `-riscv-v-vector-bits-min=512` or larger:

```
void foo(double *da) {
  da[0] = 0;
  da[1] = 1;
  da[2] = 2;
  da[3] = 3;
}
```

Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D116534
2022-01-08 11:16:21 +08:00
Baoshan Pang af931a51b9 [RISCV] Materializing constants with 'rori'
Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D116574
2022-01-07 15:39:22 -08:00
Lian Wang e8f1dfe923 [RISCV] Supplement PACKH instruction pattern
Optimize (rs1 & 255) | ((rs2 & 255) << 8) -> (PACKH rs1, rs2).

Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D116791
2022-01-07 17:59:19 +08:00
Kazu Hirata f3a344d212 [Target] Remove redundant member initialization (NFC)
Identified with readability-redundant-member-init.
2022-01-06 22:01:44 -08:00
wangpc 91cf2a9b6c [RISCV][NFC] Use sub operator to generate register list
There are several duplicated lines for generating GPRXXX's
register list that can be eliminated by using `sub` operator.

Reviewed By: asb

Differential Revision: https://reviews.llvm.org/D116729
2022-01-07 12:29:58 +08:00
Shao-Ce SUN 808c0987c3 [NFC][RISCV] Make the macro names more uniform
Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D116719
2022-01-07 11:09:41 +08:00
Liqin Weng 92153a9aa7 [RISCV] Support immediate vtype of VSETVLI/VSETIVLI in asm parser
Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D115133
2022-01-07 02:26:41 +00:00
Craig Topper ec4dd862bf [RISCV] Use simm5_plus1_nonzero in isel patterns for vmsgeu.vi/vmsltu.vi intrinsics.
The 0 immediate can't be selected to vmsgtu.vi/vmsleu.vi by decrementing
the immediate. To prevent his we had special patterns that provided
alternate lowering for the 0 cases. This relied on tablegen prioritizing
the 0 pattern over the sim5_plus1 range.

This patch introduces simm5_plus1_nonzero that excludes 0. It also
excludes the special case for vmsltu.vi since we can just use
vmsltu.vx and let the 0 be selected to X0.

This is an alternative to some of the changes in D116584.

Reviewed By: Chenbing.Zheng, asb

Differential Revision: https://reviews.llvm.org/D116723
2022-01-06 08:27:27 -08:00
Craig Topper 56ca11e31e [RISCV] Add an MIR pass to replace redundant sext.w instructions with copies.
Function calls and compare instructions tend to cause sext.w
instructions to be inserted. If we make good use of W instructions,
these operations can often end up being redundant. We don't always
detect these during SelectionDAG due to things like phis. There also
some cases caused by failure to turn extload into sextload in
SelectionDAG. extload selects to LW allowing later sext.ws to become
redundant.

This patch adds a pass that examines the input of sext.w instructions trying
to determine if it is already sign extended. Either by finding a
W instruction, other instructions that produce a sign extended result,
or looking through instructions that propagate sign bits. It uses
a worklist and visited set to search as far back as necessary.

Reviewed By: asb, kito-cheng

Differential Revision: https://reviews.llvm.org/D116397
2022-01-06 08:23:42 -08:00
Craig Topper 75117fb340 [RISCV] Don't advertise i32->i64 zextload as free for RV64.
The zextload hook is only used to determine whether to insert a
zero_extend or any_extend for narrow types leaving a basic block.
Returning true from this hook tends to cause any load whose output
leaves the basic block to become an LWU instead of an LW.

Since we tend to prefer sexts for i32 compares on RV64, this can
cause extra sext.w instructions to be created in other basic blocks.

If we use LW instead of LWU this gives the MIR pass from D116397
a better chance of removing them.

Another option might be to teach getPreferredExtendForValue in
FunctionLoweringInfo.cpp about our preference for sign_extend of
i32 compares. That would cause SIGN_EXTEND to be chosen for any
value used by a compare instead of using the isZExtFree heuristic.
That will require code to convert from the llvm::Type* to EVT/MVT
as well as querying the type legalization actions to get the
promoted type in order to call TargetLowering::isSExtCheaperThanZExt.
That seemed like many extra steps when no other target wants it.
Though it would avoid us needing to lean on the MIR pass in some cases.

Reviewed By: asb

Differential Revision: https://reviews.llvm.org/D116567
2022-01-06 08:13:42 -08:00
Craig Topper 808c662665 [RISCV] Change RISCVISD::FCVT*RTZ opcodes to take rounding mode as an operand.
Pre-work for a future change that will use these opcodes with other
rounding modes.

Differential Revision: https://reviews.llvm.org/D116724
2022-01-06 08:12:12 -08:00
Craig Topper fd992aac19 [RISCV] Use macros to reduce repetive switch cases. NFC
These 3 switches map LMUL enum to instruction names. These follow
a regular pattern. Use a macro to reduce the number of source code
lines.

Reviewed By: arcbbb

Differential Revision: https://reviews.llvm.org/D116631
2022-01-05 09:00:48 -08:00
Craig Topper df2e728b77 [RISCV] Teach RISCVGatherScatterLowering to handle more complex recurrence start values.
Previously we only recognized strided loads/store when the initial
value for the phi was a strided constant vector.

This patch extends the support to a strided_constant added to a
splatted value. The rewritten loop will add the splat value to the
first element of the strided constant vector to use as the scalar
start value. The stride is unaffected.

Reviewed By: frasercrmck

Differential Revision: https://reviews.llvm.org/D115958
2022-01-04 10:13:34 -08:00
Kazu Hirata e5947760c2 Revert "[llvm] Remove redundant member initialization (NFC)"
This reverts commit fd4808887e.

This patch causes gcc to issue a lot of warnings like:

  warning: base class ‘class llvm::MCParsedAsmOperand’ should be
  explicitly initialized in the copy constructor [-Wextra]
2022-01-03 11:28:47 -08:00
Jim Lin d38637a0e6 [RISCV] Fix the code alignment for GroupFloatVectors. NFC
Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D116520
2022-01-03 17:08:53 +08:00
Victor Perez 5527139302 [RISCV][VP] Add RVV codegen for [nX]vXi1 vp.select
Expand [nX]vXi1 vp.select the same way as [nX]vXi1 vselect.

Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D115546
2022-01-02 23:12:32 -08:00
Craig Topper bc091e0862 [RISCV] Prune more unnecessary vector pseudo instructions. NFC
For floating point specific vector instructions, we don't need
pseudos for mf8.

Reviewed By: khchen

Differential Revision: https://reviews.llvm.org/D116460
2022-01-02 23:00:42 -08:00
Kazu Hirata 41bfac6aed [Target] Remove unused forward declarations (NFC) 2022-01-02 10:20:15 -08:00
Craig Topper 4602f4169a [RISCV] Prune unnecessary vector pseudo instructions. NFC
For .vf instructions, we don't need MF8 pseudos for f16. We don't
need MF8 or MF4 pseudos for f32. Or MF8, MF4, MF2 for f64.

Reviewed By: khchen

Differential Revision: https://reviews.llvm.org/D116437
2022-01-01 19:53:53 -08:00
Kazu Hirata fd4808887e [llvm] Remove redundant member initialization (NFC)
Identified with readability-redundant-member-init.
2022-01-01 16:18:18 -08:00
Craig Topper 6f45fe9851 [RISCV] Use MxListW instead of MxList[0-5]. NFC
Better to use the named list instead of assuming the size of MxList.
2021-12-31 00:22:55 -08:00
Craig Topper 8811a87e8c [RISCV] Use defvar to simplify some code. NFC
Rather than wrapping a def around a list, we can just make a defvar
of the list.
2021-12-30 23:48:39 -08:00
wangpc 41454ab256 [RISCV] Use constant pool for large integers
For large integers (for example, magic numbers generated by
TargetLowering::BuildSDIV when dividing by constant), we may
need about 4~8 instructions to build them.
In the same time, it just takes two instructions to load
constants (with extra cycles to access memory), so it may be
profitable to put these integers into constant pool.

Reviewed By: asb, craig.topper

Differential Revision: https://reviews.llvm.org/D114950
2021-12-31 14:48:48 +08:00
jacquesguan 05f82dc877 [RISCV] Fix incorrect cases of vmv.s.f in the VSETVLI insert pass.
Fix incorrect cases of vmv.s.f and add test cases for it.

Differential Revision: https://reviews.llvm.org/D116432
2021-12-31 14:17:03 +08:00
Craig Topper 15787ccd45 [RISCV] Add support for STRICT_LRINT/LLRINT/LROUND/LLROUND. Tests for other strict intrinsics.
This patch adds isel support for STRICT_LRINT/LLRINT/LROUND/LLROUND.

It also adds test cases for f32 and f64 constrained intrinsics that
correspond to the intrinsics in float-intrinsics.ll and
double-intrinsics.ll. Support for promoting the integer argument of
STRICT_FPOWI was added.

I've skipped adding tests for f16 intrinsics, since we don't have libcalls
for them and we have inconsistent support for promoting them in LegalizeDAG.
This will need to be examined more closely.

Reviewed By: asb

Differential Revision: https://reviews.llvm.org/D116323
2021-12-30 11:54:32 -08:00
jacquesguan 128c6ed73b [RISCV] Teach VSETVLInsert to eliminate redundant vsetvli for vmv.s.x and vfmv.s.f.
Differential Revision: https://reviews.llvm.org/D116307
2021-12-30 17:16:18 +08:00
jacquesguan 1dd5e6fed5 [RISCV] Use vmv.s.x instead of vfmv.s.f when the floating point scalar is 0.
Use integer vector scalar move instruction when move 0 to avoid add a integer-float move instruction.

Differential Revision: https://reviews.llvm.org/D116365
2021-12-30 10:16:54 +08:00
Chenbing.Zheng 43c8296cda [RISCV] Refactor immediate comparison instructions patterns
The patterns of the immediate comparison instruction is rewrite here, and put similar code to a class.
Do not change any function of the original code, making the code more concise.

Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D116215
2021-12-30 09:31:01 +08:00
Craig Topper 015ff729cb [RISCV] Add a few more instructions to hasAllNBitUsers. 2021-12-29 09:17:47 -08:00
Kazu Hirata 5a667c0e74 [llvm] Use nullptr instead of 0 (NFC)
Identified with modernize-use-nullptr.
2021-12-28 08:52:25 -08:00
Hsiangkai Wang a1c7ddf926 [RISCV] Support passing scalable vectur values through the stack.
After consuming all vector registers, the scalable vector values will be
passed indirectly. The pointer values will be saved in general
registers. If all general registers are used up, we will report an error to
notify users the compiler does not support passing scalable vector
values through the stack. In this patch, we remove the restriction. After
all general registers are used up, we use the stack to save the
pointers which point to the indirect passed scalable vector values.

Differential Revision: https://reviews.llvm.org/D116310
2021-12-28 09:26:36 +08:00
Hsiangkai Wang 5d47e7d768 [RISCV] Convert whole register copies as the source defined explicitly.
The implicit defines may come from a partial define in an instruction.
It does not mean the defining instruction and the COPY instruction have
the same vl and vtype. When the source comes from the implicit defines,
do not convert the whole register copies to vmv.v.v.

Differential Revision: https://reviews.llvm.org/D115866
2021-12-27 13:59:49 +08:00
Shao-Ce SUN 70a98008ea [RISCV] Reduce repetitive codes in flw, fsw
Trying to improve code reuse in F,D,Zfh *.td files.

Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D116089
2021-12-27 09:29:35 +08:00
Kazu Hirata e7774f499b Use static_assert instead of assert (NFC)
Identified with misc-static-assert.
2021-12-26 14:26:44 -08:00
Jim Lin 02478a26f2 [RISCV] Use DAG variable directly instead of DCI.DAG
Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D116087
2021-12-24 13:06:55 +08:00
Craig Topper 0a35211b34 [RISCV] Don't allow vector types to be used with inline asm 'r' constraint
The 'r' constraint uses the GPR class. There is generic support
for bitcasting and extending/truncating non-integer VTs to the
required integer VT. This doesn't work for scalable vectors and
instead crashes.

To prevent this, explicitly reject vectors. Fixed vectors might
work without crashing, but it doesn't seem worthwhile to allow.

While there remove an unnecessary level of indentation in the
"vr" and "vm" constraint handling.

Differential Revision: https://reviews.llvm.org/D115810
2021-12-23 20:32:36 -06:00
Victor Perez 10b3675aa9 [RISCV][VP] Lower mask vector VP AND/OR/XOR to RVV instructions
For fixed and scalable vectors, each intrinsic x is lowered to vmx.mm,
dropping the mask, which is safe to do as masked-off elements are
undef anyway.

Differential Revision: https://reviews.llvm.org/D115339
2021-12-23 15:02:32 -06:00
Craig Topper 7704c503ec [RISCV] Use positive 0.0 for the neutral element in fadd reductions if nsz is present.
-0.0 requires a constant pool. +0.0 can be made with vmv.v.x x0.

Not doing this in getNeutralElement for fear of changing other targets.

Differential Revision: https://reviews.llvm.org/D115978
2021-12-23 10:38:00 -06:00
Craig Topper b7b260e19a [RISCV] Support strict FP conversion operations.
This adds support for strict conversions between fp types and between
integer and fp.

NOTE: RISCV has static rounding mode instructions, but the constrainted
intrinsic metadata is not used to select static rounding modes. Dynamic
rounding mode is always used.

Differential Revision: https://reviews.llvm.org/D115997
2021-12-23 09:40:58 -06:00
Craig Topper a9486a40f7 [RISCV] Disable interleaving scalar loops in the loop vectorizer.
The loop vectorizer can interleave scalar loops even if it doesn't
vectorize them. I don't believe we intended to enable this when
we enabled interleaving for vector instructions.

Disable interleaving for VF=1 like X86 and AMDGPU already do. Test
lifted from AMDGPU.

Differential Revision: https://reviews.llvm.org/D115975
2021-12-23 08:37:24 -06:00
jacquesguan 28a3e7dea2 [RISCV] Override hasAndNotCompare to use more andn when have Zbb extension.
Enable transform (X & Y) == Y ---> (~X & Y) == 0 and (X & Y) != Y ---> (~X & Y) != 0 when have Zbb extension to use more andn instruction.

Differential Revision: https://reviews.llvm.org/D115922
2021-12-23 10:42:20 +08:00
Shao-Ce SUN 68bc6d7cae [RISCV] Remove Zvamo Extention
Based on D111692. Zvamo is not part of the 1.0 V spec. Remove it.

Reviewed By: arcbbb

Differential Revision: https://reviews.llvm.org/D115709
2021-12-20 10:28:39 +08:00
Michael Berg f95ee6074a [RISCV] Add target specific loop unrolling and peeling preferences
Both these preference helper functions have initial support with
this change. The loop unrolling preferences are set with initial
settings to control thresholds, size and attributes of loops to
unroll with some tuning done.  The peeling preferences may need
some tuning as well as the initial support looks much like what
other architectures utilize.

Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D113798
2021-12-18 12:54:50 -08:00
Craig Topper be41996f4f [RISCV} Add FSGNJ_H to isAsCheapAsAMove and isCopyInstrImpl.
This matches FSGNJ_S and FSGNJ_D.
2021-12-17 09:14:20 -08:00
Craig Topper 66bbefeb13 [RISCV] Revert Zfhmin related changes that aren't tested and depend on f16 being a legal type.
Our Zfhmin support is only MC layer, but these are CodeGen layer
interfaces. If f16 isn't a Legal type for CodeGen with Zfhmin, then
these interfaces should keep their non-Zfh behavior.

Reviewed By: luismarques

Differential Revision: https://reviews.llvm.org/D115822
2021-12-16 08:55:28 -08:00
jacquesguan 7dfbf0b60f [RISCV] Fold (and (not (srl X, C)), 1) to (xor (bexti X, C), 1) when have Zbs extension.
When have Zbs extension, we could use bexti to fold (and (not (srl X, C)), 1) to (xor (bexti X, C), 1).

Differential Revision: https://reviews.llvm.org/D115629
2021-12-16 15:01:05 +08:00
jacquesguan d3c2ad154e [RISCV] Fix whole vector register move instruction's vector register constraint.
According to the v-spec, the source and destination VR of vmv<nr>r.v should be aligned for the VR group size.

Differential Revision: https://reviews.llvm.org/D115720
2021-12-16 10:58:55 +08:00
Craig Topper 3926893439 [RISCV] Add isel support for scalar STRICT_FADD/FSUB/FMUL/FDIV/FSQRT.
Test that STRICT_FMINNUM/FMAXNUM are lowered to libcalls for f32/f64.
The RISC-V instructions don't match the behavior of fmin/fmax libcalls
with respect to SNaN.

Promoting FMINNUM/FMAXNUM for f16 needs more work outside of the
RISC-V backend.

Reviewed By: asb, arcbbb

Differential Revision: https://reviews.llvm.org/D115680
2021-12-14 10:50:55 -08:00
Craig Topper 3f1c403a2b [RISCV] Use AdjustInstrPostInstrSelection to insert a FRM dependency for scalar FP instructions with dynamic rounding mode.
In order to support constrained FP intrinsics we need to model FRM
dependency. Whether or not a instruction uses FRM is based on a 3
bit field in the instruction. Because of this we can't add
'Uses = [FRM]' to the tablegen descriptions.

This patch examines the immediate after isel and adds an implicit
use of FRM. This idea came from Roger Ferrer Ibanez.

Other ideas:
We could be overly conservative and just pretend all instructions with
frm field read the FRM register. Or we could have pseudoinstructions
for CodeGen with rounding mode.

Reviewed By: asb, frasercrmck, arcbbb

Differential Revision: https://reviews.llvm.org/D115555
2021-12-14 10:17:57 -08:00
Craig Topper d4d76409d1 [RISCV] Add mayRaiseFPException to RISCV scalar FP instructions.
FRM dependency will be added in a future patch.

Reviewed By: arcbbb

Differential Revision: https://reviews.llvm.org/D115540
2021-12-14 09:53:30 -08:00
Craig Topper 7598ac5ec5 [RISCV] Convert (splat_vector (load)) to vlse with 0 stride.
We already do this for splat nodes that carry a VL, but not for
splats that use VLMAX.

Reviewed By: frasercrmck

Differential Revision: https://reviews.llvm.org/D115483
2021-12-14 09:14:03 -08:00
Craig Topper 3cda38796c [RISCV] Add rs2 encoding to the FPUnaryOp_r and FPUnaryOp_r_frm template arguments.
Instead of having unary instruction include a 'let' in their class
body, add rs2val as a template parameter. Then we can use a let
in FPUnaryOp_r and FPUnaryOp_r_frm. This reduces the overall
verbosity of the FP files.

Reviewed By: achieveartificialintelligence

Differential Revision: https://reviews.llvm.org/D115537
2021-12-13 21:38:42 -08:00
Nelson Chu 10a71981e9 [RISCV] Support named opcodes in .insn directive.
This patch is one of the TODO of commit, 283879793d

We build the GenericTable for these opcodes, and also extend class RISCVOpcode, to store the names of opcodes.  Then we call the parseInsnDirectiveOpcode to parse the opcode filed in .insn directive.  We only allow users to write the recognized opcode names, or just write the immediate values in the 7 bits range.

Documentation: https://sourceware.org/binutils/docs-2.37/as/RISC_002dV_002dFormats.html

Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D115224
2021-12-13 20:59:33 -08:00
Fangrui Song a6a07a514b [MachineOutliner] Don't outline functions starting with PATCHABLE_FUNCTION_ENTER/FENTRL_CALL
MachineOutliner may outline a "patchable-function-entry" function whose body has
a TargetOpcode::PATCHABLE_FUNCTION_ENTER MachineInstr. This is incorrect because
the special code sequence must stay unchanged to be used at run-time.
Avoid outlining PATCHABLE_FUNCTION_ENTER. While here, avoid outlining FENTRY_CALL too
(which doesn't reproduce currently) to allow phase ordering flexibility.

Fixes #52635

Reviewed By: paquette

Differential Revision: https://reviews.llvm.org/D115614
2021-12-13 13:24:29 -08:00
Craig Topper b18b2a01ef [RISCV] Don't use VLMAX for start value splat in reduction lowering.
The reduction instructions only reads the first element. The
execution time for a splat may take longer with a larger VL.
We should use the smallest VL we can.

Reviewed By: frasercrmck, HsiangKai

Differential Revision: https://reviews.llvm.org/D115536
2021-12-13 09:06:42 -08:00
Craig Topper 80ed2f6b36 [RISCV] Share tablegen classes for F, D, and Zfh. Other simplifications. NFC
By adding the register class and funct as template parameters we
can share the classes with all 3 extensions.

I've used "let SchedRW =" to avoid repeating scheduler classes on
multiple lines where we previously inherited from the Sched class.

A subsequent patch will add mayRaiseFPException and FRM dependencies.
Reducing the number of classes means less repeating for those changes.

This of course conflicts with the Zfinx patch D93298.

Reviewed By: achieveartificialintelligence

Differential Revision: https://reviews.llvm.org/D115469
2021-12-10 09:35:51 -08:00
Craig Topper 5861cf77da [RISCV] Remove FCSR from RISCVRegisterInfo.
We only used this to mark it as a reserved register. But that's not
important if we don't do anything else with it.

I think if we were ever to do anything with it, we would need to
model it as a super register of FRM and FFLAGS. But it might be
easier to reference both FRM and FFLAGS in implicit defs/uses
for anything we were to do with "fcsr".

Reviewed By: sepavloff

Differential Revision: https://reviews.llvm.org/D115455
2021-12-10 09:24:13 -08:00
eopXD a4bf1b449d [RISCV] Unify depedency check and extension implication parsing logics
Originially there are two places that does parsing - `parseArchString` and
`parseFeatures`, each with its code on dependency check and implication.
This patch extracts common parts of the two  as functions of `RISCVISAInfo`
and let them 2 use it.

Reviewed By: asb

Differential Revision: https://reviews.llvm.org/D112359
2021-12-09 21:16:04 -08:00
Craig Topper 6f7de819b9 [RISCV] Use MULHU for more division by constant cases.
D113805 improved handling of i32 divu/remu on RV64. The basic idea
from that can be extended to (mul (and X, C2), C1) where C2 is any
mask constant.

We can replace the and with an SLLI by shifting by the number of
leading zeros in C2 if we also shift C1 left by XLen - lzcnt(C1)
bits. This will give the full product XLen additional trailing zeros,
putting the result in the output of MULHU. If we can't use ANDI,
ZEXT.H, or ZEXT.W, this will avoid materializing C2 in a register.

The downside is it make take 1 additional instruction to create C1.
But since that's not on the critical path, it can hopefully be
interleaved with other operations.

The previous tablegen pattern is replaced by custom isel code.

Reviewed By: asb

Differential Revision: https://reviews.llvm.org/D115310
2021-12-09 09:10:14 -08:00
Kito Cheng 39c861719b [RISCV] Fix vm operand constraint to fit GCC's behavior
- `vm` constraint is used for masking operand, which always v0.

- Update testcase, only masking operand should use `vm`, vector mask operations
  should just use `vr` for any vector register.

 - Revise the description of `vm` constraint.

- This patch also fix issue on RISCVRegisterInfo.td and RISCVISelLowering.cpp.

  RISCVRegisterInfo.td:
  - The first VT in the list must be the largest total size since the
    SelectionDAGBuilder uses the first register in the list as the canonical
    type for the register.

  RISCVISelLowering.cpp:
  - Fix RISCVTargetLowering::splitValueIntoRegisterParts and
    RISCVTargetLowering::joinRegisterPartsIntoValue for handling vectors
    with different total size, that will happened on fractional LMUL since
    fractional LMUL is always occupy one vector register.

Reviewed By: frasercrmck

Differential Revision: https://reviews.llvm.org/D112599
2021-12-09 14:46:49 +08:00
Craig Topper de8d26ac02 [RISCV] Improve tracking of EndLoc in the assembly parser.
The SMLoc::getFromPointer(S.getPointer() - 1) pattern used in
several place didn't make sense since that puts the End before the
Start location.

This patch corrects this to properly calculate the end location as
we parse. Unsure how much this matters, a lot of these are for custom
operand parsing. If the custom parsing succeeds, the instruction
matching probably won't fail, so the end loc won't be used to build
a range for a diagnostic.

I've also fixed the creation functions for the CSR and VType operands
to assign the EndLoc of the RISCVOperand class using the StartLoc
instead of an empty SMLoc. I don't think these will ever be accessed,
but it makes the code look less like we forgot to assign it. This is
consistent with some operands on the ARM target.

Reviewed By: luismarques

Differential Revision: https://reviews.llvm.org/D115192
2021-12-08 15:32:44 -08:00
Michael Berg 3e363f14e1 Revert "[RISCV] Add target specific loop unrolling and peeling preferences"
This reverts commit 8487981a72.
2021-12-07 15:13:42 -08:00
Michael Berg 8487981a72 [RISCV] Add target specific loop unrolling and peeling preferences
Both these preference helper functions have initial support with
this change. The loop unrolling preferences are set with initial
settings to control thresholds, size and attributes of loops to
unroll with some tuning done.  The peeling preferences may need
some tuning as well as the initial support looks much like what
other architectures utilize.

Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D113798
2021-12-07 15:06:42 -08:00
Craig Topper 622d689480 [RISCV] Revise RISCVInstPrinter::printVTypeI to not assume there are 3 invalid vtype bits.
Instead of checking [10:8]. Check for non-zero in 8 and above.

Addresses a post-commit comment from @jrtc27 in D114581.
2021-12-07 09:40:50 -08:00
Craig Topper 2a9b2444d9 [RISCV] Replace uses of RISCVOpcode<0b0010011> and RISCVOpcode<0b0011011> with existing named objects. NFC
These are already instantiated with names as OPC_OP_IMM and
OPC_OP_IMM_32.

Reviewed By: frasercrmck

Differential Revision: https://reviews.llvm.org/D115172
2021-12-07 08:07:14 -08:00
Evandro Menezes 2ba7698423 [RISCV] Add scheduling resources for Vector pseudo instructions.
Add the scheduling resources for the V extension pseudo instructions.

Authored-by: Evandro Menezes <evandro.menezes@sifive.com>

Differential Revision: https://reviews.llvm.org/D113353
2021-12-07 09:14:28 +08:00
Craig Topper acdbd34cfb [RISCV] Loosen some restrictions on lowering constant BUILD_VECTORs using vid.v.
The immediate size check on StepNumerator did not take into account
that vmul.vi does not exist. It also did not account for power of 2
constants that can be done with vshl.vi.

This patch fixes this by moving the conversion from mul to shift
further up. Then we can consider the immediates separately for MUL
vs SHL. For MUL I've allowed simm12 which requires a single addi
before a vmul.vx. For SHL I've allowed any uimm5 which works with
vshl.vi. We could relax these further in the future. This is a
starting point that allows us to emit the same number of instructions
we were already using for smaller numerators.

Reviewed By: frasercrmck

Differential Revision: https://reviews.llvm.org/D115081
2021-12-06 09:34:40 -08:00
Victor Perez 9eb7322748 [RISCV][VP] Add RVV codegen for vp.select
Lower vp.select instrinsic to VSELECT_VL.

Reviewed By: rogfer01

Differential Revision: https://reviews.llvm.org/D114629
2021-12-03 11:02:20 +00:00
Craig Topper 2f6beb7b0e [RISCV] Add inline expansion for vector ftrunc/fceil/ffloor.
This prevents scalarization of fixed vector operations or crashes
on scalable vectors.

We don't have direct support for these operations. To emulate
ftrunc we can convert to the same sized integer and back to fp using
round to zero. We don't need to do a convert if the value is large
enough to have no fractional bits or is a nan.

The ceil and floor lowering would be better if we changed FRM, but
we don't model FRM correctly yet. So I've used the trunc lowering
with a conditional add or subtract with 1.0 if the truncate rounded
in the wrong direction.

There are also missed opportunities to use masked instructions.

Reviewed By: frasercrmck

Differential Revision: https://reviews.llvm.org/D113543
2021-12-01 11:25:28 -08:00
Craig Topper d8f9eaad89 [RISCV] Teach RISCVTargetLowering::shouldSinkOperands to handle udiv/sdiv/urem/srem.
The V extension supports .vx instructions for integer division and
remainder so we should sink splats for that operand.
2021-11-30 18:47:51 -08:00
David Green 9e8a71caf0 [DAG] Create fptosi.sat from clamped fptosi
This adds a fold in DAGCombine to create fptosi_sat from sequences for
smin(smax(fptosi(x))) nodes, where the min/max saturate the output of
the fp convert to a specific bitwidth (say INT_MIN and INT_MAX). Because
it is dealing with smin(/smax) in DAG they may currently be ISD::SMIN,
ISD::SETCC/ISD::SELECT, ISD::VSELECT or ISD::SELECT_CC nodes which need
to be handled similarly.

A shouldConvertFpToSat method was added to control when converting may
be profitable. The original fptosi will have a less strict semantics
than the fptosisat, with less values that need to produce defined
behaviour.

This especially helps on ARM/AArch64 where the vcvt instructions
naturally saturate the result.

Differential Revision: https://reviews.llvm.org/D111976
2021-11-30 15:29:14 +00:00
Hans Wennborg a87782c34d Revert "[DAG] Create fptosi.sat from clamped fptosi"
It causes builds to fail with this assert:

llvm/include/llvm/ADT/APInt.h:990:
bool llvm::APInt::operator==(const llvm::APInt &) const:
Assertion `BitWidth == RHS.BitWidth && "Comparison requires equal bit widths"' failed.

See comment on the code review.

> This adds a fold in DAGCombine to create fptosi_sat from sequences for
> smin(smax(fptosi(x))) nodes, where the min/max saturate the output of
> the fp convert to a specific bitwidth (say INT_MIN and INT_MAX). Because
> it is dealing with smin(/smax) in DAG they may currently be ISD::SMIN,
> ISD::SETCC/ISD::SELECT, ISD::VSELECT or ISD::SELECT_CC nodes which need
> to be handled similarly.
>
> A shouldConvertFpToSat method was added to control when converting may
> be profitable. The original fptosi will have a less strict semantics
> than the fptosisat, with less values that need to produce defined
> behaviour.
>
> This especially helps on ARM/AArch64 where the vcvt instructions
> naturally saturate the result.
>
> Differential Revision: https://reviews.llvm.org/D111976

This reverts commit 52ff3b0093.
2021-11-30 15:36:56 +01:00
David Green 52ff3b0093 [DAG] Create fptosi.sat from clamped fptosi
This adds a fold in DAGCombine to create fptosi_sat from sequences for
smin(smax(fptosi(x))) nodes, where the min/max saturate the output of
the fp convert to a specific bitwidth (say INT_MIN and INT_MAX). Because
it is dealing with smin(/smax) in DAG they may currently be ISD::SMIN,
ISD::SETCC/ISD::SELECT, ISD::VSELECT or ISD::SELECT_CC nodes which need
to be handled similarly.

A shouldConvertFpToSat method was added to control when converting may
be profitable. The original fptosi will have a less strict semantics
than the fptosisat, with less values that need to produce defined
behaviour.

This especially helps on ARM/AArch64 where the vcvt instructions
naturally saturate the result.

Differential Revision: https://reviews.llvm.org/D111976
2021-11-30 11:05:32 +00:00
Ben Shi 29d4230d6b [RISCV] Decode vtype with reserved fields to raw immediate
This patch fixes a crash when doing "llvm-objdump -D --mattr=+experimental-v"
against an object file which happens to keep a word that can be decoded to
VSETVLI & VSETIVLI with reserved vlmul[2:0]=4. All vtype values with
reserved fields (vlmul[2:0]=4, vsew[2:0]=0b1xx, non-zero bits 8/9/10) are
printed to raw immediate.

Reviewed By: jhenderson, jrtc27, craig.topper

Differential Revision: https://reviews.llvm.org/D114581
2021-11-30 08:31:20 +00:00
Craig Topper b121d23a9c [RISCV] Promote f16 log/pow/exp/sin/cos/etc. to f32 libcalls.
Prevents crashes or cannot select errors.

Reviewed By: frasercrmck

Differential Revision: https://reviews.llvm.org/D113822
2021-11-29 18:49:11 -08:00
Hsiangkai Wang 9a88566537 [RISCV] Fix a bug in RISCVFrameLowering.
When we have out-going arguments passing through stack and we do not
reserve the stack space in the prologue. Use BP to access stack objects
after adjusting the stack pointer before function calls.

callseq_start  ->  sp = sp - reserved_space
//
// Use FP to access fixed stack objects.
// Use BP to access non-fixed stack objects.
//
call @foo
callseq_end    ->  sp = sp + reserved_space

Differential Revision: https://reviews.llvm.org/D114246
2021-11-30 10:39:35 +08:00
Hsiangkai Wang b0c7421524 [RISCV] Emit DWARF location expression for RVV stack objects.
VLENB is the length of a vector register in bytes. We use
<vscale x 64 bits> to represent one vector register. The dwarf offset is
VLENB * scalable_offset / 8.

For the mask vector, it occupies one vector register.

Differential Revision: https://reviews.llvm.org/D107432
2021-11-27 15:13:10 +08:00
Hsiangkai Wang 137d3474ca [RISCV] Reverse the order of loading/storing callee-saved registers.
Currently, we restore the return address register as the last restoring
instruction in the epilog. The next instruction is `ret` usually. It is
a use of return address register. In some microarchitectures, there is
load-to-use data hazard. To avoid the load-to-use data hazard, we could
separate the load instruction from its use as far as possible. In this
patch, we reverse the order of restoring callee-saved registers to
increase the distance of `load ra` and `ret` in the epilog.

Differential Revision: https://reviews.llvm.org/D113967
2021-11-22 23:02:11 +08:00
wangpc af0ecfccae [RISCV] Generate pseudo instruction li
Add an alias of `addi [x], zero, imm` to generate pseudo
instruction li, which makes assembly mush more readable.
For existed tests, users can update them by running script
`llvm/utils/update_llc_test_checks.py`.

Reviewed By: asb

Differential Revision: https://reviews.llvm.org/D112692
2021-11-22 14:01:37 +08:00
Philipp Tomsich af57a71d18 [RISCV] Don't call setHasMultipleConditionRegisters(), so icmp is sunk
On RISC-V, icmp is not sunk (as the following snippet shows) which
generates the following suboptimal branch pattern:
```
  core_list_find:
	lh	a2, 2(a1)
	seqz	a3, a0         <<
	bltz	a2, .LBB0_5
	bnez	a3, .LBB0_9    << should sink the seqz
        [...]
	j	.LBB0_9
  .LBB0_5:
	bnez	a3, .LBB0_9    << should sink the seqz
	lh	a1, 0(a1)
        [...]
```
due to an icmp not being sunk.

The blocks after `codegenprepare` look as follows:
```
  define dso_local %struct.list_head_s* @core_list_find(%struct.list_head_s* readonly %list, %struct.list_data_s* nocapture readonly %info) local_unnamed_addr #0 {
  entry:
    %idx = getelementptr inbounds %struct.list_data_s, %struct.list_data_s* %info, i64 0, i32 1
    %0 = load i16, i16* %idx, align 2, !tbaa !4
    %cmp = icmp sgt i16 %0, -1
    %tobool.not37 = icmp eq %struct.list_head_s* %list, null
    br i1 %cmp, label %while.cond.preheader, label %while.cond9.preheader

  while.cond9.preheader:                            ; preds = %entry
    br i1 %tobool.not37, label %return, label %land.rhs11.lr.ph
```
where the `%tobool.not37` is the result of the icmp that is not sunk.
Note that it is computed in the basic-block up until what becomes the
`bltz` instruction and the `bnez` is a basic-block of its own.

Compare this to what happens on AArch64 (where the icmp is correctly sunk):
```
  define dso_local %struct.list_head_s* @core_list_find(%struct.list_head_s* readonly %list, %struct.list_data_s* nocapture readonly %info) local_unnamed_addr #0 {
  entry:
    %idx = getelementptr inbounds %struct.list_data_s, %struct.list_data_s* %info, i64 0, i32 1
    %0 = load i16, i16* %idx, align 2, !tbaa !6
    %cmp = icmp sgt i16 %0, -1
    br i1 %cmp, label %while.cond.preheader, label %while.cond9.preheader

  while.cond9.preheader:                            ; preds = %entry
    %1 = icmp eq %struct.list_head_s* %list, null
    br i1 %1, label %return, label %land.rhs11.lr.ph
```

This is caused by sinkCmpExpression() being skipped, if multiple
condition registers are supported.

Given that the check for multiple condition registers affect only
sinkCmpExpression() and shouldNormalizeToSelectSequence(), this change
adjusts the RISC-V target as follows:
 * we no longer signal multiple condition registers (thus changing
   the behaviour of sinkCmpExpression() back to sinking the icmp)
 * we override shouldNormalizeToSelectSequence() to let always select
   the preferred normalisation strategy for our backend

With both changes, the test results remain unchanged.  Note that without
the target-specific override to shouldNormalizeToSelectSequence(), there
is worse code (more branches) generated for select-and.ll and select-or.ll.

The original test case changes as expected:
```
  core_list_find:
	lh	a2, 2(a1)
	bltz	a2, .LBB0_5
	beqz	a0, .LBB0_9    <<
        [...]
	j	.LBB0_9
.LBB0_5:
	beqz	a0, .LBB0_9    <<
	lh	a1, 0(a1)
        [...]
```

Differential Revision: https://reviews.llvm.org/D98932
2021-11-19 08:32:59 -08:00
Zi Xuan Wu 24d1673c8b [llvm-tblgen][RISCV] Make llvm-tblgen RISCVCompressInstEmitter to be common infra across different targets
Not only RISCV but also other target such as CSKY, there are compressed instructions mixed with normal instructions.
To reuse the basic infra to compress/uncompress and predict instruction, we need reconstruct the RISCVCompressInstEmitter
and make it more general and suitable for other target.

Differential Revision: https://reviews.llvm.org/D113475
2021-11-18 11:14:27 +08:00
Zarko Todorovski 5b8bbbecfa [NFC][llvm] Inclusive language: reword and remove uses of sanity in llvm/lib/Target
Reworded removed code comments that contain `sanity check` and `sanity
test`.
2021-11-17 21:59:00 -05:00
Craig Topper 0274be28d7 [RISCV] Lower vector CTLZ_ZERO_UNDEF/CTTZ_ZERO_UNDEF by converting to FP and extracting the exponent.
If we have a large enough floating point type that can exactly
represent the integer value, we can convert the value to FP and
use the exponent to calculate the leading/trailing zeros.

The exponent will contain log2 of the value plus the exponent bias.
We can then remove the bias and convert from log2 to leading/trailing
zeros.

This doesn't work for zero since the exponent of zero is zero so we
can only do this for CTLZ_ZERO_UNDEF/CTTZ_ZERO_UNDEF. If we need
a value for zero we can use a vmseq and a vmerge to handle it.

We need to be careful to make sure the floating point type is legal.
If it isn't we'll continue using the integer expansion. We could split the vector
and concatenate the results but that needs some additional work and evaluation.

Differential Revision: https://reviews.llvm.org/D111904
2021-11-17 10:29:41 -08:00
Jay Foad 3264e95938 [CodeGen] Update LiveIntervals in TargetInstrInfo::convertToThreeAddress
Delegate updating of LiveIntervals to each target's
convertToThreeAddress implementation, instead of repairing LiveIntervals
after the fact in TwoAddressInstruction::convertInstTo3Addr.

Differential Revision: https://reviews.llvm.org/D113493
2021-11-17 10:16:47 +00:00
jacquesguan 6405e8b584 [RISCV] Refactor some rvv instructions' definition with foreach.
Simplify rvv instructions that use eew in their mnemonic and encoding with foreach. And fix a scheduling bug.

Differential Revision: https://reviews.llvm.org/D113453
2021-11-16 15:20:45 +08:00
Craig Topper 391b0ba603 [RISCV] Override TargetLowering::hasAndNot for Zbb.
Differential Revision: https://reviews.llvm.org/D113937
2021-11-15 18:44:07 -08:00
Ben Shi 4c3d916c4b [RISCV] Optimize immediate materialisation with SH*ADD
Use LUI+SH*ADD+ADDI to compose specific immediates.

Reviewed By: craig.topper, luismarques

Differential Revision: https://reviews.llvm.org/D113568
2021-11-15 23:34:28 +00:00
Craig Topper f59307bfdc [RISCV] Teach needVSETVLIPHI to handle mask register instructions.
This handles the case where the mask register instruction input
comes from a Phi of vsetvlis. If the VLMAX is the same as the VLMAX
required by the mask register instruction, we can avoid a vsetvli.

Reviewed By: frasercrmck

Differential Revision: https://reviews.llvm.org/D113204
2021-11-15 09:57:28 -08:00
Craig Topper 02bed66cd5 [RISCV] Improve codegen for i32 udiv/urem by constant on RV64.
The division by constant optimization often produces constants that
are uimm32, but not simm32. These constants require 3 or 4 instructions
to materialize without Zba.

Since these instructions are often used by a multiply with a LHS
that needs to be zero extended with an AND, we can switch the MUL
to a MULHU by shifting both inputs left by 32. Once we shift the
constant left, the upper 32 bits no longer need to be 0 so constant
materialization is free to use LUI+ADDIW. This reduces the constant
materialization from 4 instructions to 3 in some cases while also
reducing the zero extend of the LHS from 2 shifts to 1.

Differential Revision: https://reviews.llvm.org/D113805
2021-11-12 14:49:10 -08:00
Craig Topper ee7a006ce4 [RISCV] Promote f16 ceil/floor/round/roundeven/nearbyint/rint/trunc intrinsics to f32 libcalls.
Previously these would crash. I don't think these can be generated
directly from C. Not sure if any optimizations can introduce them.

Reviewed By: asb

Differential Revision: https://reviews.llvm.org/D113527
2021-11-11 08:28:41 -08:00
Craig Topper 4183522e80 [RISCV] Promote f16 frem with Zfh.
Add riscv64 coverage for f32 and f64 frem.

Reviewed By: luismarques

Differential Revision: https://reviews.llvm.org/D113531
2021-11-10 17:35:07 -08:00
Craig Topper 9ee5cec688 [RISCV] Prevent bad legalizer behavior when bitcasting fixed vectors to i64 on RV32 with Zve32.
Similar to D113219, we need to make sure we don't create a vXi64
vector when it isn't legal. This fixes an error found by an
expensive checks build.
2021-11-10 11:58:49 -08:00
Craig Topper 57bc7b1089 [RISCV] Prevent crashes when bitcasting between fixed vectors and scalars.
Not all scalar element types are allowed in vectors so we may not
be able to bitcast to a 1 element vector to use insert/extract.

This will become a bigger issue when the Zve extensions are commited.
For now, I'm using the ELEN limit to limit the element types.

Reviewed By: frasercrmck

Differential Revision: https://reviews.llvm.org/D113219
2021-11-10 09:21:52 -08:00
Shao-Ce SUN 1c81941f19 [NFC][RISCV] Fix wrong predicates of vfwredsum 2021-11-09 17:19:50 +08:00
Craig Topper 376233113e [RISCV] Use TargetConstant for CSR number for READ_CSR/WRITE_CSR.
This is consistent with what we do for other operands that are required
to be constants.

I don't think this results in any real changes. The pattern match
code for isel treats ConstantSDNode and TargetConstantSDNode the same.
2021-11-08 15:10:24 -08:00
Craig Topper 304edbb553 [RISCV] SMUL_LOHI/UMUL_LOHI should expand for RVV.
These and MULHS/MULHU both default to Legal. Targets need to set
the ones they don't support to Expand.

I think MULHS/MULHU likely has priority in most places so this
change probably isn't directly testable. I found it while looking
at disabling MULHS/MULHU for nxvXi64 as required for Zve64x.

Reviewed By: frasercrmck

Differential Revision: https://reviews.llvm.org/D113325
2021-11-08 09:38:36 -08:00
Ben Shi e32cf690df [RISCV] Optimize (add (mul r, c0), c1)
Optimize (add (mul x, c0), c1) ->
         (add (mul (add x, c1/c0+1), c0), c1%c0-c0),
if c1/c0+1 and c1%c0-c0 are simm12, while c1 is not.

Optimize (add (mul x, c0), c1) ->
         (add (mul (add x, c1/c0-1), c0), c1%c0+c0),
if c1/c0-1 and c1%c0+c0 are simm12, while c1 is not.

Reviewed By: craig.topper, asb

Differential Revision: https://reviews.llvm.org/D111141
2021-11-08 02:58:25 +00:00
Bin Cheng 54d891a7d5 [RISCV]: Fix typo by abstracting VWholeLoad* classes
This patch abstracts VWholeLoad* classes into VWholeLoadN, simplifies
existing code as well as fixes a typo.

Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D109319
2021-11-06 10:48:03 +08:00
Bin Cheng d488f1fff2 [RISCV][NFC]: Refactor classes for load/store instructions of RVV
This patch refactors classes for load/store of V extension by:
- Introduce new class for VUnitStrideLoadFF and VUnitStrideSegmentLoadFF
  so that uses of L/SUMOP* are not spread around different places.
- Reorder classes for Unit-Stride load/store in line with table
  describing lumop/sumop in riscv-v-spec.pdf.

Reviewed By: HsiangKai, craig.topper

Differential Revision: https://reviews.llvm.org/D109318
2021-11-06 10:48:03 +08:00
Shao-Ce SUN 5c3d7184b4 [RISCV] Support Zfhmin extension
According to RISC-V Unprivileged ISA 15.6.

Reviewed By: asb

Differential Revision: https://reviews.llvm.org/D111866
2021-11-06 01:41:02 +08:00
Zakk Chen 0649dfebba [RISCV] Rename some assembler mnemonic and intrinsic functions for RVV 1.0.
Rename vpopc/vmandnot/vmornot to vcpop/vmandn/vmorn assembler mnemonic.

Reviewed By: frasercrmck, jrtc27, craig.topper

Differential Revision: https://reviews.llvm.org/D111062
2021-11-04 10:08:01 -07:00
Craig Topper 5022ac0771 [RISCV] Use HasVInstructions and HasVInstructionsAnyF in more place in TableGen. NFC
Change RISCVSubtarget.hasVInstructionAnyF() to call hasVInstructionsF32
so that any changes to hasVInstructionsF32 are reflected.

The files were missed in D112496.
2021-11-03 14:32:45 -07:00
Fraser Cormack d065b03801 [RISCV] Optimize vp.load with an all-ones mask
Similar to D110206, this patch optimizes unmasked vp.load intrinsics to
avoid the need of a vmset instruction to set the mask. It does so by
selecting a riscv_vle intrinsic rather than a riscv_vle_mask intrinsic.

Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D113022
2021-11-02 17:23:39 +00:00
David Callahan 4ec1b8eeac [RISCV] Fix invalid kill on callee save
A callee save may be live (specifically X1) on entry and so a spill
should not mark it killed.

Differential Revision: https://reviews.llvm.org/D111285
2021-11-02 11:56:54 +00:00
Craig Topper ada5458521 [RISCV] Expand scalable vector bswap. Fix crash for bitreverse.
Fix LegalizeVectorOps to not try shuffle or unrolling expansions for
scalable vectors.

Differential Revision: https://reviews.llvm.org/D112236
2021-10-31 10:01:27 -07:00
Craig Topper aefcd59895 [RISCV] Teach RISCVInsertVSETVLI::needVSETVLI to handle mask register instructions better.
If the VL operand of a mask register instruction comes from an
explicit vsetvli with a different VTYPE, we can still avoid needing
a vsetvli as long as the SEW/LMUL ratio is the same and policy bits
match.

Differential Revision: https://reviews.llvm.org/D112762
2021-10-29 09:49:36 -07:00
Hsiangkai Wang 7051f73d69 [RISCV] Sync Zvlsseg register order as the same as vector registers.
Sync the order of Zvlsseg registers with vector registers to avoid
unnecessary register copies between vector instructions and zvlsseg
instructions.

Differential Revision: https://reviews.llvm.org/D110250
2021-10-28 13:34:53 +08:00
Hsiangkai Wang 0a9b82960c [RISCV] Use vmv.v.[v|i] if we know COPY is under the same vl and vtype.
If we know the source operand of COPY is defined by a vector instruction
with tail agnostic and the same LMUL and there is no vsetvli between
COPY and the define instruction to change the vl and vtype, we could use
vmv.v.v or vmv.v.i to copy vector registers to get better performance than
the whole vector register move instructions.

If the source of COPY is from vmv.v.i, we could use vmv.v.i for the
COPY.

This patch only considers all these instructions within one basic block.

Case 1:
```
bb.0:
  ...
  VSETVLI          # The first VSETVLI before COPY and VOP.
  ...              # Use this VSETVLI to check LMUL and tail agnostic.
  ...
  vy = VOP va, vb  # Define vy.
  ...              # There is no vsetvli between VOP and COPY.
  vx = COPY vy
```

Case 2:
```
bb.0:
  ...
  VSETVLI          # The first VSETVLI before VOP.
  ...              # Use this VSETVLI to check LMUL and tail agnostic.
  ...
  vy = VOP va, vb  # Define vy.
  ...              # There is no vsetvli to change vl between VOP and COPY.
  ...
  VSETVLI          # The first VSETVLI before COPY.
  ...              # This VSETVLI does not change vl and vtype.
  ...
  vx = COPY vy
```

Co-Authored-by: Zakk Chen <zakk.chen@sifive.com>
Co-Authored-by: Kito Cheng <kito.cheng@sifive.com>

Differential Revision: https://reviews.llvm.org/D103510
2021-10-28 11:39:04 +08:00
Craig Topper 1387483e72 [RISCV] Replace most uses of RISCVSubtarget::hasStdExtV. NFCI
Add new hasVInstructions() which is currently equivalent.

Replace vector uses of hasStdExtZfh/F/D with new vector specific
versions. The vector spec no longer requires that the vectors implement the
same types as scalar. It only requires that the scalar type is
the maximum size the vectors can support. This is currently
implemented using the scalar rule we were using before.

Add new hasVInstructionsI64() begin using to qualify code that
requires i64 vector elements.

This is all NFC for now, but we can start using this to better
implement D112408 which introduces the Zve extensions.

Reviewed By: frasercrmck, eopXD

Differential Revision: https://reviews.llvm.org/D112496
2021-10-27 19:33:48 -07:00
Michael Liao e6a4ba3aa6 [amdgpu] Handle the case where there is no scavenged register.
- When an unconditional branch is expanded into an indirect branch, if
  there is no scavenged register, an SGPR pair needs spilling to enable
  the destination PC calculation. In addition, before jumping into the
  destination, that clobbered SGPR pair need restoring.
- As SGPR cannot be spilled to or restored from memory directly, the
  spilling/restoring of that SGPR pair reuses the regular SGPR spilling
  support but without spilling it into memory. As that spilling and
  restoring points are fully controlled, we only need to spill that SGPR
  into the temporary VGPR, which needs spilling into its emergency slot.
- The target-specific hook is revised to take additional restore block,
  where the restoring code is filled. After that, the relaxation will
  place that restore block directly before the destination block and
  insert an unconditional branch in any fall-through block into the
  destination block.

Reviewed By: arsenm

Differential Revision: https://reviews.llvm.org/D106449
2021-10-27 18:37:27 -04:00
Craig Topper 2783a5cfaf [RISCV] Add ICmp and FCmp to shouldSinkOperands. 2021-10-26 22:23:54 -07:00
Ben Shi 97e52e1c35 [RISCV] Optimize immediate materialisation with SLLI.UW in the Zba extension
Simplify "LUI+SLLI+ADDI+SLLI" and "LUI+ADDIW+SLLI+ADDI+SLLI" to
"LUI+ADDIW+SLLIUW" to reduce total instruction amount.

Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D111933
2021-10-27 02:48:38 +00:00
Craig Topper e2b7aabb57 [RISCV] Reduce the number of RISCV vector builtins by an order of magnitude.
All but 2 of the vector builtins are only used by clang_builtin_alias.
When using clang_builtin_alias, the type string of the builtin is never
checked. Only the types in the function definition used for the alias
are checked.

This patch takes advantage of this to share a single builtin for
many different types. We already used type overloads on the IR intrinsic
so the codegen for the builtins that are being merge were already
the same. This extends the type overloading to the builtins.

I had to make a few tweaks to make this work.
-Floating point vector-vector vmerge now uses the vmerge intrinsic
 instead of the vfmerge intrinsic. New isel patterns and tests are
 added to support this.
-The SemaChecking for the immediate of vset_v/vget_v has been removed.
 Determining the valid range is harder now. I've added masking to
 ManualCodegen to ensure valid IR for invalid input.

This reduces the number of builtins from ~25000 to ~1100.

Reviewed By: HsiangKai

Differential Revision: https://reviews.llvm.org/D112102
2021-10-25 09:03:59 -07:00
Craig Topper 210b586a85 [RISCV] Add vcsr CSR name for V extension.
Reviewed By: frasercrmck, kito-cheng

Differential Revision: https://reviews.llvm.org/D112342
2021-10-25 08:56:25 -07:00
Fraser Cormack 74c6895b39 [RISCV] Fix missing cross-block VSETVLI insertion
This patch fixes a codegen bug, the test for which was introduced in
D112223.

When merging VSETVLIInfo across blocks, if the 'exit' VSETVLIInfo
produced by a block is found to be compatible with the VSETVLIInfo
computed as the intersection of the 'exit' VSETVLIInfo produced by the
block's predecessors, that blocks' 'exit' info is discarded and the
intersected value is taken in its place.

However, we have one authority on what constitutes VSETVLIInfo
compatibility and we are using it in two different contexts.

Compatibility is used in one context to elide VSETVLIs between
straight-line vector instructions. But compatibility when evaluated
between two blocks' exit infos ignores any info produced *inside* each
respective block before the exit points. As such it does not guarantee
that a block will not produce a VSETVLI which is incompatible with the
'previous' block.

As such, we must ensure that any merging of VSETVLIInfo is performed
using some notion of "strict" compatibility. I've defined this as a full
vtype match, but this is perhaps too pessimistic. Given that test
coverage in this regard is lacking -- the only change is in the failing
test -- I think this is a good starting point.

Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D112228
2021-10-22 10:45:10 +01:00
Craig Topper d55be79d75 [RISCV] Expand scalable vector CTTZ/CTLZ/CTPOP.
Differential Revision: https://reviews.llvm.org/D112233
2021-10-21 10:50:04 -07:00
Hsiangkai Wang facff468b6 [RISCV] Reorder the vector register allocation order.
GPR uses argument registers as the first group of registers to allocate.
This patch uses vector argument registers, v8 to v23, as the first group
to allocate.

Differential Revision: https://reviews.llvm.org/D111304
2021-10-19 09:30:13 +08:00
Craig Topper 84d9bc51a3 [RISCV] Rewrite forwardCopyWillClobberTuple to not assume that there are exactly 32 registers. NFC
This function was copied from ARM where register pairs/triples/quads can wrap around the 32 encoding space. So register 31 can pair with register 0. This is not true for RISCV vectors. The spec specifically mentions the possibility of a future encoding that has more than 32 registers.

This patch removes the modulo from the code and directly checks that destination register is in the source register range and not the beginning of the range. Though I don't expect an identity copy will occur.

Reviewed By: frasercrmck

Differential Revision: https://reviews.llvm.org/D111467
2021-10-18 09:57:38 -07:00
Kito Cheng ff13189c5d [RISCV] Unify the arch string parsing logic to to RISCVISAInfo.
How many place you need to modify when implementing a new extension for RISC-V?

At least 7 places as I know:

- Add new SubtargetFeature at RISCV.td
- -march parser in RISCV.cpp
- RISCVTargetInfo::initFeatureMap@RISCV.cpp for handling feature vector.
- RISCVTargetInfo::getTargetDefines@RISCV.cpp for pre-define marco.
- Arch string parser for ELF attribute in RISCVAsmParser.cpp
- ELF attribute emittion in RISCVAsmParser.cpp, and make sure it's in
  canonical order...
- ELF attribute emittion in RISCVTargetStreamer.cpp, and again, must in
  canonical order...

And now, this patch provide an unified infrastructure for handling (almost)
everything of RISC-V arch string.

After this patch, you only need to update 2 places for implement an extension
for RISC-V:
- Add new SubtargetFeature at RISCV.td, hmmm, it's hard to avoid.
- Add new entry to RISCVSupportedExtension@RISCVISAInfo.cpp or
  SupportedExperimentalExtensions@RISCVISAInfo.cpp .

Most codes are come from existing -march parser, but with few new feature/bug
fixes:
- Accept version for -march, e.g. -march=rv32i2p0.
- Reject version info with `p` but without minor version number like `rv32i2p`.

Differential Revision: https://reviews.llvm.org/D105168
2021-10-17 16:25:23 +08:00
Jim Lin 2ccdc7315e [RISCV] Add invalid match case for uimm2, uimm3 and uimm7
Reviewed By: craig.topper, jrtc27

Differential Revision: https://reviews.llvm.org/D110308
2021-10-15 14:54:48 +08:00
Ben Shi 4fe5ab4b00 [RISCV] Optimize immediate materialisation with SH*ADD
Use SH1ADD/SH2ADD/SH3ADD along with LUI+ADDI to compose int32*3,
int32*5 and int32*9.

Reviewed By: craig.topper, luismarques

Differential Revision: https://reviews.llvm.org/D111484
2021-10-15 06:46:41 +00:00
Craig Topper 79ae9562cc [RISCV] Remove unused member variable. NFC 2021-10-14 12:56:47 -07:00
Craig Topper f7ba572483 [RISCV] Update Zba, Zbb, Zbc, and Zbs version from 0.93 to 1.0.
I've removed the Zbs W instructions that are not part of the frozen spec.

References to B as an extension name have been removed. Tests are updated or split accordingly.

Reviewed By: luismarques

Differential Revision: https://reviews.llvm.org/D110669
2021-10-14 09:25:03 -07:00
Ben Shi 7e81526126 [RISCV] Optimize immediate materialisation with BSETI/BCLRI
Opitimize immediate materialisation in the following way if profitable:
1. Use BCLRI for upper 32 bits if the lower 32 bits are negative int32.
2. Use BSETI for upper 32 bits if the lower 32 bits are positive int32.

Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D111508
2021-10-14 04:56:47 +00:00
Ben Shi 481db13fec [RISCV] Optimize immediate materialisation with SLLI.UW
Use LUI+SLLI.UW to compose the upper bits instead of LUI+SLLI.

Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D111705
2021-10-14 02:24:50 +00:00
Ben Shi 787eeb8597 [RISCV] Optimize immediate materialisation with BCLRI
Do the following optimization for immediate materialisation:

1. For values in range 0xffffffff 7fffffff ~ 0xffffffff 00000000, first
   generate the lower 32-bit with Val|0x80000000 (which is expected be an
   int32), then emit (BCLRI r, 31).

2. For values in range 0x80000000 ~ 0xffffffff, first generate the lower
   32-bit with Val&~0x80000000 (which is expected to be an int32), then
   emit (BSETI r, 31).

Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D111532
2021-10-13 00:59:23 +00:00
jacquesguan 0608bbd4e8 [RISCV] Rename assembler mnemonic of unordered floating-point reductions for v1.0-rc change
Rename vfredsum and vfwredsum to vfredusum and vfwredusum. Add aliases for vfredsum and vfwredsum.

Reviewed By: luismarques, HsiangKai, khchen, frasercrmck, kito-cheng, craig.topper

Differential Revision: https://reviews.llvm.org/D105690
2021-10-12 06:46:46 +00:00
Reid Kleckner b3a6d096d7 Fix shlib builds for all lib/Target/*/TargetInfo libs
They all must depend on MC now that the target registry is in MC.
Also fix llvm-cxxdump
2021-10-08 15:21:13 -07:00
Reid Kleckner 89b57061f7 Move TargetRegistry.(h|cpp) from Support to MC
This moves the registry higher in the LLVM library dependency stack.
Every client of the target registry needs to link against MC anyway to
actually use the target, so we might as well move this out of Support.

This allows us to ensure that Support doesn't have includes from MC/*.

Differential Revision: https://reviews.llvm.org/D111454
2021-10-08 14:51:48 -07:00
Craig Topper f2ad8c9dc6 [RISCV] Remove experimental-b extension that includes all Zb* extensions
At this point it looks like a B extension will never exist. Instead
Zba, Zbb, Zbc, and Zbs are individual extensions being ratified
together as a package. Unknown at this time when or if the other
Zb* extensions will be ratified.

This patch removes references to the B extension. I've updated and
split tests accordingly.

This has been split from D110669 to make review a little easier.

Differential Revision: https://reviews.llvm.org/D111338
2021-10-07 20:47:17 -07:00
Craig Topper c4803bd416 [RISCV] Handle vector of pointer in getTgtMemIntrinsic for strided load/store.
getScalarSizeInBits() doesn't work if the scalar type is a pointer.
For that we need to go through DataLayout.
2021-10-07 10:11:56 -07:00
Shivam Gupta 7a189333ed [NFC] Add doxygen comment for hasFp in RISCVFrameLowering.cpp 2021-10-06 22:35:28 +05:30
Hsiangkai Wang 80a6456306 [RISCV] Update to vlm.v and vsm.v according to v1.0-rc1.
vle1.v  -> vlm.v
vse1.v  -> vsm.v

Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D106044
2021-10-05 21:49:54 +08:00
Craig Topper 33d20977b7 Revert "[RISCV] Add an GPR def to the Zvlseg SPILL/RELOAD pseudos"
This reverts commit 1f16191906.

We're seeing some issues with this internally. It seems that when
the spill is created by register allocation, the GPR doesn't get
allocated and an assertion fires during virtual register rewriting.

The .mir test case contains the spill before register allocation so
register allocation sees it as any other instruction.
2021-10-02 10:44:11 -07:00
Fraser Cormack 52c60459f5 [RISCV][NFC] Reformat a line of frame lowering code 2021-10-01 14:25:47 +01:00
Fraser Cormack fcaa64d947 [RISCV][NFC] Add closing parentheses to frame layout comments 2021-10-01 11:58:55 +01:00
Craig Topper a21c557955 [RISCV] Remove Zbproposedc extension
This consists of 3 compressed instructions, c.not, c.neg, and c.zext.w.
I believe these have been picked up by the Zce effort using different
encodings. I don't think it makes sense to keep them in bitmanip. It
will eventually cause a conflict if/when Zce is implemented in llvm.

Differential Revision: https://reviews.llvm.org/D110871
2021-09-30 14:23:05 -07:00
Craig Topper a2a07e8db3 [RISCV] Fold store of vmv.x.s to a vse with VL=1.
This can avoid a loss of decoupling with the scalar unit on cores
with decoupled scalar and vector units.

We should support FP too, but those use extract_element and not a
custom ISD node so it is a little different. I also left a FIXME
in the test for i64 extract and store on RV32.

Reviewed By: frasercrmck

Differential Revision: https://reviews.llvm.org/D109482
2021-09-27 09:54:46 -07:00
Craig Topper 933182e948 [RISCV] Improve support for forming widening multiplies when one input is a scalar splat.
If one input of a fixed vector multiply is a sign/zero extend and
the other operand is a splat of a scalar, we can use a widening
multiply if the scalar value has sufficient sign/zero bits.

Reviewed By: frasercrmck

Differential Revision: https://reviews.llvm.org/D110028
2021-09-27 09:37:07 -07:00
Fraser Cormack d48f6df1f8 [RISCV] Create the correct mask type when lowering EXTRACT_VECTOR_ELT
This particular case was creating a `VMSET_VL` using the old
fixed-length type in order to pass a mask to other custom nodes
operating on the scalable container type. This kind of thing wasn't
caught for us; I only noticed when experimenting with odd-length
vectors, where it was trying to generate an invalid `v3i1` MVT.

Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D110420
2021-09-27 09:43:40 +01:00
Kazu Hirata c4ae4a745d [RISCV] Remove redundant declaration RISCVMnemonicSpellCheck (NFC)
Note that RISCVMnemonicSpellCheck is defined in
RISCVGenAsmMatcher.inc, which RISCVAsmParser.cpp includes.

Identified with readability-redundant-declaration.
2021-09-26 09:26:57 -07:00
Jim Lin ed687c0211 [RISCV] Fix incorrect operand type of inst alias for InstR4
Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D110381
2021-09-25 11:25:12 +08:00
Craig Topper 715cf6ffb9 [RISCV] Add another isel optimization for (and (shl X, c2), c1).
Where c1 is a shifted mask with 32-c2 leading zeros and c3 trailing
zeros and c3>c2. We can select it as (slli (srliw X, c3-c2), c3).
2021-09-24 15:10:25 -07:00
Hsiangkai Wang 7d39a8a921 [RISCV] (1/2) Add the tail policy argument to builtins/intrinsics.
Add the tail policy argument to LLVM IR intrinsics. There are two policies for tail elements. Tail agnostic means users do not care about the values in the tail elements and tail undisturbed means the values in the tail elements need to be kept after the operation. In order to let users control the tail policy, we add an additional argument at the end of the argument list.

For unmasked operations, we have no maskedoff and the tail policy is always tail agnostic. If users want to keep tail elements under unmasked operations, they could use all one mask in the masked operations to do it. So, we only add the additional argument for masked operations for most cases. There are exceptions listed below.

In this patch, we do not handle the following cases to reduce the complexity of the patch. There could be two separate patches for them.

* Use dest argument to control tail policy
vmerge.vvm/vmerge.vxm/vmerge.vim (add _t builtins with additional dest argument)
vfmerge.vfm (add _t builtins with additional dest argument)
vmv.v.v (add _t builtins with additional dest argument)
vmv.v.x (add _t builtins with additional dest argument)
vmv.v.i (add _t builtins with additional dest argument)
vfmv.v.f (add _t builtins with additional dest argument)
vadc.vvm/vadc.vxm/vadc.vim (add _t builtins with additional dest argument)
vsbc.vvm/vsbc.vxm (add _t builtins with additional dest argument)

* Always has tail argument for masked/unmasked intrinsics
Vector Single-Width Integer Multiply-Add Instructions (add _t and _mt builtins)
Vector Widening Integer Multiply-Add Instructions (add _t and _mt builtins)
Vector Single-Width Floating-Point Fused Multiply-Add Instructions (add _t and _mt builtins)
Vector Widening Floating-Point Fused Multiply-Add Instructions (add _t and _mt builtins)
Vector Reduction Operations (add _t and _mt builtins)
Vector Slideup Instructions (add _t and _mt builtins)
Vector Slidedown Instructions (add _t and _mt builtins)

Discussion: https://github.com/riscv/rvv-intrinsic-doc/pull/101

Differential Revision: https://reviews.llvm.org/D105092
2021-09-24 17:09:50 +08:00
Craig Topper 40b230f685 [RISCV] Limit transformAddImmMulImm to prevent an infinite loop.
This fixes an issue reported in D108607.
2021-09-23 15:53:11 -07:00
Craig Topper 70f50114f3 [RISCV] Add another isel optimization for (and (shl x, c2), c1)
Turn (and (shl x, c2), c1) -> (slli (srli x, c3-c2), c3) if c1 is a
shifted mask with no leading zeros and c3 trailing zeros where c3
is greater than c2.
2021-09-23 14:18:07 -07:00
Craig Topper 4a69551d66 [RISCV] Add more isel optimizations for (and (shr x, c2), c1).
Turn (and (shr x, c2), c1) -> (slli (srli x, c2+c3), c3) if c1 is a
shifted mask with c2 leading zeros and c3 trailing zeros.

When the leading zeros is C2+32 we can use SRLIW in place of SRLI.
2021-09-23 11:29:04 -07:00
Jim Lin fbacf5ad38 [RISCV] Add missing op type OPERAND_UIMM2, OPERAND_UIMM3 and OPERAND_UIMM7 for verifyInstruction
Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D110307
2021-09-23 19:30:46 +08:00
Fraser Cormack e7c879a69d [RISCV][VP] Add support for VP_REDUCE_* operations
This patch adds codegen support for lowering the vector-predicated
reduction intrinsics to RVV instructions. The process is similar to that
of the other reduction intrinsics, save for the fact that every VP
reduction has a start value. We reuse the existing custom "VL" nodes,
adding extra patterns where required to handle non-true masks.

To support these nodes, the `RISCVISD::VECREDUCE_*_VL` nodes have been
given an explicit "merge" operand. This is to faciliate the VP
reductions, where we must be careful to ensure that even if no operation
is performed (when VL=0) we still produce the start value. The RVV
reductions don't update the destination register under these conditions,
so we tie the splatted start value to the output register.

Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D107657
2021-09-23 11:11:05 +01:00
Jay Foad 6cef28ed2d [TII] Remove the MFI argument to convertToThreeAddress. NFC.
This simplifies the API and addresses a FIXME in
TwoAddressInstructionPass::convertInstTo3Addr.

Differential Revision: https://reviews.llvm.org/D110229
2021-09-23 08:58:46 +01:00
Craig Topper f0a422f935 [RISCV] Add fcvt.s.w(u)/fcvt.d.w(u)/fcvt.h.w(u) to hasAllNBitUsers
These instructions only read the lower 32 bits of their input.
2021-09-22 14:24:26 -07:00
Craig Topper b33a1cc05b [RISCV] Optimize vp.store with an all ones mask to avoid a vmset.
We can use riscv_vse intrinsic instead of riscv_vse_mask. The code here
is based on similar code for handling masked.scatter and vp.scatter.

Reviewed By: frasercrmck

Differential Revision: https://reviews.llvm.org/D110206
2021-09-22 09:12:47 -07:00
Craig Topper 7c975665b4 [RISCV] Make some arrays of constants 'static const'. NFC
This helps the compiler generate better code.
2021-09-21 10:52:47 -07:00
Craig Topper aeb63d464f [RISCV] Teach RISCVTargetLowering::shouldSinkOperands to sink splats for and/or/xor.
This requires a minor change to CodeGenPrepare to ensure that
shouldSinkOperands will be called for And.

Reviewed By: frasercrmck

Differential Revision: https://reviews.llvm.org/D110106
2021-09-21 10:07:29 -07:00
Ben Shi b3052013b4 [RISCV] Optimize (add (mul x, c0), c1)
Optimize (add (mul x, c0), c1) -> (ADDI (MUL (ADDI, c1/c0), c0), c1%c0),
if c1/c0 and c1%c0 are simm12, while c1 is not.

Optimize (add (mul x, c0), c1) -> (MUL (ADDI, c1/c0), c0),
if c1%c0 is zero, and c1/c0 is simm12 while c1 is not.

Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D108607
2021-09-21 14:13:14 +00:00
Craig Topper a95ba81073 [RISCV] Teach RISCVTargetLowering::shouldSinkOperands to sink splats for FMA.
If either of the multiplicands is a splat, we can sink it to use
vfmacc.vf or similar.
2021-09-20 11:49:50 -07:00
Craig Topper 04ab6c85ef [RISCV] Teach RISCVTargetLowering::shouldSinkOperands to sink splats for FAdd/FSub/FMul/FDiv. 2021-09-20 10:25:46 -07:00
Craig Topper d85e347a28 [RISCV] Add a pass to recognize VLS strided loads/store from gather/scatter.
For strided accesses the loop vectorizer seems to prefer creating a
vector induction variable with a start value of the form
<i32 0, i32 1, i32 2, ...>. This value will be incremented each
loop iteration by a splat constant equal to the length of the vector.
Within the loop, arithmetic using splat values will be done on this
vector induction variable to produce indices for a vector GEP.

This pass attempts to dig through the arithmetic back to the phi
to create a new scalar induction variable and a stride. We push
all of the arithmetic out of the loop by folding it into the start,
step, and stride values. Then we create a scalar GEP to use as the
base pointer for a strided load or store using the computed stride.
Loop strength reduce will run after this pass and can do some
cleanups to the scalar GEP and induction variable.

Reviewed By: frasercrmck

Differential Revision: https://reviews.llvm.org/D107790
2021-09-20 09:39:44 -07:00
Ben Shi dee5a8ca32 [RISCV] Optimize (add (shl x, c0), (shl y, c1)) with SH*ADD
Optimize (add (shl x, c0), (shl y, c1)) ->
         (SLLI (SH*ADD x, y), c1), if c0-c1 == 1/2/3.

Reviewed By: craig.topper, luismarques

Differential Revision: https://reviews.llvm.org/D108916
2021-09-19 16:35:12 +08:00
Craig Topper 73e5b9ea90 [RISCV] Select (srl (sext_inreg X, i32), uimm5) to SRAIW if only lower 32 bits are used.
SimplifyDemandedBits can turn srl into sra if the bits being shifted
in aren't demanded. This patch can recover the original sra in some cases.

I've renamed the tablegen class for detecting W users since the "overflowing operator"
term I originally borrowed from Operator.h does not include srl.

Reviewed By: luismarques

Differential Revision: https://reviews.llvm.org/D109162
2021-09-16 11:03:35 -07:00
Jim Lin f29336104d [RISCV] Rename prefix `FeatureExt*` to `FeatureStdExt*` for all sub-extension
Rename prefix `FeatureExt*` to `FeatureStdExt*` for all sub-extension for consistency

Reviewed By: HsiangKai, asb

Differential Revision: https://reviews.llvm.org/D108187
2021-09-13 16:24:15 +08:00
Craig Topper 283879793d [RISCV] Initial support .insn directive for the assembler.
This allows for a custom encoding to be emitted. It can also be
used with inline assembly to allow the custom instruction to be
register allocated like other instructions.

I initially started from SystemZ's implementation, but some of
the formats allow operands to be specified in multiple ways so I
had to add support for matching different operand class lists for
the same format. That implementation is a simplified version of
what is emitted by tablegen for regular instructions.

I've left out the compressed formats. And I haven't supported the
named opcodes like LUI or OP_IMM_32. Those can be added in future
patches.

Documentation can be found here https://sourceware.org/binutils/docs-2.37/as/RISC_002dV_002dFormats.html

Reviewed By: jrtc27, MaskRay

Differential Revision: https://reviews.llvm.org/D108602
2021-09-12 15:56:12 -07:00
Craig Topper 1b736bda3b [RISCV] Enable CGP to sink splat operands of Add/Sub/Mul/Shl/LShr/AShr
LICM may have pulled out a splat, but with .vx instructions we
can fold it into an operation.

This patch enables CGP to reverse the LICM transform and move the
splat back into the loop.

I've started with the commutable integer operations and shifts, but we can
extend this with more operations in future patches.

Reviewed By: frasercrmck

Differential Revision: https://reviews.llvm.org/D109394
2021-09-10 09:04:01 -07:00
Craig Topper 6c7cadb8c1 [RISCV] Teach vsetvli insertion that stores don't use the policy bits in vtype.
This can avoid a vsetvl after a tail undisturbed operation.

Differential Revision: https://reviews.llvm.org/D109549
2021-09-10 09:03:20 -07:00
Craig Topper 9af8f1b18e [SelectionDAG] Add isZero/isAllOnes methods to ConstantSDNode.
Soft deprecrate isNullValue/isAllOnesValue and update in tree
callers. This matches the changes to the APInt interface from
D109483.

Reviewed By: lattner

Differential Revision: https://reviews.llvm.org/D109535
2021-09-09 13:28:30 -07:00
Alexander Pivovarov 4bc8dbe0ca [RISCV] Add SiFive cores E and S series
Add SiFive cores E20, E21, E24, E34, S21, S54 and S76

Differential Revision: https://reviews.llvm.org/D109260
2021-09-08 23:59:04 -07:00
Yvan Roux 261cbe98c3 [RISCV] Fix Machine Outliner jump table handling.
Don't outline machine instructions which are using jump table indexes
since they are materialized as local labels (like the already handled
case of constant pools).

Reviewed By: paquette

Differential Revision: https://reviews.llvm.org/D109436
2021-09-09 07:32:30 +02:00
Craig Topper a574f0e0c3 [RISCV] Disable use of i128 shift libcalls on RV32.
Since i128 isn't a legal C type on RV32, I don't believe
libgcc implements these functions for RV32. compiler-rt
does implement them because i128 support is enabled
in order to handle long double.

This is consistent with 32-bit X86 and ARM.

Reviewed By: efriedma

Differential Revision: https://reviews.llvm.org/D109383
2021-09-08 14:26:07 -07:00
Craig Topper aca14c8cf1 [RISCV] Remove unused tablegen template parameters. NFC
Identified in D109359
2021-09-08 10:01:42 -07:00
Craig Topper b04c09c07c [RISCV] Use V0 instead of VMV0: for mask vectors in isel patterns.
This is consistent with the RVV intrinsic patterns. This has been
shown to prevent some "ran out of registers" errors in our internal
testing.

Unfortunately, there are some regressions on LMUL=8 tests in here.
I think the lack of registers with LMUL=8 just makes it very hard
to schedule correctly.

Reviewed By: frasercrmck

Differential Revision: https://reviews.llvm.org/D109245
2021-09-08 09:46:21 -07:00
Craig Topper 1f16191906 [RISCV] Add an GPR def to the Zvlseg SPILL/RELOAD pseudos
The expansion of these pseudos creates ADD instructions. Those
ADDs modify a GPR so that it is no longer contains the same value
as the input base pointer. Therefore, I believe we should have a
GPR as a Def on these instructions and expansion should get the
destination register for the ADDs from that operand.

At least in our tests here this works out so that register
scavenging picks the same register as the base pointer.

Reviewed By: frasercrmck

Differential Revision: https://reviews.llvm.org/D109405
2021-09-08 09:23:33 -07:00
Kazu Hirata 5c6338de16 [RISCV] Fix "set but not used" warnings 2021-09-07 09:19:31 -07:00
Peter Smith e63455d5e0 [MC] Use local MCSubtargetInfo in writeNops
On some architectures such as Arm and X86 the encoding for a nop may
change depending on the subtarget in operation at the time of
encoding. This change replaces the per module MCSubtargetInfo retained
by the targets AsmBackend in favour of passing through the local
MCSubtargetInfo in operation at the time.

On Arm using the architectural NOP instruction can have a performance
benefit on some implementations.

For Arm I've deleted the copy of the AsmBackend's MCSubtargetInfo to
limit the chances of this causing problems in the future. I've not
done this for other targets such as X86 as there is more frequent use
of the MCSubtargetInfo and it looks to be for stable properties that
we would not expect to vary per function.

This change required threading STI through MCNopsFragment and
MCBoundaryAlignFragment.

I've attempted to take into account the in tree experimental backends.

Differential Revision: https://reviews.llvm.org/D45962
2021-09-07 15:46:19 +01:00
Fraser Cormack a823bdf3ab [RISCV][VP] Custom lower VP_STORE and VP_LOAD
This patch adds support for the vector-predicated `VP_STORE` and
`VP_LOAD` nodes. We do this in the same way we lower `MSTORE` and
`MLOAD`: to regular load/store instructions via intrinsics.

One necessary change was made to `SelectionDAGLegalize` so that
`VP_STORE` nodes' operation actions are taken from the stored "value"
operands, in the same vein as `STORE` or `MSTORE`.

Reviewed By: craig.topper, rogfer01

Differential Revision: https://reviews.llvm.org/D108999
2021-09-07 10:53:25 +01:00
Fraser Cormack f4dee8cb82 [RISCV][VP] Custom lower VP_SCATTER and VP_GATHER
This patch adds support for the `VP_SCATTER` and `VP_GATHER` nodes by
lowering them to RVV's `vsox`/`vlux` instructions, respectively. This
process is almost identical to the existing `MSCATTER`/`MGATHER` support.

One extra change was made to `SelectionDAGLegalize` so that
`VP_SCATTER`'s operation action is derived from its stored "value"
operand rather than its return type (which is always the chain).

Reviewed By: craig.topper, rogfer01

Differential Revision: https://reviews.llvm.org/D108987
2021-09-07 10:43:07 +01:00
Craig Topper 75620fadf5 [RISCV] Change how we encode AVL operands in vector pseudoinstructions to use GPRNoX0.
This patch changes the register class to avoid accidentally setting
the AVL operand to X0 through MachineIR optimizations.

There are cases where we really want to use X0, but we can't get that
past the MachineVerifier with the register class as GPRNoX0. So I've
use a 64-bit -1 as a sentinel for X0. All other immediate values should
be uimm5. I convert it to X0 at the earliest possible point in the VSETVLI
insertion pass to avoid touching the rest of the algorithm. In
SelectionDAG lowering I'm using a -1 TargetConstant to hide it from
instruction selection and treat it differently than if the user
used -1. A user -1 should be selected to a register since it doesn't
fit in uimm5.

This is the rest of the changes started in D109110. As mentioned there,
I don't have a failing test from MachineIR optimizations anymore.

Reviewed By: frasercrmck

Differential Revision: https://reviews.llvm.org/D109116
2021-09-03 09:19:25 -07:00
Alexander Pivovarov 6cd4b508a8 [RISCV] Add SiFive core S51
Add SiFive core s51 as rv64imac RocketModel

Reviewed-By: MaskRay, evandro
Differential Revision: https://reviews.llvm.org/D108886
2021-09-02 18:45:25 -07:00
Alexander Pivovarov 1104e3258b Fix typo in RISCVMatInt.cpp comments 2021-09-02 18:11:09 -07:00
Craig Topper b5fd6b46f5 [RISCV] Teach instruction selection to elide sext.w in some cases.
If a sext_inreg is up for isel, and all its users are W instructions,
we can skip emitting the sext_inreg. This helpful if the producing
instruction can't become a W instruction.

Reviewed By: asb

Differential Revision: https://reviews.llvm.org/D108966
2021-09-02 07:54:34 -07:00
Evandro Menezes 5ebdb07e7e [RISCV] Enable shrink wrap by default
Differential Revision: https://reviews.llvm.org/D109037
2021-09-02 09:47:58 -05:00
Craig Topper e4e69ba4d1 [RISCV] Split PseudoVSETVLI into 2 instructions to allow different register classes for rs1.
X0 has special meaning for vsetvli, we need to make sure we never
create it a vsetvli that uses it by accident. This could happen
if the register coalescer coalesces a copy from X0 into this
instruction.

This patch splits the instruction so that we can have GPRNoX0
register class to use for the cases where we don't want the source
to be X0. The verifier won't let us explicitly use X0 on a GPRNoX0
operand so we need a separate pseudo for those cases.

I don't currently have a failing example for this. There was a
failure in D107957, but the coalescable copy from that example
should have been optimized away much earlier so I've fixed that.

This is not a complete fix. We still need to prevent the same
possible issue on the AVL operand of all of the vector instruction
pseudos. I don't want to make two versions of all of those so we
need to find a different solution for those. I have an idea I'm
going to try.

Differential Revision: https://reviews.llvm.org/D109110
2021-09-02 07:45:31 -07:00
Alexander Pivovarov 4b04d54206 [RISCV] Fix typo in RISCVSchedSiFive7.td
Fix typo in "microarchitecure".

Differential Revision: https://reviews.llvm.org/D109006
2021-09-01 16:39:48 -05:00
Craig Topper ccbb4c8b4f [RISCV] Fold (RISCVISD::SELECT_CC X, Y, CC, Z, Z) -> Z.
If the true and false values are the same, we don't need a SELECT_CC.

This would normally be folded before a select is legalized to
select_cc. The test case exploits the late legalization of vscale
to trigger a case where they become identical after legalization.

This works around an issue found on a test case in D107957. In that
case the true/false values were both eventually 0 and the select was
used by a vector AVL operand. The select_cc got expanded to control
flow and a phi, but the phi inputs were both copies from X0. MachineIR
optimizations simplified this to a single copy from X0 going into the
vector instruction. This became the input of a vsetvli after vsetvli
insertion. Then register coalescing folded the copy into the vsetvli.
X0 as the source of a vsetvli is a special encoding and should not be
created by coalesing. We need to fix our vsetvli handling to make sure
this can never happen any other way, but removing the unneeded select
is still a worthwhile optimization.
2021-09-01 12:37:52 -07:00
Luke a78dd726f4 [SLP][RISCV] Implement unsigned getMinVectorRegisterBitWidth() for RISCV
Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D108973
2021-09-01 14:25:15 +08:00
Nick Desaulniers e9b3f25730 [RISCVISelLowering] avoid emitting libcalls to __mulodi4() and __multi3()
Similar to D108842, D108844, D108926, D108928, and D108936.

__has_builtin(builtin_mul_overflow) returns true for 32b RISCV targets,
but Clang is deferring to compiler RT when encountering long long types.

If the semantics of __has_builtin mean "the compiler resolves these,
always" then we shouldn't conditionally emit a libcall.

Link: https://bugs.llvm.org/show_bug.cgi?id=28629

Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D108939
2021-08-31 11:23:56 -07:00
Nikita Popov c1b7540645 [TTI] Sink IVDescriptors.h include (NFC)
Forward declare RecurrenceDescriptor and include IVDescritor.h
only in implementation code that actually needs it.
2021-08-30 22:41:58 +02:00
Craig Topper 0560a4adb3 [RISCV] Enable CONCAT_VECTORS for fixed FP vectors.
Reviewed By: frasercrmck

Differential Revision: https://reviews.llvm.org/D108487
2021-08-30 08:47:45 -07:00
Craig Topper dbf0d8118c [RISCV] Use ~0ULL instead of ~0U when checking for invalid ErrorInfo.
ErrorInfo is a uint64_t and is initialized to all 1s.

Not sure how to test this. Noticed while working on .insn support.
2021-08-27 12:30:33 -07:00
Philipp Krones 54e8cae565 [MC][RISCV] Add RISCV MCObjectFileInfo
This makes sure, that the text section will have a 2-byte alignment, if
the +c extension is enabled.

Reviewed By: MaskRay, luismarques

Differential Revision: https://reviews.llvm.org/D102052
2021-08-27 18:23:29 +01:00
Craig Topper 0eeab8b282 [RISCV] Add -riscv-v-fixed-length-vector-elen-max to limit the ELEN used for fixed length vectorization.
This adds an ELEN limit for fixed length vectors. This will scalarize
any elements larger than this. It will also disable some fractional
LMULs. For example, if ELEN=32 then mf8 becomes illegal, i32/f32
vectors can't use any fractional LMULs, i16/f16 can only use mf2,
and i8 can use mf2 and mf4.

We may also need something for the scalable vectors, but that has
interactions with the intrinsics and we can't scalarize a scalable
vector.

Longer term this should come from one of the Zve* features
2021-08-27 10:17:35 -07:00
Craig Topper 1b9417454e [RISCV] Insert a sext_inreg when type legalizing i32 shl by constant on RV64.
Similar to what we do for add/sub/mul.

This can help remove some sext.w. There are some regressions on
some bswap tests, but I have an idea how to fix that for a follow up.

A new PACKW pattern is added to handle the new sext_inreg placement.

Differential Revision: https://reviews.llvm.org/D108663
2021-08-26 10:20:19 -07:00
Ben Shi f69fb7ac72 [DAGCombiner] Add target hook function to decide folding (mul (add x, c1), c2)
Reviewed by: lebedev.ri, spatel, craig.topper, luismarques, jrtc27

Differential Revision: https://reviews.llvm.org/D107711
2021-08-22 16:53:32 +08:00
Ben Shi 5b6c9a5ab0 [RISCV] Optimize add in the zba extension with SH*ADD
Optimize (add x, c) to (SH*ADD (c>>b), x) if c is not simm12
while (c>>b) is simm12 and c has b trailing zeros.

Reviewed By: luismarques

Differential Revision: https://reviews.llvm.org/D108193
2021-08-20 22:41:49 +08:00
Fraser Cormack 5b06cbac11 [RISCV] Fix reporting of incorrect commutable operand indices
This patch fixes an issue where RISCV's `findCommutedOpIndices` would
incorrectly return the pseudo `CommuteAnyOperandIndex` as a commutable
operand index, rather than fixing a specific index.

Reviewed By: rogfer01

Differential Revision: https://reviews.llvm.org/D108206
2021-08-20 10:27:15 +01:00
Craig Topper 36d8316cc8 [RISCV] Reduce duplicate code for calling SimplifyDemandedBits.
This encapsulates the APInt creation and worklist management into
a helper function.

To keep one common interface I've use Log2_32 in places that
previously created a mask by subtracting 1 from a power of 2.

Differential Revision: https://reviews.llvm.org/D108324
2021-08-19 07:09:38 -07:00
Craig Topper 3f9b37ccb1 [RISCV] Remove sext_inreg+add/sub/mul/shl isel patterns.
Let the sext_inreg be selected to sext.w. Remove unneeded sext.w
during PostProcessISelDAG.

This gives opportunities for some other isel patterns to match
like the ADDIPair or matching mul with immediate to shXadd.

This becomes possible after D107658 started selecting W instructions
based on users. The sext.w will be considered a W user so isel
will often select a W instruction for the sext.w input and we can
just remove the sext.w. Otherwise we can combine the sext.w with
a ADD/SUB/MUL/SLLI to create a new W instruction in parallel
to the the original instruction.

Reviewed By: luismarques

Differential Revision: https://reviews.llvm.org/D107708
2021-08-18 11:07:11 -07:00
Craig Topper 6d7ea597ef [RISCV] Insert sext_inreg when type legalizing add/sub/mul with constant LHS.
We already do this for non-constants RHS. This just removes the
special case. I believe the special case may have been needed
because the ANY_EXTEND of a constant used to create zero extended
constants, but we recently changed that to produce sign extended
constants.

D107658 is needed to prevent some regressions.

Reviewed By: luismarques

Differential Revision: https://reviews.llvm.org/D107697
2021-08-18 10:44:25 -07:00
Craig Topper 20e6265873 [RISCV] Improve constant materialization for stores of i16 or i32 negative constants.
DAGCombiner::visitStore can clear the upper bits of constants
used by stores. This leads prevents them from being recognized as
sign extended negative values making them more expensive to
materialize.

This patch uses the hasAllNBitUsers method from D107658 to make
a negative constant if none of the users care about the upper bits.

Reviewed By: luismarques

Differential Revision: https://reviews.llvm.org/D108052
2021-08-18 10:25:12 -07:00
Craig Topper d9ba1a9c5c [RISCV] Teach isel to select ADDW/SUBW/MULW/SLLIW when only the lower 32-bits are used.
We normally select these when the root node is a sext_inreg, but
SimplifyDemandedBits can sometimes bypass the sext_inreg for some
users. This can create situation where sext_inreg+add/sub/mul/shl
is selected to a W instruction, and then the add/sub/mul/shl is
separately selected to a non-W instruction with the same inputs.

This patch tries to detect when it would still be ok to use a W
instruction without the sext_inreg by checking the direct users.
This can allow the W instruction to CSE with one created for a
sext_inreg+add/sub/mul/shl. To minimize complexity and cost of
checking, we make no attempt to determine if the CSE will happen
and just always use a W instruction when we can.

Differential Revision: https://reviews.llvm.org/D107658
2021-08-18 10:22:00 -07:00
Craig Topper f70238914a [RISCV] Add zext.h/zext.w to RISCVTTIImpl::getIntImmCostInst.
If we have these instructions, we don't need to hoist the immediate
for an AND that would match them.

Reviewed By: luismarques

Differential Revision: https://reviews.llvm.org/D107783
2021-08-18 09:40:40 -07:00
Craig Topper 8f6cea43e7 [RISCV] Use RISCV::RVVBitsPerBlock for RGK_ScalableVector in getRegisterBitWidth.
I might be wrong, but I think this is should be width of the known
min size we use for scalable vectors. It shouldn't scale with
minimum vlen.

Reviewed By: frasercrmck

Differential Revision: https://reviews.llvm.org/D107945
2021-08-17 11:13:15 -07:00
Craig Topper d63f117210 [RISCV] Support RISCVISD::SELECT_CC in ComputeNumSignBitsForTargetNode. 2021-08-13 18:00:09 -07:00
Craig Topper 79fbddbea0 [RISCV] Teach vsetvli insertion pass that it doesn't need to insert vsetvli for unit-stride or strided loads/stores in some cases.
For unit-stride and strided load/stores we set the SEW operand of
the pseudo instruction equal the EEW in the opcode. The LMUL
of the pseudo instruction is the LMUL we want.

These instructions calculate EMUL=(EEW/SEW) * LMUL. We can use
this to avoid changing vtype if the SEW/LMUL of the previous
vtype matches the EEW/EMUL ratio we need for the instruction.

Due to how the global analysis works, we can only do this
optimization when the previous vsetvli was produced in the block
containing the store. We need to know in the first phase if the
vsetvli will be inserted so we can propagate information to
the successors in the second phase correctly. This means we can't
depend on predecessors.

Reviewed By: rogfer01

Differential Revision: https://reviews.llvm.org/D106601
2021-08-12 10:05:27 -07:00
Craig Topper 6f5edc3487 [RISCV] Fold (add (select lhs, rhs, cc, 0, y), x) -> (select lhs, rhs, cc, x, (add x, y))
Similar for sub except sub isn't commutative.

Modify the existing and/or/xor folds to also work on ISD::SELECT
and not just RISCVISD::SELECT_CC. This is needed to make sure
we do this transform before type legalization turns i32 add/sub
into add/sub+sign_extend_inreg on RV64. If we don't do this before
that, the sign_extend_inreg will still be after the select.

Reviewed By: frasercrmck

Differential Revision: https://reviews.llvm.org/D107603
2021-08-10 09:02:56 -07:00
Fraser Cormack 2b4a1d4b86 [RISCV] Improve codegen for shuffles with LHS/RHS splats
Shuffles which are broken into separate halves reveal splats in which
a half is accessed via one index; such operations can be optimized to
use "vrgather.vi".

This optimization could be achieved by adding extra patterns to match
`vrgather_vv_vl` which uses a splat as an index operand, but this patch
instead identifies splat earlier. This way, future optimizations can
build on top of the data gathered here, e.g., to splat-gather dominant
indices and insert any leftovers.

Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D107449
2021-08-09 10:31:40 +01:00
Craig Topper 2f3b738960 [RISCV] Add optimizations for FMV_X_ANYEXTH similar to FMV_X_ANYEXTW_RV64.
This enables the fneg and fabs combines we have for FMV_X_ANYEXTW_RV64.
2021-08-08 18:30:48 -07:00
Craig Topper 88bc29f5f2 [RISCV] Introduce a RISCV CondCode enum instead of using ISD:SET* in MIR. NFC
Previously we converted ISD condition codes to integers and stored
them directly in our MIR instructions. The ISD enum kind of belongs
to SelectionDAG so that seems like incorrect layering.

This patch instead uses a CondCode node on RISCV::SELECT_CC until
isel and then converts it from ISD encoding to a RISCV specific value.
This value can be converted to/from the RISCV branch opcodes in the
RISCV namespace.

My larger motivation is to possibly support a microarchitectural
feature of some CPUs where a short forward branch over a single
instruction can be predicated internally. This will require a new
pseudo instruction for select that needs to carry a branch condition
and live probably until RISCVExpandPseudos. At that point it can be
expanded to control flow without other instructions ending up in the
predicated basic block. Using an ISD encoding in RISCVExpandPseudos
doesn't seem like correct layering.

Reviewed By: luismarques

Differential Revision: https://reviews.llvm.org/D107400
2021-08-08 17:25:37 -07:00
Craig Topper 20dfe051ab [RISCV] Move the $rs operand of PseudoStore from outs to ins. NFC
This is the data to be stored so it should be an input.

To keep operand order similar between loads and stores, move the temp
register to the first dest operand of floating point loads. Rework
the assembler code accordingly.

This doesn't have any functional effect because this Pseudo is only
used by the assembler which doesn't use ins/outs.

Reviewed By: luismarques

Differential Revision: https://reviews.llvm.org/D107309
2021-08-08 15:58:24 -07:00