Commit Graph

2346 Commits

Author SHA1 Message Date
Haojian Wu 7ed68182d7 Fix a -Wswitch warning. 2022-09-13 08:57:43 +02:00
jacquesguan b98b4fae75 [RISCV] Add cost model for compare and select instructions.
This patch adds cost model for vector compare and select instructions. For vector FP compare instruction, it only add the comparisions supported natively.

Reviewed By: reames

Differential Revision: https://reviews.llvm.org/D132296
2022-09-13 14:44:46 +08:00
Yeting Kuo 5fcb5d7759 [RISCV] Add assertion of hasVecPolicyOp to catch masked intrinsic without policy operand.
The original code may have incorrect result if there is a masked instruction
without policy operand to make us set its policy to TUMU. The patch adds an
assertion to catch the instruction.

Reviewed By: reames

Differential Revision: https://reviews.llvm.org/D133302
2022-09-13 10:09:49 +08:00
Craig Topper d49280e0a4 [RISCV] Rename WriteFALU* and ReadFALU* to WriteFAdd*/ReadFAdd*.
ALU seems a little vague. FAdd felt more precise even though it
also include FSUB instructions.

Reviewed By: monkchiang

Differential Revision: https://reviews.llvm.org/D133632
2022-09-12 09:37:28 -07:00
Craig Topper 4186a49d79 [RISCV] Custom type legalize i32 loads by sign extending.
The default is to use extload which can become a zextload or
sextload if it is followed by an 'and' or sext_inreg.

Sometimes type legalization will introduce an 'and' from promoting
something like 'srl X, C' and a sext_inreg from from a setcc. The
'and' could be freely folded with the promoted 'srl' by using srliw,
but the sext_inreg can't be folded into a compare. DAG combiner
will see both of these choices and may decide to fold the 'and'
instead of the 'sext_inreg'. This forces the sext_inreg to become
a sext.w.

By picking sextload in the type legalizer we take this choice away.
Looking at spec2006 compiled with Zba and Zbb this appeared to be
net reduction in lines of code in the objdump disassembly output.

This is similar to what we do with i32 add/sub/mul/shl in
type legalization where we always emit a sext_inreg.

Reviewed By: asb

Differential Revision: https://reviews.llvm.org/D130397
2022-09-12 09:13:07 -07:00
Alex Bradbury 51ae462447 [RISCV] Add the GlobalMerge pass (disabled by default)
Split out from D129178, this just adds the GlobalMerge tests (other than global-merge-minsize.ll which is testing a specific configuration of the pass when it's enabled) and exposes `-riscv-enable-global-merge` and //doesn't enable it by default//.

Note that the comment "// FIXME: Unify control over GlobalMerge." is copied from the Arm and AArch64 backends, which expose the same flag. Presumably the author is imagining some later refactoring that provides a target-independent flag.

Reviewed By: craig.topper, reames, hiraditya

Differential Revision: https://reviews.llvm.org/D130481
2022-09-08 18:40:38 -07:00
Craig Topper 5f3a8b585b [RISCV] Add RecurKind::FMulAdd to isLegalToVectorizeReduction for scalable vectors.
Reviewed By: reames

Differential Revision: https://reviews.llvm.org/D133511
2022-09-08 12:34:59 -07:00
Joe Loser 5e96cea1db [llvm] Use std::size instead of llvm::array_lengthof
LLVM contains a helpful function for getting the size of a C-style
array: `llvm::array_lengthof`. This is useful prior to C++17, but not as
helpful for C++17 or later: `std::size` already has support for C-style
arrays.

Change call sites to use `std::size` instead.

Differential Revision: https://reviews.llvm.org/D133429
2022-09-08 09:01:53 -06:00
liqinweng 9b4e75ee76 [RISCV][COST] Add cost model for mask vector select instruction when its condition is a scalar type
Reviewed By: jacquesguan

Differential Revision: https://reviews.llvm.org/D132992
2022-09-08 18:55:49 +08:00
Philip Reames a4a29438f4 [RISCV][MC] Add minimal support for Ztso extension
This is a minimalist implementation which simply adds the extension (in the experimental namespace since its not ratified), and wires up the setting of the required ELF header flag. Future changes will include codegen changes to exploit the stronger memory model.

This is intended to implement v0.1 of the proposed specification which can be found in Chapter 25 of https://github.com/riscv/riscv-isa-manual/releases/download/draft-20220723-10eea63/riscv-spec.pdf.

Differential Revision: https://reviews.llvm.org/D133239
2022-09-07 09:30:57 -07:00
Craig Topper 5d30565d80 [RISCV] Improve vector fround lowering by changing FRM.
This is a follow up to D133238 which did this for ceil/floor.

Reviewed By: arcbbb, frasercrmck

Differential Revision: https://reviews.llvm.org/D133335
2022-09-06 09:33:13 -07:00
Craig Topper f0332d12ae [RISCV] Improve vector fceil/ffloor lowering by changing FRM.
This adds new VFCVT pseudoinstructions that take a rounding mode operand. A custom inserter is used to insert additional instructions to change FRM around the
VFCVT.

Some of this is borrowed from D122860, but takes a somewhat different direction. We may migrate to that patch, but for now I was trying to keep this as independent from
RVV intrinsics as I could.

A followup patch will use this approach for FROUND too.

Still need to fix the cost model.

Reviewed By: arcbbb

Differential Revision: https://reviews.llvm.org/D133238
2022-09-05 19:03:44 -07:00
Craig Topper 11881a8f3f [RISCV] Rename some V extension multiclasses for consistency. NFC
Use "SDNode" in the name is the convention for the VLMax patterns
in RISCVInstrInfoVSDPatterns.td. This files use "VL".
2022-09-01 22:17:08 -07:00
Alex Bradbury 6e1897ce95 [RISCV][NFC] Fix typo in comment in RISCVInstrInfoZicbo.td
Zicbop->Zicbom typo.
2022-09-01 13:49:55 +01:00
liqinweng c45810f810 [RISCV] When ISD::SETUGT && Imm == -1, has processed before lowering
When ISD::SETUGT && Imm == -1, has processed before lowering. Use assert replace it

Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D132373
2022-09-01 15:38:16 +08:00
Craig Topper 6e0ae7e940 [RISCV] Slightly simplify coode in combineVWADD_W_VL_VWSUB_W_VL and combineMUL_VLToVWMUL_VL. NFC
Use computeMaxSignificantBits instead of ComputeNumSignBits. Create
APInt as part of call to MaskedValueIsZero instead of creating
a named temporary.
2022-08-31 15:02:03 -07:00
Michael Maitland 30a4264f5f [RISCV][CodeGen] add assertion to RISCVTargetStreamer getTargetStreamer()
X86 and ARM AsmParsers have this same assertion. This assertion provides better reporting when the RISCVTargetStreamer is null and helps to prevent null pointer access.

Reviewed By: bkramer

Differential Revision: https://reviews.llvm.org/D132863
2022-08-31 11:15:47 -07:00
jacquesguan 45c1ce321d [RISCV] Add cost model for select and integer compare instructions.
This patch adds cost model for vector select and integer compare instructions.
2022-08-31 11:32:58 +08:00
Craig Topper 7973346d16 [RISCV] Use uint64_t countTrailingZeros/Ones instead of APInt. NFC
We know the type is 32 or 64 bits, we can use getZExtValue and
bypass the slow path check in APInt.
2022-08-30 12:39:36 -07:00
Craig Topper 893f5e95e2 [RISCV] Improve isel of AND with shiftedMask containing 32 leading zeros and some trailing zeros.
We can use srliw to shift out the trailing bits and slli to shift
back in zeros. The sign extend of srliw will 0 the upper 32 bits
since we will be shifting a 0 into bit 31.
2022-08-30 12:22:46 -07:00
liqinweng 72c9f811d8 [RISCV][COST] Refactor for costs of integer saturing add/sub
Reviewed By: reames

Differential Revision: https://reviews.llvm.org/D132822
2022-08-30 11:39:55 +08:00
Craig Topper e25eb61d03 [RISCV] Enable (srl (and X, C2), C) to form SRLIW in more cases.
Don't require the AND has one use and don't depend on
targetShrinkDemandedConstant turning C2 into 0xffffffff. Instead,
check that the constant is 0xffffffff after replacing any bits
that will be shifted out with 1s.

Another way to fix this might be to prevent SimplifyDemandedBits
from destroying the ANDI after type legalization using
targetShrinkDemandedBits. That would prevent the CSE that created
this mess. targetShrinkDemandedBits is currently only enable after
legalize ops. Quick experiment shows we can't just change when it
runs, we would need to try a different heuristic for post type
legalization.
2022-08-29 15:52:08 -07:00
Craig Topper 0fbe71e91f [RISCV] Use hasAllWUsers to recover ANDI.
SimplifyDemandedBits can 0 the upper bits and targetShrinkDemandedConstant
isn't alway able to recover it.

At least part of that may be because targetShrinkDemandedConstant
only runs in the last DAGCombine. Might be worth seeing what happens
if we move it post type legalization.
2022-08-29 14:11:09 -07:00
Craig Topper 1c334b306e [RISCV] Add more invertible setccs to tryDemorganOfBooleanCondition.
This builds on D132771 to invert (setlt 0, X) to (setlt X, 1) and
vice versa.

Reviewed By: reames

Differential Revision: https://reviews.llvm.org/D132798
2022-08-29 12:23:03 -07:00
Craig Topper 9d12bb77f9 [RISCV] Apply DeMorgan to (beqz (and/or (seteq), (xor Z, 1))) to remove the xor.
We can rewrite to (bnez (or/and (setne), Z) is Z is 0/1.

Alternatively, we could canonicalize to (xor (or/and (setne), Z), 1)
even if there is no branch. The xor would not always get removed,
but it might enable other DeMorgan combines. I decided to be
conservative for this first patch and require the xor to be removed.

I have a couple other invertible setccs I will add in a follow up
patch.

Reviewed By: reames

Differential Revision: https://reviews.llvm.org/D132771
2022-08-29 12:16:34 -07:00
Craig Topper 2f811a6c7f [VP][RISCV] Add vp.fabs intrinsic and RISC-V support.
Mostly just modeled after vp.fneg except there is a
"functional instruction" for fneg while fabs is always an
intrinsic.

Reviewed By: fakepaper56

Differential Revision: https://reviews.llvm.org/D132793
2022-08-29 09:32:06 -07:00
Craig Topper e0a9da2562 [RISCV] Add Uses=[FRM] and mayRaiseFPException to VF(N/W)CVT instructions.
Reviewed By: arcbbb, kito-cheng

Differential Revision: https://reviews.llvm.org/D132792
2022-08-29 09:26:33 -07:00
Craig Topper 6732896bbf [RISCV] Use analyzeBranch in RISCVRedundantCopyElimination.
The existing code was incorrect if we had more than one conditional
branch instruction in a basic block. Though I don't think that will
occur, using analyzeBranch detects that as an unsupported case.

Overall this results in simpler code in RISCVRedundantCopyElimination.

Reviewed By: reames, kito-cheng

Differential Revision: https://reviews.llvm.org/D132347
2022-08-29 09:05:53 -07:00
Yeting Kuo abf0416328 [RISCV] Merge vmerge.vvm and unmasked intrinsic with VLMAX vector length.
The motivation of this patch is to lower the IR pattern
(vp.merge mask, (add x, y), false, vl) to
(PseudoVADD_VV_<LMUL>_MASK false, x, y, mask, vl).

Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D131841
2022-08-29 11:44:51 +08:00
liqinweng 6fdd62c98d [RISCV] Remove unused code
Reviewed By: benshi001

Differential Revision: https://reviews.llvm.org/D132281
2022-08-29 10:16:44 +08:00
liqinweng a42e21deb8 [RISCV] Refactor for costs of integer min/max
Reviewed By: reames

Differential Revision: https://reviews.llvm.org/D132724
2022-08-29 10:13:50 +08:00
Kazu Hirata 2833760c57 [Target] Qualify auto in range-based for loops (NFC) 2022-08-28 17:35:09 -07:00
Benjamin Kramer b69086e6c7 [RISC-V][HWASAN] Fold variable into assert 2022-08-29 00:32:37 +02:00
Alexey Baturo e3485345d3 [RISC-V][HWASAN] Add support for lowering HWASAN intrinsic for RISC-V
Reviewed By: vitalybuka

Differential Revision: https://reviews.llvm.org/D131343
2022-08-28 21:22:13 +03:00
Alexey Baturo 0636aec330 [RISC-V][HWASAN] Add intrinsics required for HWASAN support for RISC-V
Reviewed By: vitalybuka

Differential Revision: https://reviews.llvm.org/D131340
2022-08-28 18:05:43 +03:00
Philip Reames b45a262679 [RISCV] Enable fixed length vectors and loop vectorization with same
This change enables the use of RISCV's variable length vector registers for fixed length vectors in the IR, and implicitly enables various IR transforms which generate fixed length vectors if legal (e.g. LoopVectorize). Specifically, this enables fixed length vectors which are known to be inbounds of the underlying variable hardware size.

For context, remember that the +V extension provides a minimum VLEN of 128. The embedded variants provide lower minimums. The analogy here is essentially vectorizing for SSE on a machine which may or may not include AVX2/AVX512. We won't get full utilization by default, but we will get some benefit. And of course, with an explicit mcpu we can vectorize to the exact target hardware.

The LV impact is mostly related to vectorizer robustness. In cases we haven't yet fully implemented scalable vectorization support, we can fall back to fixed length vectorization.

SLP has been disabled for now, even when fixed vectors are enabled.  See a310637 and associated review.  There are a few addiitional code quality issues which need worked through before turning SLP on would be reasonable.

Differential Revision: https://reviews.llvm.org/D131508
2022-08-26 14:45:23 -07:00
Philip Reames a310637132 [RISCV] Disable SLP vectorization by default due to unresolved profitability issues
This change implements a TTI query with the goal of disabling slp vectorization on RISCV. The current default configuration disables SLP already, but its current tied to the ability to lower fixed length vectors. Over in D131508, I want to enable fixed length vectors for purposes of LoopVectorizer, but preliminary analysis has revealed a couple of SLP specific issues we need to resolve before enabling it by default. This change exists to allow us to enable LV without SLP.

Differential Revision: https://reviews.llvm.org/D132680
2022-08-26 14:11:22 -07:00
Yunze Zhu 3846e3970f [RISCV] Generate correct ELF abi flag when empty .ll file has target-abi attribute
In patch D121183, target abi is get from .ll file's target-abi
attribute and set in RISCVAsmPrinter::emitFunctionEntryLabel
function. In https://github.com/llvm/llvm-project/issues/57242,
an api mismatch error may be caused by failing to call function
RISCVAsmPrinter::emitFunctionEntryLabel to set target-abi to
correct one when the .ll is empty or a module has no function.

This patch move setting target-abi part to function
RISCVAsmPrinter::emitStartOfAsmFile, make sure all .ll file and
module in LTO read target-abi from module flag and set, with or
without function.

Signed-off-by: xiaojing.zhang <xiaojing.zhang@xcalibyte.com>
Signed-off-by: jianxin.lai <jianxin.lai@xcalibyte.com>

Reviewed By: luismarques

Differential Revision: https://reviews.llvm.org/D132204
2022-08-26 14:39:39 +08:00
LiaoChunyu 6b098bf35a [RISCV] : Add support for simm10_lsb0000nonzero operand.
Running on RISCV machine llvm-exegesis I faced with trouble: can't measure C_ADDI16SP, beacuse immediate has type simm10_lsb0000nonzero.

Patch adds support for processing this immediate operand type.

Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D132650
2022-08-26 14:37:37 +08:00
Craig Topper e4177201eb [RISCV][M68k] Replace fixed size BitVector with std::bitset.
Saves a heap allocation and avoids an explicit call to the BitVector constructor.

Reviewed By: reames, myhsu

Differential Revision: https://reviews.llvm.org/D132674
2022-08-25 12:45:08 -07:00
Craig Topper 41a3b5739b [RISCV] Teach combineDeMorganOfBoolean to handle (and (xor X, 1), (not Y)).
SimplifyDemandedBits tries to agressively turn xor immediates into -1
to match a 'not' instruction. In this case, because X is a boolean, the
upper bits of (xor X, 1) are known to be 0. Because this is an AND
instruction, that means those bits aren't demanded from the other
operand, and thus SimplifyDemandedBits can turn (xor Y, 1) to (not Y).

We need to detect that this has happened to enable the DeMorgan
optimization. To do this we allow one of the xors to use -1 when
the outer operation is And.

Reviewed By: reames

Differential Revision: https://reviews.llvm.org/D132671
2022-08-25 10:55:45 -07:00
Philip Reames 53f738ce7e [RISCV] Add empirical costs for integer min/max and saturing add/sub
All of these are lowered to a single instruction for all legal vector types.
2022-08-25 09:27:17 -07:00
Craig Topper ec91d761ac [RISCV] Apply DeMorgan's law to (and/or (xor X, 1), (xor Y, 1)) if X and Y are 0/1.
This optimizes xors that appear due to legalizing setge/setle which
require an xor with 1. This reduces the number of xors and may
allow the xor to fold with a beqz or bnez.

Differential Revision: https://reviews.llvm.org/D132614
2022-08-25 08:49:30 -07:00
Philip Reames 03798f268b {RISCV] Backout cttz/ctlz instruction costs
Craig points out correctly in post-commit review that these depend on the availability of floating point extensions.
2022-08-24 15:40:48 -07:00
Philip Reames d4d6e71ea2 [RISCV] Add empirical costs for bswap/bitreverse/ctpop/ctlz/cttz
If anyone is looking for a source of ideas on vector codegen improvements, the lowerings for several of these seem to include pretty obvious fixits.
2022-08-24 15:09:21 -07:00
Philip Reames 42af1a776a [RISCV] Add empirically measured vector sqrt intrinsic costs 2022-08-24 14:27:57 -07:00
Philip Reames 4d3134866f [RISCV] Add vector fabs intrinsic costs
We have a fabs vector instruction, and are using it for current lowering.
2022-08-24 14:09:51 -07:00
Saleem Abdulrasool 8f45b5a7a9 RISCV: permit unaligned nop-slide padding emission
We may be requested to emit an unaligned nop sequence (e.g. 7-bytes or
3-bytes).  These should be 0-filled even though that is not a valid
instruction.  This matches the behaviour on other architectures like
ARM, X86, and MIPS.  When a custom section is emitted, it may be
classified as text even though it may be a data section or we may be
emitting data into a text segment (e.g. a literal pool).  In such cases,
we should be resilient to the emission request.

This was originally identified by the Linux kernel build and reported on
D131270 by Nathan Chancellor.

Differential Revision: https://reviews.llvm.org/D132482
Reviewed By: luismarques
Tested By: Nathan Chancellor
2022-08-24 20:26:48 +00:00
Simon Pilgrim f9de13232f [X86] Promote i8/i16 CTTZ (BSF) instructions and remove speculation branch
This patch adds a Type operand to the TLI isCheapToSpeculateCttz/isCheapToSpeculateCtlz callbacks, allowing targets to decide whether branches should occur on a type-by-type/legality basis.

For X86, this patch proposes to allow CTTZ speculation for i8/i16 types that will lower to promoted i32 BSF instructions by masking the operand above the msb (we already do something similar for i8/i16 TZCNT). This required a minor tweak to CTTZ lowering - if the src operand is known never zero (i.e. due to the promotion masking) we can remove the CMOV zero src handling.

Although BSF isn't very fast, most CPUs from the last 20 years don't do that bad a job with it, although there are some annoying passthrough EFLAGS dependencies. Additionally, now that we emit 'REP BSF' in most cases, we are tending towards assuming this will most likely be executed as a TZCNT instruction on any semi-modern CPU.

Differential Revision: https://reviews.llvm.org/D132520
2022-08-24 17:28:18 +01:00
Kito Cheng 8e8a62006e [RISCV][NFC] Minor cleanup in RISCVInstrInfo::getOutliningType
The only use of TM is checking result of TargetMachine::getFunctionSections,
check that directly instead of introdce a local variable.
2022-08-24 23:42:34 +08:00