llvm-project

Commit Graph

Author	SHA1	Message	Date
Haojian Wu	7ed68182d7	Fix a -Wswitch warning.	2022-09-13 08:57:43 +02:00
jacquesguan	b98b4fae75	[RISCV] Add cost model for compare and select instructions. This patch adds cost model for vector compare and select instructions. For vector FP compare instruction, it only add the comparisions supported natively. Reviewed By: reames Differential Revision: https://reviews.llvm.org/D132296	2022-09-13 14:44:46 +08:00
Yeting Kuo	5fcb5d7759	[RISCV] Add assertion of hasVecPolicyOp to catch masked intrinsic without policy operand. The original code may have incorrect result if there is a masked instruction without policy operand to make us set its policy to TUMU. The patch adds an assertion to catch the instruction. Reviewed By: reames Differential Revision: https://reviews.llvm.org/D133302	2022-09-13 10:09:49 +08:00
Craig Topper	d49280e0a4	[RISCV] Rename WriteFALU* and ReadFALU* to WriteFAdd/ReadFAdd. ALU seems a little vague. FAdd felt more precise even though it also include FSUB instructions. Reviewed By: monkchiang Differential Revision: https://reviews.llvm.org/D133632	2022-09-12 09:37:28 -07:00
Craig Topper	4186a49d79	[RISCV] Custom type legalize i32 loads by sign extending. The default is to use extload which can become a zextload or sextload if it is followed by an 'and' or sext_inreg. Sometimes type legalization will introduce an 'and' from promoting something like 'srl X, C' and a sext_inreg from from a setcc. The 'and' could be freely folded with the promoted 'srl' by using srliw, but the sext_inreg can't be folded into a compare. DAG combiner will see both of these choices and may decide to fold the 'and' instead of the 'sext_inreg'. This forces the sext_inreg to become a sext.w. By picking sextload in the type legalizer we take this choice away. Looking at spec2006 compiled with Zba and Zbb this appeared to be net reduction in lines of code in the objdump disassembly output. This is similar to what we do with i32 add/sub/mul/shl in type legalization where we always emit a sext_inreg. Reviewed By: asb Differential Revision: https://reviews.llvm.org/D130397	2022-09-12 09:13:07 -07:00
Alex Bradbury	51ae462447	[RISCV] Add the GlobalMerge pass (disabled by default) Split out from D129178, this just adds the GlobalMerge tests (other than global-merge-minsize.ll which is testing a specific configuration of the pass when it's enabled) and exposes `-riscv-enable-global-merge` and //doesn't enable it by default//. Note that the comment "// FIXME: Unify control over GlobalMerge." is copied from the Arm and AArch64 backends, which expose the same flag. Presumably the author is imagining some later refactoring that provides a target-independent flag. Reviewed By: craig.topper, reames, hiraditya Differential Revision: https://reviews.llvm.org/D130481	2022-09-08 18:40:38 -07:00
Craig Topper	5f3a8b585b	[RISCV] Add RecurKind::FMulAdd to isLegalToVectorizeReduction for scalable vectors. Reviewed By: reames Differential Revision: https://reviews.llvm.org/D133511	2022-09-08 12:34:59 -07:00
Joe Loser	5e96cea1db	[llvm] Use std::size instead of llvm::array_lengthof LLVM contains a helpful function for getting the size of a C-style array: `llvm::array_lengthof`. This is useful prior to C++17, but not as helpful for C++17 or later: `std::size` already has support for C-style arrays. Change call sites to use `std::size` instead. Differential Revision: https://reviews.llvm.org/D133429	2022-09-08 09:01:53 -06:00
liqinweng	9b4e75ee76	[RISCV][COST] Add cost model for mask vector select instruction when its condition is a scalar type Reviewed By: jacquesguan Differential Revision: https://reviews.llvm.org/D132992	2022-09-08 18:55:49 +08:00
Philip Reames	a4a29438f4	[RISCV][MC] Add minimal support for Ztso extension This is a minimalist implementation which simply adds the extension (in the experimental namespace since its not ratified), and wires up the setting of the required ELF header flag. Future changes will include codegen changes to exploit the stronger memory model. This is intended to implement v0.1 of the proposed specification which can be found in Chapter 25 of https://github.com/riscv/riscv-isa-manual/releases/download/draft-20220723-10eea63/riscv-spec.pdf. Differential Revision: https://reviews.llvm.org/D133239	2022-09-07 09:30:57 -07:00
Craig Topper	5d30565d80	[RISCV] Improve vector fround lowering by changing FRM. This is a follow up to D133238 which did this for ceil/floor. Reviewed By: arcbbb, frasercrmck Differential Revision: https://reviews.llvm.org/D133335	2022-09-06 09:33:13 -07:00
Craig Topper	f0332d12ae	[RISCV] Improve vector fceil/ffloor lowering by changing FRM. This adds new VFCVT pseudoinstructions that take a rounding mode operand. A custom inserter is used to insert additional instructions to change FRM around the VFCVT. Some of this is borrowed from D122860, but takes a somewhat different direction. We may migrate to that patch, but for now I was trying to keep this as independent from RVV intrinsics as I could. A followup patch will use this approach for FROUND too. Still need to fix the cost model. Reviewed By: arcbbb Differential Revision: https://reviews.llvm.org/D133238	2022-09-05 19:03:44 -07:00
Craig Topper	11881a8f3f	[RISCV] Rename some V extension multiclasses for consistency. NFC Use "SDNode" in the name is the convention for the VLMax patterns in RISCVInstrInfoVSDPatterns.td. This files use "VL".	2022-09-01 22:17:08 -07:00
Alex Bradbury	6e1897ce95	[RISCV][NFC] Fix typo in comment in RISCVInstrInfoZicbo.td Zicbop->Zicbom typo.	2022-09-01 13:49:55 +01:00
liqinweng	c45810f810	[RISCV] When ISD::SETUGT && Imm == -1, has processed before lowering When ISD::SETUGT && Imm == -1, has processed before lowering. Use assert replace it Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D132373	2022-09-01 15:38:16 +08:00
Craig Topper	6e0ae7e940	[RISCV] Slightly simplify coode in combineVWADD_W_VL_VWSUB_W_VL and combineMUL_VLToVWMUL_VL. NFC Use computeMaxSignificantBits instead of ComputeNumSignBits. Create APInt as part of call to MaskedValueIsZero instead of creating a named temporary.	2022-08-31 15:02:03 -07:00
Michael Maitland	30a4264f5f	[RISCV][CodeGen] add assertion to RISCVTargetStreamer getTargetStreamer() X86 and ARM AsmParsers have this same assertion. This assertion provides better reporting when the RISCVTargetStreamer is null and helps to prevent null pointer access. Reviewed By: bkramer Differential Revision: https://reviews.llvm.org/D132863	2022-08-31 11:15:47 -07:00
jacquesguan	45c1ce321d	[RISCV] Add cost model for select and integer compare instructions. This patch adds cost model for vector select and integer compare instructions.	2022-08-31 11:32:58 +08:00
Craig Topper	7973346d16	[RISCV] Use uint64_t countTrailingZeros/Ones instead of APInt. NFC We know the type is 32 or 64 bits, we can use getZExtValue and bypass the slow path check in APInt.	2022-08-30 12:39:36 -07:00
Craig Topper	893f5e95e2	[RISCV] Improve isel of AND with shiftedMask containing 32 leading zeros and some trailing zeros. We can use srliw to shift out the trailing bits and slli to shift back in zeros. The sign extend of srliw will 0 the upper 32 bits since we will be shifting a 0 into bit 31.	2022-08-30 12:22:46 -07:00
liqinweng	72c9f811d8	[RISCV][COST] Refactor for costs of integer saturing add/sub Reviewed By: reames Differential Revision: https://reviews.llvm.org/D132822	2022-08-30 11:39:55 +08:00
Craig Topper	e25eb61d03	[RISCV] Enable (srl (and X, C2), C) to form SRLIW in more cases. Don't require the AND has one use and don't depend on targetShrinkDemandedConstant turning C2 into 0xffffffff. Instead, check that the constant is 0xffffffff after replacing any bits that will be shifted out with 1s. Another way to fix this might be to prevent SimplifyDemandedBits from destroying the ANDI after type legalization using targetShrinkDemandedBits. That would prevent the CSE that created this mess. targetShrinkDemandedBits is currently only enable after legalize ops. Quick experiment shows we can't just change when it runs, we would need to try a different heuristic for post type legalization.	2022-08-29 15:52:08 -07:00
Craig Topper	0fbe71e91f	[RISCV] Use hasAllWUsers to recover ANDI. SimplifyDemandedBits can 0 the upper bits and targetShrinkDemandedConstant isn't alway able to recover it. At least part of that may be because targetShrinkDemandedConstant only runs in the last DAGCombine. Might be worth seeing what happens if we move it post type legalization.	2022-08-29 14:11:09 -07:00
Craig Topper	1c334b306e	[RISCV] Add more invertible setccs to tryDemorganOfBooleanCondition. This builds on D132771 to invert (setlt 0, X) to (setlt X, 1) and vice versa. Reviewed By: reames Differential Revision: https://reviews.llvm.org/D132798	2022-08-29 12:23:03 -07:00
Craig Topper	9d12bb77f9	[RISCV] Apply DeMorgan to (beqz (and/or (seteq), (xor Z, 1))) to remove the xor. We can rewrite to (bnez (or/and (setne), Z) is Z is 0/1. Alternatively, we could canonicalize to (xor (or/and (setne), Z), 1) even if there is no branch. The xor would not always get removed, but it might enable other DeMorgan combines. I decided to be conservative for this first patch and require the xor to be removed. I have a couple other invertible setccs I will add in a follow up patch. Reviewed By: reames Differential Revision: https://reviews.llvm.org/D132771	2022-08-29 12:16:34 -07:00
Craig Topper	2f811a6c7f	[VP][RISCV] Add vp.fabs intrinsic and RISC-V support. Mostly just modeled after vp.fneg except there is a "functional instruction" for fneg while fabs is always an intrinsic. Reviewed By: fakepaper56 Differential Revision: https://reviews.llvm.org/D132793	2022-08-29 09:32:06 -07:00
Craig Topper	e0a9da2562	[RISCV] Add Uses=[FRM] and mayRaiseFPException to VF(N/W)CVT instructions. Reviewed By: arcbbb, kito-cheng Differential Revision: https://reviews.llvm.org/D132792	2022-08-29 09:26:33 -07:00
Craig Topper	6732896bbf	[RISCV] Use analyzeBranch in RISCVRedundantCopyElimination. The existing code was incorrect if we had more than one conditional branch instruction in a basic block. Though I don't think that will occur, using analyzeBranch detects that as an unsupported case. Overall this results in simpler code in RISCVRedundantCopyElimination. Reviewed By: reames, kito-cheng Differential Revision: https://reviews.llvm.org/D132347	2022-08-29 09:05:53 -07:00
Yeting Kuo	abf0416328	[RISCV] Merge vmerge.vvm and unmasked intrinsic with VLMAX vector length. The motivation of this patch is to lower the IR pattern (vp.merge mask, (add x, y), false, vl) to (PseudoVADD_VV_<LMUL>_MASK false, x, y, mask, vl). Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D131841	2022-08-29 11:44:51 +08:00
liqinweng	6fdd62c98d	[RISCV] Remove unused code Reviewed By: benshi001 Differential Revision: https://reviews.llvm.org/D132281	2022-08-29 10:16:44 +08:00
liqinweng	a42e21deb8	[RISCV] Refactor for costs of integer min/max Reviewed By: reames Differential Revision: https://reviews.llvm.org/D132724	2022-08-29 10:13:50 +08:00
Kazu Hirata	2833760c57	[Target] Qualify auto in range-based for loops (NFC)	2022-08-28 17:35:09 -07:00
Benjamin Kramer	b69086e6c7	[RISC-V][HWASAN] Fold variable into assert	2022-08-29 00:32:37 +02:00
Alexey Baturo	e3485345d3	[RISC-V][HWASAN] Add support for lowering HWASAN intrinsic for RISC-V Reviewed By: vitalybuka Differential Revision: https://reviews.llvm.org/D131343	2022-08-28 21:22:13 +03:00
Alexey Baturo	0636aec330	[RISC-V][HWASAN] Add intrinsics required for HWASAN support for RISC-V Reviewed By: vitalybuka Differential Revision: https://reviews.llvm.org/D131340	2022-08-28 18:05:43 +03:00
Philip Reames	b45a262679	[RISCV] Enable fixed length vectors and loop vectorization with same This change enables the use of RISCV's variable length vector registers for fixed length vectors in the IR, and implicitly enables various IR transforms which generate fixed length vectors if legal (e.g. LoopVectorize). Specifically, this enables fixed length vectors which are known to be inbounds of the underlying variable hardware size. For context, remember that the +V extension provides a minimum VLEN of 128. The embedded variants provide lower minimums. The analogy here is essentially vectorizing for SSE on a machine which may or may not include AVX2/AVX512. We won't get full utilization by default, but we will get some benefit. And of course, with an explicit mcpu we can vectorize to the exact target hardware. The LV impact is mostly related to vectorizer robustness. In cases we haven't yet fully implemented scalable vectorization support, we can fall back to fixed length vectorization. SLP has been disabled for now, even when fixed vectors are enabled. See `a310637` and associated review. There are a few addiitional code quality issues which need worked through before turning SLP on would be reasonable. Differential Revision: https://reviews.llvm.org/D131508	2022-08-26 14:45:23 -07:00
Philip Reames	a310637132	[RISCV] Disable SLP vectorization by default due to unresolved profitability issues This change implements a TTI query with the goal of disabling slp vectorization on RISCV. The current default configuration disables SLP already, but its current tied to the ability to lower fixed length vectors. Over in D131508, I want to enable fixed length vectors for purposes of LoopVectorizer, but preliminary analysis has revealed a couple of SLP specific issues we need to resolve before enabling it by default. This change exists to allow us to enable LV without SLP. Differential Revision: https://reviews.llvm.org/D132680	2022-08-26 14:11:22 -07:00
Yunze Zhu	3846e3970f	[RISCV] Generate correct ELF abi flag when empty .ll file has target-abi attribute In patch D121183, target abi is get from .ll file's target-abi attribute and set in RISCVAsmPrinter::emitFunctionEntryLabel function. In https://github.com/llvm/llvm-project/issues/57242, an api mismatch error may be caused by failing to call function RISCVAsmPrinter::emitFunctionEntryLabel to set target-abi to correct one when the .ll is empty or a module has no function. This patch move setting target-abi part to function RISCVAsmPrinter::emitStartOfAsmFile, make sure all .ll file and module in LTO read target-abi from module flag and set, with or without function. Signed-off-by: xiaojing.zhang <xiaojing.zhang@xcalibyte.com> Signed-off-by: jianxin.lai <jianxin.lai@xcalibyte.com> Reviewed By: luismarques Differential Revision: https://reviews.llvm.org/D132204	2022-08-26 14:39:39 +08:00
LiaoChunyu	6b098bf35a	[RISCV] : Add support for simm10_lsb0000nonzero operand. Running on RISCV machine llvm-exegesis I faced with trouble: can't measure C_ADDI16SP, beacuse immediate has type simm10_lsb0000nonzero. Patch adds support for processing this immediate operand type. Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D132650	2022-08-26 14:37:37 +08:00
Craig Topper	e4177201eb	[RISCV][M68k] Replace fixed size BitVector with std::bitset. Saves a heap allocation and avoids an explicit call to the BitVector constructor. Reviewed By: reames, myhsu Differential Revision: https://reviews.llvm.org/D132674	2022-08-25 12:45:08 -07:00
Craig Topper	41a3b5739b	[RISCV] Teach combineDeMorganOfBoolean to handle (and (xor X, 1), (not Y)). SimplifyDemandedBits tries to agressively turn xor immediates into -1 to match a 'not' instruction. In this case, because X is a boolean, the upper bits of (xor X, 1) are known to be 0. Because this is an AND instruction, that means those bits aren't demanded from the other operand, and thus SimplifyDemandedBits can turn (xor Y, 1) to (not Y). We need to detect that this has happened to enable the DeMorgan optimization. To do this we allow one of the xors to use -1 when the outer operation is And. Reviewed By: reames Differential Revision: https://reviews.llvm.org/D132671	2022-08-25 10:55:45 -07:00
Philip Reames	53f738ce7e	[RISCV] Add empirical costs for integer min/max and saturing add/sub All of these are lowered to a single instruction for all legal vector types.	2022-08-25 09:27:17 -07:00
Craig Topper	ec91d761ac	[RISCV] Apply DeMorgan's law to (and/or (xor X, 1), (xor Y, 1)) if X and Y are 0/1. This optimizes xors that appear due to legalizing setge/setle which require an xor with 1. This reduces the number of xors and may allow the xor to fold with a beqz or bnez. Differential Revision: https://reviews.llvm.org/D132614	2022-08-25 08:49:30 -07:00
Philip Reames	03798f268b	{RISCV] Backout cttz/ctlz instruction costs Craig points out correctly in post-commit review that these depend on the availability of floating point extensions.	2022-08-24 15:40:48 -07:00
Philip Reames	d4d6e71ea2	[RISCV] Add empirical costs for bswap/bitreverse/ctpop/ctlz/cttz If anyone is looking for a source of ideas on vector codegen improvements, the lowerings for several of these seem to include pretty obvious fixits.	2022-08-24 15:09:21 -07:00
Philip Reames	42af1a776a	[RISCV] Add empirically measured vector sqrt intrinsic costs	2022-08-24 14:27:57 -07:00
Philip Reames	4d3134866f	[RISCV] Add vector fabs intrinsic costs We have a fabs vector instruction, and are using it for current lowering.	2022-08-24 14:09:51 -07:00
Saleem Abdulrasool	8f45b5a7a9	RISCV: permit unaligned nop-slide padding emission We may be requested to emit an unaligned nop sequence (e.g. 7-bytes or 3-bytes). These should be 0-filled even though that is not a valid instruction. This matches the behaviour on other architectures like ARM, X86, and MIPS. When a custom section is emitted, it may be classified as text even though it may be a data section or we may be emitting data into a text segment (e.g. a literal pool). In such cases, we should be resilient to the emission request. This was originally identified by the Linux kernel build and reported on D131270 by Nathan Chancellor. Differential Revision: https://reviews.llvm.org/D132482 Reviewed By: luismarques Tested By: Nathan Chancellor	2022-08-24 20:26:48 +00:00
Simon Pilgrim	f9de13232f	[X86] Promote i8/i16 CTTZ (BSF) instructions and remove speculation branch This patch adds a Type operand to the TLI isCheapToSpeculateCttz/isCheapToSpeculateCtlz callbacks, allowing targets to decide whether branches should occur on a type-by-type/legality basis. For X86, this patch proposes to allow CTTZ speculation for i8/i16 types that will lower to promoted i32 BSF instructions by masking the operand above the msb (we already do something similar for i8/i16 TZCNT). This required a minor tweak to CTTZ lowering - if the src operand is known never zero (i.e. due to the promotion masking) we can remove the CMOV zero src handling. Although BSF isn't very fast, most CPUs from the last 20 years don't do that bad a job with it, although there are some annoying passthrough EFLAGS dependencies. Additionally, now that we emit 'REP BSF' in most cases, we are tending towards assuming this will most likely be executed as a TZCNT instruction on any semi-modern CPU. Differential Revision: https://reviews.llvm.org/D132520	2022-08-24 17:28:18 +01:00
Kito Cheng	8e8a62006e	[RISCV][NFC] Minor cleanup in RISCVInstrInfo::getOutliningType The only use of TM is checking result of TargetMachine::getFunctionSections, check that directly instead of introdce a local variable.	2022-08-24 23:42:34 +08:00

1 2 3 4 5 ...

2346 Commits