llvm-project

Commit Graph

Author	SHA1	Message	Date
Craig Topper	41759c3d92	[RISCV] Add RISCVISD::BR_CC similar to RISCVISD::SELECT_CC. This allows me to introduce similar combines for branches as we have recently added for SELECT_CC. Some of them are less useful for standalone setccs and only help branch instructions. By having a BR_CC node its easier to only affect branches. I'm using CondCodeSDNode to make isel patterns easier to write so we can refer to the codes by name. SELECT_CC uses a constant instead. I've translated the condition code just like SELECT_CC so we need less patterns for the swapped conditions. This includes special cases for X < 1 and X > -1 that get translated to blez and bgez by using a 0 constant. computeKnownBitsForTargetNode support for SELECT_CC is added to allow MaskedValueIsZero to work for cases where the true and false values of the SELECT_CC are setccs and the result of the SELECT_CC is used by a BR_CC. This was needed to avoid regressions in some of the overflow tests. Reviewed By: luismarques Differential Revision: https://reviews.llvm.org/D98159	2021-03-15 11:54:01 -07:00
Philipp Tomsich	018e96f71f	[RISCV] Add isel-patterns to optimize (a < 1) into blez (a <= 0) The following code-sequence showed up in a testcase (isolated from SPEC2017) for if-conversion and vectorization when searching for the maximum in an array: addi a2, zero, 1 blt a1, a2, .LBB0_5 which can be expressed as `bge zero,a1,.LBB0_5`/`blez a1,/LBB0_5`. More generally, we want to express (a < 1) as (a <= 0). This adds the required isel-pattern and updates the testcases. Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D98449	2021-03-15 11:32:43 -07:00
Craig Topper	3dc5b533e0	[RISCV] Improve legalization of i32 UADDO/USUBO on RV64. The default legalization uses zero extends that require pair of shifts on RISCV. Instead we can take advantage of the fact that unsigned compares work equally well on sign extended inputs. This allows us to use addw/subw and sext.w. Reviewed By: luismarques Differential Revision: https://reviews.llvm.org/D98233	2021-03-15 09:30:23 -07:00
Fraser Cormack	0c5b789c73	[RISCV] Support fixed-length vectors in the calling convention This patch adds fixed-length vector support to the calling convention when RVV is used to lower fixed-length vectors. The scheme follows the regular vector calling convention for the argument/return registers, but uses scalable vector container types as the LocVTs, and converts to/from the fixed-length vector value types as required. Fixed-length vector types may be split when the combination of minimum VLEN and the maximum allowable LMUL is not large enough to fully contain the vector. In this case the behaviour differs between fixed-length vectors passed as parameters and as return values: 1. For return values, vectors must be passed entirely via registers or via the stack. 2. For parameters, unlike scalar values, split vectors continue to be passed by value, and are split across multiple registers until there are no remaining registers. Thus vector parameters may be found partly in registers and partly on the stack. As with scalable vectors, the first fixed-length mask vector is passed via v0. Split mask fixed-length vectors are passed first via v0 and then via the next available vector register: v8,v9,etc. The handling of vector return values uses all available argument registers v8-v23 which does not adhere to the calling convention we're supposedly implementing, but since this issue affects both fixed-length and scalable-vector values, it was left as-is. Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D97954	2021-03-15 10:43:51 +00:00
Hsiangkai Wang	a81dff1e58	[RISCV] Support inline asm for vector instructions. Types of fractional LMUL and LMUL=1 are all using VR register class. When using inline asm, it will use the first type in the register class as the type for the register. It is not necessary the same as the value type. We need to use INSERT_SUBVECTOR/EXTRACT_SUBVECToR/BITCAST to make it legal to put the value in the corresponding register class. Differential Revision: https://reviews.llvm.org/D97480	2021-03-15 11:02:18 +08:00
Craig Topper	fcdf7f6224	[RISCV] Give an explicit error if 'generic' CPU is passed instead of 'generic-rv32' or 'generic-rv64'. Validate 64Bit feature against the triple. I encountered a project that uses llvm that passes "generic" by default. While I could fix that project, I wouldn't be surprised if other projects did something similar. So it seems like a good idea to provide a better error here. I've also added validation of the 64Bit feature against the triple so that we can catch a mismatched CPU before failing in a mysterious way. We can make it pretty far in isel because we calculate XLenVT from the triple and use that to set up the legal integer type. Reviewed By: luismarques, khchen Differential Revision: https://reviews.llvm.org/D98307	2021-03-14 17:21:31 -07:00
luxufan	a9b9c64fd4	change rvv frame layout This patch change the rvv frame layout that proposed in D94465. In patch D94465, In the eliminateFrameIndex function, to eliminate the rvv frame index, create temp virtual register is needed. This virtual register should be scavenged by class RegsiterScavenger. If the machine function has other unused registers, there is no problem. But if there isn't unused registers, we need a emergency spill slot. Because of the emergency spill slot belongs to the scalar local variables field, to access emergency spill slot, we need a temp virtual register again. This makes the compiler report the "Incomplete scavenging after 2nd pass" error. So I change the rvv frame layout as follows: ``` \|--------------------------------------\| \| arguments passed on the stack \| \|--------------------------------------\|<--- fp \| callee saved registers \| \|--------------------------------------\| \| rvv vector objects(local variables \| \| and outgoing arguments \| \|--------------------------------------\| \| realignment field \| \|--------------------------------------\| \| scalar local variable(also contains\| \| emergency spill slot) \| \|--------------------------------------\|<--- bp \| variable-sized local variables \| \|--------------------------------------\|<--- sp ``` Differential Revision: https://reviews.llvm.org/D97111	2021-03-13 16:05:55 +08:00
Craig Topper	51151828ac	[RISCV] Teach normaliseSetCC to canonicalize X > -1 to X >= 0 and X < 1 to 0 >= X. This allows the use of BGE with X0 instead of puting -1/1 in a register. Reviewed By: jrtc27 Differential Revision: https://reviews.llvm.org/D98542	2021-03-12 11:50:10 -08:00
Craig Topper	45d3ed0304	[RISCV] Add support for scalable vector masked load/store. Reviewed By: frasercrmck Differential Revision: https://reviews.llvm.org/D98460	2021-03-12 10:32:33 -08:00
Fraser Cormack	641f5700f9	[RISCV] Optimize INSERT_VECTOR_ELT sequences This patch optimizes the codegen for INSERT_VECTOR_ELT in various ways. Primarily, it removes the use of vslidedown during lowering, and the vector element is inserted entirely using vslideup with a custom VL and slide index. Additionally, lowering of i64-element vectors on RV32 has been optimized in several ways. When the 64-bit value to insert is the same as the sign-extension of the lower 32-bits, the codegen can follow the regular path. When this is not possible, a new sequence of two i32 vslide1up instructions is used to get the vector element into a vector. This sequence was suggested by @craig.topper. From there, the value is slid into the final position for more consistent lowering across RV32 and RV64. Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D98250	2021-03-12 09:13:38 +00:00
Fraser Cormack	4d2d5855c7	[RISCV] Fix up stale VECREDUCE comments. NFC. Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D98399	2021-03-12 08:49:46 +00:00
Craig Topper	1d26bbcf9b	[RISCV] Return false from isShuffleMaskLegal except for splats. We don't support any other shuffles currently. This changes the bswap/bitreverse tests that check for this in their expansion code. Previously we expanded a byte swapping shuffle through memory. Now we're scalarizing and doing bit operations on scalars to swap bytes. In the future we can probably use vrgather.vx to do a byte swap shuffle.	2021-03-11 20:02:49 -08:00
Craig Topper	c82f442954	[RISCV] Support fixed vector copysign. Reviewed By: frasercrmck Differential Revision: https://reviews.llvm.org/D98394	2021-03-11 09:57:24 -08:00
Craig Topper	0dff8a9627	[RISCV] Handle vmv.x.s intrinsic for i64 vectors on RV32. Reviewed By: frasercrmck Differential Revision: https://reviews.llvm.org/D98372	2021-03-11 09:39:50 -08:00
Craig Topper	9c841cb8e8	[RISCV] Support extract_vector_elt for fixed and scalable masked registers. This uses a really simple approach of converting to an i8 vector and extracting. This is probably not the best approach especially if you know the index is constant. Other ideas: -Store to stack temporary using vse1, load as scalar and shift. -Sort of bitcast the vector to a vector of i8, slide down the appropriate 8 bit element, copy to scalar, shift down the correct bit within the 8 bits we extracted. Not exactly sure how to describe such a bitcast from i1 vector to i8 vector within the type system for elements less than 8. Reviewed By: frasercrmck Differential Revision: https://reviews.llvm.org/D98310	2021-03-11 09:26:44 -08:00
Craig Topper	9106d04554	[RISCV][SelectionDAG] Introduce an ISD::SPLAT_VECTOR_PARTS node that can represent a splat of 2 i32 values into a nxvXi64 vector for riscv32. On riscv32, i64 isn't a legal scalar type but we would like to support scalable vectors of i64. This patch introduces a new node that can represent a splat made of multiple scalar values. I've used this new node to solve the current crashes we experience when getConstant is used after type legalization. For RISCV, we are now default expanding SPLAT_VECTOR to SPLAT_VECTOR_PARTS when needed and then handling the SPLAT_VECTOR_PARTS later during LegalizeOps. I've remove the special case I previously put in for ABS for D97991 as the default expansion is now able to succesfully use getConstant. Reviewed By: frasercrmck Differential Revision: https://reviews.llvm.org/D98004	2021-03-10 09:46:18 -08:00
Craig Topper	0c73a506e8	[RISCV] Starting fixing issues that prevent us from testing vXi64 intrinsics on RV32. Currently we crash in type legalization any time an intrinsic uses a scalar i64 on RV32. This patch adds support for type legalizing this to prevent crashing. I don't promise that it uses the best possible codegen just that it is functional. This first version handles 3 cases. vmv.v.x intrinsic, vmv.s.x intrinsic and intrinsics that take a scalar input, splat it and then do some operation. For vmv.v.x we'll either rely on hardware sign extension for constants or we'll convert it to multiple splats and bit manipulation. For vmv.s.x we use a really unoptimal sequence inspired by what we do for an INSERT_VECTOR_ELT. For the third case we'll either try to use the .vi form for constants or convert to a complicated splat and bitmanip and use the .vv form of the operation. I've renamed the ExtendOperand field to SplatOperand now use it specifically for the third case. The first two cases are handled by custom lowering specifically for those intrinsics. I haven't updated all tests yet, but I tried to cover a subset that includes single-width, widening, and narrowing. Reviewed By: frasercrmck Differential Revision: https://reviews.llvm.org/D97895	2021-03-10 09:45:38 -08:00
Craig Topper	1e39118638	[RISCV] Manually split vector operands to VECREDUCE when handling vXi64 vectors on RV32. The type legalizer will visit the result before the operands. To avoid creating an illegal target specific node or falling back to scalarization, we need to manually split vector operands. This still doesn't handle the case of non-power of 2 operands which need to be widened. I'm not sure the type legalizer is ready for it. I think we would need to insert an INSERT_SUBVECTOR with the power of 2 type we want, with an undef first operand, and the non-power of 2 orignal operand as the vector to insert. Then fill in the neutral elements into the elements the padded elements. Alternatively we INSERT_SUBVECTOR into a neutral vector. From there we carry on splitting if needed to get to a legal type then do the target specific code. The problem with this is the type legalizer doesn't know how to widen an insert_subvector yet. We would need to add that including the handling for a non-undef first vector. Reviewed By: frasercrmck Differential Revision: https://reviews.llvm.org/D98292	2021-03-10 09:27:38 -08:00
Craig Topper	351844edf1	[RISCV] Add support for VECTOR_REVERSE for scalable vector types. I've left mask registers to a future patch as we'll need to convert them to full vectors, shuffle, and then truncate. Reviewed By: frasercrmck Differential Revision: https://reviews.llvm.org/D97609	2021-03-09 10:03:45 -08:00
Craig Topper	77ac3166e5	[RISCV] Add support for fixed vector reductions. I've included tests that require type legalization to split the vector. The i64 version of these scalarizes on RV32 due to type legalization visiting the result before the vector type. So we have to abort our custom expansion to avoid creating target specific nodes with an illegal type. Then type legalization ends up scalarizing. We might be able to fix this by doing custom splitting for large vectors in our handler to get down to a legal type. Reviewed By: frasercrmck Differential Revision: https://reviews.llvm.org/D98102	2021-03-09 09:39:59 -08:00
Craig Topper	1c7ad4dd88	[RISCV] Don't modify the SEW immediate on the V extension pseudo instructions after inserting VSETVLI. Previously we set the value to -1, but the SEW information could be useful for scheduling. Reviewed By: frasercrmck, rogfer01 Differential Revision: https://reviews.llvm.org/D98062	2021-03-09 09:02:19 -08:00
Craig Topper	72ecf2f43f	[RISCV] Optimize fixed vector ABS. Fix crash on scalable vector ABS for SEW=64 with RV32. The default fixed vector expansion uses sra+xor+add since it can't see that smax is legal due to our custom handling. So we select smax(X, sub(0, X)) manually. Scalable vectors are able to use the smax expansion automatically for most cases. It crashes in one case because getConstant can't build a SPLAT_VECTOR for nxvXi64 when i64 scalars aren't legal. So we manually emit a SPLAT_VECTOR_I64 for that case. Reviewed By: frasercrmck Differential Revision: https://reviews.llvm.org/D97991	2021-03-09 08:51:03 -08:00
Craig Topper	478317fbb7	[RISCV] Make the hasStdExtM() check in RISCVInstrInfo::getVLENFactoredAmount emit a diagnostic rather than an assert. As far as I know we're not enforcing the StdExtM must be enabled to use the V extension. If we use an assert here and hit this code in a release build we'll silently emit an invalid instruction. By using a diagnostic we report the error to the user in release builds. I think there may still be a later fatal error from the code emitter though. Reviewed By: frasercrmck Differential Revision: https://reviews.llvm.org/D97970	2021-03-09 08:50:02 -08:00
ShihPo Hung	5cdb2e9860	[RISCV][MC] Fix nf encoding for vector ld/st whole register The three bit nf is one less than the number of NFIELDS, so we manually decrement 1 for VS1/2/4/8R & VL1/2/4/8R. Reviewed By: craig.topper Differential revision: https://reviews.llvm.org/D98185	2021-03-08 19:30:24 -08:00
Craig Topper	7a64cc4a76	[RISCV] Make use of DAG.getNeutralElement in lowerVECREDUCE to avoid repeating the same list of constants. NFC Reviewed By: frasercrmck, khchen Differential Revision: https://reviews.llvm.org/D98091	2021-03-08 09:11:10 -08:00
Craig Topper	a2651266c5	[RISCV] Add explicit i64 types to RV64 isel patterns to stop tablegen from generating unneeded i32 patterns for RV32 HwMode.	2021-03-08 09:06:56 -08:00
Fraser Cormack	18173c57bd	[RISCV] Add new entry points to getContainerForFixedLengthVector While working on adding fixed-length vectors to the calling convention, it was necessary to be able to query for a fixed-length vector container type without access to an instance of SelectionDAG. This patch modifies the "main" getContainerForFixedLengthVector function to use an instance of TargetLowering rather than SelectionDAG, and preserves the SelectionDAG overload as a wrapper. An additional non-static version of the function was also added to simplify the common case in RISCVTargetLowering. Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D97925	2021-03-08 09:26:19 +00:00
Craig Topper	c91b3c9e63	[RISCV] Fold (select_cc (setlt X, Y), 0, ne, trueV, falseV) -> (select_cc X, Y, lt, trueV, falseV) A setcc can be created during LegalizeDAG after select_cc has been created. This combine will enable us to fold these late setccs. Reviewed By: luismarques Differential Revision: https://reviews.llvm.org/D98132	2021-03-07 09:44:56 -08:00
Craig Topper	fdbd5d3206	[RISCV] Fold (select_cc (xor X, Y), 0, eq/ne, trueV, falseV) -> (select_cc X, Y, eq/ne, trueV, falseV) This pattern occurs when lowering for overflow operations introduce an xor after select_cc has already been formed. I had to rework another combine that looked for select_cc of an xor with 1. That xor will now get combined away so we just need to look for the RHS of the select_cc being 1. Reviewed By: luismarques Differential Revision: https://reviews.llvm.org/D98130	2021-03-07 09:29:55 -08:00
Fangrui Song	2d922de3af	[MC][RISCV] Support .reloc , BFD_RELOC_{NONE,32,64}, BFD_RELOC_NONE is useful for ld --gc-sections: it provides a generic way indicating a dependency between two sections.	2021-03-05 21:45:11 -08:00
Luke	d28297ff68	[RISCV] Enable fixed-length vectorization of LoopVectorizer for RISC-V Vector By implementing the method "unsigned RISCVTTIImpl::getRegisterBitWidth(bool Vector)", fixed-length vectorization is enabled when possible. Without this method, the "#pragma clang loop" directive is needed to enable vectorization(or the cost model may inform LLVM that "Vectorization is possible but not beneficial"). Reviewed By: frasercrmck Differential Revision: https://reviews.llvm.org/D97549	2021-03-05 10:54:51 +08:00
Fraser Cormack	8e7ceffd0b	[RISCV] Fix crash when inserting large fixed-length subvectors This patch addresses a compiler crash resulting from passing a fixed-length type to one that expects scalable vector types. An assertion was added to prevent this regressing in the future. Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D97868	2021-03-04 09:27:16 +00:00
Fraser Cormack	d8e1d2ebf4	[RISCV] Preserve fixed-length VL on insert_vector_elt in more cases This patch fixes up one case where the fixed-length-vector VL was dropped (falling back to VLMAX) when inserting vector elements, as the code would lower via ISD::INSERT_VECTOR_ELT (at index 0) which loses the fixed-length vector information. To this end, a custom node, VMV_S_XF_VL, was introduced to carry the VL operand through to the final instruction. This node wraps the RVV vmv.s.x and vmv.s.f instructions, which were being selected by insert_vector_elt anyway. There should be no observable difference in scalable-vector codegen. There is still one outstanding drop from fixed-length VL to VLMAX, when an i64 element is inserted into a vector on RV32; the splat (which is custom legalized) has no notion of the original fixed-length vector type. Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D97842	2021-03-04 09:21:10 +00:00
Amara Emerson	8a316045ed	[AArch64][GlobalISel] Enable use of the optsize predicate in the selector. To do this while supporting the existing functionality in SelectionDAG of using PGO info, we add the ProfileSummaryInfo and LazyBlockFrequencyInfo analysis dependencies to the instruction selector pass. Then, use the predicate to generate constant pool loads for f32 materialization, if we're targeting optsize/minsize. Differential Revision: https://reviews.llvm.org/D97732	2021-03-02 12:55:51 -08:00
Fraser Cormack	c1695ddf7d	[RISCV] Support fixed-length INSERT_VECTOR_ELT This patch enables support for lowering INSERT_VECTOR_ELT on fixed-length vector types. The strategy follows that for scalable vector types. This patch also includes a quick fix to prevent the compiler infinitely looping between lowering BUILD_VECTOR as VECTOR_SHUFFLE and back again. Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D97698	2021-03-02 16:48:38 +00:00
Fraser Cormack	de2b70010a	[RISCV] Lower CONCAT_VECTORS to INSERT_SUBVECTOR nodes The default expansion of CONCAT_VECTORS goes through the stack. This patch avoids that penalty by custom-lowering CONCAT_VECTORS to a series of INSERT_SUBVECTOR nodes. Futher optimizations are possible, but this is a good start. Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D97692	2021-03-02 11:13:59 +00:00
Fraser Cormack	3fea9226ee	[RISCV] Support INSERT_SUBVECTOR on vector masks Like with EXTRACT_SUBVECTOR, INSERT_SUBVECTOR poses a problem for vector masks as RVV isn't able to slide mask types around. We choose instead to bitcast to equivalently-sized i8 types where we can, else we zero-extend, perform the operation, and truncate back down. One test was left disabled due to a crash in the legalizer. Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D97559	2021-03-01 12:04:11 +00:00
Fraser Cormack	e80ca3af82	[RISCV] Fix INSERT/EXTRACT_SUBVECTOR on fractional LMUL types This patch fixes a bug where the lowering for INSERT_SUBVECTOR and EXTRACT_SUBVECTOR would insist on first extracting a register-aligned LMUL1 vector type before perfoming the slide up/down. This was even if the vector was a fractional LMUL type, in which case the aligned EXTRACT_SUBVECTOR was invalid. This issue only occurred for scalable vector types, but a variety of tests for both scalable and fixed-length vectors have been added to ensure this does not regress in the future. Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D97556	2021-03-01 11:51:05 +00:00
Fraser Cormack	4ea734e6ec	[RISCV] Unify scalable- and fixed-vector INSERT_SUBVECTOR lowering This patch unifies the two disparate paths for lowering INSERT_SUBVECTOR operations under one roof. Consequently, with this patch it is possible to support any fixed-length subvector insertion, not just "cast-like" ones. As before, support for the insertion of mask vectors will come in a separate patch. Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D97543	2021-03-01 11:38:47 +00:00
Fraser Cormack	bd4d421688	[RISCV] Support EXTRACT_SUBVECTOR on vector masks This patch adds support for extracting subvectors from vector masks. This can be either extracting a scalable vector from another, or a fixed-length vector from a fixed-length or scalable vector. Since RVV lacks a way to slide vector masks down on an element-wise basis and we don't know the true length of the vector registers, in many cases we must resort to using equivalently-sized i8 vectors to perform the operation. When this is not possible we fall back and extend to a suitable i8 vector. Support was also added for fixed-length truncation to mask types. Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D97475	2021-03-01 11:20:09 +00:00
Fangrui Song	47c5576d7d	ELF: Create unique SHF_GNU_RETAIN sections for llvm.used global objects If a global object is listed in `@llvm.used`, place it in a unique section with the `SHF_GNU_RETAIN` flag. The section is a GC root under `ld --gc-sections` with LLD>=13 or GNU ld>=2.36. For front ends which do not expect to see multiple sections of the same name, consider emitting `@llvm.compiler.used` instead of `@llvm.used`. SHF_GNU_RETAIN is restricted to ELFOSABI_GNU and ELFOSABI_FREEBSD in binutils. We don't do the restriction - see the rationale in D95749. The integrated assembler has supported SHF_GNU_RETAIN since D95730. GNU as>=2.36 supports section flag 'R'. We don't need to worry about GNU ld support because older GNU ld just ignores the unknown SHF_GNU_RETAIN. With this change, `__attribute__((retain))` functions/variables emitted by clang will get the SHF_GNU_RETAIN flag. Differential Revision: https://reviews.llvm.org/D97448	2021-02-26 16:38:44 -08:00
Craig Topper	b183cbfacd	[RISCV] Call SelectBaseAddr on the base pointer in the custom isel for vector loads and stores. This will allow FrameIndex as the base address instead of emitting a separate ADDI from isel. eliminateFrameIndex will likely turn it back into an ADDI, but this makes things consistent with the SDPatterns and VLPatterns. I only tested one case for simplicity. I can test more if reviewers want. Reviewed By: frasercrmck Differential Revision: https://reviews.llvm.org/D97221	2021-02-26 11:38:23 -08:00
Fraser Cormack	37014db013	[RISCV] Use existing method for the LMUL1 type. NFCI. Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D97467	2021-02-26 09:44:05 +00:00
Craig Topper	d7fca3f0bf	[RISCV] Support fixed vector extract_element for FP types.	2021-02-25 16:30:28 -08:00
Craig Topper	95c6824995	[RISCV] Teach CleanupVSETVLI to remove 'vsetvli zero, zero, vtype' when the vtype matches the previous vsetvli or vsetivli Reviewed By: frasercrmck, arcbbb Differential Revision: https://reviews.llvm.org/D97408	2021-02-25 07:51:19 -08:00
Craig Topper	25c6b7ddd2	[RISCV] Add isel pattern to match X > -1 to bgez. Reviewed By: asb Differential Revision: https://reviews.llvm.org/D97262	2021-02-25 07:42:22 -08:00
Fraser Cormack	0ad86f879f	[RISCV] Update RVV ISA section-header comments. NFC. Some of the section headers had become stale with the transition from RVV specification version 0.9 to 0.10. This patch brings them up to date.	2021-02-25 14:15:28 +00:00
Fraser Cormack	02f435db0b	[RISCV] Support fixed-length vector i2fp/fp2i conversions This patch extends the support for scalable-vector int->fp and fp->int conversions by additionally handling fixed-length vectors. The existing scalable-vector lowering re-expresses widening/narrowing by x4+ conversions as standard nodes. The fixed-length vector support slots in at "the end" of this process by lowering the now equally-sized and widening/narrowing by x2 nodes to our custom VL versions. Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D97374	2021-02-25 13:47:58 +00:00
Fraser Cormack	9620ce90d7	[RISCV] Support fixed-length vector FP_ROUND & FP_EXTEND This patch extends the support for vector FP_ROUND and FP_EXTEND by including support for fixed-length vector types. Since fixed-length vectors use "VL" nodes and scalable vectors can use the standard nodes, there is slightly more to do in the fixed-length case. A helper function was introduced to try and reduce the divergent paths. It is expected that this function will similarly come in useful for lowering the int-to-fp and fp-to-int operations for fixed-length vectors. Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D97301	2021-02-25 12:16:06 +00:00
Fraser Cormack	84413e1947	[RISCV] Support fixed-length vector truncates This patch extends support for our custom-lowering of scalable-vector truncates to include those of fixed-length vectors. It does this by co-opting the custom RISCVISD::TRUNCATE_VECTOR node and adding mask and VL operands. This avoids unnecessary duplication of patterns and inflation of the ISel table. Some truncates go through CONCAT_VECTORS which currently isn't efficiently handled, as it goes through the stack. This can be improved upon in the future. Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D97202	2021-02-25 12:11:34 +00:00
Fraser Cormack	3bc5ed3875	[RISCV] Support fixed-length vector sign/zero extension This patch adds support for the custom lowering sign- and zero-extension of fixed-length vector types. It does so through custom nodes. Since the source and destination types are (necessarily) of different sizes, it is possible that the source type is legal whilst the larger destination type isn't. In this case the legalization makes heavy use of EXTRACT_SUBVECTOR. Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D97194	2021-02-25 12:05:17 +00:00
Fraser Cormack	821f8bb29a	[RISCV] Unify scalable- and fixed-vector EXTRACT_SUBVECTOR lowering This patch unifies the two disparate paths for lowering EXTRACT_SUBVECTOR operations under one roof. Consequently, with this patch it is possible to support any fixed-length subvector extraction, not just "cast-like" ones. Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D97192	2021-02-25 11:46:57 +00:00
Craig Topper	159f78fc2f	[RISCV] Reuse existing SDLoc and XLenVT in the switch in RISCVISelDAGToDAG::Select. NFC A SDLoc and XLenVT were already created above the switch.	2021-02-24 21:39:00 -08:00
Craig Topper	efcdd598b7	[RISCV] Teach VSETVLI inserter to use VSETIVLI when possible. We always create the VL operand using a register, but if we can determine that it came from an ADDI X0, imm with a sufficiently small immediate, we can use VSETIVLI. Reviewed By: frasercrmck Differential Revision: https://reviews.llvm.org/D97332	2021-02-24 16:07:33 -08:00
Craig Topper	9bde29629d	[RISCV] Use a ComplexPattern for zexti32 to match sexti32. We just started using a ComplexPattern for sexti32. This updates zexti32 to match. Reviewed By: luismarques Differential Revision: https://reviews.llvm.org/D97231	2021-02-24 16:06:29 -08:00
Craig Topper	086670d367	[RISCV] Support fixed vector extract element. Use VL=1 for scalable vector extract element. I've changed to use VL=1 for slidedown and shifts to avoid extra element processing that we don't need. The i64 fixed vector handling on i32 isn't great if the vector type isn't legal due to an ordering issue in type legalization. If the vector type isn't legal, we fall back to default legalization which will bitcast the vector to vXi32 and use two independent extracts. Doing better will require handling several different cases by manually inserting insert_subvector/extract_subvector to adjust the type to a legal vector before emitting custom nodes. Reviewed By: frasercrmck Differential Revision: https://reviews.llvm.org/D97319	2021-02-24 10:17:00 -08:00
Hsiangkai Wang	53c4c2b9f7	[RISCV] vle1.v/vse1.v should be unmasked instructions. vle1.v/vse1.v should be unmasked instructions. The vm encoding is 1 for unmasked instructions. Differential Revision: https://reviews.llvm.org/D97237	2021-02-23 19:59:22 +08:00
Fraser Cormack	dd68f3cf28	[RISCV] Support insertion of misaligned subvectors This patch extends the support for RVV INSERT_SUBVECTOR to cover those which don't align to a vector register boundary. Like the support for EXTRACT_SUBVECTOR in D96959, it accomplishes this by extracting the nearest register-sized subvector (a subregister operation), then sliding the vector down with VSLIDEDOWN, inserting the subvector to the first position, and sliding the vector back up again afterwards. Unlike subvector extraction, for vectors that occupy less than a full vector register we must preserve the untouched elements. We do this by lowering to an LMUL=1 INSERT_SUBVECTOR using the above method and lowering that to a VSLIDEUP with a zero offset. This uses a tail-undisturbed policy and so has the effect of "sliding in" the subvector elements while preserving the surrounding ones. Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D96972	2021-02-23 10:31:06 +00:00
Craig Topper	3231607ce9	[RISCV] Have sexti32 also recognize AssertZExt from types smaller than i32. An i64 AssertZExt from a type smaller than i32 has at least 33 leading zeros which mean it has at least 33 sign bits. Since we have a couple patterns that use two sexti32, I've switched to a ComplexPattern so tablegen didn't have to generate 9 different permutations. As noted in the FIXME, maybe we should just call computeNumSignBits, but we don't have tests that benefit from that yet. Reviewed By: luismarques Differential Revision: https://reviews.llvm.org/D97130	2021-02-22 14:56:22 -08:00
Craig Topper	1cd2a5a7da	[RISCV] Add isel support for bitcasts between fixed vector types. This should fix the issue reported in D96972. I don't have a good test case for this without those changes. Differential Revision: https://reviews.llvm.org/D97082	2021-02-22 12:05:46 -08:00
Craig Topper	1aeb927fed	[RISCV] Custom isel the rest of the vector load/store intrinsics. A previous patch moved the index versions. This moves the rest. I also removed the custom lowering for VLEFF since we can now do everything directly in the isel handling. I had to update getLMUL to handle mask registers to index the pseudo table correctly for VLE1/VSE1. This is good for another 15K reduction in llc size. Reviewed By: frasercrmck Differential Revision: https://reviews.llvm.org/D97097	2021-02-22 09:53:46 -08:00
Craig Topper	1a6c1ac686	[SelectionDAG][RISCV] Teach ComputeNumSignBits to handle SREM. This also removes a pattern from RISCV that is no longer needed since the sexti32 on the LHS of the srem in the pattern implies the result is sign extended so the sign_extend_inreg should be removed in DAG combine now. Reviewed By: luismarques, RKSimon Differential Revision: https://reviews.llvm.org/D97133	2021-02-21 11:13:36 -08:00
Fraser Cormack	3e1317fd32	[RISCV] Support extraction of misaligned subvectors This patch extends the support for RVV EXTRACT_SUBVECTOR to cover those which don't align to a vector register boundary. It accomplishes this by extracting the nearest register-sized subvector (a subregister operation), then sliding the vector down with VSLIDEDOWN and extracting the subvector from the first position (a COPY operation). Since this procedure involves the use of VSCALE and multiplication, the handling of such operations is done during lowering to simplify the implementation and make use of DAG combining. This necessitated moving some helper functions from RISCVISelDAGToDAG to RISCVTargetLowering. Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D96959	2021-02-20 15:43:54 +00:00
Fraser Cormack	9aa20caee6	[RISCV] Improve register allocation around vector masks With vector mask registers only allocatable to V0 (VMV0Regs) it is relatively simple to generate code which uses multiple masks and naively requires spilling. This patch aims to improve codegen in such cases by telling LLVM it can use VRRegs to hold masks. This will prevent spilling in many cases by having LLVM copy to an available VR register. Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D97055	2021-02-20 14:47:51 +00:00
Craig Topper	71b68fe532	[RISCV] Teach our custom vector load/store intrinsic isel code to propagate memory operands if we have them. We don't currently create memory operands for these intrinsics, but there was a suggestion of using the indexed load/store intrinsics to implement isel for scalable vector gather/scatter. That may propagate the memory operand from the gather/scatter ISD nodes.	2021-02-19 19:12:20 -08:00
Craig Topper	7e54d7304b	[RISCV] Remove VPatILoad and VPatIStore multiclasses that are no longer used. NFC	2021-02-19 13:23:08 -08:00
Craig Topper	e7c86f4ac4	[RISCV] Use inheritance to reduce some repeated code in tablegen. NFC The VLX and VSX searchable tables, share the same format so we can have a common base class for them.	2021-02-19 10:42:18 -08:00
Craig Topper	7f5b3886e4	[RISCV] Remove unneeded indexed segment load/store vector pseudo instruction. We had more combinations of data and index lmuls than we needed. Also add some asserts to verify that the IndexVT and data VT have the same element count when we isel these pseudo instructions.	2021-02-19 10:28:48 -08:00
Craig Topper	d056d5decf	[RISCV] Use custom isel for vector indexed load/store intrinsics. There are many legal combinations of index and data VTs supported for these intrinsics. This results in a lot of isel patterns in RISCVGenDAGISel.inc. By adding a separate table similar to what we use for segment load/stores, we can more efficiently manually select these intrinsics. We should also be able to reuse this table scalable vector gather/scatter. This reduces the llc binary size by ~56K. Reviewed By: khchen Differential Revision: https://reviews.llvm.org/D97033	2021-02-19 10:10:06 -08:00
Craig Topper	dbf910f0d9	[RISCV] Prevent selecting a 0 VL to X0 for the segment load/store intrinsics. Just like we do for isel patterns, we need to call selectVLOp to prevent 0 from being selected to X0 by the default isel. Reviewed By: frasercrmck Differential Revision: https://reviews.llvm.org/D97021	2021-02-19 10:07:12 -08:00
Craig Topper	98dff5e804	[RISCV] Move SHFLI matching to DAG combine. Add 32-bit support for RV64 We previously used isel patterns for this, but that used quite a bit of space in the isel table due to OR being associative and commutative. It also wouldn't handle shifts/ands being in reversed order. This generalizes the shift/and matching from GREVI to take the expected mask table as input so we can reuse it for SHFLI. There is no SHFLIW instruction, but we can promote a 32-bit SHFLI to i64 on RV64. As long as bit 4 of the control bit isn't set, a 64-bit SHFLI will preserve 33 sign bits if the input had at least 33 sign bits. ComputeNumSignBits has been updated to account for that to avoid sext.w in the tests. Reviewed By: frasercrmck Differential Revision: https://reviews.llvm.org/D96661	2021-02-19 10:07:12 -08:00
Fraser Cormack	d9531a3097	[RISCV] Address some clang-tidy warnings. NFCI.	2021-02-19 12:10:28 +00:00
Craig Topper	cd4051ac80	[RISCV] Prune unneeded indexed load/store pseudo instructions. We were creating more combinations of value and index lmul than we needed. I've copied the loop structure used here from VPseudoAMOEI with all data sew values instead of just 32/64. Similar can be done for segment loads/store. Reviewed By: khchen Differential Revision: https://reviews.llvm.org/D97008	2021-02-18 23:08:39 -08:00
Craig Topper	8ed3bbbcc3	[RISCV] Split zvlsseg searchable table into 4 separate tables. Index by properties rather than intrinsic ID. Intrinsic ID is a 32-bit value which made each row of the table 4 byte aligned. The remaining fields used 5 bytes. This meant 3 bytes of padding per row. This patch breaks the table into 4 separate tables and indexes them by properties we know about the intrinsic. NF, masked, strided, ordered, etc. The indexed load/store tables have no padding in their rows now. All together this reduces the size of llc binary by ~28K. I'm considering adding similar tables for isel of non-segment load/store as well to cut down the size of the isel table and probably improve our isel performance. Those tables would need to indexed from intrinsics, IR loads/stores, gathers/scatters, and RISCVISD opcodes. So having a table that can be indexed without using intrinsic ID is more flexible. Reviewed By: HsiangKai Differential Revision: https://reviews.llvm.org/D96894	2021-02-18 19:00:49 -08:00
Craig Topper	cf34559104	[RISCV] Enable PrimaryKeyEarlyOut on RISCVVPseudosTable. This table is queried in RISCVMCInstLower without knowing whether the instruction is a vector pseudo. Due to the way the binary search works, we have to do log2(tablesize) checks just to determine a non-vector instruction isn't in the table. Conveniently, all the vector pseudos are pretty tightly packed within the internal instruction enum. By enabling the PrimaryKeyEarlyOut, tablegen will emit a check against the beginning and end of the table before doing the binary search. This gives a quick early out on the search for the majority of non-vector instructions. Differential Revision: https://reviews.llvm.org/D97016	2021-02-18 18:59:32 -08:00
Craig Topper	0db938312a	[RISCV] Simplify VPseudoAMOEI multiclass. NFC lmul was already iterated in one of the loops. We don't need to recreate it from a string.	2021-02-18 12:40:51 -08:00
Jessica Clarke	74df1ffaad	[RISCV] Use XLenRI alias for RegInfoByHwMode instances This avoids tedious repetition and matches what we do for the ValueTypeByHwMode uses. Reviewed By: craig.topper, luismarques Differential Revision: https://reviews.llvm.org/D96649	2021-02-18 19:38:36 +00:00
Craig Topper	156fc07e19	[RISCV] Add support for fixed vector MULHU/MULHS. This uses to division by constant optimization to use MULHU/MULHS. Reviewed By: frasercrmck, arcbbb Differential Revision: https://reviews.llvm.org/D96934	2021-02-18 09:15:08 -08:00
Craig Topper	792627be35	[RISCV] Add support for fixed vector sign/zero extend from mask types. Due to vXi64 on RV32, I've directly emitted this using _VL ISD opcodes. If it wasn't for that we could just use fixed vector BUILD_VECTOR and VSELECT and let those each be legalized. Reviewed By: frasercrmck Differential Revision: https://reviews.llvm.org/D96910	2021-02-18 09:08:10 -08:00
Craig Topper	c7dd92e8a5	[RISCV] Support isel of scalable vector bitcasts These should be NOPs so we can just replace with the input. This matches what SVE does with isel patterns for all permutations. Custom isel saves us from having to list all permurations for all LMULs. Reviewed By: frasercrmck Differential Revision: https://reviews.llvm.org/D96921	2021-02-18 09:01:13 -08:00
Hsiangkai Wang	065a187f33	[RISCV] Fix typo. Use ValueType instead of LLVMType.	2021-02-18 23:21:27 +08:00
Hsiangkai Wang	f1efa8abaf	[RISCV] Fix bugs in pseudo instructions for masked segment load. For masked segment load, the destination register should not overlap with mask register. It could not be V0. In the original implementation, there is no segment load/store register class without V0. In this patch, I added these register classes and modify `GetVRegNoV0` to get the correct one. Differential Revision: https://reviews.llvm.org/D96937	2021-02-18 22:17:00 +08:00
Hsiangkai Wang	b97d8b32c3	[NFC][RISCV] Use concise way to describe load/store instructions. Differential Revision: https://reviews.llvm.org/D96923	2021-02-18 22:17:00 +08:00
Benjamin Kramer	ae1e6c3557	[RISCV] Rewrite assert to not give unused variable warnings in Release builds NFCI	2021-02-18 11:42:36 +01:00
Fraser Cormack	d876214990	[RISCV] Begin to support more subvector inserts/extracts This patch adds support for INSERT_SUBVECTOR and EXTRACT_SUBVECTOR (nominally where both operands are scalable vector types) where the vector, subvector, and index align sufficiently to allow decomposition to subregister manipulation: * For extracts, the extracted subvector must correctly align with the lower elements of a vector register. * For inserts, the inserted subvector must be at least one full vector register, and correctly align as above. This approach should work for fixed-length vector insertion/extraction too, but that will come later. Reviewed By: craig.topper, khchen, arcbbb Differential Revision: https://reviews.llvm.org/D96873	2021-02-18 10:18:27 +00:00
Craig Topper	016eca8f90	[RISCV] Guard LowerINSERT_VECTOR_ELT against fixed vectors. The type legalizer can call this code based on the scalar type so we need to verify the vector type is a scalable vector. I think due to how type legalization visits nodes, the vector type will have already been legalized so we don't have an issue with using MVT here like we did for EXTRACT_VECTOR_ELT. I've added a test just in case.	2021-02-17 19:27:08 -08:00
Craig Topper	00c4e0a8f6	[RISCV] Guard the ISD::EXTRACT_VECTOR_ELT handling in ReplaceNodeResults against fixed vectors and non-MVT types. The type legalizer is calling this code based on the scalar type so we need to verify the input type is a scalable vector. The vector type has also not been legalized yet when this is called so we need to use EVT for it.	2021-02-17 18:25:38 -08:00
Craig Topper	3bdd02735b	[RISCV] Localize RISCVZvlssegTable to RISCVISelDAGToDAG.cpp, the only place it is used.	2021-02-17 11:37:28 -08:00
Craig Topper	799f7865c8	[RISCV] Use bits<7> instead of bits<11> for the EEW field size in the RISCVZvlsseg searchable table. NFCI We only support 8, 16, 32, and 64 for EEW. These only need 7 bits to represent.	2021-02-17 11:12:36 -08:00
Craig Topper	d4353a3101	[RISCV] Merge the handlers for masked and unmasked segment loads/stores. A lot of the code for the masked and unmasked is the same. This patch adds a boolean to handle the differences so we can share the code. Differential Revision: https://reviews.llvm.org/D96841	2021-02-17 10:08:33 -08:00
Craig Topper	6f30d0035a	[RISCV] Merge the vsetvli and vsetvlimax intrinsic selection These have very similar code just with a different number of operands and handling for vsetivl. Differential Revision: https://reviews.llvm.org/D96834	2021-02-17 10:08:33 -08:00
luxufan	709ea8bc87	[RISCV] Simplify BP initialisation We can re-use copyPhysReg rather than writing a specialised copy. Differential Revision: https://reviews.llvm.org/D95227	2021-02-17 20:33:20 +08:00
Fraser Cormack	d81161646a	[RISCV] Add support for fixed vector vselect This patch adds support for fixed-length vector vselect. It does so by lowering them to a custom unmasked VSELECT_VL node with a vector length operand. Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D96768	2021-02-17 10:59:00 +00:00
Hsiangkai Wang	a3c783dbf2	[RISCV] Spilling for RISC-V V extension. (2nd version) Differential Revision: https://reviews.llvm.org/D95148	2021-02-17 14:05:19 +08:00
Hsiangkai Wang	5a31a67385	[RISCV] Frame handling for RISC-V V extension. This patch proposes how to deal with RISC-V vector frame objects. The layout of RISC-V vector frame will look like \|---------------------------------\| \| scalar callee-saved registers \| \|---------------------------------\| \| scalar local variables \| \|---------------------------------\| \| scalar outgoing arguments \| \|---------------------------------\| \| RVV local variables && \| \| RVV outgoing arguments \| \|---------------------------------\| <- end of frame (sp) If there is realignment or variable length array in the stack, we will use frame pointer to access fixed objects and stack pointer to access non-fixed objects. \|---------------------------------\| <- frame pointer (fp) \| scalar callee-saved registers \| \|---------------------------------\| \| scalar local variables \| \|---------------------------------\| \| ///// realignment ///// \| \|---------------------------------\| \| scalar outgoing arguments \| \|---------------------------------\| \| RVV local variables && \| \| RVV outgoing arguments \| \|---------------------------------\| <- end of frame (sp) If there are both realignment and variable length array in the stack, we will use frame pointer to access fixed objects and base pointer to access non-fixed objects. \|---------------------------------\| <- frame pointer (fp) \| scalar callee-saved registers \| \|---------------------------------\| \| scalar local variables \| \|---------------------------------\| \| ///// realignment ///// \| \|---------------------------------\| <- base pointer (bp) \| RVV local variables && \| \| RVV outgoing arguments \| \|---------------------------------\| \| /////////////////////////////// \| \| variable length array \| \| /////////////////////////////// \| \|---------------------------------\| <- end of frame (sp) \| scalar outgoing arguments \| \|---------------------------------\| In this version, we do not save the addresses of RVV objects in the stack. We access them directly through the polynomial expression (a x VLENB + b). We do not reserve frame pointer when there is any RVV object in the stack. So, we also access the scalar frame objects through the polynomial expression (a x VLENB + b) if the access across RVV stack area. Differential Revision: https://reviews.llvm.org/D94465	2021-02-17 14:05:19 +08:00
Craig Topper	61a238e6e1	[RISCV] Add isel patterns for fixed vector fmsub/fnmadd/fnmsub.	2021-02-16 12:03:33 -08:00
Craig Topper	07ca13fe07	[RISCV] Add support for fixed vector mask logic operations. Reviewed By: frasercrmck Differential Revision: https://reviews.llvm.org/D96741	2021-02-16 09:34:00 -08:00
Fraser Cormack	04977ce5ce	[RISCV] Fix a crash in fixed-length build_vector lowering Non-splatted non-integer build_vector nodes were mistakenly being lowered as VID expressions, which should not happen. VID can only be used to select integer build_vector nodes. Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D96718	2021-02-16 10:25:15 +00:00
Fraser Cormack	b870199020	[RISCV] Add patterns for scalable-vector fabs & fcopysign The patterns mostly follow the scalar counterparts, save for some extra optimizations to match the vector/scalar forms. The patch adds a DAGCombine for ISD::FCOPYSIGN to try and reorder ISD::FNEG around any ISD::FP_EXTEND or ISD::FP_TRUNC of the second operand. This helps us achieve better codegen to match vfsgnjn. Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D96028	2021-02-16 10:21:09 +00:00
Craig Topper	29b894a8d3	[RISCV] Add expicit i32/i64 types to RV32 or RV64 only isel patterns. NFC This stops tablegen from generating patterns with the opposite type in the opposite HwMode. This just adds wasted bytes to the isel table. This reduces the isel table by about 1800 bytes.	2021-02-15 14:36:05 -08:00

1 2 3 4 5 ...

1058 Commits