llvm-project

Commit Graph

Author	SHA1	Message	Date
Cullen Rhodes	3918ef07c4	[AArch64][SVE] Remove redundant ptest after match/nmatch These instructions are flag setting so the ptest is redundant, the TableGen class wasn't setting the element size for the predicate causing the checks in AArch64InstrInfo::optimizePTestInstr to fail.	2022-09-28 08:23:23 +00:00
Paul Walker	0533c39a76	[SVE] Expand DUPM patterns to handle all integer vector types. NOTE: i8 vector splats are ignored because the immediate range of DUP already has full coverage. Differential Revision: https://reviews.llvm.org/D131078	2022-08-05 16:00:08 +00:00
Cullen Rhodes	6082051da1	[AArch64][SVE] Add patterns to select mla/mls Adds patterns for: add(a, select(mask, mul(b, c), splat(0))) -> mla(a, mask, b, c) sub(a, select(mask, mul(b, c), splat(0))) -> mls(a, mask, b, c) Reviewed By: paulwalker-arm Differential Revision: https://reviews.llvm.org/D130492	2022-07-26 07:52:44 +00:00
Rosie Sumpter	e5edc1b5ee	[AArch64][SVE] Ensure PTEST operands have type nxv16i1 Currently any legal predicate types will be pattern-matched when creating a PTEST instruction. This could be a problem in future since PTEST always uses the .B specifier for the operand, but it is not always guaranteed that the extra lanes of unpacked types (e.g. nxv4i1) are zero. This patch ensures the operands of PTEST are type nxv16i1, where the undef lanes are set to zero. Differential Revision: https://reviews.llvm.org/D129282/	2022-07-12 09:27:59 +01:00
Sander de Smalen	95e08824fa	[AArch64] Add support for various operations on nxv1i1 types. The supported operations are: * Logical operations (and, or, xor, bic) * Logical reductions (and, or, xor, [us]min, [us]max) * Conversions to/from svbool_t * Predicate count (CNTP) Reviewed By: paulwalker-arm Differential Revision: https://reviews.llvm.org/D128835	2022-07-06 15:57:11 +00:00
Sander de Smalen	690db16422	[AArch64] Make nxv1i1 types a legal type for SVE. One motivation to add support for these types are the LD1Q/ST1Q instructions in SME, for which we have defined a number of load/store intrinsics which at the moment still take a `<vscale x 16 x i1>` predicate regardless of their element type. This patch adds basic support for the nxv1i1 type such that it can be passed/returned from functions, as well as some basic support to support some existing tests that result in a nxv1i1 type. It also adds support for splats. Other operations (e.g. insert/extract subvector, logical ops, etc) will be supported in follow-up patches. Reviewed By: paulwalker-arm, efriedma Differential Revision: https://reviews.llvm.org/D128665	2022-07-01 15:11:13 +00:00
Matt Devereau	018a0dd5c8	[AArch64][SVE] Create AArch64ISD node for DUPQLANE128 Create an AArch64ISD node instead of emitting machine node DUP_ZZI_Q. This allows a simpler DAG combine for work previously attempted in https://reviews.llvm.org/D128503 Differential Revision: https://reviews.llvm.org/D128902	2022-07-01 11:46:24 +00:00
Paul Walker	43f8a6b749	[SVE] Use CPY to zero active lanes of a floating point vector. Patterns exist for the integer case that are trivially expandable to cover 0.0f. Differential Revision: https://reviews.llvm.org/D128669	2022-07-01 00:59:00 +01:00
Paul Walker	2be4a7a209	[SVE] Extend "and(ipg,cmp(x,y))" patterns to cover the case when y is an immediate. Differential Revision: https://reviews.llvm.org/D128479	2022-07-01 00:56:22 +01:00
Bradley Smith	424b2ae9ab	[AArch64][SVE] Match (add x (urshr/srshr y c)) -> ursra/srsra x y c Differential Revision: https://reviews.llvm.org/D128447	2022-06-29 12:10:50 +00:00
Bradley Smith	6f27df5084	[AArch64][SVE] Match (add x (lsr/asr y c)) -> usra/ssra x y c Differential Revision: https://reviews.llvm.org/D128045	2022-06-23 14:56:21 +00:00
Paul Walker	e8716179eb	[SVE] Make ISD::SPLAT_VECTOR a legal operation. The implication of this patch being AArch64ISD::DUP no longer supports scalable vectors. Differential Revision: https://reviews.llvm.org/D128265	2022-06-23 00:42:47 +01:00
Paul Walker	84f486cfab	[NFC][SVE] Simplify SUBR_ZI isel patterns. Differential Revision: https://reviews.llvm.org/D128199	2022-06-22 00:05:18 +01:00
Rosie Sumpter	2c4e44752d	[AArch64][SME] Add load/store intrinsics This patch adds implementations for the load/store SME ACLE intrinsics: - @llvm.aarch64.sme.ld1* - @llvm.aarch64.sme.st1* Differential Revision: https://reviews.llvm.org/D127210	2022-06-14 11:11:22 +01:00
Sander de Smalen	9c38fc111b	[AArch64] Remove references to Streaming SVE from target features. Following discussion on D120261 and D121208 it seems better to remove the concept of Streaming SVE from the subtarget/assembler predicates and instead reason about 'SVE' and 'SME' as its higher level features, rather than trying to model this runtime mode through explicit feature flags. This patch is largely NFC. Reviewed By: paulwalker-arm, david-arm Differential Revision: https://reviews.llvm.org/D125977	2022-05-31 16:25:01 +02:00
Paul Walker	84acdd32ca	[SVEInstrFormats] Ensure scatter instructions are named consistently.	2022-05-23 20:22:14 +01:00
zhongyunde	e1afae0311	[AArch64][SVE] Add some logical operation DestructiveBinaryComm patterns Add DestructiveBinaryComm* patterns for ORR, EOR, AND and BIC. The above instructions requires that the source and destination registers are equal, so use movprfx should be beneficial to performance. note: BIC (i.e. A & ~B) is not a commutative operation. Reviewed By: paulwalker-arm, david-arm Differential Revision: https://reviews.llvm.org/D124224	2022-04-22 20:31:00 +08:00
Peter Waller	f1cb816f90	[AArch64][SVE] Mark {CNT*,RDVL,INDEX} as materializable Differential Revision: https://reviews.llvm.org/D122731	2022-03-31 15:28:24 +00:00
Hsiangkai Wang	b8e296cf6a	[AArch64][SME] Add rdsvl instruction This patch adds support for the following SME instruction: * RDSVL The reference can be found here: https://developer.arm.com/documentation/ddi0602/2021-12 Differential Revision: https://reviews.llvm.org/D120603	2022-02-28 23:14:50 +00:00
Hsiangkai Wang	7dd7cb0487	[AArch64][SME] Add addsvl and addspl instructions This patch adds support for the following SME instructions: * ADDSPL, ADDSVL The reference can be found here: https://developer.arm.com/documentation/ddi0602/2021-12 Differential Revision: https://reviews.llvm.org/D120554	2022-02-28 23:14:50 +00:00
Paul Walker	7ab78f34cd	[SVE] Refactor complex immediate pattern used by CPY/DUP. SelectSVE8BitLslImm didn't account for constant values that have a larger bit width than the result vector's element type. This only seems to affect a single corner case when lowering fixed length vectors but the code itself is also not consistent with how other related complex patterns are implemented so I've taken the opportunity to refactor the code. Differential Revision: https://reviews.llvm.org/D120440	2022-02-25 16:12:35 +00:00
Paul Walker	8ca5be93cc	[SVE] Don't custom lower constant predicate ISD:SPLAT_VECTOR operations. Differential Revision: https://reviews.llvm.org/D120340	2022-02-25 11:32:37 +00:00
David Truby	be826cf4f7	[AArch64][NEON][SVE] Lower FCOPYSIGN using AArch64ISD::BSP This patch modifies the FCOPYSIGN lowering to go through the BSP pseudo-instruction. This allows the same lowering code for NEON, SVE and SVE2. As part of this, lowering for BSP for SVE and SVE2 is also added. For SVE and NEON this patch is NFC. Differential Revision: https://reviews.llvm.org/D118394	2022-02-07 14:35:26 +00:00
Matt Devereau	6b73a4cc7d	[AArch64][SVE] Remove false register dependency for unary FP convert operations Generate movprfx for floating point convert zeroing pseudo operations Differential Revision: https://reviews.llvm.org/D118617	2022-02-04 09:55:39 +00:00
Matt Devereau	1c6dca96ca	[AArch64][SVE] Fold vselect into predicated fmul, fsub and fadd Fold vselect with an unpredicated fmul/fsub/fadd operand into a predicated fmul/fsub/fadd: (vselect (p) (op (a) (b)) (a)) => (op -> (p) (a) (b)) Differential Revision: https://reviews.llvm.org/D117689	2022-02-03 13:43:15 +00:00
Paul Walker	bcda4c48c8	[SVE] By using SEL when orring predicates we forgo the need for a PTRUE. Differential Revision: https://reviews.llvm.org/D118463	2022-01-31 19:39:23 +00:00
Paul Walker	804915f5dc	[SVE] Extend isel pattern coverage for INCP & DECP. Adds patterns for: add(x, cntp(p, p)) -> incp(x, p) sub(x, cntp(p, p)) -> decp(x, p) Differential Revision: https://reviews.llvm.org/D118567	2022-01-31 19:05:05 +00:00
Paul Walker	30efee764d	[SVE] Remove AArch64ISD::PFALSE. AArch64ISD::PFALSE does not provide any value, in fact it can prevent common combines from firing. We only needed to lower to PFALSE until ISD::SPLAT_VECTOR became generally available. Differential Revision: https://reviews.llvm.org/D118469	2022-01-29 11:31:00 +00:00
Paul Walker	49178a2c4e	[SVE] Extend isel pattern coverage for BIC. Adds patterns of the form "(and a, (not b)) -> bic". NOTE: With this support I'm inclined to remove AArch64ISD::BIC, but will leave that investigation for another time. Differential Revision: https://reviews.llvm.org/D118365	2022-01-28 13:14:46 +00:00
Sander de Smalen	dafd1f29da	[AArch64][SVE] Avoid using ptrue for unpredicated predicate AND. Reviewed By: david-arm Differential Revision: https://reviews.llvm.org/D118146	2022-01-27 13:00:23 +00:00
Sander de Smalen	d58757e522	[AArch64][SVE] Implement PFALSE with explicit AArch64ISD node. The ISel patterns for PFALSE helps recognise the instructions as being free of side-effects, which helps MachineCSE remove redundant PFALSE instructions. Reviewed By: dmgreen Differential Revision: https://reviews.llvm.org/D118054	2022-01-27 10:30:13 +00:00
Paul Walker	66bd7ebdf7	[SVE] Use DUPM to handling more splat immediate cases. NOTE: Only considers i64 based vectors at this time because smaller element types require extra isel operand parsing. Differential Revision: https://reviews.llvm.org/D118040	2022-01-26 12:04:44 +00:00
Cullen Rhodes	eee993ae4c	[AArch64][SVE] Fold predicate into compare Codegen of added testcase before this patch: ptrue p0.s cmpgt p1.s, p0/z, z0.s, z1.s cmpge p2.s, p0/z, z2.s, z1.s and p0.b, p0/z, p1.b, p2.b ret Patterns originally authored by Will Lovett. Reviewed By: david-arm Differential Revision: https://reviews.llvm.org/D116749	2022-01-10 10:52:06 +00:00
Paul Walker	22370530a3	[NFC][SVE] Add missing tests for i32 INC/DEC patterns. D111441 included trunc isel patterns for sve_int_pred_pattern_a but no accompanying tests. This patch adds the missing tests and also simplifies the isel patterns that use sve_cnt_shl_imm. Differential Revision: https://reviews.llvm.org/D115512	2021-12-17 13:13:36 +00:00
Andrew Wei	dc7b672f96	[AArch64][SVE] Lower shuffles to permute instructions: rev/revb/revh/revw Attempt to lower a shuffle as a permute instruction(rev/revb/revh/revw) for fixed length SVE. Reviewed By: paulwalker-arm Differential Revision: https://reviews.llvm.org/D114960	2021-12-15 21:53:00 +08:00
Jessica Clarke	a3530dc199	[AArch64][NFC] Alter ComplexPattern types to be consistent with their uses When used as a non-leaf node, TableGen does not currently use the type of a ComplexPattern for type inference, which also means it does not check it doesn't conflict with the use. This differs from when used as a leaf value, where the type is used for inference. Fixing that discrepancy is something I intend to upstream as a subsequent review. AArch64 currently has several ComplexPatterns that are used in contexts where they're expected to be an iPTR. The cases that lead to type contradictions are separated out in D108759, but there are additional differences to the TableGen output when using my locally-patched TableGen. None of these appear to matter, at least for passing all the CodeGen tests, but it's safer to avoid such changes (and similar changes were causing issues on some AMDGPU tests, causing failures to select). Changing these additional ComplexPatterns to use iPTR rather than i64 ensures that the TableGen output remains bit-for-bit identical (compared to without having this patch and my TableGen patch, as well as the intermediate state of having this patch but not my TableGen patch), and more accurately captures the higher-level meaning of these patterns. Reviewed By: david-arm Differential Revision: https://reviews.llvm.org/D109034	2021-12-03 07:04:59 +00:00
Jessica Clarke	0cb44cfbb7	[AArch64][NFC] Fix ComplexPattern types conflicting with uses When used as a non-leaf node, TableGen does not currently use the type of a ComplexPattern for type inference, which also means it does not check it doesn't conflict with the use. This differs from when used as a leaf value, where the type is used for inference. Fixing that discrepancy is something I intend to upstream as a subsequent review, but these are all the type conflicts found (all legitimate) by my locally-patched TableGen. Reviewed By: paulwalker-arm Differential Revision: https://reviews.llvm.org/D108759	2021-12-03 07:04:59 +00:00
David Sherwood	9cef7c1ca9	[CodeGen][SVE] Add missing isel patterns for vector_reverse We were missing patterns for vector_reverse of unpacked FP vector types, as well as all the supported bfloat vectors. Tests added here: CodeGen/AArch64/named-vector-shuffle-reverse-sve.ll Differential Revision: https://reviews.llvm.org/D114089	2021-11-18 09:59:26 +00:00
Peter Waller	599ea3e73f	[AArch64][SVE] Break false dependencies for inactive lanes of FP unary operations Follow up to D105889, covering instructions using sve_fp_2op_p_zd_HSD: frintn, frintp, frintm, frintz, frinta, frintx, frinti, frecpx and fsqrt. Reviewed By: bsmith Differential Revision: https://reviews.llvm.org/D113485	2021-11-15 09:15:21 +00:00
Ahmed Bougacha	bef777206e	[AArch64] Rename some timm predicates for consistency. NFC. timm isn't the common case, and TImmLeafs should make it clear what they are. We're adding a plain ImmLeaf for 0_65535, so rename i64_imm0_65535 to timm64_0_65535, and imm32_0_7 to timm32_0_7.	2021-10-28 11:41:29 -07:00
David Truby	2e0fb007d6	[llvm][AArch64][SVE] Fold literals into math instructions SVE has predicated literal forms of some instructions for specific literals, which currently are generated correctly when using ACLE but not when those instructions are generated directly. This adds the patterns to generate those instructions when generating from standard LLVM IR instructions. Differential Revision: https://reviews.llvm.org/D99074	2021-10-17 10:57:04 +00:00
Kerry McLaughlin	1a2e90199f	[SVE][CodeGen] Add patterns for ADD/SUB + element count This patch adds patterns to match the following with INC/DEC: - @llvm.aarch64.sve.cnt[b\|h\|w\|d] intrinsics + ADD/SUB - vscale + ADD/SUB For some implementations of SVE, INC/DEC VL is not as cheap as ADD/SUB and so this behaviour is guarded by the "use-scalar-inc-vl" feature flag, which for SVE is off by default. There are no known issues with SVE2, so this feature is enabled by default when targeting SVE2. Reviewed By: david-arm Differential Revision: https://reviews.llvm.org/D111441	2021-10-13 11:36:15 +01:00
Bradley Smith	5be266db7a	[AArch64][SVE] Improve VECTOR_SPLICE codegen for VL > 128-bit Differential Revision: https://reviews.llvm.org/D111135	2021-10-07 15:28:55 +00:00
Peter Waller	be26e6ff73	[AArch64][SVE] Remove redundant PTEST following PNEXT/PFIRST PNEXT and PFIRST set the NZCV flags, so the subsequent PTEST can be optimized away in AArch64InstrInfo::optimizePTestInstr. See-also: https://reviews.llvm.org/D93292 Differential Revision: https://reviews.llvm.org/D110177	2021-10-05 15:10:48 +00:00
Cullen Rhodes	d42f76fd36	[AArch64][SVE] NFC: Remove unused template args For sve_fp_3op_p_zds_zx we have zero patterns downstream but the intrinsic args can be added again if/when the patterns are implemented. Identified in D109359. Reviewed By: sdesmalen Differential Revision: https://reviews.llvm.org/D109429	2021-09-09 07:10:57 +00:00
Cullen Rhodes	5b848a35d2	[AArch64][SVE] NFC: Use stepvector directly in index multiclasses Also fixes a couple of warnings identified in D109359: SVEInstrFormats.td:5099:59: warning: unused template argument: sve_int_index_ri::step_vector SVEInstrFormats.td:5133:59: warning: unused template argument: sve_int_index_rr::step_vector Reviewed By: david-arm Differential Revision: https://reviews.llvm.org/D109422	2021-09-09 07:10:57 +00:00
Cullen Rhodes	1fe0e6a380	[AArch64][SME] Support ptrue(s) in streaming mode The ptrue and ptrues instructions are legal in streaming mode, missed in D106272. The reference can be found here: https://developer.arm.com/documentation/ddi0602/2021-06/SVE-Instructions Reviewed By: paulwalker-arm Differential Revision: https://reviews.llvm.org/D107807	2021-08-11 07:49:36 +00:00
Bradley Smith	81eafb8a37	[AArch64][SVE] Break false dependencies for inactive lanes of unary operations Differential Revision: https://reviews.llvm.org/D105889	2021-07-26 15:01:21 +00:00
Caroline Concatto	0bfc26e3a4	[SVE][AArch64] Improve code generation for vector_splice for Imm > 0 This patch implements vector_splice in tablegen for all cases when the Immediate is positive and lower than the known minimum value of a scalable vector. Vector_splice can be implemented using SVE instruction EXT. For instance : @llvm.experimental.vector.splice(Vector_1, Vector_2, Imm) @llvm.experimental.vector.splice(<A,B,C,D>, <E,F,G,H>, 1) ==> <B, C, D, E> EXT Vector_1, Vector_2, Imm // Vector_1 = B, C, D + Vector_2 = E Depends on D105633 Differential Revision: https://reviews.llvm.org/D106273	2021-07-26 11:45:46 +01:00
Eli Friedman	0ca46a1757	[SelectionDAG] Fix the representation of ISD::STEP_VECTOR. The existing rule about the operand type is strange. Instead, just say the operand is a TargetConstant with the right width. (Legalization ignores TargetConstants, so it doesn't matter if that width is legal.) Highlights: 1. I had to substantially rewrite the AArch64 isel patterns to expect a TargetConstant. Nothing too exotic, but maybe a little hairy. Maybe worth considering a target-specific node with some dagcombines instead of this complicated nest of isel patterns. 2. Our behavior on RV32 for vectors of i64 has changed slightly. In particular, we correctly preserve the width of the arithmetic through legalization. This changes the DAG a bit. Maybe room for improvement here. 3. I explicitly defined the behavior around overflow. This is necessary to make the DAGCombine transforms legal, and I don't think it causes any practical issues. Differential Revision: https://reviews.llvm.org/D105673	2021-07-21 10:58:40 -07:00

1 2 3 4 5 ...

348 Commits