llvm-project

Commit Graph

Author	SHA1	Message	Date
Bradley Smith	8bad8a43c3	[AArch64][SVE] Add patterns to generate FMLA/FMLS/FNMLA/FNMLS/FMAD Adjust generateFMAsInMachineCombiner to return false if SVE is present in order to combine fmul+fadd into fma. Also add new pseudo instructions so as to select the most appropriate of FMLA/FMAD depending on register allocation. Depends on D96599 Differential Revision: https://reviews.llvm.org/D96424	2021-02-18 16:55:16 +00:00
David Green	0f435a544a	[AArch64] Correct some tablegen operand types. NFC	2021-02-06 14:34:14 +00:00
David Sherwood	d1bf26fd94	[AArch64][SVE] Add lowering for llvm abs intrinsic Add functionality to permit lowering of the abs and neg intrinsics using the passthru variants. Differential Revision: https://reviews.llvm.org/D94160	2021-01-08 08:55:25 +00:00
Cameron McInally	f4013359b3	[SVE] Add unpacked scalable floating point ZIP/UZP/TRN patterns Differential Revision: https://reviews.llvm.org/D94193	2021-01-07 09:56:53 -06:00
Bradley Smith	c73ae747cb	[AArch64][SVE] Add optimization to remove redundant ptest instructions Co-Authored-by: Graham Hunter <graham.hunter@arm.com> Co-Authored-by: Paul Walker <paul.walker@arm.com> Differential Revision: https://reviews.llvm.org/D93292	2021-01-05 15:28:36 +00:00
Paul Walker	eba6deab22	[SVE] Lower vector CTLZ, CTPOP and CTTZ operations. CTLZ and CTPOP are lowered to CLZ and CNT instructions respectively. CTTZ is not a native SVE operation but is instead lowered to: CTTZ(V) => CTLZ(BITREVERSE(V)) In the case of fixed-length support using SVE we also lower CTTZ operating on NEON sized vectors because of its reliance on BITREVERSE which is also lowered to SVE intructions at these lengths. Differential Revision: https://reviews.llvm.org/D93607	2021-01-05 10:42:35 +00:00
Paul Walker	8eec7294fe	[SVE] Lower vector BITREVERSE and BSWAP operations. These operations are lowered to RBIT and REVB instructions respectively. In the case of fixed-length support using SVE we also lower BITREVERSE operating on NEON sized vectors as this results in fewer instructions. Differential Revision: https://reviews.llvm.org/D93606	2020-12-22 16:49:50 +00:00
Paul Walker	c0bc169cb1	[NFC][SVE] Clean up bfloat isel patterns that emit non-bfloat instructions. During isel there's no need to protect illegal types. Patch also adds a missing unit test for tbl2 intrinsic using bfloat types. Differential Revision: https://reviews.llvm.org/D93404	2020-12-18 13:20:41 +00:00
Paul Walker	632f4d2747	[NFC] Fix a few SVEInstrInfo related stylistic issues.	2020-12-15 16:10:38 +00:00
Kerry McLaughlin	c5ced82c8e	[SVE][CodeGen] Lower scalable floating-point vector reductions Changes in this patch: - Minor changes to the LowerVECREDUCE_SEQ_FADD function added by @cameron.mcinally to also work for scalable types - Added TableGen patterns for FP reductions with unpacked types (nxv2f16, nxv4f16 & nxv2f32) - Asserts added to expandFMINNUM_FMAXNUM & expandVecReduceSeq for scalable types Reviewed By: cameron.mcinally Differential Revision: https://reviews.llvm.org/D93050	2020-12-14 11:45:42 +00:00
Huihui Zhang	1e113c078a	[AArch64][SVE] Fix umin/umax lowering to handle out of range imm. Immediate must be in an integer range [0,255] for umin/umax instruction. Extend pattern matching helper SelectSVEArithImm() to take in value type bitwidth when checking immediate value is in range or not. Reviewed By: sdesmalen Differential Revision: https://reviews.llvm.org/D89831	2020-10-23 09:42:56 -07:00
Paul C. Anagnostopoulos	2871c6c93f	[Aarch64] [TableGen] Clean up !if(!eq(boolean, 1) and related booleans. Differential Revision: https://reviews.llvm.org/D89551	2020-10-19 10:33:55 -04:00
Muhammad Asif Manzoor	aab6f7db47	[AArch64][SVE] Add lowering for llvm fabs Add the functionality to lower fabs for passthru variant Reviewed By: paulwalker-arm Differential Revision: https://reviews.llvm.org/D88679	2020-10-01 19:41:25 -04:00
Kerry McLaughlin	fcf70e1e3b	[SVE][CodeGen] Lower scalable fp_extend & fp_round operations This patch adds FP_EXTEND_MERGE_PASSTHRU & FP_ROUND_MERGE_PASSTHRU ISD nodes, used to lower scalable vector fp_extend/fp_round operations. fp_round has an additional argument, the 'trunc' flag, which is an integer of zero or one. This also fixes a warning introduced by the new tests added to sve-split-fcvt.ll, resulting from an implicit TypeSize -> uint64_t cast in SplitVecOp_FP_ROUND. Reviewed By: sdesmalen, paulwalker-arm Differential Revision: https://reviews.llvm.org/D88321	2020-10-01 12:17:37 +01:00
Muhammad Asif Manzoor	3a76de4275	[AArch64][SVE] Add lowering for llvm frecpx Add the functionality to lower frecpx for passthru variant Reviewed By: paulwalker-arm Differential Revision: https://reviews.llvm.org/D88032	2020-09-23 15:23:54 -04:00
Kerry McLaughlin	d0149ba9b4	[SVE][CodeGen] Lower legal integer -> floating point conversions This patch adds new ISD nodes, SCVTZ_MERGE_PASSTHRU & UCVTZ_MERGE_PASSTHRU, which are used to lower both legal scalable vector [S\|U]INT_TO_FP operations and the following intrinsics: - llvm.aarch64.sve.scvtf - llvm.aarch64.sve.ucvtf Reviewed By: sdesmalen, efriedma Differential Revision: https://reviews.llvm.org/D87913	2020-09-23 11:53:53 +01:00
David Sherwood	96e52c1364	[SVE][CodeGen] Mark ptrue/pfalse instructions as rematerializable	2020-09-21 16:44:32 +01:00
Paul Walker	f3fa954b5b	[SVE] Change definition of reduction ISD nodes to have an SVE vector result type. The current nodes, AArch64::SMAXV_PRED for example, are defined to return a NEON vector result. This is incorrect because they modify the complete SVE register and are thus changed to represent such. This patch also adds nodes for UADDV_PRED and SADDV_PRED, which unifies the handling of all SVE reductions. NOTE: Floating-point reductions are already implemented correctly, so this patch is essentially making everything consistent with those. Differential Revision: https://reviews.llvm.org/D87843	2020-09-21 13:16:28 +01:00
Kerry McLaughlin	f7185b271f	[SVE][CodeGen] Lower floating point -> integer conversions This patch adds new ISD nodes, FCVTZS_MERGE_PASSTHRU & FCVTZU_MERGE_PASSTHRU, which are used to lower scalable vector FP_TO_SINT/FP_TO_UINT operations and the following intrinsics: - llvm.aarch64.sve.fcvtzu - llvm.aarch64.sve.fcvtzs Reviewed By: efriedma, paulwalker-arm Differential Revision: https://reviews.llvm.org/D87232	2020-09-17 14:04:22 +01:00
Muhammad Asif Manzoor	fd536eeed9	[AArch64][SVE] Add lowering for llvm fceil Add the functionality to lower fceil for passthru variant Reviewed By: paulwalker-arm Differential Revision: https://reviews.llvm.org/D84548	2020-08-26 15:59:44 -04:00
Francesco Petrogalli	61dfa00957	[MC][SVE] Fix data operand for instruction alias of `st1d`. The version of `st1d` that operates with vector plus immediate addressing mode uses the alias `st1d { <Zn>.d }, <Pg>, [<Za>.d]` for rendering `st1d { <Zn>.d }, <Pg>, [<Za>.d, #0]`. The disassembler was generating `<Zn>.s` instead of `<Zn>.d>`. Differential Revision: https://reviews.llvm.org/D86633	2020-08-26 18:22:17 +00:00
Paul Walker	73ac3c0ede	[SVE] Lower scalable vector ISD::FNEG operations. Also updates isConstOrConstSplatFP to allow the mul(A,-1) -> neg(A) transformation when -1 is expressed as an ISD::SPLAT_VECTOR. Differential Revision: https://reviews.llvm.org/D86415	2020-08-25 11:22:28 +01:00
Paul Walker	0015b8db8e	[SVE] Add ISEL patterns for predicated shifts by an immediate. For scalable vector shifts the prediacte is typically all active, which gets selected to an unpredicated shift by immediate. When code generating for fixed length vectors the predicate is based on the vector length and so additional patterns are required to make use of SVE's predicated shift by immediate instructions. Differential Revision: https://reviews.llvm.org/D86204	2020-08-20 11:47:20 +01:00
Eli Friedman	be944c85f3	[AArch64][SVE] Add patterns for integer mla/mls. We probably want to introduce pseudo-instructions at some point, like we have for binary operations, but this seems okay for now. One thing I'm not sure about is whether we should be doing this as a DAGCombine instead of directly pattern-matching it. I don't see any big downside to doing it this way, though. Differential Revision: https://reviews.llvm.org/D85681	2020-08-18 12:51:16 -07:00
Paul Walker	9f63dc3265	[SVE] Fix shift-by-imm patterns used by asr, lsl & lsr intrinsics. Right shift patterns will no longer incorrectly accept a shift amount of zero. At the same time they will allow larger shift amounts that are now saturated to their upper bound. Patterns have been extended to enable immediate forms for shifts taking an arbitrary predicate. This patch also unifies the code path for immediate parsing so the i64 based shifts are no longer treated specially. Differential Revision: https://reviews.llvm.org/D86084	2020-08-18 11:41:26 +01:00
Paul Walker	b6c7b7fa31	[SVE] Add ISD nodes for predicated integer extend inreg operations. These are useful instructions when lowering fixed length vector extends, so I've broken this patch out as kind of NFC like work. Differential Revision: https://reviews.llvm.org/D85546	2020-08-11 11:39:26 +01:00
Paul Walker	0d33a8ef5b	[SVE] Lower scalable vector mul operations. This allows us to remove extra patterns from AArch64SVEInstrInfo.td because we can reuse those required for fixed length vectors. Differential Revision: https://reviews.llvm.org/D85328	2020-08-06 11:15:35 +01:00
Paul Walker	3ed59b775d	[SVE] Implement lowering for fixed length vector multiplication. NOTE: Also uses SVE code generation for NEON size vectors, instead of expanding i64 based vector multiplications. Differential Revision: https://reviews.llvm.org/D85327	2020-08-06 11:01:39 +01:00
Paul Walker	4be13b15d6	[SVE] Replace remaining _MERGE_OP1 nodes with _PRED variants. This is the final bit of work to relax the register allocation requirements when code generating normal LLVM IR, which rarely care about the result of inactive lanes. By using _PRED nodes we can make better use of SVE's reversed instructions. Also removes a redundant parameter from the min/max tests. Differential Revision: https://reviews.llvm.org/D85142	2020-08-04 11:19:17 +01:00
Francesco Petrogalli	809600d664	[llvm][sve] Reg + Imm addressing mode for ld1ro. Reviewers: kmclaughlin, efriedma, sdesmalen Subscribers: tschuett, hiraditya, psnobl, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D83357	2020-07-24 17:48:47 +00:00
Paul Walker	509351d768	[SVE] Add lowering for scalable vector fadd, fdiv, fmul and fsub operations. Lower the operations to predicated variants. This is prep work required for fixed length code generation but also fixes a bug whereby these operations fail selection when "unpacked" vector types (e.g. MVT::nxv2f32) are used. This patch also adds the missing "unpacked" patterns for FMA. Differential Revision: https://reviews.llvm.org/D83765	2020-07-16 11:31:35 +00:00
Sander de Smalen	8b7b0ad24c	[AArch64][SVE] NFC: Rename isOrig -> isReverseInstr This is a non-functional to clarify some of the terminology in the AArch64SVEInstrInfo/SVEInstrFormats.td files around the tables for mapping an instruction to it's reverse instruction counter part, and vice versa. e.g. DIV -> DIVR and DIVR -> DIV. Reviewers: paulwalker-arm, cameron.mcinally, rengolin, efriedma Reviewed By: paulwalker-arm, efriedma Tags: #llvm Differential Revision: https://reviews.llvm.org/D82979	2020-07-02 17:01:15 +01:00
Sander de Smalen	075c440f7b	[AArch64][SVE] Put zeroing pseudos and patterns under flag. This patch puts the _ZERO pseudos and corresponding patterns under the predicate 'UseExperimentalZeroingPseudos', so that they can be enabled/disabled through compile flags. This is done because the zeroing pseudos use MOVPRFX to do merging of the inactive lanes, but it depends on the uarch whether this operation is actually merged with the destructive operation. If not, it may be more profitable to use a SELECT and to give the compiler the freedom to schedule these instructions as normal, rather than keeping them bundled together. Additionally, this feature is not yet fully implemented and there are still known bugs (see D80410) that need to be resolved before the 'experimental' can be dropped from the name. Reviewers: paulwalker-arm, cameron.mcinally, efriedma Reviewed By: paulwalker-arm Tags: #llvm Differential Revision: https://reviews.llvm.org/D82780	2020-07-02 14:24:33 +01:00
Paul Walker	a1aed80a35	[SVE] Relax merge requirement for IR based divides. We currently lower SDIV to SDIV_MERGE_OP1. This forces the value for inactive lanes in a way that can hamper register allocation, however, the lowering has no requirement for inactive lanes. Instead this patch replaces SDIV_MERGE_OP1 with SDIV_PRED thus freeing the register allocator. Once done the only user of SDIV_MERGE_OP1 is intrinsic lowering so I've removed the node and perform ISel on the intrinsic directly. This also allows us to implement MOVPRFX based zeroing in the same manner as SUB. This patch also renames UDIV_MERGE_OP1 and [F]ADD_MERGE_OP1 for the same reason but in the ADD cases the ISel code is already as required. Differential Revision: https://reviews.llvm.org/D82783	2020-07-01 08:18:42 +00:00
Sander de Smalen	39f6a36a24	[AArch64][SVE] NFCI: Choose consistent naming for predicated SDAG nodes This patch proposes a naming convention for operations that take a general predicate (and are thus predicated) that specifies what happens to the false lanes. Currently the _PRED suffix is used, which doesn't really say much other than that it takes a predicate. In some instances this means it has merging predication and in other cases it means zeroing-predication. This patch also changes the order of operands to AArch64ISD::DUP_MERGE_PASSTHRU, to pass the predicate as the first operand, which is in line with all other predicates nodes. It takes the passthru value as an explicit passthru value, which is always passed as the last operand. Reviewers: paulwalker-arm, cameron.mcinally, eli.friedman, dancgr, efriedma Reviewed By: paulwalker-arm Tags: #llvm Differential Revision: https://reviews.llvm.org/D81850	2020-06-29 13:37:30 +01:00
Paul Walker	3a98d5d7e7	[SVE] Code generation for fixed length vector adds. Summary: Teach LowerToPredicatedOp to lower fixed length vector operations. Add AArch64ISD nodes and isel patterns for predicated integer and floating point adds. Together this enables SVE code generation for fixed length vector adds. Reviewers: rengolin, efriedma Subscribers: tschuett, kristof.beyls, hiraditya, rkruppe, psnobl, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D82483	2020-06-26 19:54:41 +00:00
Cullen Rhodes	c65d4eb5d3	[AArch64][SVE] Guard perm and select bfloat16 intrinsic patterns Summary: Permutation and selection bfloat16 intrinsic patterns should be guarded on the feature flag `+bf16`. Missed in D82182 and D80850. Reviewers: sdesmalen, fpetrogalli, kmclaughlin, efriedma Reviewed By: fpetrogalli Differential Revision: https://reviews.llvm.org/D82492	2020-06-26 09:35:36 +00:00
Cullen Rhodes	26502ad609	[AArch64][SVE] Add bfloat16 support to perm and select intrinsics Summary: Added for following intrinsics: * zip1, zip2, zip1q, zip2q * trn1, trn2, trn1q, trn2q * uzp1, uzp2, uzp1q, uzp2q * splice * rev * sel Reviewed By: david-arm Differential Revision: https://reviews.llvm.org/D82182	2020-06-24 10:04:51 +00:00
Francesco Petrogalli	ef597eda8e	[sve][acle] Add SVE BFloat16 extensions. Summary: List of intrinsics: svfloat32_t svbfdot[_f32](svfloat32_t op1, svbfloat16_t op2, svbfloat16_t op3) svfloat32_t svbfdot[_n_f32](svfloat32_t op1, svbfloat16_t op2, bfloat16_t op3) svfloat32_t svbfdot_lane[_f32](svfloat32_t op1, svbfloat16_t op2, svbfloat16_t op3, uint64_t imm_index) svfloat32_t svbfmmla[_f32](svfloat32_t op1, svbfloat16_t op2, svbfloat16_t op3) svfloat32_t svbfmlalb[_f32](svfloat32_t op1, svbfloat16_t op2, svbfloat16_t op3) svfloat32_t svbfmlalb[_n_f32](svfloat32_t op1, svbfloat16_t op2, bfloat16_t op3) svfloat32_t svbfmlalb_lane[_f32](svfloat32_t op1, svbfloat16_t op2, svbfloat16_t op3, uint64_t imm_index) svfloat32_t svbfmlalt[_f32](svfloat32_t op1, svbfloat16_t op2, svbfloat16_t op3) svfloat32_t svbfmlalt[_n_f32](svfloat32_t op1, svbfloat16_t op2, bfloat16_t op3) svfloat32_t svbfmlalt_lane[_f32](svfloat32_t op1, svbfloat16_t op2, svbfloat16_t op3, uint64_t imm_index) svbfloat16_t svcvt_bf16[_f32]_m(svbfloat16_t inactive, svbool_t pg, svfloat32_t op) svbfloat16_t svcvt_bf16[_f32]_x(svbool_t pg, svfloat32_t op) svbfloat16_t svcvt_bf16[_f32]_z(svbool_t pg, svfloat32_t op) svbfloat16_t svcvtnt_bf16[_f32]_m(svbfloat16_t even, svbool_t pg, svfloat32_t op) svbfloat16_t svcvtnt_bf16[_f32]_x(svbfloat16_t even, svbool_t pg, svfloat32_t op) For reference, see section 7.2 of "Arm C Language Extensions for SVE - Version 00bet4" Reviewers: sdesmalen, ctetreau, efriedma, david-arm, rengolin Subscribers: tschuett, kristof.beyls, hiraditya, rkruppe, psnobl, cfe-commits, llvm-commits Tags: #clang, #llvm Differential Revision: https://reviews.llvm.org/D82141	2020-06-22 16:53:02 +00:00
Francesco Petrogalli	d32c134648	[llvm][SVE] Reg + reg addressing mode for LD1RO. Reviewers: efriedma, sdesmalen Reviewed By: efriedma Subscribers: tschuett, hiraditya, rkruppe, psnobl, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D80741	2020-06-19 03:56:10 +00:00
Francesco Petrogalli	28a00ac9ba	[llvm][SVE] IR intrinsics for quadword permutation instructions. Summary: Adding intrinsics and codegen patterns for: * trn1 <Zd>.q, <Zm>.q, <Zn>.q * trn2 <Zd>.q, <Zm>.q, <Zn>.q * zip1 <Zd>.q, <Zm>.q, <Zn>.q * zip2 <Zd>.q, <Zm>.q, <Zn>.q * uzp1 <Zd>.q, <Zm>.q, <Zn>.q * uzp2 <Zd>.q, <Zm>.q, <Zn>.q These instructions are defined in Armv8.6-A. Reviewers: sdesmalen, efriedma, kmclaughlin Reviewed By: sdesmalen Subscribers: tschuett, kristof.beyls, hiraditya, rkruppe, psnobl, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D80850	2020-06-15 16:21:56 +00:00
Francesco Petrogalli	febeaf94a8	[llvm][SVE] IR intrinsic for LD1RO. Reviewers: sdesmalen, efriedma Subscribers: tschuett, hiraditya, rkruppe, psnobl, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D80738	2020-06-03 13:57:16 +00:00
Francesco Petrogalli	b572d9b1a7	[llvm][sve] Intrinsics for SVE sudot and usdot instructions. Summary: This patch adds IR intrinsics for the mnemonics USDOT and SUDOT of the 8.6 extension of Armv8-a. Reviewers: sdesmalen, efriedma, david-arm Subscribers: tschuett, kristof.beyls, hiraditya, rkruppe, psnobl, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D79876	2020-05-18 22:02:19 +00:00
Francesco Petrogalli	01f9d8ce5c	[llvm][SVE] IR intrinscs for matrix multiplication instructions. Summary: Instructions: * SMMLA * UMMLA * USMMLA * FMMLA Reviewers: sdesmalen, efriedma, kmclaughlin Subscribers: tschuett, hiraditya, rkruppe, psnobl, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D79638	2020-05-18 22:02:19 +00:00
Eli Friedman	a1ce88b4e3	[AArch64][SVE] Implement AArch64ISD::SETCC_PRED This unifies SETCC operations along the lines of other operations. Differential Revision: https://reviews.llvm.org/D79975	2020-05-15 11:53:21 -07:00
Cameron McInally	b085e51d81	[AArch64][SVE] Add some integer DestructiveBinaryComm* patterns Add DestructiveBinaryComm* patterns for ADD, SUB, and SUBR. Differential Revision: https://reviews.llvm.org/D76711	2020-05-14 16:35:49 -05:00
Eli Friedman	a52f10b5a3	[AArch64][SVE] Add patterns for VSELECT of immediate merged with a variable. This covers forms involving "CPY (immediate, merging)". Differential Revision: https://reviews.llvm.org/D79803	2020-05-13 15:02:08 -07:00
Eli Friedman	a8874c76e8	[AArch64][SVE] Add patterns for VSELECT of immediates. This covers forms involving "CPY (immediate, zeroing)". This doesn't handle the case where the operands are reversed, and the condition is freely invertible. Not sure how to handle that. Maybe a DAGCombine. Differential Revision: https://reviews.llvm.org/D79598	2020-05-11 17:04:22 -07:00
Kerry McLaughlin	3bcd3dd473	[CodeGen][SVE] Lowering of shift operations with scalable types Summary: Adds AArch64ISD nodes for: - SHL_PRED (logical shift left) - SHR_PRED (logical shift right) - SRA_PRED (arithmetic shift right) Existing patterns for unpredicated left shift by immediate have also been moved into the appropriate multiclasses in SVEInstrFormats.td. Reviewers: sdesmalen, efriedma, ctetreau, huihuiz, rengolin Reviewed By: efriedma Subscribers: huihuiz, tschuett, kristof.beyls, hiraditya, rkruppe, psnobl, cfe-commits, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D79478	2020-05-07 11:43:49 +01:00
Eli Friedman	2c8546107a	[AArch64][SVE] Implement lowering for SIGN_EXTEND etc. of SVE predicates. Now using patterns, since there's a single-instruction lowering. (We could convert to VSELECT and pattern-match that, but there doesn't seem to be much point.) I think this might be the first instruction to use nested multiclasses this way? It seems like a good way to reduce duplication between different integer widths. Let me know if it seems like an improvement. Also, while I'm here, fix the return type of SETCC so we don't try to merge a sign-extend with a SETCC. Differential Revision: https://reviews.llvm.org/D79193	2020-05-06 17:56:32 -07:00

1 2 3 4 5 ...

281 Commits