Commit Graph

930 Commits

Author SHA1 Message Date
Muhammad Asif Manzoor 1ffcbe35ae [AArch64][SVE] Add lowering for rounding operations
Add the functionality to lower SVE rounding operations for passthru variant.
Created a new test case file for all rounding operations.

Reviewed By: paulwalker-arm

Differential Revision: https://reviews.llvm.org/D86793
2020-09-04 11:16:57 -04:00
Benjamin Kramer 8782c72765 Strength-reduce SmallVectors to arrays. NFCI. 2020-08-28 21:14:20 +02:00
David Sherwood f4257c5832 [SVE] Make ElementCount members private
This patch changes ElementCount so that the Min and Scalable
members are now private and can only be accessed via the get
functions getKnownMinValue() and isScalable(). In addition I've
added some other member functions for more commonly used operations.
Hopefully this makes the class more useful and will reduce the
need for calling getKnownMinValue().

Differential Revision: https://reviews.llvm.org/D86065
2020-08-28 14:43:53 +01:00
Ties Stuij d678e14c55 [AArch64][CodeGen] Restrict bfloat vector operations to what's actually supported
Previously in addTypeForNeon, we would set the operations for bfloat vectors
like other generic types. But as bfloat is a storage-only type a number of
operations shouldn't be set. This patch fixes that.

Reviewed By: dmgreen

Differential Revision: https://reviews.llvm.org/D85101
2020-08-28 11:44:37 +01:00
Mikhail Maltsev 23d5e93f34 [AArch64] Optimize instruction selection for certain vector shuffles
This patch adds code to recognize vector shuffles which can be
represented as VDUP (splat) of a vector lane with of a different
(wider) type than the original vector lane type.

For example:
    shufflevector <4 x i16> %v, <4 x i16> undef, <4 x i32> <i32 0, i32 1, i32 0, i32 1>
is essentially:
    shufflevector <2 x i32> %v, <2 x i32> undef, <2 x i32> <i32 0, i32 0>

Such patterns are generated by the SelectionDAG machinery in some cases
(see DAGCombiner::visitBITCAST in DAGCombiner.cpp, the "Remove double
bitcasts from shuffles" part).

Reviewed By: dmgreen

Differential Revision: https://reviews.llvm.org/D86225
2020-08-27 11:06:49 +01:00
Paul Walker 81337c915f [SVE] Fallback to default expansion when lowering SIGN_EXTEN_INREG from non-byte based source.
Differential Revision: https://reviews.llvm.org/D86394
2020-08-27 10:57:37 +01:00
Ahmed Bougacha 383f7c8858 [AArch64] Use CCAssignFnForReturn helper in more spots. NFC.
It was added for GISel, but SDAG could use it too!
2020-08-26 14:39:11 -07:00
Muhammad Asif Manzoor fd536eeed9 [AArch64][SVE] Add lowering for llvm fceil
Add the functionality to lower fceil for passthru variant

Reviewed By: paulwalker-arm

Differential Revision: https://reviews.llvm.org/D84548
2020-08-26 15:59:44 -04:00
Paul Walker 73ac3c0ede [SVE] Lower scalable vector ISD::FNEG operations.
Also updates isConstOrConstSplatFP to allow the mul(A,-1) -> neg(A)
transformation when -1 is expressed as an ISD::SPLAT_VECTOR.

Differential Revision: https://reviews.llvm.org/D86415
2020-08-25 11:22:28 +01:00
Cameron McInally 36dbb8fc97 [SVE] Lower fixed length UDIV to scalable
Pretty much just a copy of the SDIV patches (D86114 and D85982) with string replacement.

Differential Revision: https://reviews.llvm.org/D86316
2020-08-21 09:01:25 -05:00
Cameron McInally ac63959460 [SVE] Lower fixed length vXi8/vXi16 SDIV to scalable
There are no nxv16i8/nxv8i16 SDIV instructions, so these fixed width operations must be promoted to nxv4i32.

Differential Revision: https://reviews.llvm.org/D86114
2020-08-20 13:47:01 -05:00
Mehdi Amini a407ec9b6d Revert "Revert "[NFC][llvm] Make the contructors of `ElementCount` private.""
Was reverted because MLIR/Flang builds were broken, these APIs have been
fixed in the meantime.
2020-08-19 17:26:36 +00:00
Mehdi Amini 4fc56d70aa Revert "[NFC][llvm] Make the contructors of `ElementCount` private."
This reverts commit 264afb9e6a.
(and dependent 6b742cc48 and fc53bd610f)

MLIR/Flang are broken.
2020-08-19 17:21:37 +00:00
Francesco Petrogalli 264afb9e6a [NFC][llvm] Make the contructors of `ElementCount` private.
Differential Revision: https://reviews.llvm.org/D86120
2020-08-19 16:26:44 +00:00
Eli Friedman bb18532399 [AArch64][SVE] Allow llvm.aarch64.sve.st2/3/4 with vectors of pointers.
This isn't necessaary for ACLE, but could be useful in other situations.
And the change is simple.

Differential Revision: https://reviews.llvm.org/D85251
2020-08-18 12:51:16 -07:00
Paul Walker cb5cc47a65 [SVE] Lower fixed length vector ISD::SPLAT_VECTOR operations.
Also strengthens the CHECK lines for scalable vector splat tests.

Differential Revision: https://reviews.llvm.org/D86070
2020-08-18 11:19:43 +01:00
Cameron McInally 92593f9e77 [SVE] Lower fixed length vXi32/vXi64 SDIV to scalable vectors.
Differential Revision: https://reviews.llvm.org/D85982
2020-08-14 18:47:22 -05:00
Cameron McInally 21810b0e14 [SVE] Lower fixed length vector integer UMIN/UMAX
Differential Revision: https://reviews.llvm.org/D85926
2020-08-13 14:48:36 -05:00
Cameron McInally e1a87f0a9b [SVE] Lower fixed length vector integer SMIN/SMAX
Differential Revision: https://reviews.llvm.org/D85855
2020-08-13 11:41:20 -05:00
Paul Walker e63cc8105a [SVE] Lower fixed length vector integer shifts.
Differential Revision: https://reviews.llvm.org/D85724
2020-08-13 12:35:47 +01:00
Paul Walker 130098228d [SVE] Lower fixed length vector integer ISD::SETCC operations.
Differential Revision: https://reviews.llvm.org/D85831
2020-08-13 12:01:56 +01:00
Paul Walker 9e04895258 [SVE] Lower fixed length integer extend operations.
Differential Revision: https://reviews.llvm.org/D85640
2020-08-13 11:54:53 +01:00
Francesco Petrogalli c561f4d2ec [SVE][VLS] Don't combine logical AND.
Testing is performed when targeting 128, 256 and 512-bit wide vectors.

For 128-bit vectors, the original behavior of using NEON instructions is
preserved.

Differential Revision: https://reviews.llvm.org/D85479
2020-08-12 20:00:07 +01:00
Cameron McInally ce2c991061 [SVE] Lower fixed length FP minnum/maxnum
Lower fixed length MINNUM/MAXNUM to scalable vectors. Cherry-picked from D71767 with added tests.

Differential Revision: https://reviews.llvm.org/D85744
2020-08-12 12:02:52 -05:00
David Sherwood 88bbd30736 [SVE][CodeGen] Fix issues with EXTRACT_SUBVECTOR when using scalable FP vectors
In this patch I have fixed two issues:

1. Our SVE tuple get/set intrinsics were using the wrong constant type
for the index passed to EXTRACT_SUBVECTOR. I have fixed this by using the
function SelectionDAG::getVectorIdxConstant to create the value. Also, I
have updated the documentation for EXTRACT_SUBVECTOR describing what type
the constant index should be and we now enforce this when creating the
node.
2. The AArch64 backend was missing the appropriate patterns for
extracting certain subvectors (nxv4f16 and nxv2f32) from legal SVE types.
I have added them as part of this patch.

The only way that I could find to test the new patterns was to use the
SVE tuple get intrinsics, although I realise it looks a bit unusual.
Tests added here:

  test/CodeGen/AArch64/sve-extract-subvector.ll

Differential Revision: https://reviews.llvm.org/D85516
2020-08-12 08:35:46 +01:00
Paul Walker b6c7b7fa31 [SVE] Add ISD nodes for predicated integer extend inreg operations.
These are useful instructions when lowering fixed length vector
extends, so I've broken this patch out as kind of NFC like work.

Differential Revision: https://reviews.llvm.org/D85546
2020-08-11 11:39:26 +01:00
Paul Walker d542feb8e4 [SVE] Lower fixed length vector integer subtract operations.
Differential Revision: https://reviews.llvm.org/D85665
2020-08-11 11:32:12 +01:00
Paul Walker 0d33a8ef5b [SVE] Lower scalable vector mul operations.
This allows us to remove extra patterns from AArch64SVEInstrInfo.td
because we can reuse those required for fixed length vectors.

Differential Revision: https://reviews.llvm.org/D85328
2020-08-06 11:15:35 +01:00
Paul Walker 3ed59b775d [SVE] Implement lowering for fixed length vector multiplication.
NOTE: Also uses SVE code generation for NEON size vectors, instead
of expanding i64 based vector multiplications.

Differential Revision: https://reviews.llvm.org/D85327
2020-08-06 11:01:39 +01:00
Paul Walker 927fc536ca [SVE] Add lowering for fixed length vector and, or & xor operations.
Since there are no ill effects when performing these operations
with undefined elements, they are lowered to the already supported
unpredicated scalable vector equivalents.

Differential Revision: https://reviews.llvm.org/D85117
2020-08-05 11:28:34 +01:00
Sander de Smalen f2916636f8 [AArch64][SVE] Disable tail calls if callee does not preserve SVE regs.
This fixes an issue triggered by the following code, where emitEpilogue
got confused when trying to restore the SVE registers after the call,
whereas the call to bar() is implemented as a TCReturn:

  int non_sve();
  int sve(svint32_t x) { return non_sve(); }

Reviewed By: efriedma

Differential Revision: https://reviews.llvm.org/D84869
2020-08-05 09:38:54 +01:00
Eli Friedman 95efea4b93 [AArch64][SVE] Widen narrow sdiv/udiv operations.
The SVE instruction set only supports sdiv/udiv for 32-bit and 64-bit
integers.  If we see an 8-bit or 16-bit divide, widen the operands to 32
bits, and narrow the result.

Differential Revision: https://reviews.llvm.org/D85170
2020-08-04 13:22:15 -07:00
Paul Walker 4be13b15d6 [SVE] Replace remaining _MERGE_OP1 nodes with _PRED variants.
This is the final bit of work to relax the register allocation
requirements when code generating normal LLVM IR, which rarely
care about the result of inactive lanes. By using _PRED nodes
we can make better use of SVE's reversed instructions.

Also removes a redundant parameter from the min/max tests.

Differential Revision: https://reviews.llvm.org/D85142
2020-08-04 11:19:17 +01:00
David Sherwood 23ad660b5d [SVE][CodeGen] At -O0 fallback to DAG ISel when translating alloca with scalable types
When building code at -O0 We weren't falling back to DAG ISel correctly
when encountering alloca instructions with scalable vector types. This
is because the alloca has no operands that are scalable. I've fixed this by
adding a check in AArch64ISelLowering::fallBackToDAGISel for alloca
instructions with scalable types.

Differential Revision: https://reviews.llvm.org/D84746
2020-07-30 08:40:53 +01:00
Eli Friedman c02aa53ecb [AArch64][SVE] Add "fast" fcmp operations.
dacf8d3 added support for most fcmp operations, but there are some extra
variations I hadn't considered: SelectionDAG supports float comparisons
that are neither ordered nor unordered. Add support for the missing
operations.

Differential Revision: https://reviews.llvm.org/D84460
2020-07-24 13:22:41 -07:00
Francesco Petrogalli 809600d664 [llvm][sve] Reg + Imm addressing mode for ld1ro.
Reviewers: kmclaughlin, efriedma, sdesmalen

Subscribers: tschuett, hiraditya, psnobl, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D83357
2020-07-24 17:48:47 +00:00
Petre-Ionut Tudor 1af9fc8213 [ARM] Generate [SU]HADD from ((a + b) >> 1)
Summary:
Teach LLVM to recognize the above pattern, where the operands are
either signed or unsigned types.

Subscribers: kristof.beyls, hiraditya, danielkiss, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D83777
2020-07-21 13:22:07 +01:00
Eli Friedman b8f765a1e1 [AArch64][SVE] Add support for trunc to <vscale x N x i1>.
This isn't a natively supported operation, so convert it to a
mask+compare.

In addition to the operation itself, fix up some surrounding stuff to
make the testcase work: we need concat_vectors on i1 vectors, we need
legalization of i1 vector truncates, and we need to fix up all the
relevant uses of getVectorNumElements().

Differential Revision: https://reviews.llvm.org/D83811
2020-07-20 13:11:02 -07:00
Paul Walker 6384ec4099 [SVE] Add lowering for fixed length vector fdiv, fma, fmul and fsub operations.
Differential Revision: https://reviews.llvm.org/D84034
2020-07-20 11:57:34 +00:00
Tim Northover 88464a55b4 AArch64: emit @llvm.debugtrap as `brk #0xf000` on all platforms
It's useful for a debugger to be able to distinguish an @llvm.debugtrap
from a (noreturn) @llvm.trap, so this extends the existing Windows
behaviour to other platforms.
2020-07-20 10:31:26 +01:00
Paul Walker 509351d768 [SVE] Add lowering for scalable vector fadd, fdiv, fmul and fsub operations.
Lower the operations to predicated variants.  This is prep work
required for fixed length code generation but also fixes a bug
whereby these operations fail selection when "unpacked" vector
types (e.g. MVT::nxv2f32) are used.

This patch also adds the missing "unpacked" patterns for FMA.

Differential Revision: https://reviews.llvm.org/D83765
2020-07-16 11:31:35 +00:00
Paul Walker 319a97b5e2 [SVE] Ensure fixed length vector fptrunc operations bigger than NEON are not considered legal.
Differential Revision: https://reviews.llvm.org/D83568
2020-07-13 11:16:30 +00:00
Paul Walker f78e6a3095 [SVE] Code generation for fixed length vector truncates.
Lower fixed length vector truncates to a sequence of SVE UZP1 instructions.

Differential Revision: https://reviews.llvm.org/D83395
2020-07-10 10:37:19 +00:00
Eli Friedman 56ae2cebcd [AArch64][SVE] Add lowering for llvm.fma.
This is currently bare-bones; we aren't taking advantage of any of the
FMA variant instructions.  But it's enough to at least generate
code.

Differential Revision: https://reviews.llvm.org/D83444
2020-07-09 16:12:41 -07:00
Paul Walker 614fb09645 [SVE] Disable some BUILD_VECTOR related code generator features.
Fixed length vector code generation for SVE does not yet custom
lower BUILD_VECTOR and instead relies on expansion.  At the same
time custom lowering for VECTOR_SHUFFLE is also not available so
this patch updates isShuffleMaskLegal to reject vector types that
require SVE.

Related to this it also prevents the merging of stores after
legalisation because this only works when BUILD_VECTOR is either
legal or can be elminated.  When this is not the case the code
generator enters an infinite legalisation loop.

Differential Revision: https://reviews.llvm.org/D83408
2020-07-09 10:47:04 +00:00
Paul Walker fb75451775 [SVE] Custom ISel for fixed length extract/insert_subvector.
We use extact_subvector and insert_subvector to "cast" between
fixed length and scalable vectors.  This patch adds custom c++
based ISel for the following cases:

  fixed_vector = ISD::EXTRACT_SUBVECTOR scalable_vector, 0
  scalable_vector = ISD::INSERT_SUBVECTOR undef(scalable_vector), fixed_vector, 0

Which result in either EXTRACT_SUBREG/INSERT_SUBREG for NEON sized
vectors or COPY_TO_REGCLASS otherwise.

Differential Revision: https://reviews.llvm.org/D82871
2020-07-08 09:49:28 +00:00
Petre-Ionut Tudor af80a4353e [ARM] Generate [SU]RHADD from (b - (~a)) >> 1
Summary:
Teach LLVM to recognize the above pattern, which is usually a
transformation of (a + b + 1) >> 1, where the operands are either
signed or unsigned types.

Subscribers: kristof.beyls, hiraditya, danielkiss, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D82669
2020-07-03 16:00:06 +01:00
Guillaume Chatelet 87e2751cf0 [Alignment][NFC] Use proper getter to retrieve alignment from ConstantInt and ConstantSDNode
This patch is part of a series to introduce an Alignment type.
See this thread for context: http://lists.llvm.org/pipermail/llvm-dev/2019-July/133851.html
See this patch for the introduction of the type: https://reviews.llvm.org/D64790

Differential Revision: https://reviews.llvm.org/D83082
2020-07-03 08:06:43 +00:00
Sander de Smalen 143e324e75 [CodeGen][SVE] Don't drop scalable flag in DAGCombiner::visitEXTRACT_SUBVECTOR
There was a rogue 'assert' in AArch64ISelLowering for the tuple.get intrinsics,
that shouldn't really have been there (I suspect this was a remnant from when
we expected the wider vector always to have come from a vector CONCAT).

When I tried to create a more minimal reproducer, I found a bug in
DAGCombiner where it drops the scalable flag when trying to fold:

      extract_subv (bitcast X), Index --> bitcast (extract_subv X, Index')

This patch fixes both issues.

Reviewers: david-arm, efriedma, spatel

Reviewed By: efriedma

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D82910
2020-07-02 10:16:43 +01:00
Guillaume Chatelet ef36f5143d [Alignment] TargetLowering::hasPairedLoad must use Align for RequiredAlignment
As per documentation of `hasPairLoad`:
"`RequiredAlignment` gives the minimal alignment constraints that must be met to be able to select this paired load."
In this sense, `0` is strictly equivalent to `1`. We make this obvious by using `Align` instead of unsigned.
There is only one implementor of this interface.

Differential Revision: https://reviews.llvm.org/D82958
2020-07-01 14:32:30 +00:00