Commit Graph

2250 Commits

Author SHA1 Message Date
Sander de Smalen 699e22a083 [ISEL] Move trivial step_vector folds to FoldConstantArithmetic.
Given that step_vector is practically a constant, doing this early
helps with DAGCombine folds that happen before type legalization.

There is currently no way to test this happens earlier, although existing
tests for step_vector folds continue protect the folds happening at all.

Reviewed By: david-arm

Differential Revision: https://reviews.llvm.org/D117863
2022-01-24 16:37:21 +00:00
Craig Topper b8c7cdcc81 [SelectionDAG][RISCV] Teach getNode to fold bswap(bswap(x))->x.
This can show up during when bitreverse is expanded to bswap and
swap of bits within a byte. If the input is already a bswap, we
should cancel them out before we further transform them in a way
that makes it harder to see the redundancy.

Reviewed By: RKSimon

Differential Revision: https://reviews.llvm.org/D118007
2022-01-24 08:17:46 -08:00
Matt Arsenault 99e8e17313 Reapply "Revert "GlobalISel: Add G_ASSERT_ALIGN hint instruction"
This reverts commit a97e20a3a8.
2022-01-24 09:26:52 -05:00
Sander de Smalen 4f8fdf7827 [ISEL] Canonicalise constant splats to RHS.
SelectionDAG::getNode() canonicalises constants to the RHS if the
operation is commutative, but it doesn't do so for constant splat
vectors. Doing this early helps making certain folds on vector types,
simplifying the code required for target DAGCombines that are enabled
before Type legalization.

Somewhat to my surprise, DAGCombine doesn't seem to traverse the
DAG in a post-order DFS, so at the time of doing some custom fold where
the input is a MUL, DAGCombiner::visitMUL hasn't yet reordered the
constant splat to the RHS.

This patch leads to a few improvements, but also a few  minor regressions,
which I traced down to D46492. When I tried reverting this change to see
if the changes were still necessary, I ran into some segfaults. Not sure
if there is some latent bug there.

Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D117794
2022-01-24 09:38:36 +00:00
Simon Pilgrim d6fee6c3b0 [DAG] SelectionDAG::computeKnownBits - add mul(x,x) self-multiply handling (PR48683)
Pass the SelfMultiply flag to KnownBits::mul() - added at D108992

https://alive2.llvm.org/ce/z/NN_eaR
2022-01-19 17:39:32 +00:00
Victor Perez fd1dce35bd [LegalizeTypes][VP] Add splitting support for vp.reduction.*
Split vp.reduction.* intrinsics by splitting the vector to reduce in
two halves, perform the reduction operation in each one of them and
accumulate the results of both operations.

Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D117469
2022-01-18 09:29:24 +00:00
David Sherwood f4515ab858 Revert "[CodeGen][AArch64] Ensure isSExtCheaperThanZExt returns true for negative constants"
This reverts commit 197f3c0deb.

Reverting after miscompilation errors discovered with ffmpeg.
2022-01-18 08:40:20 +00:00
David Sherwood 197f3c0deb [CodeGen][AArch64] Ensure isSExtCheaperThanZExt returns true for negative constants
When we know the value we're extending is a negative constant then it
makes sense to use SIGN_EXTEND because this may improve code quality in
some cases, particularly when doing a constant splat of an unpacked vector
type. For example, for SVE when splatting the value -1 into all elements
of a vector of type <vscale x 2 x i32> the element type will get promoted
from i32 -> i64. In this case we want the splat value to sign-extend from
(i32 -1) -> (i64 -1), whereas currently it zero-extends from
(i32 -1) -> (i64 0xFFFFFFFF). Sign-extending the constant means we can use
a single mov immediate instruction.

New tests added here:

  CodeGen/AArch64/sve-vector-splat.ll

I believe we see some code quality improvements in these existing
tests too:

  CodeGen/AArch64/reduce-and.ll
  CodeGen/AArch64/unfold-masked-merge-vector-variablemask.ll

The apparent regressions in CodeGen/AArch64/fast-isel-cmp-vec.ll only
occur because the test disables codegen prepare and branch folding.

Differential Revision: https://reviews.llvm.org/D114357
2022-01-17 11:08:57 +00:00
Fraser Cormack 877d1b3d07 [SelectionDAG][VP] Add splitting/widening for VP_LOAD and VP_STORE
Original patch by @hussainjk.

This patch was split off from D109377 to keep vector legalization
(widening/splitting) separate from vector element legalization
(promoting).

While the original patch added a third overload of
SelectionDAG::getVPStore, this patch takes the liberty of collapsing
those all down to 1, as three overloads seems excessive for a
little-used node.

The original patch also used ModifyToType in places, but that method
still crashes on scalable vector types. Seeing as the other VP
legalization methods only work when all operands need identical
widening, this patch follows in that vein.

Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D117235
2022-01-15 11:41:29 +00:00
James Y Knight a97e20a3a8 Revert "GlobalISel: Add G_ASSERT_ALIGN hint instruction"
This commit sometimes causes a crash when compiling a vtable thunk. E.g.:

clang '--target=aarch64-grtev4-linux-gnu' -xc++ - -c -o /dev/null <<EOF
struct a {
  virtual int f();
};
struct c {
  virtual int &g() const;
};
struct d : a, c {
  int &g() const;
};
int &d::g() const {}
EOF

Some follow-up commits have been reverted as well:
Revert "IR: Make getRetAlign check callee function attributes"
Revert "Fix MSVC "32-bit shift implicitly converted to 64 bits" warning. NFC."
Revert "Fix MSVC "32-bit shift implicitly converted to 64 bits" warning. NFC."

This reverts commit 4f414af6a7.
This reverts commit a5507d2e25.
This reverts commit 3d2d208f6a.
This reverts commit 07ddfa95e3.
2022-01-14 04:50:07 +00:00
David Sherwood ba471ba8d2 Revert "[CodeGen][AArch64] Ensure isSExtCheaperThanZExt returns true for negative constants"
This reverts commit 31009f0b5a.

It seems to be causing SVE VLA buildbot failures and has introduced a
genuine regression. Reverting for now.
2022-01-13 15:59:43 +00:00
David Sherwood 31009f0b5a [CodeGen][AArch64] Ensure isSExtCheaperThanZExt returns true for negative constants
When we know the value we're extending is a negative constant then it
makes sense to use SIGN_EXTEND because this may improve code quality in
some cases, particularly when doing a constant splat of an unpacked vector
type. For example, for SVE when splatting the value -1 into all elements
of a vector of type <vscale x 2 x i32> the element type will get promoted
from i32 -> i64. In this case we want the splat value to sign-extend from
(i32 -1) -> (i64 -1), whereas currently it zero-extends from
(i32 -1) -> (i64 0xFFFFFFFF). Sign-extending the constant means we can use
a single mov immediate instruction.

New tests added here:

  CodeGen/AArch64/sve-vector-splat.ll

I believe we see some code quality improvements in these existing
tests too:

  CodeGen/AArch64/dag-numsignbits.ll
  CodeGen/AArch64/reduce-and.ll
  CodeGen/AArch64/unfold-masked-merge-vector-variablemask.ll

The apparent regressions in CodeGen/AArch64/fast-isel-cmp-vec.ll only
occur because the test disables codegen prepare and branch folding.

Differential Revision: https://reviews.llvm.org/D114357
2022-01-13 09:43:07 +00:00
Matt Arsenault 07ddfa95e3 GlobalISel: Add G_ASSERT_ALIGN hint instruction
Insert it for call return values only for now, which is the only case
the DAG handles also.
2022-01-12 18:20:58 -05:00
Craig Topper a500f7f48f [SelectionDAG] Add FP_TO_UINT_SAT/FP_TO_SINT_SAT to computeKnownBits/computeNumSignBits.
These nodes should saturate to their saturating VT. We can use this
information to know the bits past the VT are all zeros or all sign bits.

I think we might only have test coverage for the unsigned case. I'll
verify and add tests.

Reviewed By: nikic

Differential Revision: https://reviews.llvm.org/D116870
2022-01-09 17:48:05 -08:00
Victor Perez 38efa68b08 [LegalizeTypes][VP] Add splitting support for vp.select
Split vp.select in a similar way as vselect, splitting also the length
parameter.

Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D116651
2022-01-07 08:46:01 +00:00
Craig Topper cbcbbd6ac8 [ValueTracking][SelectionDAG] Rename ComputeMinSignedBits->ComputeMaxSignificantBits. NFC
This function returns an upper bound on the number of bits needed
to represent the signed value. Use "Max" to match similar functions
in KnownBits like countMaxActiveBits.

Rename APInt::getMinSignedBits->getSignificantBits. Keeping the old
name around to keep this patch size down. Will do a bulk rename as
follow up.

Rename KnownBits::countMaxSignedBits->countMaxSignificantBits.

Reviewed By: lebedev.ri, RKSimon, spatel

Differential Revision: https://reviews.llvm.org/D116522
2022-01-03 11:33:30 -08:00
Kazu Hirata 69ccc96162 [llvm] Use the default constructor for SDValue (NFC) 2022-01-01 10:36:59 -08:00
Craig Topper 243b7aaf51 [SelectionDAG] Use KnownBits::countMinSignBits() to simplify the end of ComputeNumSignBits.
This matches what is done in ValueTracking.cpp

Reviewed By: RKSimon, foad

Differential Revision: https://reviews.llvm.org/D116423
2021-12-31 17:29:57 -08:00
Simon Pilgrim 4639461531 [DAG][X86] Add TargetLowering::isSplatValueForTargetNode override
Add callback to enable us to test target nodes if they are splat vectors

Added some basic X86ISD::VBROADCAST + X86ISD::VBROADCAST_LOAD handling
2021-12-22 16:57:44 +00:00
Simon Pilgrim 592e89e636 [DAG] Constify SelectionDAG::isSplatValue()
This doesn't generate any nodes so should be usable by methods with const SelectionDAG &.
2021-12-21 11:19:23 +00:00
Simon Pilgrim c1340b9e78 [DAG] Improve FMINNUM/FMAXNUM/FMINIMUM/FMAXIMUM constant folding
Merge the node combines into a common DAGCombiner::visitFMinMax (like we do for IMINMAX).

Move the constant folding into SelectionDAG::foldConstantFPMath.

This allows us to fold the vecreduce-propagate-sd-flags.ll test as it reduces constants - so I've refactored it to take variables instead.

Differential Revision: https://reviews.llvm.org/D115952
2021-12-19 11:45:51 +00:00
Simon Pilgrim 9d2994311a [DAG] Move foldConstantFPMath() inside FoldConstantArithmetic
Further merging of integer and fp constant folding paths.

This allows us to handle undef vector arguments the same as scalar cases.
2021-12-17 16:06:41 +00:00
Simon Pilgrim 512ab9968d [DAG] foldConstantFPMath - fold vector splats as well as scalar constants 2021-12-17 15:19:26 +00:00
Simon Pilgrim 52611702ea Revert rG22dbc7a48bf7a3942a7e5ff57977ef828d240bd3 "[DAG] foldConstantFPMath - fold vector splats as well as scalar constants"
A followup patch uncovered an issue with allowing undef elements in the splat - I will reapply this with a fixed implementation.
2021-12-17 15:19:25 +00:00
Simon Pilgrim 22dbc7a48b [DAG] foldConstantFPMath - fold vector splats as well as scalar constants 2021-12-17 14:24:36 +00:00
Simon Pilgrim d91b5b0f57 [DAG] foldConstantFPMath - use APFloat& for read-only constant fold arg. NFC.
We just need to copy the 1st arg (which we use for the constant fold result) - use a cheaper const reference for the 2nd arg.
2021-12-17 12:34:03 +00:00
Simon Pilgrim b88f4f271b [DAG] SelectionDAG::isSplatValue - add *_EXTEND_VECTOR_INREG handling
Fixes #52719
2021-12-15 12:26:39 +00:00
Omer Aviram 617ad14060 [SelectionDAG] Add pattern to haveNoCommonBitsSet
Correctly identify the following pattern, which has no common bits: (X & ~M) op (Y & M).

Differential Revision: https://reviews.llvm.org/D113970
2021-12-01 12:04:04 -05:00
Zarko Todorovski 95875d246a [LLVM][NFC]Inclusive language: remove occurances of sanity check/test from llvm
Part of work to use more inclusive language in clang/llvm. Rewording
some comments and change function and variable names.
2021-11-24 17:29:55 -05:00
Simon Moll 1e65b93f3a [VP] Canonicalize macros of VPIntrinsics.def
Usage and naming of macros in VPIntrinsics.def has been inconsistent. Rename all property macros to VP_PROPERTY_<name>.  Use BEGIN/END scope macros to attach properties to vp intrinsics and SDNodes (instead of specifying either directly with the property macro).
A follow-up patch has documentation on how the macros are (intended) to be used.

Reviewed By: frasercrmck

Differential Revision: https://reviews.llvm.org/D114144
2021-11-23 16:51:11 +01:00
Kazu Hirata 99d5cbbd7e [CodeGen] Use SDNode::uses (NFC) 2021-11-12 07:33:29 -08:00
Simon Pilgrim 098ea29641 [DAG] FoldConstantArithmetic - fold intop(bitcast(buildvector(c1)),bitcast(buildvector(c1))) -> bitcast(intop(buildvector(c1'),buildvector(c2')))
Enable FoldConstantArithmetic to constant fold bitcasted constant build vectors. These have typically been bitcasted for type legalization purposes.

By extracting the raw constant bit data, performing the constant fold, and then casting the constant bit data back to the (legalized) type, we can perform constant folding on integer types after legalization.

This in particular helps 32-bit targets which need to handle vXi64 build vectors - during legalization the (unsupported) i64 elements are split to create a bitcasted v2Xi32 build vector.

Addresses some regressions in D113192.

Differential Revision: https://reviews.llvm.org/D113564
2021-11-11 11:35:18 +00:00
Simon Pilgrim ed80761b50 [DAG] Split BuildVectorSDNode::getConstantRawBits into BuildVectorSDNode::recastRawBits helper. NFC.
NFC refactor of D113351, pulling out the APInt split/merge code from the BuildVectorSDNode bits extraction into a BuildVectorSDNode::recastRawBits helper. This is to allow us to reuse the code when we're packing constant folded APInt data back together.
2021-11-10 13:06:19 +00:00
Simon Pilgrim 58c01ef270 [SelectionDAG] Merge FoldConstantVectorArithmetic into FoldConstantArithmetic (PR36544)
This patch merges FoldConstantVectorArithmetic back into FoldConstantArithmetic.

Like FoldConstantVectorArithmetic we now handle vector ops with any operand count, but we currently still only handle binops for scalar types - this can be improved in future patches - in particular some common unary/trinary ops still have poor constant folding.

There's one change in functionality causing test changes - FoldConstantVectorArithmetic bails early if the build/splat vector isn't all constant (with some undefs) elements, but FoldConstantArithmetic doesn't - it instead attempts to fold the scalar nodes and bails if they fail to regenerate a constant/undef result, allowing some additional identity/undef patterns to be handled.

Differential Revision: https://reviews.llvm.org/D113300
2021-11-09 11:31:01 +00:00
Simon Pilgrim f059b04f7b [DAG] Add SelectionDAG::ComputeMinSignedBits helper
As suggested on D113371, this adds a wrapper to SelectionDAG::ComputeNumSignBits, similar to the llvm::ComputeMinSignedBits wrapper.

I've included some usage, its not exhaustive, just the more obvious cases where the intention is obvious.

Differential Revision: https://reviews.llvm.org/D113396
2021-11-08 14:12:45 +00:00
Simon Pilgrim f60d3ec0c7 [DAG] Add BuildVectorSDNode::getConstantRawBits helper
We have several places where we need to extract the raw bits data from a BUILD_VECTOR node, so consolidate this to a single helper function that handles Undefs and Integer/FP constants, including implicit truncation.

This should make it easier to extend D113202 to handle more constant folding of bitcasted constant data.

Differential Revision: https://reviews.llvm.org/D113351
2021-11-08 12:07:38 +00:00
Kazu Hirata 87e53a0ad8 [llvm] Use make_early_inc_range (NFC) 2021-11-05 19:39:07 -07:00
Simon Pilgrim 9e6506299a [DAG] FoldConstantVectorArithmetic - remove SDNodeFlags argument
Another minor step towards merging FoldConstantVectorArithmetic into FoldConstantArithmetic.

We don't use SDNodeFlags in any constant folding inside DAG, so passing the Flags argument is a waste of time - an alternative would be to wire up FoldConstantArithmetic to take SDNodeFlags just-in-case we someday start using it, but we don't have any way to test it and I'd prefer to avoid dead code.

Differential Revision: https://reviews.llvm.org/D113276
2021-11-05 14:36:17 +00:00
Simon Pilgrim f2703c3c33 [DAG] FoldConstantArithmetic - rename NumOps -> NumElts. NFC.
NumOps represents the number of elements for vector constant folding, rename this NumElts so in future we can the consistently use NumOps to represent the number of operands of the opcode.

Minor cleanup before trying to begin generalizing FoldConstantArithmetic to support opcodes other than binops.
2021-11-05 13:32:34 +00:00
Simon Pilgrim c1e7911c3b [DAG] FoldConstantArithmetic - fold bitlogic(bitcast(x),bitcast(y)) -> bitcast(bitlogic(x,y))
To constant fold bitwise logic ops where we've legalized constant build vectors to a different type (e.g. v2i64 -> v4i32), this patch adds a basic ability to peek through the bitcasts and perform the constant fold on the inner operands.

The MVE predicate v2i64 regressions will be addressed by future support for basic v2i64 type support.

One of the yak shaving fixes for D113192....

Differential Revision: https://reviews.llvm.org/D113202
2021-11-05 12:00:59 +00:00
Roman Lebedev 25043c8276
[NFCI] Introduce `ICmpInst::compare()` and use it where appropriate
As noted in https://reviews.llvm.org/D90924#inline-1076197
apparently this is a pretty common pattern,
let's not repeat it yet again, but have it in a common place.

There may be some more places where it could be used,
but these are the most obvious ones.
2021-10-30 17:50:06 +03:00
Kerry McLaughlin 0d153df69e [SVE] Fix selection failure when splitting extended masked loads
When splitting a masked load, `GetDependentSplitDestVTs` is used to get the
MemVTs of the high and low parts. If the masked load is extended, this
may return VTs with different element types which are used to create the
high & low masked load instructions.
This patch changes `GetDependentSplitDestVTs` to ensure we return VTs with
the same element type.

Reviewed By: david-arm

Differential Revision: https://reviews.llvm.org/D111996
2021-10-21 13:04:38 +01:00
Kerry McLaughlin 1a2e90199f [SVE][CodeGen] Add patterns for ADD/SUB + element count
This patch adds patterns to match the following with INC/DEC:
 - @llvm.aarch64.sve.cnt[b|h|w|d] intrinsics + ADD/SUB
 - vscale + ADD/SUB

For some implementations of SVE, INC/DEC VL is not as cheap as ADD/SUB and
so this behaviour is guarded by the "use-scalar-inc-vl" feature flag, which for SVE
is off by default. There are no known issues with SVE2, so this feature is
enabled by default when targeting SVE2.

Reviewed By: david-arm

Differential Revision: https://reviews.llvm.org/D111441
2021-10-13 11:36:15 +01:00
Bradley Smith 5be266db7a [AArch64][SVE] Improve VECTOR_SPLICE codegen for VL > 128-bit
Differential Revision: https://reviews.llvm.org/D111135
2021-10-07 15:28:55 +00:00
Simon Pilgrim 2e5daac217 [llvm] Update report_fatal_error calls from raw_string_ostream to use Twine(OS.str())
As described on D111049, we're trying to remove the <string> dependency from error handling and replace uses of report_fatal_error(const std::string&) with the Twine() variant which can be forward declared.

We can use the raw_string_ostream::str() method to perform the implicit flush() and return a reference to the std::string container that we can then wrap inside Twine().
2021-10-05 18:42:12 +01:00
Jay Foad a9bceb2b05 [APInt] Stop using soft-deprecated constructors and methods in llvm. NFC.
Stop using APInt constructors and methods that were soft-deprecated in
D109483. This fixes all the uses I found in llvm, except for the APInt
unit tests which should still test the deprecated methods.

Differential Revision: https://reviews.llvm.org/D110807
2021-10-04 08:57:44 +01:00
Fraser Cormack e2b46e336b [DAGCombiner][VP] Fold zero-length or false-masked VP ops
This patch adds a generic DAGCombine for vector-predicated (VP) nodes.
Those for which we can determine that no vector element is active can be
replaced by either undef or, for reductions, the start value.

This is tested rather trivially at the IR level, where it's possible
that we want to teach instcombine to perform this optimization.

However, we can also see the zero-evl case arise during SelectionDAG
legalization, when wide VP operations can be split into two and the
upper operation emerges as trivially false.

It's possible that we could perform this optimization "proactively"
(both on legal vectors and before splitting) and reduce the width of an
operation and insert it into a larger undef vector:

```
v8i32 vp_add x, y, mask, 4
->
v8i32 insert_subvector (v8i32 undef), (v4i32 vp_add xsub, ysub, mask, 4), i32 0
```

This is somewhat analogous to similar vector narrow/widening
optimizations, but it's unclear at this point whether that's beneficial
to do this for VP ops for any/all targets.

Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D109148
2021-09-27 11:30:09 +01:00
Simon Pilgrim 9db20822f7 [APInt] Add APIntOps::ScaleBitMask helper
APInt is used to describe a bit mask in a variety of value tracking and demanded bits/elts functions.

When traversing through dst/src operands, we have a number of places where these masks need to widened/narrowed to translate through bitcasts, reductions etc. to a different type.

This patch add a APIntOps::ScaleBitMask common helper, adds unit test coverage, and updates a number of cases to use the the helper instead of their own implementation.

This came up on D109065 where we currently have to add yet another implementation of the same code.

Differential Revision: https://reviews.llvm.org/D109683
2021-09-13 16:27:12 +01:00
vnalamot 0fc3ebb70a [SelectionDAG][NFC] Fix typo in VerifyDAGDiverence() function name
Reviewed By: arsenm

Differential Revision: https://reviews.llvm.org/D109674
2021-09-13 20:48:04 +05:30
Craig Topper 9af8f1b18e [SelectionDAG] Add isZero/isAllOnes methods to ConstantSDNode.
Soft deprecrate isNullValue/isAllOnesValue and update in tree
callers. This matches the changes to the APInt interface from
D109483.

Reviewed By: lattner

Differential Revision: https://reviews.llvm.org/D109535
2021-09-09 13:28:30 -07:00
Chris Lattner d51da74889 [CodeGen] Use DAG.getAllOnesConstant where possible to simplify code. NFC. 2021-09-09 10:22:51 -07:00
Chris Lattner 735f46715d [APInt] Normalize naming on keep constructors / predicate methods.
This renames the primary methods for creating a zero value to `getZero`
instead of `getNullValue` and renames predicates like `isAllOnesValue`
to simply `isAllOnes`.  This achieves two things:

1) This starts standardizing predicates across the LLVM codebase,
   following (in this case) ConstantInt.  The word "Value" doesn't
   convey anything of merit, and is missing in some of the other things.

2) Calling an integer "null" doesn't make any sense.  The original sin
   here is mine and I've regretted it for years.  This moves us to calling
   it "zero" instead, which is correct!

APInt is widely used and I don't think anyone is keen to take massive source
breakage on anything so core, at least not all in one go.  As such, this
doesn't actually delete any entrypoints, it "soft deprecates" them with a
comment.

Included in this patch are changes to a bunch of the codebase, but there are
more.  We should normalize SelectionDAG and other APIs as well, which would
make the API change more mechanical.

Differential Revision: https://reviews.llvm.org/D109483
2021-09-09 09:50:24 -07:00
Hussain Kadhem 524ded7d01 [VP] implementation of sdag support for VP memory intrinsics
Followup to D99355: SDAG support for vector-predicated load/store/gather/scatter.

Reviewed By: frasercrmck

Differential Revision: https://reviews.llvm.org/D105871
2021-08-31 17:01:50 +02:00
jacquesguan a7ebc4d145 [DAGCombiner] Teach isKnownToBeAPowerOfTwo handle SPLAT_VECTOR
Make DAGCombine turn mul by power of 2 into shl for scalable vector.

Reviewed By: frasercrmck

Differential Revision: https://reviews.llvm.org/D107883
2021-08-18 10:10:40 +08:00
Simon Pilgrim 3a7c82efb8 [DAG] isGuaranteedNotToBeUndefOrPoison - handle ISD::BUILD_VECTOR nodes
If all demanded elements of the BUILD_VECTOR pass a isGuaranteedNotToBeUndefOrPoison check, then we can treat this specific demanded use of the BUILD_VECTOR as guaranteed not to be undef or poison either.

Differential Revision: https://reviews.llvm.org/D107174
2021-07-31 15:08:25 +01:00
Fraser Cormack f924a3d474 [SelectionDAG] Support scalable-vector splats in yet more cases
This patch extends support for (scalable-vector) splats in the
DAGCombiner via the `ISD::matchBinaryPredicate` function, which enable a
variety of simple combines of constants.

Users of this function may now have to distinguish between
`BUILD_VECTOR` and `SPLAT_VECTOR` vector operands. The way of dealing
with this in-tree follows the approach added for
`ISD::matchUnaryPredicate` implemented in D94501.

Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D106575
2021-07-26 10:15:08 +01:00
Simon Pilgrim c261a06b7a [DAG] Add initial SelectionDAG::isGuaranteedNotToBeUndefOrPoison framework (PR51129)
I've setup the basic framework for the isGuaranteedNotToBeUndefOrPoison call and updated DAGCombiner::visitFREEZE to use it, further Opcodes can be handled when we have test coverage.

I'm not aware of any vector test freeze coverage so the DemandedElts (and the Depth) args are not being used yet - but they are in place.

SelectionDAG::isGuaranteedNotToBePoison wrappers have also been added.

Differential Revision: https://reviews.llvm.org/D106668
2021-07-24 11:36:35 +01:00
Eli Friedman 0ca46a1757 [SelectionDAG] Fix the representation of ISD::STEP_VECTOR.
The existing rule about the operand type is strange.  Instead, just say
the operand is a TargetConstant with the right width.  (Legalization
ignores TargetConstants, so it doesn't matter if that width is legal.)

Highlights:

1. I had to substantially rewrite the AArch64 isel patterns to expect a
TargetConstant.  Nothing too exotic, but maybe a little hairy. Maybe
worth considering a target-specific node with some dagcombines instead
of this complicated nest of isel patterns.
2. Our behavior on RV32 for vectors of i64 has changed slightly. In
particular, we correctly preserve the width of the arithmetic through
legalization.  This changes the DAG a bit. Maybe room for
improvement here.
3. I explicitly defined the behavior around overflow. This is necessary
to make the DAGCombine transforms legal, and I don't think it causes any
practical issues.

Differential Revision: https://reviews.llvm.org/D105673
2021-07-21 10:58:40 -07:00
Craig Topper 50302feb1d [SelectionDAG][RISCV] Use isSExtCheaperThanZExt to control whether sext or zext is used for constant folding any_extend.
RISCV would prefer a sign extended constant since that works better
with our constant materialization. We have an existing TLI hook we
use to control sign extension of setcc operands in type legalization.
That hook happens to do the right check we need here, but might be
straying from its original purpose. With only RISCV defining this
hook in tree, I wasn't sure if it was worth adding another hook
with identical behavior.

This is an alternative to D105785 where I tried to handle this in
the RISCV backend by not creating ANY_EXTENDs in some places.

Reviewed By: frasercrmck

Differential Revision: https://reviews.llvm.org/D105918
2021-07-19 09:25:28 -07:00
Eli Friedman 6601be4419 [X86] Remove incorrect use of known bits in shuffle simplification.
This reverts commit 2a419a0b99.

The result of a shufflevector must not propagate poison from any element
other than the one noted in the shuffle mask.

The regressions outside of fptoui-may-overflow.ll can probably be
recovered some other way; for example, using isGuaranteedNotToBePoison.

See discussion on https://reviews.llvm.org/D106053 for more background.

Differential Revision: https://reviews.llvm.org/D106222
2021-07-18 18:13:11 -07:00
Simon Pilgrim 95995673d1 [DAG] SelectionDAG::MaskedElementsAreZero - assert we're calling with a vector. NFCI.
Add an assertion that we've calling MaskedElementsAreZero with a vector op and that the DemandedElts arg is a matching width.

Makes the error a lot easier to grok when something else accidentally gets used.
2021-07-16 17:43:35 +01:00
Eli Friedman 1e30bf8621 [SelectionDAG] Add an overload of getStepVector that assumes step 1.
This is mostly a minor convenience, but the pattern seems frequent
enough to be worthwhile (and we'll probably add more uses in the
future).

Differential Revision: https://reviews.llvm.org/D105850
2021-07-14 11:37:01 -07:00
David Stuttard 83cb9632a1 [DAGCombiner] Add support for mulhi const folding in DAGCombiner
Differential Revision: https://reviews.llvm.org/D103323

Change-Id: I4ffaaa32301795ba8a339567a68e77fe0862b869
2021-07-05 12:01:26 +01:00
Paul Walker 287d39dd5a [NFC] Fix a few whitespace issues and typos. 2021-07-04 11:49:58 +01:00
Simon Pilgrim 80dd591610 [SelectionDAG] Replace APInt.lshr().trunc() with APInt.extractBits() where possible. NFC.
This also allows us to use KnownBits::extractBits in one case.
2021-07-03 16:33:00 +01:00
Simon Pilgrim e2e44c3da9 [SelectionDAG] Use KnownBits::insertBits instead of separate APInt::insertBits calls. NFC. 2021-07-03 16:32:59 +01:00
Craig Topper af331e8284 [SelectionDAG] Rename memory VT argument for getMaskedGather/getMaskedScatter from VT to MemVT.
Use getMemoryVT() in MGATHER/MSCATTER DAG combines instead of
using the passthru or store value VT for this argument.
2021-07-02 17:37:40 -07:00
Simon Pilgrim 7353beda4a [DAG] SelectionDAG::computeKnownBits - use APInt::insertBits to merge subvector knownbits. NFCI.
As noticed on D104472 we can use APInt::insertBits which will avoid a lot of temporary APInt creations
2021-06-18 14:59:01 +01:00
David Green b889c6ee99 [DAG] Allow isNullOrNullSplat to see truncated zeroes
This sets the AllowTruncation flag on isConstOrConstSplat in
isNullOrNullSplat, allowing it to see truncated constant zeroes on
architectures such as AArch64, where only a i32.i64 are legal. As a
truncation of 0 is always 0, this should always be valid, allowing some
extra folding to happen including some of the cases from D103755.

Differential Revision: https://reviews.llvm.org/D103756
2021-06-08 10:18:58 +01:00
Guillaume Chatelet 1da2c7d25c [NFC] Fix semantic discrepancy for MVT::LAST_VALUETYPE
Differential Revision: https://reviews.llvm.org/D103251
2021-06-07 10:04:16 +00:00
Fraser Cormack aec9cbbeb8 [SelectionDAG] Extend FoldConstantVectorArithmetic to SPLAT_VECTOR
This patch extends the SelectionDAG's ability to constant-fold vector
arithmetic to include support for SPLAT_VECTOR. This is not only for
scalable-vector types but also for fixed-length vector types, which
helps Hexagon in a couple of cases.

The original RISC-V test case was in fact an infinite DAGCombine loop.
The pattern `and (truncate v1), (truncate v2)` can be combined to
`truncate (and v1, v2)` but the truncate can similarly be combined back
to `truncate (and v1, v2)` (but, crucially, only when one of `v1` or
`v2` is a constant vector).

It wasn't exposed in on fixed-length types because a TRUNCATE of a
constant BUILD_VECTOR was folded into the BUILD_VECTOR itself, whereas
this did not happen for the equivalent (scalable-vector) SPLAT_VECTOR.

Reviewed By: RKSimon, craig.topper

Differential Revision: https://reviews.llvm.org/D103246
2021-06-04 09:53:15 +01:00
Craig Topper d24d2447cd [SelectionDAG] Fix typo in assert. NFC 2021-05-28 10:37:11 -07:00
Michael Liao c9dd29925f [SelectionDAG] Propagate scoped AA metadata when lowering mem intrinsics.
- When memory intrinsics, such as memcpy, the attached scoped AA
  metadata is not passed down to the backend. As a result, the backend
  cannot schedule relevant memory operations around them following that
  hint. In this patch, SelectionDAG is enhanced to propagate that
  metadata (scoped AA only) when they are lowered into loads and stores.

Differential Revision: https://reviews.llvm.org/D102215
2021-05-25 14:42:26 -04:00
Jessica Clarke e10958c807 [SelectionDAG][Mips][PowerPC][RISCV][WebAssembly] Teach computeKnownBits/ComputeNumSignBits about atomics
Unlike normal loads these don't have an extension field, but we know
from TargetLowering whether these are sign-extending or zero-extending,
and so can optimise away unnecessary extensions.

This was noticed on RISC-V, where sign extensions in the calling
convention would result in unnecessary explicit extension instructions,
but this also fixes some Mips inefficiencies. PowerPC sees churn in the
tests as all the zero extensions are only for promoting 32-bit to
64-bit, but these zero extensions are still not optimised away as they
should be, likely due to i32 being a legal type.

This also simplifies the WebAssembly code somewhat, which currently
works around the lack of target-independent combines with some ugly
patterns that break once they're optimised away.

Re-landed with correct handling in ComputeNumSignBits for Tmp == VTBits,
where zero-extending atomics were incorrectly returning 0 rather than
the (slightly confusing) required return value of 1.

Re-landed again after D102819 fixed PowerPC to correctly zero-extend all
of its atomics as it claimed to do, since the combination of that bug
and this optimisation caused buildbot regressions.

Reviewed By: RKSimon, atanasyan

Differential Revision: https://reviews.llvm.org/D101342
2021-05-20 20:34:23 +01:00
Fraser Cormack 26bd2250c1 [RISCV] Ensure shuffle splat operands are type-legal
The use of `SelectionDAG::getSplatValue` isn't guaranteed to return a
type-legal splat value as it may implicitly extract a vector element
from another shuffle. It is not permitted to introduce an illegal type
when lowering shuffles.

This patch addresses the crash by adding a boolean flag to
`getSplatValue`, defaulting to false, which when set will ensure a
type-legal return value. If it is unable to do that it will fail to
return a splat value.

I've been through the existing uses of `getSplatValue` in other targets
and was unable to find a need or test cases showing a need to update
their uses. In some cases, the call is made during `LegalizeVectorOps`
which may still produce illegal scalar types. In other situations, the
illegally-typed splat value may be quickly patched up to a legal type
(such as any-extending the returned `extract_vector_elt` up to a legal
type) before `LegalizeDAG` notices.

Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D102687
2021-05-20 18:00:03 +01:00
Stefan Pintilie 8d37411e48 Revert "[SelectionDAG][Mips][PowerPC][RISCV][WebAssembly] Teach computeKnownBits/ComputeNumSignBits about atomics"
This reverts commit 6c80361b84.
Breaks PowerPC Big Endian buildbots.
2021-05-12 09:46:18 -05:00
Jessica Clarke 6c80361b84 [SelectionDAG][Mips][PowerPC][RISCV][WebAssembly] Teach computeKnownBits/ComputeNumSignBits about atomics
Unlike normal loads these don't have an extension field, but we know
from TargetLowering whether these are sign-extending or zero-extending,
and so can optimise away unnecessary extensions.

This was noticed on RISC-V, where sign extensions in the calling
convention would result in unnecessary explicit extension instructions,
but this also fixes some Mips inefficiencies. PowerPC sees churn in the
tests as all the zero extensions are only for promoting 32-bit to
64-bit, but these zero extensions are still not optimised away as they
should be, likely due to i32 being a legal type.

This also simplifies the WebAssembly code somewhat, which currently
works around the lack of target-independent combines with some ugly
patterns that break once they're optimised away.

Re-landed with correct handling in ComputeNumSignBits for Tmp == VTBits,
where zero-extending atomics were incorrectly returning 0 rather than
the (slightly confusing) required return value of 1.

Reviewed By: RKSimon, atanasyan

Differential Revision: https://reviews.llvm.org/D101342
2021-05-06 04:01:20 +01:00
Jessica Clarke 897d7bceb9 Revert "[SelectionDAG][Mips][PowerPC][RISCV][WebAssembly] Teach computeKnownBits/ComputeNumSignBits about atomics"
This seems to have broken sanitizers, giving lots of

  Assertion `NumBits <= MAX_INT_BITS && "bitwidth too large"' failed.

failures across multiple targets (currently X86 and PowerPC). Reverting
until I have a chance to reproduce and debug.

This reverts commit 6e876f9ded.
2021-05-05 17:02:05 +01:00
Jessica Clarke 6e876f9ded [SelectionDAG][Mips][PowerPC][RISCV][WebAssembly] Teach computeKnownBits/ComputeNumSignBits about atomics
Unlike normal loads these don't have an extension field, but we know
from TargetLowering whether these are sign-extending or zero-extending,
and so can optimise away unnecessary extensions.

This was noticed on RISC-V, where sign extensions in the calling
convention would result in unnecessary explicit extension instructions,
but this also fixes some Mips inefficiencies. PowerPC sees churn in the
tests as all the zero extensions are only for promoting 32-bit to
64-bit, but these zero extensions are still not optimised away as they
should be, likely due to i32 being a legal type.

This also simplifies the WebAssembly code somewhat, which currently
works around the lack of target-independent combines with some ugly
patterns that break once they're optimised away.

Reviewed By: RKSimon, atanasyan

Differential Revision: https://reviews.llvm.org/D101342
2021-05-05 16:34:45 +01:00
Craig Topper 3067520bf4 [SelectionDAG] Use a VTSDNode to store the saturation width for FP_TO_SINT_SAT/FP_TO_UINT_SAT
Previously we used an i32 constant to store the saturation width, but i32 isn't
legal on RISCV64. This wasn't a big deal to fix, but it is extra work for the
type legalizer.

This patch uses a VTSDNode to store the type similar to SEXT_INREG. This makes
it opaque to the type legalizer.

Reviewed By: nikic

Differential Revision: https://reviews.llvm.org/D101262
2021-04-27 14:38:42 -07:00
Jun Ma 978eb3f168 [DAGCombiner] Allow operand of step_vector to be negative.
It is proper to relax non-negative limitation of step_vector.
Also this patch adds more combines for step_vector:
(sub X, step_vector(C)) -> (add X,  step_vector(-C))

Differential Revision: https://reviews.llvm.org/D100812
2021-04-22 20:58:03 +08:00
Simon Pilgrim 2a419a0b99 [X86][SSE] combineX86ShuffleChain - check if we're blending with zero into already zero elements
Add a SelectionDAG::MaskedElementsAreZero helper that wraps SelectionDAG::MaskedValueIsZero testing for entirely zero vector elements
2021-04-20 17:09:49 +01:00
Simon Pilgrim e156f2515c [DAG] SelectionDAG.cpp - breakup if-else chains where each block returns. NFCI.
Match style guide that requests that if+return blocks are separate.
2021-04-20 12:37:00 +01:00
Fraser Cormack 457da7f298 [SelectionDAG] Relax constraints on STEP_VECTOR step operand
This patch relaxes the requirement that the STEP_VECTOR step constant
must be of a type at least as large as the vector element type. This
does not permit its use on targets which have legal vector element types
larger than the largest legal scalar type, such as i64 vectors on RV32.

As such, the requirement has been loosened so that the step operand must
be any scalar type so long as the constant immediate is non-negative and
the value fits inside the vector element type.

This limits combining optimizations in certain circumstances but in
practice it's unlikely to be a hindrance.

Reviewed By: paulwalker-arm

Differential Revision: https://reviews.llvm.org/D100660
2021-04-20 08:41:42 +01:00
Simon Pilgrim 37a4621fb6 [DAG] SelectionDAG::isSplatValue - early out if binop is not splat. NFCI.
Just return false if we fail to match splats - the remainder of the code is for (fixed)vector operations - shuffles/insertions etc.
2021-04-16 18:26:33 +01:00
Tim Northover 6401b78ab3 SDAG: constant fold bf16 -> i16 casts
This direction is particularly useful because i16 constants are much more
likely to be legal than bf16.
2021-04-14 11:27:46 +01:00
Stephen Tozer aa3e78a59f Reapply "[DebugInfo] Correctly track SDNode dependencies for list debug values"
Fixed memory leak error by using BumpAllocator for SDDbgValue arrays.

This reverts commit 1b589172bd.
2021-04-12 12:51:29 +01:00
Stephen Tozer 1b589172bd Revert "[DebugInfo] Correctly track SDNode dependencies for list debug values"
Reverted due to failure on the sanitizer-x86_64-linux-fast bot.

This reverts commit e10493eb50.
2021-04-08 17:55:45 +01:00
Stephen Tozer e10493eb50 [DebugInfo] Correctly track SDNode dependencies for list debug values
During SelectionDAG, we must track the SDNodes that each SDDbgValue depends on
to compute its value. These are ultimately derived from the location operands to
the SDDbgValue, but were stored in a separate vector prior to this patch. This
resulted in cases where one of the lists was updated incorrectly, resulting in
crashes during compilation. This patch fixes the issue by directly recomputing
the dependency list from the SDDbgOperands in getDependencies().

Differential Revision: https://reviews.llvm.org/D99423
2021-04-08 17:01:45 +01:00
Craig Topper 67953311e2 [SelectionDAG] Teach SelectionDAG::FoldConstantArithmetic to handle SPLAT_VECTOR
This allows FoldConstantArithmetic to handle SPLAT_VECTOR in
addition to BUILD_VECTOR. This allows it to support scalable
vectors. I'm also allowing fixed length SPLAT_VECTOR which is
used by some targets, but I'm not familiar enough to write tests
for those targets.

I had to block this function from running on CONCAT_VECTORS to
avoid calling getNode for a CONCAT_VECTORS of 2 scalars.
This can happen because the 2 operand getNode calls this
function for any opcode. Previously we were protected because
CONCAT_VECTORs of BUILD_VECTOR is folded to a larger BUILD_VECTOR
before that call. But it's not always possible to fold a CONCAT_VECTORS
of SPLAT_VECTORs, and we don't even try.

This fixes PR49781 where DAG combine thought constant folding
should be possible, but FoldConstantArithmetic couldn't do it.

Reviewed By: david-arm

Differential Revision: https://reviews.llvm.org/D99682
2021-04-07 10:03:33 -07:00
Simon Pilgrim ddbb58736a [KnownBits] Rename KnownBits::computeForMul to KnownBits::mul. NFCI.
As promised in D98866
2021-04-06 10:11:41 +01:00
Simon Pilgrim 4ea5475a3f [KnownBits] Add KnownBits::haveNoCommonBitsSet helper. NFCI.
Include exhaustive test coverage.
2021-04-02 21:44:33 +01:00
Jun Ma 274ac9d40e [AArch64][SVE] Lowering sve.dot to DOT node
Differential Revision: https://reviews.llvm.org/D99699
2021-04-02 20:05:17 +08:00
Sander de Smalen 0f7bbbc481 Always emit error for wrong interfaces to scalable vectors, unless cmdline flag is passed.
In order to bring up scalable vector support in LLVM incrementally,
we introduced behaviour to emit a warning, instead of an error, when
asking the wrong question of a scalable vector, like asking for the
fixed number of elements.

This patch puts that behaviour under a flag. The default behaviour is
that the compiler will always error, which means that all LLVM unit
tests and regression tests will now fail when a code-path is taken that
still uses the wrong interface.

The behaviour to demote an error to a warning can be individually enabled
for tools that want to support experimental use of scalable vectors.
This patch enables that behaviour when driving compilation from Clang.
This means that for users who want to try out scalable-vector support,
fixed-width codegen support, or build user-code with scalable vector
intrinsics, Clang will not crash and burn when the compiler encounters
such a case.

This allows us to do away with the following pattern in many of the SVE tests:
  RUN: .... 2>%t
  RUN: cat %t | FileCheck --check-prefix=WARN
  WARN-NOT: warning: ...

The behaviour to emit warnings is only temporary and we expect this flag
to be removed in the future when scalable vector support is more stable.

This patch also has fixes the following tests:
 unittests:
   ScalableVectorMVTsTest.SizeQueries
   SelectionDAGAddressAnalysisTest.unknownSizeFrameObjects
   AArch64SelectionDAGTest.computeKnownBitsSVE_ZERO_EXTEND_VECTOR_INREG

 regression tests:
   Transforms/InstCombine/vscale_gep.ll

Reviewed By: paulwalker-arm, ctetreau

Differential Revision: https://reviews.llvm.org/D98856
2021-04-02 10:55:22 +01:00
Craig Topper 9e00b6660d [SelectionDAG] Remove unneeded vector resize from the end of FoldConstantArithmetic. NFC
There's an assert right before that makes sure the size already matches.
Earlier in this function's life, scalars and vectors shared more
code.
2021-03-31 12:33:10 -07:00
Tomas Matheson a9968c0a33 [NFC][CodeGen] Tidy up TargetRegisterInfo stack realignment functions
Currently needsStackRealignment returns false if canRealignStack returns false.
This means that the behavior of needsStackRealignment does not correspond to
it's name and description; a function might need stack realignment, but if it
is not possible then this function returns false. Furthermore,
needsStackRealignment is not virtual and therefore some backends have made use
of canRealignStack to indicate whether a function needs stack realignment.

This patch attempts to clarify the situation by separating them and introducing
new names:

 - shouldRealignStack - true if there is any reason the stack should be
   realigned

 - canRealignStack - true if we are still able to realign the stack (e.g. we
   can still reserve/have reserved a frame pointer)

 - hasStackRealignment = shouldRealignStack && canRealignStack (not target
   customisable)

Targets can now override shouldRealignStack to indicate that stack realignment
is required.

This change will make it easier in a future change to handle the case where we
need to realign the stack but can't do so (for example when the register
allocator creates an aligned spill after the frame pointer has been
eliminated).

Differential Revision: https://reviews.llvm.org/D98716

Change-Id: Ib9a4d21728bf9d08a545b4365418d3ffe1af4d87
2021-03-30 17:31:39 +01:00
David Sherwood 748ae5281d [IR][SVE] Add new llvm.experimental.stepvector intrinsic
This patch adds a new llvm.experimental.stepvector intrinsic,
which takes no arguments and returns a linear integer sequence of
values of the form <0, 1, ...>. It is primarily intended for
scalable vectors, although it will work for fixed width vectors
too. It is intended that later patches will make use of this
new intrinsic when vectorising induction variables, currently only
supported for fixed width. I've added a new CreateStepVector
method to the IRBuilder, which will generate a call to this
intrinsic for scalable vectors and fall back on creating a
ConstantVector for fixed width.

For scalable vectors this intrinsic is lowered to a new ISD node
called STEP_VECTOR, which takes a single constant integer argument
as the step. During lowering this argument is set to a value of 1.
The reason for this additional argument at the codegen level is
because in future patches we will introduce various generic DAG
combines such as

  mul step_vector(1), 2 -> step_vector(2)
  add step_vector(1), step_vector(1) -> step_vector(2)
  shl step_vector(1), 1 -> step_vector(2)
  etc.

that encourage a canonical format for all targets. This hopefully
means all other targets supporting scalable vectors can benefit
from this too.

I've added cost model tests for both fixed width and scalable
vectors:

  llvm/test/Analysis/CostModel/AArch64/neon-stepvector.ll
  llvm/test/Analysis/CostModel/AArch64/sve-stepvector.ll

as well as codegen lowering tests for fixed width and scalable
vectors:

  llvm/test/CodeGen/AArch64/neon-stepvector.ll
  llvm/test/CodeGen/AArch64/sve-stepvector.ll

See this thread for discussion of the intrinsic:
https://lists.llvm.org/pipermail/llvm-dev/2021-January/147943.html
2021-03-23 10:43:35 +00:00
Simon Pilgrim 9d2df96407 [DAG] computeKnownBits - add ISD::MULHS/MULHU/SMUL_LOHI/UMUL_LOHI handling
Reuse the existing KnownBits multiplication code to handle the 'extend + multiply + extract high bits' pattern for multiply-high ops.

Noticed while looking at the codegen for D88785 / D98587 - the patch helps division-by-constant expansion code in particular, which suggests that we might have some further KnownBits div/rem cases we could handle - but this was far easier to implement.

Differential Revision: https://reviews.llvm.org/D98857
2021-03-19 16:02:31 +00:00
Simon Pilgrim b1afa187c8 [DAG] SelectionDAG::isSplatValue - add ISD::ABS handling
Add ISD::ABS to the existing unary instructions handling for splat detection

This is similar to D83605, but doesn't appear to need to touch any of the wasm refactoring.

Differential Revision: https://reviews.llvm.org/D98778
2021-03-18 10:28:29 +00:00
Fraser Cormack 0035decae7 [CodeGen] Fix issues with scalable-vector INSERT/EXTRACT_SUBVECTORs
This patch addresses a few issues when dealing with scalable-vector
INSERT_SUBVECTOR and EXTRACT_SUBVECTOR nodes.

When legalizing in DAGTypeLegalizer::SplitVecRes_INSERT_SUBVECTOR, we
store the low and high halves to the stack separately. The offset for
the high half was calculated incorrectly.

Additionally, we can optimize this process when we can detect that the
subvector is contained entirely within the low/high split vector type.
While this optimization is valid on scalable vectors, when performing
the 'high' optimization, the subvector must also be a scalable vector.
Note that the 'low' optimization is still conservative: it may be
possible to insert v2i32 into the low half of a split nxv1i32/nxv1i32,
but we can't guarantee it. It is always possible to insert v2i32 into
nxv2i32 or v2i32 into nxv4i32+2 as we know vscale is at least 1.

Lastly, in SelectionDAG::isSplatValue, we early-exit on the extracted subvector value
type being a scalable vector, forgetting that we can also extract a
fixed-length vector from a scalable one.

Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D98495
2021-03-15 17:04:21 +00:00