Commit Graph

3052 Commits

Author SHA1 Message Date
David Truby c305557acd [llvm][sve] Lowering for VLS truncating stores
This adds custom lowering for truncating stores when operating on
fixed length vectors in SVE. It also includes a DAG combine to
fold extends followed by truncating stores into non-truncating
stores in order to prevent this pattern appearing once truncating
stores are supported.

Currently truncating stores are not used in certain cases where
the size of the vector is larger than the target vector width.

Differential Revision: https://reviews.llvm.org/D104471
2021-07-12 11:14:17 +01:00
David Green 4ce26deac2 [DAG] Reassociate Add with Or
We already have reassociation code for Adds and Ors separately in DAG
combiner, this adds it for the combination of the two where Ors act like
Adds. It reassociates (add (or (x, c), y) -> (add (add (x, y), c)) where
we know that the Ors operands have no common bits set, and the Or has
one use.

Differential Revision: https://reviews.llvm.org/D104765
2021-07-07 10:21:07 +01:00
David Stuttard 83cb9632a1 [DAGCombiner] Add support for mulhi const folding in DAGCombiner
Differential Revision: https://reviews.llvm.org/D103323

Change-Id: I4ffaaa32301795ba8a339567a68e77fe0862b869
2021-07-05 12:01:26 +01:00
Paul Walker 287d39dd5a [NFC] Fix a few whitespace issues and typos. 2021-07-04 11:49:58 +01:00
Craig Topper af331e8284 [SelectionDAG] Rename memory VT argument for getMaskedGather/getMaskedScatter from VT to MemVT.
Use getMemoryVT() in MGATHER/MSCATTER DAG combines instead of
using the passthru or store value VT for this argument.
2021-07-02 17:37:40 -07:00
Roman Lebedev c2c0d3ea89
Revert "[WebAssembly] Implementation of global.get/set for reftypes in LLVM IR"
This reverts commit 4facbf213c.

```
********************
FAIL: LLVM :: CodeGen/WebAssembly/funcref-call.ll (44466 of 44468)
******************** TEST 'LLVM :: CodeGen/WebAssembly/funcref-call.ll' FAILED ********************
Script:
--
: 'RUN: at line 1';   /builddirs/llvm-project/build-Clang12/bin/llc < /repositories/llvm-project/llvm/test/CodeGen/WebAssembly/funcref-call.ll --mtriple=wasm32-unknown-unknown -asm-verbose=false -mattr=+reference-types | /builddirs/llvm-project/build-Clang12/bin/FileCheck /repositories/llvm-project/llvm/test/CodeGen/WebAssembly/funcref-call.ll
--
Exit Code: 2

Command Output (stderr):
--
llc: /repositories/llvm-project/llvm/include/llvm/Support/LowLevelTypeImpl.h:44: static llvm::LLT llvm::LLT::scalar(unsigned int): Assertion `SizeInBits > 0 && "invalid scalar size"' failed.

```
2021-07-02 11:49:51 +03:00
Paulo Matos 4facbf213c [WebAssembly] Implementation of global.get/set for reftypes in LLVM IR
Reland of 31859f896.

This change implements new DAG notes GLOBAL_GET/GLOBAL_SET, and
lowering methods for load and stores of reference types from IR
globals. Once the lowering creates the new nodes, tablegen pattern
matches those and converts them to Wasm global.get/set.

Differential Revision: https://reviews.llvm.org/D104797
2021-07-02 09:46:28 +02:00
David Green 2887f14639 [ISel] Port AArch64 SABD and UABD to DAGCombine
This ports the AArch64 SABD and USBD over to DAG Combine, where they can be
used by more backends (notably MVE in a follow-up patch). The matching code
has changed very little, just to handle legal operations and types
differently. It selects from (ABS (SUB (EXTEND a), (EXTEND b))), producing
a ubds/abdu which is zexted to the original type.

Differential Revision: https://reviews.llvm.org/D91937
2021-06-26 19:34:16 +01:00
David Green b8c8bb0769 [DAG] Fold neg(splat(neg(x)) -> splat(x)
This add as a fold of sub(0, splat(sub(0, x))) -> splat(x). This can
come up in the lowering of right shifts under AArch64, where we generate
a shift left of a negated number.

Differential Revision: https://reviews.llvm.org/D103755
2021-06-25 19:53:29 +01:00
Jinsong Ji c125af82a5 [DAGCombine] Check reassoc flags in aggressive fsub fusion
The is from discussion in https://reviews.llvm.org/D104247#inline-993387

The contract and reassoc flags shouldn't imply each other .

All the aggressive fsub fusion reassociate operations,
we should guard them with reassoc flag check.

Reviewed By: mcberg2017

Differential Revision: https://reviews.llvm.org/D104723
2021-06-23 13:59:40 +00:00
Jinsong Ji 3996311ee1 [DAGCombine] reassoc flag shouldn't enable contract
According to IR LangRef, the FMF flag:

contract
Allow floating-point contraction (e.g. fusing a multiply followed by an
addition into a fused multiply-and-add).

reassoc
Allow reassociation transformations for floating-point instructions.
This may dramatically change results in floating-point.

My understanding is that these two flags shouldn't imply each other,
as we might have a SDNode that can be reassociated with others, but
not contractble.

eg: We may want following fmul/fad/fsub to freely reassoc, but don't
want fma being generated here.

   %F = fmul reassoc double %A, %B         ; <double> [#uses=1]
   %G = fmul reassoc double %C, %D         ; <double> [#uses=1]
   %H = fadd reassoc double %F, %G         ; <double> [#uses=1]
   %I = fsub reassoc double %H, %E         ; <double> [#uses=1]

Before https://reviews.llvm.org/D45710, `reassoc` flag actually
did not imply isContratable either.

The current implementation also only check the flag in fadd node,
ignoring fmul node, this patch update that as well.

Reviewed By: spatel, qiucf

Differential Revision: https://reviews.llvm.org/D104247
2021-06-21 21:15:43 +00:00
Saleem Abdulrasool 5b5833b9e0 SelectionDAG: repair the Windows build
6e5628354e regressed the Windows build as
the return type no longer matched in both branches for the return value
type deduction.  This uses a bit more compiler magic to deal with that.
2021-06-14 08:25:36 -07:00
Roman Lebedev 0f94c3c80d
[NFC][DAGCombine] Extract getFirstIndexOf() lambda back into a function
Not all supported compilers like such lambdas, at least one buildbot is unhappy.
2021-06-14 16:25:59 +03:00
Roman Lebedev 6e5628354e
[DAGCombine] reduceBuildVecToShuffle(): sort input vectors by decreasing size
The sorting, obviously, must be stable, else we will have random assembly fluctuations.

Apparently there was no test coverage that would benefit from that,
so i've added one test.

The sorting consists of two parts - just sort the input vectors,
and recompute the shuffle mask -> input vector mapping.
I don't believe we need to do anything else.

Reviewed By: RKSimon

Differential Revision: https://reviews.llvm.org/D104187
2021-06-14 16:18:37 +03:00
Carl Ritson cfbb92441f [SDAG] Fix pow2 assumption when splitting vectors
When reducing vector builds to shuffles it possible that
the DAG combiner may try to extract invalid subvectors.

This happens as the existing code assumes vectors will be power
of 2 sizes, which is already untrue, but becomes more noticable
with v6 and v7 types.
Specifically the existing code assumes that half PowerOf2Ceil of
a given vector index will fit twice into a given vector.

Reviewed By: RKSimon

Differential Revision: https://reviews.llvm.org/D103880
2021-06-11 08:58:16 +09:00
David Spickett 64de8763aa Revert "Implementation of global.get/set for reftypes in LLVM IR"
This reverts commit 31859f896c.

Causing SVE and RISCV-V test failures on bots.
2021-06-10 10:11:17 +00:00
Paulo Matos 31859f896c Implementation of global.get/set for reftypes in LLVM IR
This change implements new DAG notes GLOBAL_GET/GLOBAL_SET, and
lowering methods for load and stores of reference types from IR
globals. Once the lowering creates the new nodes, tablegen pattern
matches those and converts them to Wasm global.get/set.

Reviewed By: tlively

Differential Revision: https://reviews.llvm.org/D95425
2021-06-10 10:07:45 +02:00
Sanjay Patel dd763ac791 [SDAG] fix miscompile from merging stores of different sizes
As shown in:
https://llvm.org/PR50623
...and the similar tests here, we were not accounting for
store merging of different sizes that do not cover the
entire range of the wide value to be stored.

This is the easy fix: just make sure that all of the
original stores are the same size, so when we calculate
the wide width, it's a simple N * M check.

This still allows all of the motivating optimizations from:
D86420 / 54a5dd485c
D87112 / 7a06b166b1

We could enhance this code to track individual bytes and
allow merging multiple sizes.
2021-06-09 09:51:39 -04:00
Simon Pilgrim 61a2d6bfe4 [DAG] foldShuffleOfConcatUndefs - ensure shuffles of upper (undef) subvector elements is undef (PR50609)
shuffle(concat(x,undef),concat(y,undef)) -> concat(shuffle(x,y),shuffle(x,y))

If the original shuffle references any of the upper (undef) subvector elements, ensure the split shuffle masks uses undef instead of an out-of-bounds value.

Fixes PR50609
2021-06-08 15:49:41 +01:00
Sanjay Patel 0718ac706d [SDAG] allow cast folding for vector sext-of-setcc with signed compare
This extends 434c8e013a and ede3982792 to handle signed
predicates by sign-extending the setcc operands.

This is not shown directly in https://llvm.org/PR50055 ,
but the pattern is visible by changing the unsigned convert
to signed in the source code.
2021-06-02 15:05:02 -04:00
Sanjay Patel ede3982792 [SDAG] allow more cast folding for vector sext-of-setcc
This is a follow-up to D103280 that eases the use restrictions,
so we can handle the motivating case from:
https://llvm.org/PR50055

The loop code is adapted from similar use checks in
ExtendUsesToFormExtLoad() and SliceUpLoad(). I did not see an
easier way to filter out non-chain uses of load values.

Differential Revision: https://reviews.llvm.org/D103462
2021-06-02 13:14:49 -04:00
Sanjay Patel 1b14f3951a [SDAG] add helper function for sext-of-setcc folds; NFC
Try to make this easier to read as noted in D103280
2021-06-01 08:07:17 -04:00
Sanjay Patel 63fe4cb082 [SDAG] add check to sext-of-setcc fold to bypass changing a legal op
I accidentaly pushed a draft of D103280 that was discussed
during the review, but it was not supposed to be the final
version.

Rather than revert and recommit, I'm updating the existing
code. This way we have a record of the codegen diff that
would result if we decide to remove this predicate in the
future.
2021-05-31 08:58:11 -04:00
Sanjay Patel 434c8e013a [SDAG] try harder to fold casts into vector compare
sext (vsetcc X, Y) --> vsetcc (zext X), (zext Y) --
(when the zexts are free and a bunch of other conditions)

We have a couple of similar folds to this already for vector selects,
but this pattern slips through because it is only a setcc.

The tests are based on the motivating case from:
https://llvm.org/PR50055
...but we need extra logic to get that example, so I've left that as
a TODO for now.

Differential Revision: https://reviews.llvm.org/D103280
2021-05-31 07:14:01 -04:00
Florian Hahn 126f90b252
[DAGCombine] Poison-prove scalarizeExtractedVectorLoad.
extractelement is poison if the index is out-of-bounds, so just
scalarizing the load may introduce an out-of-bounds load, which is UB.

To avoid introducing new UB, we can mask the index so it only contains
valid indices.

Fixes PR50382.

Reviewed By: efriedma

Differential Revision: https://reviews.llvm.org/D103077
2021-05-30 11:40:55 +01:00
Fraser Cormack b7101e218c [DAGCombine][RISCV] Don't try to trunc-store combined vector stores
DAGCombine's `mergeStoresOfConstantsOrVecElts` optimization is told
whether it's to use vector types and also whether it's to issue a
truncating store. However, the truncating store code path assumes a
scalar integer `ConstantSDNode`, and when using vector types it creates
either a `BUILD_VECTOR` or `CONCAT_VECTORS` to store: neither of which
is a constant.

The `riscv64` target is able to expose a crash here because it switches
on both code paths at the same time. The `f32` is stored as `i32` which
must be promoted to `i64`, necessitating a truncating store.
It also decides later that it prefers a vector store of `v2f32`.

While vector truncating stores are legal, this combine is not able to
emit them. We also don't have a test case. This patch adds an assert to
catch this case more gracefully, and updates one of the caller functions
to the function to turn off the use of truncating stores when preferring
vectors.

Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D103173
2021-05-27 14:16:32 +01:00
Fraser Cormack 85e31eddf2 [DAGCombiner] Relax an assertion to an early return
The select-of-constants transform was asserting that its constant vector
inputs did not implicitly truncate their input without that as an
explicit precondition to the function. This patch relaxes that assertion
into an early return to skip the optimization.

Reviewed By: RKSimon

Differential Revision: https://reviews.llvm.org/D102393
2021-05-17 09:15:55 +01:00
Sanjay Patel 9dfd7f9b67 [SDAG] reduce code duplication for extend_vec_inreg combines; NFC
These are identical so far, and I was looking at adding a fold
for a pattern with scalar_to_vector which would also nd up duplicated.
2021-05-14 08:29:57 -04:00
Hendrik Greving 762ac725bf [DAGCombiner] Fix DAG combine store elimination, different address space.
Fixes a bug in the DAG combiner that eliminates the stores because it missed
to inspect the address space of the pointers.

%v = load %ptr_as1
// no chain side effect
store %v, %ptr_as2

As well as

store %v, %ptr_as1
store %v, %ptr_as2

Fixes a test for above in X86.

Differential Revision: https://reviews.llvm.org/D102096
2021-05-12 07:14:22 -07:00
Bradley Smith 635164b95a [AArch64][SVE] Improve SVE codegen for fixed length BITCAST
Expanding a fixed length operation involves wrapping the operation in an
insert/extract subvector pair, as such, when this is done to bitcast we
end up with an extract_subvector of a bitcast. DAGCombine tries to
convert this into a bitcast of an extract_subvector which restores the
initial fixed length bitcast, causing an infinite loop of legalization.

As part of this patch, we must make sure the above DAGCombine does not
trigger after legalization if the created bitcast would not be legal.

Differential Revision: https://reviews.llvm.org/D101990
2021-05-10 14:43:53 +01:00
Craig Topper 0c330afdfa [RISCV] Enable SPLAT_VECTOR for fixed vXi64 types on RV32.
This replaces D98479.

This allows type legalization to form SPLAT_VECTOR_PARTS so we don't
lose the splattedness when the scalar type is split.

I'm handling SPLAT_VECTOR_PARTS for fixed vectors separately so
we can continue using non-VL nodes for scalable vectors.

I limited to RV32+vXi64 because DAGCombiner::visitBUILD_VECTOR likes
to form SPLAT_VECTOR before seeing if it can replace the BUILD_VECTOR
with other operations. Especially interesting is a splat BUILD_VECTOR of
the extract_vector_elt which can become a splat shuffle, but won't if
we form SPLAT_VECTOR first. We either need to reorder visitBUILD_VECTOR
or add visitSPLAT_VECTOR.

Reviewed By: frasercrmck

Differential Revision: https://reviews.llvm.org/D100803
2021-04-29 08:20:09 -07:00
Jun Ma 978eb3f168 [DAGCombiner] Allow operand of step_vector to be negative.
It is proper to relax non-negative limitation of step_vector.
Also this patch adds more combines for step_vector:
(sub X, step_vector(C)) -> (add X,  step_vector(-C))

Differential Revision: https://reviews.llvm.org/D100812
2021-04-22 20:58:03 +08:00
Fraser Cormack c141bd3cf9 [DAGCombiner] Support all-ones/all-zeros SPLAT_VECTOR in more combines
This patch adds incrementally-better support for SPLAT_VECTOR in a
handful of vector combines by changing a few more
isBuildVectorAllOnes/isBuildVectorAllZeros to the equivalent
isConstantSplatVectorAllOnes/Zeros calls.

Reviewed By: paulwalker-arm

Differential Revision: https://reviews.llvm.org/D100851
2021-04-21 11:05:37 +01:00
Jun Ma 1ef5699d1a [DAGCombiner] Support fold zero scalar vector.
This patch changes ISD::isBuildVectorAllZeros to
ISD::isConstantSplatVectorAllZeros which handles zero sclar vector.

TestPlan: check-llvm

Differential Revision: https://reviews.llvm.org/D100813
2021-04-20 16:28:43 +08:00
David Sherwood 83f5fa519e [CodeGen] Improve code generation for clamping of constant indices with scalable vectors
When trying to clamp a constant index into a scalable vector we can
test if the index is less than the minimum number of elements in the
vector. If so, we can simply return the index because we know it is
guaranteed to fit inside the vector.

Differential Revision: https://reviews.llvm.org/D100639
2021-04-19 08:34:17 +01:00
Jun Ma 7e1422c1e4 [DAGCombiner] Fold step_vector with add/mul/shl
This patch implements some DAG combines for STEP_VECTOR:
add step_vector(C1), step_vector(C2) -> step_vector(C1+C2)
add (add X step_vector(C1)), step_vector(C2) -> add X step_vector(C1+C2)
mul step_vector(C1), C2 -> step_vector(C1*C2)
shl step_vector(C1), C2 -> step_vector(C1<<C2)

TestPlan: check-llvm

Differential Revision: https://reviews.llvm.org/D100088
2021-04-15 18:06:35 +08:00
dfukalov d066079728 [NFC][AA] Prepare to convert AliasResult to class with PartialAlias offset.
Main reason is preparation to transform AliasResult to class that contains
offset for PartialAlias case.

Reviewed By: asbirlea

Differential Revision: https://reviews.llvm.org/D98027
2021-04-09 12:54:22 +03:00
Simon Pilgrim 77d625f8d8 [DAG] MergeInnerShuffle with BinOps - sometimes accept undef mask elements
If the inner shuffle already contains undef elements, then accept them in the merged shuffle as well.

This helps some X86 HADD/SUB patterns where slow targets were ending up with HADD/SUB because the (un)merged shuffles were stuck either side of the ADD/SUB - meaning we ended up with a total cost much higher than the "2*shuffle+add" that a slow target usually expands a HADD/SUB to.
2021-04-01 14:33:00 +01:00
Florian Hahn eb3d9f2eb6
[SelDag] Add isIntOrFPConstant helper function.
This patch adds a new isIntOrFPConstant  helper function to check if a
SDValue is a integer of FP constant. This pattern is used in various
places.

There also are places that incorrectly just check for integer constants,
e.g. D99384, so hopefully this helper will help people avoid that issue.

Reviewed By: RKSimon

Differential Revision: https://reviews.llvm.org/D99428
2021-03-28 12:48:58 +01:00
Craig Topper 30080b003e [DAGCombiner] Minor compile time improvement to (sext_in_reg (sign_extend_vector_inreg x)) optimization.
Don't bother calling ComputeNumSignBits if N00Bits < ExtVTBits. No
matter what answer we get back this will be true:
(N00Bits - DAG.ComputeNumSignBits(N00, DemandedSrcElts)) < ExtVTBits)

So we might as well save the computation. This makes the code more
consistent with the similar (sext_in_reg (sext x)) handling above.
2021-03-21 11:16:41 -07:00
Simon Pilgrim 64c2641c89 [DAG] Limit (sext_in_reg (zero_extend_vector_inreg x)) to exact sign extension
As commented by @craig.topper on rG1ba5c550d418, we can't guarantee that we'll be extending zero bits, just sign bit. So, revert to the old code for zero_extend_vector_inreg cases.
2021-03-21 14:01:37 +00:00
Simon Pilgrim ffb2887103 [DAG] Fold shuffle(bop(shuffle(x,y),shuffle(z,w)),undef) -> bop(shuffle'(x,y),shuffle'(z,w))
Followup to D96345, handle unary shuffles of binops (as well as binary shuffles) if we can merge the shuffle with inner operand shuffles.

Differential Revision: https://reviews.llvm.org/D98646
2021-03-19 14:14:56 +00:00
Craig Topper 182b831aeb [DAGCombiner][RISCV] Teach visitMGATHER/MSCATTER to remove gather/scatters with all zeros masks that use SPLAT_VECTOR.
Previously only all zeros BUILD_VECTOR was recognized.
2021-03-18 15:34:14 -07:00
Simon Pilgrim 1ba5c550d4 [DAG] Improve folding (sext_in_reg (*_extend_vector_inreg x)) -> (sext_vector_inreg x)
Extend this to support ComputeNumSignBits of the (used) source vector elements so that we can handle more than just the case where we're sext_in_reg from the source element signbit.

Noticed while investigating the poor codegen in D98587.
2021-03-18 15:34:53 +00:00
Craig Topper 5b825433d7 [DAGCombiner] Optimize 1-bit smulo to AND+SETNE.
A 1-bit smulo overflows is both inputs are -1 since the result
should be +1 which can't be represented in a signed 1 bit value.

We can detect this with an AND and a setcc. The multiply result
can also use the same AND.

Reviewed By: RKSimon

Differential Revision: https://reviews.llvm.org/D97634
2021-03-13 09:39:36 -08:00
Craig Topper 2ea7014089 [DAGCombiner] Use isConstantSplatVectorAllZeros/Ones instead of isBuildVectorAllZeros/Ones in visitMSTORE and visitMLOAD.
This allows us to optimize when the mask is a splat_vector in
addition to build_vector.
2021-03-12 12:14:56 -08:00
Sanjay Patel 415c67ba4c [SDAG] allow partial undef vector constants with select->logic folds
This is an enhancement suggested in the original review/commit:
D97730 / 7fce3322a2
2021-03-02 14:29:15 -05:00
Sanjay Patel 7fce3322a2 [SDAG] allow vector types for select->logic folds
This prepares codegen for a change that will remove the identical
folds from IR because they are not poison-safe. See
D93065 / D97360
for details.

We already generically support scalar types, and there are various
target-specific transforms that overlap the vector folds. For example,
x86 recognizes the and patterns, but not or. We can end up with 1
extra instruction there, but I think that is still preferred over the
blendv alternative that loads a constant vector.

If this is not optimal, then it should be fixed with a later transform
(this change is not expected to result in any regressions because
InstCombine currently does the same thing).

Removing custom code and supporting undefs in constant-pattern-matching
can be follow-up changes.

Differential Revision: https://reviews.llvm.org/D97730
2021-03-02 09:25:10 -05:00
Simon Pilgrim c0d4b44e6a [DAG] DAGCombiner::tryStoreMergeOfLoads - remove unused StartAddress variable. NFCI.
Noticed in "initialization is never read" clang-tidy warning - the only StartAddress set/used is inside the load combine loop.
2021-03-02 13:29:31 +00:00
Sanjay Patel 154c47dc06 [SDAG] add helper for select->logic folds; NFC
This set of transforms should be extended to handle vector types.
2021-03-01 16:24:15 -05:00