Commit Graph

117 Commits

Author SHA1 Message Date
Florian Hahn e2f6290e06
[VectorCombine] Discard ScalarizationResult state in early exit.
ScalarizationResult's destructor makes sure ToFreeze is not ignored if
set. Currently, scalarizeLoadExtract has an early exit if the index is
not safe directly. But when it is SafeWithFreeze, we need to discard the
state first, otherwise we hit the assert in the destructor.

Fixes PR51992.
2021-09-28 12:52:16 +01:00
Florian Hahn 300870a95c
[VectorCombine] Switch to using a worklist.
This patch updates VectorCombine to use a worklist to allow iterative
simplifications where a combine enables other combines.

Suggested in D100302.

The main use case at the moment is foldSingleElementStore and
scalarizeLoadExtract working together to improve scalarization.

Note that we now also do not run SimplifyInstructionsInBlock on the
whole function if there have been changes. This means we fail to
remove/simplify instructions not related to any of the vector combines.
IMO this is fine, as simplifying the whole function seems more like a
workaround for not tracking the changed instructions.

Compile-time impact looks neutral:
NewPM-O3: +0.02%
NewPM-ReleaseThinLTO: -0.00%
NewPM-ReleaseLTO-g: -0.02%

http://llvm-compile-time-tracker.com/compare.php?from=52832cd917af00e2b9c6a9d1476ba79754dcabff&to=e66520a4637290550a945d528e3e59573485dd40&stat=instructions

Reviewed By: spatel, lebedev.ri

Differential Revision: https://reviews.llvm.org/D110171
2021-09-22 09:54:58 +01:00
Florian Hahn 5131037ea9
[ValueTracking,VectorCombine] Allow passing DT to computeConstantRange.
isValidAssumeForContext can provide better results with access to the
dominator tree in some cases. This patch adjusts computeConstantRange to
allow passing through a dominator tree.

The use VectorCombine is updated to pass through the DT to enable
additional scalarization.

Note that similar APIs like computeKnownBits already accept optional dominator
tree arguments.

Reviewed By: lebedev.ri

Differential Revision: https://reviews.llvm.org/D110175
2021-09-21 16:54:47 +01:00
Florian Hahn ea27dd7497
[VectorCombine] Add tests which require DT to use info from assumes. 2021-09-21 13:07:06 +01:00
Florian Hahn c24fc37e47
[VectorCombine] Support AND/UREM indices that require freezing.
38b098be66 limited scalarization to indices that are known non-poison.
For certain patterns that restrict the range of an index, we can insert
a freeze of the original value, to prevent propagation of poison.

Reviewed By: lebedev.ri

Differential Revision: https://reviews.llvm.org/D107580
2021-09-13 11:21:45 +01:00
Florian Hahn 38b098be66
[VectorCombine] Limit scalarization known non-poison indices.
We can only trust the range of the index if it is guaranteed
non-poison.

Fixes PR50949.

Reviewed By: lebedev.ri

Differential Revision: https://reviews.llvm.org/D107364
2021-08-05 15:36:31 +01:00
Florian Hahn 8d2a8ced00
[VectorCombine] Add additional tests with freeze combinations.
Suggested in D107364.
2021-08-05 14:08:07 +01:00
Florian Hahn ccf1038a92
[VectorCombine] Add tests where the index is guaranteed non-poison.
Tests for PR50949.
2021-08-03 16:42:30 +01:00
Simon Pilgrim f10d4cfc23 [VectorCombine] Fix PR30986 poison test case
Thanks @xbolva00!
2021-08-02 14:16:32 +01:00
Simon Pilgrim 7942e20fc8 [VectorCombine] Add PR30986 test case 2021-08-02 13:05:47 +01:00
Eli Friedman bdd55b2f18 Fix the default alignment of i1 vectors.
Currently, the default alignment is much larger than the actual size of
the vector in memory.  Fix this to use a sane default.

For SVE, temporarily remove lowering of load/store operations for
predicates with less than 16 elements. The layout the backend was
assuming for SVE predicates with less than 16 elements doesn't agree
with the frontend. More work probably needs to be done here.

This change is, strictly speaking, not backwards-compatible at the
bitcode level. But probably nobody is actually depending on that; i1
vectors in memory are rare, and the code that does use them probably
ends up forcing the alignment to something sane anyway.  If we think
this is a concern, I can restrict this to scalable vectors for now
(where it's actually causing issues for me at the moment).

Differential Revision: https://reviews.llvm.org/D88994
2021-07-31 14:09:59 -07:00
Roman Lebedev 48e9602c40
[NFC][VectorCombine] Load widening: add a few more negative tests 2021-07-21 15:21:37 +03:00
Roman Lebedev a0217bda38
[NFC][VectorCombine] Add tests for widening of partial vector load 2021-07-21 00:24:47 +03:00
Philip Reames e75a2dfe20 [tests] Stablize tests for possible change in deref semantics
There's a potential change in dereferenceability attribute semantics in the nearish future.  See llvm-dev thread "RFC: Decomposing deref(N) into deref(N) + nofree" and D99100 for context.

This change simply adds appropriate attributes to tests to keep transform logic exercised under both old and new/proposed semantics.  Note that for many of these cases, O3 would infer exactly these attributes on the test IR.

This change handles the idiomatic pattern of a dereferenceable object being passed to a call which can not free that memory.  There's a couple other tests which need more one-off attention, they'll be handled in another change.
2021-07-14 13:05:43 -07:00
Florian Hahn 96ca03493a
[VectorCombine] Limit scalarization to non-poison indices for now.
As Eli mentioned post-commit in D103378, the result of the freeze may
still be out-of-range according to Alive2. So for now, just limit the
transform to indices that are non-poison.
2021-06-14 16:40:14 +01:00
Roman Lebedev 20542b47d6
[VectorCombine] scalarizeLoadExtract(): use computeAlignmentAfterScalarization() helper
This results in slightly more optimistic alignments in some cases
2021-06-11 12:47:10 +03:00
Qiu Chaofan 2670c7dd5b [VectorCombine] Fix alignment in single element store
This fixes the concern in single element store scalarization that the
alignment of new store may be larger than it should be. It calculates
the largest alignment if index is constant, and a safe one if not.

Reviewed By: lebedev.ri, spatel

Differential Revision: https://reviews.llvm.org/D103419
2021-06-11 10:28:15 +08:00
Qiu Chaofan a115c5247f [NFC] Pre-commit tests for VectorCombine scalarize 2021-06-10 14:29:18 +08:00
Kerry McLaughlin 5db52751a5 [CostModel] Return an invalid cost for memory ops with unsupported types
Fixes getTypeConversion to return `TypeScalarizeScalableVector` when a scalable vector
type cannot be legalized by widening/splitting. When this is the method of legalization
found, getTypeLegalizationCost will return an Invalid cost.

The getMemoryOpCost, getMaskedMemoryOpCost & getGatherScatterOpCost functions already call
getTypeLegalizationCost and will now also return an Invalid cost for unsupported types.

Reviewed By: sdesmalen, david-arm

Differential Revision: https://reviews.llvm.org/D102515
2021-06-08 12:07:36 +01:00
Florian Hahn d4c070d801
[VectorCombine] Freeze index unless it is known to be non-poison.
If the index itself is already poison, the poison propagates through
instructions clamping the index to a valid range. This still causes
introducing a load of poison, as flagged by Alive2 and pointed out
at 575e2aff55.

This patch updates the code to freeze the index, unless it is proven to
not be poison.

Reviewed By: nlopes

Differential Revision: https://reviews.llvm.org/D103378
2021-06-01 10:40:57 +01:00
Florian Hahn f000c4cfb6
[VectorCombine] Add tests with multiple noundef indices for scalarization. 2021-06-01 10:17:50 +01:00
Florian Hahn 829978744d
[VectorCombine] Add tests with noundef index for load scalarization. 2021-05-30 12:15:41 +01:00
Florian Hahn 007f268c35
[VectorCombine] Check indices for all extracts we scalarize.
We need to make sure that the indices of all extracts we scalarize are
valid.
2021-05-28 18:35:29 +01:00
Florian Hahn f01df9805c
[VectorCombine] Add variants of multi-extract tests with assumes. 2021-05-28 18:35:24 +01:00
Florian Hahn a92376d297
[VectorCombine] Add test that combines load & store scalarization. 2021-05-25 14:28:37 +01:00
Florian Hahn 575e2aff55
[VectorCombine] Use constant range info for index scalarization legality.
We can only scalarize memory accesses if we know the index is valid.

This patch adjusts canScalarizeAcceess to fall back to
computeConstantRange to check if the index is known to be valid.

Reviewed By: nlopes

Differential Revision: https://reviews.llvm.org/D102476
2021-05-25 13:58:42 +01:00
Simon Pilgrim 1ad4f887bd [CostModel][X86] Improve accuracy of vector non-uniform shift costs on XOP/AVX2 targets
By llvm-mca analysis, Haswell/Broadwell has a non-uniform vector shift recip-throughput cost of the AVX2 targets at 2 for both 128 and 256-bit vectors - XOP capable targets have better 128-bit vector shifts so improve the fallback in those cases.
2021-05-24 14:18:21 +01:00
Florian Hahn d251d6f812
[VectorCombine] Fix load extract scalarization tests with assumes.
The input IR for @load_extract_idx_var_i64_known_valid_by_assume
and @load_extract_idx_var_i64_not_known_valid_by_assume_after_load
has been swapped.

This patch fixes the test so that @load_extract_idx_var_i64_known_valid_by_assume
has the assume before the load and the other test has it after.
2021-05-24 13:14:13 +01:00
Florian Hahn 4e8c28b6fb
Recommit "[VectorCombine] Scalarize vector load/extract."
This reverts commit 94d54155e2.

This fixes a sanitizer failure by moving scalarizeLoadExtract(I)
before foldSingleElementStore(I), which may remove instructions.
2021-05-24 11:35:07 +01:00
Florian Hahn 94d54155e2
Revert "[VectorCombine] Scalarize vector load/extract."
This reverts commit 86497785d5.

One of the tests causes an ASAN failure.
https://lab.llvm.org/buildbot/#/builders/5/builds/7927/steps/12/logs/stdio
2021-05-24 10:11:00 +01:00
Florian Hahn 86497785d5
[VectorCombine] Scalarize vector load/extract.
This patch adds a new combine that tries to scalarize chains of
`extractelement (load %ptr), %idx` to `load (gep %ptr, %idx)`. This is
profitable when extracting only a few elements out of a large vector.

At the moment, `store (extractelement (load %ptr), %idx), %ptr`
operations on large vectors result in huge code in the backend.

This can easily be triggered by using the matrix extension, e.g.
https://clang.godbolt.org/z/qsccPdPf4

This should complement D98240.

Reviewed By: spatel

Differential Revision: https://reviews.llvm.org/D100273
2021-05-24 09:29:08 +01:00
Florian Hahn 4efb4f674c [VectorCombine] Add positive test for scalarizing multiple extracts.
As suggested in D100273. Also adds an out-of-bound access test
2021-05-21 13:39:37 +01:00
Florian Hahn 2f69b78a57
[VectorCombine] Add tests with and & urem guaranteeing idx is valid. 2021-05-16 12:51:53 +01:00
Florian Hahn 7ba0e99aec
[VectorCombine] Add tests with assumes involvind variable index.
Add test cases with variable indices together with assumes guaranteeing
that the indices are valid.
2021-05-14 11:20:08 +01:00
Juneyoung Lee 395607af3c Reapply [ConstantFold] Fold more operations to poison
This was reverted to mitigate mitigate miscompiles caused by
the logical and/or to bitwise and/or fold. Reapply it now that
the underlying issue has been fixed by D101191.

-----

This patch folds more operations to poison.

Alive2 proof: https://alive2.llvm.org/ce/z/mxcb9G (it does not contain tests about div/rem because they fold to poison when raising UB)

Reviewed By: nikic

Differential Revision: https://reviews.llvm.org/D92270
2021-05-13 16:04:12 +02:00
Qiu Chaofan 6d2df18163 [VectorComine] Restrict single-element-store index to inbounds constant
Vector single element update optimization is landed in 2db4979. But the
scope needs restriction. This patch restricts the index to inbounds and
vector must be fixed sized. In future, we may use value tracking to
relax constant restrictions.

Reviewed By: fhahn

Differential Revision: https://reviews.llvm.org/D102146
2021-05-12 13:18:20 +08:00
Qiu Chaofan 2db4979c0f [VectorCombine] Simplify to scalar store if only one element updated
This patch simplifies load-insertelt-store pattern into
getelementptr-store.

Reviewed By: fhahn

Differential Revision: https://reviews.llvm.org/D98240
2021-05-08 18:14:51 +08:00
Florian Hahn 3219d981d4
[VectorCombine] Add tests for load/extract scalarization.
Add tests where scalarizing a vector load + extract is profitable.
2021-04-11 21:39:48 +01:00
Juneyoung Lee 06829034ca Revert "[ConstantFold] Fold more operations to poison"
This reverts commit 53040a968d due to its
bad interaction with select i1 -> and/or i1 transformation.

This fixes:
https://bugs.llvm.org/show_bug.cgi?id=49005
https://bugs.llvm.org/show_bug.cgi?id=48435
2021-02-04 00:24:02 +09:00
Juneyoung Lee 8871a4b4ca [Constant] Update ConstantVector::get to return poison if all input elems are poison
The diff was reviewed at D93994
2021-01-07 09:26:07 +09:00
Juneyoung Lee 278aa65cc4 [IR] Let IRBuilder's CreateVectorSplat/CreateShuffleVector use poison as placeholder
This patch updates IRBuilder to create insertelement/shufflevector using poison as a placeholder.

Reviewed By: nikic

Differential Revision: https://reviews.llvm.org/D93793
2020-12-30 04:21:04 +09:00
Juneyoung Lee ae6e89327b Precommit tests that have poison as shufflevector's placeholder
This commit copies existing tests at llvm/Transforms containing
'shufflevector X, undef' and replaces them with 'shufflevector X, poison'.
The new copied tests have *-inseltpoison.ll suffix at its file name
(as db7a2f347f did)
See https://reviews.llvm.org/D93793

Test files listed using

grep -R -E "^[^;]*shufflevector <.*> .*, <.*> undef" | cut -d":" -f1 | uniq

Test files copied & updated using

file_org=llvm/test/Transforms/$1
if [[ "$file_org" = *-inseltpoison.ll ]]; then
  file=$file_org
else
  file=${file_org%.ll}-inseltpoison.ll
  if [ ! -f $file ]; then
    cp $file_org $file
  fi
fi
sed -i -E 's/^([^;]*)shufflevector <(.*)> (.*), <(.*)> undef/\1shufflevector <\2> \3, <\4> poison/g' $file
head -1 $file | grep "Assertions have been autogenerated by utils/update_test_checks.py" -q
if [ "$?" == 1 ]; then
  echo "$file : should be manually updated"
  # The test is manually updated
  exit 1
fi
python3 ./llvm/utils/update_test_checks.py --opt-binary=./build-releaseassert/bin/opt $file
2020-12-29 17:09:31 +09:00
Juneyoung Lee db7a2f347f Precommit transform tests that have poison as insertelement's placeholder
This commit copies existing tests at llvm/Transforms and replaces
'insertelement undef' in those files with 'insertelement poison'.
(see https://reviews.llvm.org/D93586)

Tests listed using this script:

grep -R -E '^[^;]*insertelement <.*> undef,' . | cut -d":" -f1 | uniq |
wc -l

Tests updated:

file_org=llvm/test/Transforms/$1
file=${file_org%.ll}-inseltpoison.ll
cp $file_org $file
sed -i -E 's/^([^;]*)insertelement <(.*)> undef/\1insertelement <\2> poison/g' $file
head -1 $file | grep "Assertions have been autogenerated by utils/update_test_checks.py" -q
if [ "$?" == 1 ]; then
  echo "$file : should be manually updated"
  # I manually updated the script
  exit 1
fi
python3 ./llvm/utils/update_test_checks.py --opt-binary=./build-releaseassert/bin/opt $file
2020-12-24 11:46:17 +09:00
Sanjay Patel 47aaa99c0e [VectorCombine] allow peeking through GEPs when creating a vector load
This is an enhancement motivated by https://llvm.org/PR16739
(see D92858 for another).

We can look through a GEP to find a base pointer that may be
safe to use for a vector load. If so, then we shuffle (shift)
the necessary vector element over to index 0.

Alive2 proof based on 1 of the regression tests:
https://alive2.llvm.org/ce/z/yPJLkh

The vector translation is independent of endian (verify by
changing to leading 'E' in the datalayout string).

Differential Revision: https://reviews.llvm.org/D93229
2020-12-18 09:25:03 -05:00
Sanjay Patel 71a1b9fe76 [VectorCombine] add tests for gep load with cast; NFC 2020-12-17 16:40:55 -05:00
Sanjay Patel 46c331bf26 [VectorCombine] adjust test alignments for better coverage; NFC 2020-12-16 16:30:45 -05:00
Sanjay Patel 38ebc1a13d [VectorCombine] optimize alignment for load transform
Here's another minimal step suggested by D93229 / D93397 .
(I'm trying to be extra careful in these changes because
load transforms are easy to get wrong.)

We can optimistically choose the greater alignment of a
load and its pointer operand. As the test diffs show, this
can improve what would have been unaligned vector loads
into aligned loads.

When we enhance with gep offsets, we will need to adjust
the alignment calculation to include that offset.

Differential Revision: https://reviews.llvm.org/D93406
2020-12-16 15:25:45 -05:00
Sanjay Patel aaaf0ec72b [VectorCombine] loosen alignment constraint for load transform
As discussed in D93229, we only need a minimal alignment constraint
when querying whether a hypothetical vector load is safe. We still
pass/use the potentially stronger alignment attribute when checking
costs and creating the new load.

There's already a test that changes with the minimum code change,
so splitting this off as a preliminary commit independent of any
gep/offset enhancements.

Differential Revision: https://reviews.llvm.org/D93397
2020-12-16 12:25:18 -05:00
Sanjay Patel 8593e197bc [VectorCombine] add alignment test for gep load; NFC 2020-12-14 18:31:19 -05:00
Sanjay Patel d399f870b5 [VectorCombine] make load transform poison-safe
As noted in D93229, the transform from scalar load to vector load
potentially leaks poison from the extra vector elements that are
being loaded.

We could use freeze here (and x86 codegen at least appears to be
the same either way), but we already have a shuffle in this logic
to optionally change the vector size, so let's allow that
instruction to serve both purposes.

Differential Revision: https://reviews.llvm.org/D93238
2020-12-14 17:42:01 -05:00