Commit Graph

1274 Commits

Author SHA1 Message Date
Kazu Hirata 343de6856e [Transforms] Use std::nullopt instead of None (NFC)
This patch mechanically replaces None with std::nullopt where the
compiler would warn if None were deprecated.  The intent is to reduce
the amount of manual work required in migrating from Optional to
std::optional.

This is part of an effort to migrate from llvm::Optional to
std::optional:

https://discourse.llvm.org/t/deprecating-llvm-optional-x-hasvalue-getvalue-getvalueor/63716
2022-12-02 21:11:37 -08:00
Alexey Bataev 0cc15050a4 [SLP]Fix PR59230: Use actual vector factor when sorting entries.
When we sort entries for attempting to reorder scalars, need to use
actual vectorization factor, not the number of scalars. Otherwise the
compiler crashes, if the scalars has to be reordered.

Differential Revision: https://reviews.llvm.org/D138819
2022-11-29 06:46:06 -08:00
Qiongsi Wu f946c70130 [SLPVectorizer] Do Not Move Loads/Stores Beyond Stacksave/Stackrestore Boundaries
If left unchecked, the SLPVecrtorizer can move loads/stores below a stackrestore. The move can cause issues if the loads/stores have pointer operands from `alloca`s that are reset by the stackrestores. This patch adds the dependency check.

The check is conservative, in that it does not check if the pointer operands of the loads/stores are actually from `alloca`s that may be reset. We did not observe any SPECCPU2017 performance degradation so this simple fix seems sufficient.

The test could have been added to `llvm/test/Transforms/SLPVectorizer/X86/stacksave-dependence.ll`, but that test has not been updated to use opaque pointers. I am not inclined to add tests that still use typed pointers, or to refactor `llvm/test/Transforms/SLPVectorizer/X86/stacksave-dependence.ll` to use opaque pointers in this patch. If desired, I will open a different patch to refactor and consolidate the tests.

Reviewed By: ABataev

Differential Revision: https://reviews.llvm.org/D138585
2022-11-28 10:00:29 -05:00
Kazu Hirata 5fc8f6c37c [Vectorize] Use std::optional in SLPVectorizer.cpp (NFC)
This is part of an effort to migrate from llvm::Optional to
std::optional:

https://discourse.llvm.org/t/deprecating-llvm-optional-x-hasvalue-getvalue-getvalueor/63716
2022-11-26 18:03:49 -08:00
Alexey Bataev ac93b61165 [SLP]Fix PR59098: check if the vector type is scalarized for
extractelements.

If the resulting type is going to be scalarized, no need to adjust the
cost of removed extractelement and insert/extract subvector costs.
Otherwise, the compiler can crash because of the wrong type sizes.
2022-11-21 10:26:01 -08:00
Alexey Bataev 07015e12f0 [SLP]Fix PR59053: trying to erase instruction with users.
Need to count the reduced values, vectorized in the tree but not in the top node. Such scalars still must be extracted out of the vector node instead of the original scalar.
2022-11-17 17:23:48 -08:00
Alexey Bataev 9f9fdab9f1 [SLP]Fix PR58766: deleted value used after vectorization.
If same instruction is reduced several times, but in one graph is part
of buildvector sequence and in another it is vectorized, we may loose
information that it was part of buildvector and must be extracted from
later vectorized value.
2022-11-16 10:57:03 -08:00
Alexey Bataev 2f8f17c157 [SLP]Fix PR58956: fix insertpoint for reduced buildvector graphs.
If the graph is only the buildvector node without main operation, need
to inherit insrtpoint from the redution instruction. Otherwise the
compiler crashes trying to insert instruction at the entry block.
2022-11-16 07:38:49 -08:00
Alexey Bataev 0a33ceee01 [SLP]Fix a crash on analysis of the vectorized node.
Need to use advanced check for the same vectorized node to avoid
possible compiler crash. We may have 2 similar nodes (vector one and
gather) after graph nodes rotation, need to do extra checks for the
exact match.
2022-11-15 13:40:28 -08:00
OCHyams 139e08efc5 [Assignment Tracking][23/*] Account for assignment tracking in SLP Vectorizer
The Assignment Tracking debug-info feature is outlined in this RFC:

https://discourse.llvm.org/t/
rfc-assignment-tracking-a-better-way-of-specifying-variable-locations-in-ir

The SLP-Vectorizer can merge a set of scalar stores into a single vectorized
store. Merge DIAssignID intrinsics from the scalar stores onto the new
vectorized store.

Reviewed By: jmorse

Differential Revision: https://reviews.llvm.org/D133320
2022-11-15 15:20:18 +00:00
Alexey Bataev b505fd559d [SLP]Redesign vectorization of the gather nodes.
Gather nodes are vectorized as simply vector of the scalars instead of
relying on the actual node. It leads to the fact that in some cases
we may miss incorrect transformation (non-matching set of scalars is
just ended as a gather node instead of possible vector/gather node).
Better to rely on the actual nodes, it allows to improve stability and
better detect missed cases.

Differential Revision: https://reviews.llvm.org/D135174
2022-11-10 10:59:54 -08:00
Alexey Bataev b5d91ab73e [SLP]Fix PR58863: Mask index beyond mask size for non-power-2 insertelement analysis.
Need to check if the insertelement mask size is reached during cost analysis to avoid compiler crash.

Differential Revision: https://reviews.llvm.org/D137639
2022-11-08 07:54:57 -08:00
skc7 42bce72536 Reapply "[SLP] Extend reordering data of tree entry to support PHInodes".
Reapplies 87a2086 (which was reverted in 656f1d8).
Fix for scalable vectors in getInsertIndex merged in 46d53f4.

Reviewed By: ABataev

Differential Revision: https://reviews.llvm.org/D137537
2022-11-08 21:21:28 +05:30
Nathan James 6aa050a690 Reland "[llvm][NFC] Use c++17 style variable type traits"
This reverts commit 632a389f96.

This relands commit
1834a310d0.

Differential Revision: https://reviews.llvm.org/D137493
2022-11-08 14:15:15 +00:00
Nathan James 632a389f96 Revert "[llvm][NFC] Use c++17 style variable type traits"
This reverts commit 1834a310d0.
2022-11-08 13:11:41 +00:00
skc7 46d53f45d8 [SLP][NFC] Restructure getInsertIndex
Reviewed By: ABataev

Differential Revision: https://reviews.llvm.org/D137567
2022-11-08 18:07:50 +05:30
Nathan James 1834a310d0
[llvm][NFC] Use c++17 style variable type traits
This was done as a test for D137302 and it makes sense to push these changes

Reviewed By: dblaikie

Differential Revision: https://reviews.llvm.org/D137493
2022-11-08 12:22:52 +00:00
skc7 9d96feb19b [SLP][NFC] Restructure areTwoInsertFromSameBuildVector
Reviewed By: ABataev

Differential Revision: https://reviews.llvm.org/D137569
2022-11-08 09:32:19 +05:30
Alexey Bataev ecd0b5a532 Revert "[SLP]Redesign vectorization of the gather nodes."
This reverts commit 8ddd1ccdf8 to fix
buildbots failures reported in https://lab.llvm.org/buildbot#builders/74/builds/14839
2022-11-07 08:35:21 -08:00
Alexey Bataev 8ddd1ccdf8 [SLP]Redesign vectorization of the gather nodes.
Gather nodes are vectorized as simply vector of the scalars instead of
relying on the actual node. It leads to the fact that in some cases
we may miss incorrect transformation (non-matching set of scalars is
just ended as a gather node instead of possible vector/gather node).
Better to rely on the actual nodes, it allows to improve stability and
better detect missed cases.

Differential Revision: https://reviews.llvm.org/D135174
2022-11-07 07:04:38 -08:00
David Green 656f1d8b74 Revert "[SLP] Extend reordering data of tree entry to support PHI nodes"
This reverts commit 87a20868eb as it has
problems with scalable vectors and use-list orders. Test to follow.
2022-11-06 11:43:51 +00:00
Alexey Bataev f090e3c00f [SLP]Fix write after bounds.
Need to use comma instead of + symbol to prevent writing after bounds.
2022-11-03 05:30:41 -07:00
Alexey Bataev 8b015b2078 [SLP][NFC]Formatting and reduce number of iterations, NFC. 2022-11-03 05:30:13 -07:00
skc7 87a20868eb [SLP] Extend reordering data of tree entry to support PHI nodes
Reviewed By: ABataev

Differential Revision: https://reviews.llvm.org/D136757
2022-11-01 04:50:04 +00:00
Alexey Bataev 99f9bd4807 [SLP]Fix a crash in the analysis of the compatible cmp operands.
We can skip the analysis of the operands opcodes, can compare directly
them in some cases.
2022-10-31 09:47:25 -07:00
Alexey Bataev 2ec51f1c75 [SLP]Improve analysis of same/alternate code ops and scheduling.
Should improve compile time for analysis and vectorization.

Metric: SLP.NumVectorInstructions

Program                                                                                       SLP.NumVectorInstructions
test-suite :: External/SPEC/CINT2017speed/623.xalancbmk_s/623.xalancbmk_s.test  6380.00                   6378.00 -0.0%
test-suite :: External/SPEC/CINT2017rate/523.xalancbmk_r/523.xalancbmk_r.test   6380.00                   6378.00 -0.0%
test-suite :: External/SPEC/CINT2006/483.xalancbmk/483.xalancbmk.test           2023.00                   2022.00 -0.0%
test-suite :: External/SPEC/CINT2006/471.omnetpp/471.omnetpp.test               148.00                    146.00 -1.4%

Generated more vector instructions.

Differential Revision: https://reviews.llvm.org/D127531
2022-10-27 16:29:16 -07:00
Alexey Bataev 8ce0c7b1c9 Revert "[SLP]Improve analysis of same/alternate code ops and scheduling."
This reverts commit dad64448c6 to fix
a crash in https://lab.llvm.org/buildbot/#/builders/74/builds/14584
2022-10-27 15:21:35 -07:00
Alexey Bataev dad64448c6 [SLP]Improve analysis of same/alternate code ops and scheduling.
Should improve compile time for analysis and vectorization.

Metric: SLP.NumVectorInstructions

Program                                                                                       SLP.NumVectorInstructions
test-suite :: External/SPEC/CINT2017speed/623.xalancbmk_s/623.xalancbmk_s.test  6380.00                   6378.00 -0.0%
test-suite :: External/SPEC/CINT2017rate/523.xalancbmk_r/523.xalancbmk_r.test   6380.00                   6378.00 -0.0%
test-suite :: External/SPEC/CINT2006/483.xalancbmk/483.xalancbmk.test           2023.00                   2022.00 -0.0%
test-suite :: External/SPEC/CINT2006/471.omnetpp/471.omnetpp.test               148.00                    146.00 -1.4%

Generated more vector instructions.

Differential Revision: https://reviews.llvm.org/D127531
2022-10-27 11:31:18 -07:00
Alexey Bataev da4e0f7ac5 [SLP][NFC]Fix PR58476: Fix compile time for reductions, NFC.
Improve O(N^2) to O(N) in some cases, reduce number of allocations by
reserving memory.
Also, improve analysis of loads reduction values to avoid analysis
of not vectorizable cases.
2022-10-24 10:13:24 -07:00
Alexey Bataev b8b740c834 [SLP][NFC]Remove unused variable, NFC. 2022-10-19 12:35:27 -07:00
Alexey Bataev 087dadfd37 [SLP]Generalize cost model.
Generalized the cost model estimation. Improved cost model estimation
for repeated scalars (no need to count their cost anymore), improved
  cost model for extractelement instructions.

cpu2017
   511.povray_r             0.57
   520.omnetpp_r           -0.98
   521.wrf_r               -0.01
   525.x264_r               3.59 <+
   526.blender_r           -0.12
   531.deepsjeng_r         -0.07
   538.imagick_r           -1.42
Geometric mean:  0.21

Differential Revision: https://reviews.llvm.org/D115757
2022-10-18 11:55:59 -07:00
Alexey Bataev 62267e8de0 Revert "[SLP]Generalize cost model."
This reverts commit f12fb91188 and
f5c747bfbe to fix detected non-initialized
var use.
2022-10-18 11:25:59 -07:00
Alexey Bataev f5c747bfbe [SLP][NFC]Fix a warning for ?: with enum/unsigned, NFC. 2022-10-18 10:08:05 -07:00
Alexey Bataev f12fb91188 [SLP]Generalize cost model.
Generalized the cost model estimation. Improved cost model estimation
for repeated scalars (no need to count their cost anymore), improved
  cost model for extractelement instructions.

cpu2017
   511.povray_r             0.57
   520.omnetpp_r           -0.98
   521.wrf_r               -0.01
   525.x264_r               3.59 <+
   526.blender_r           -0.12
   531.deepsjeng_r         -0.07
   538.imagick_r           -1.42
Geometric mean:  0.21

Differential Revision: https://reviews.llvm.org/D115757
2022-10-18 08:49:32 -07:00
Alexey Bataev e79532d28c [SLP][NFC]Try to fix MSVC buildbots with a workaround, NFC. 2022-10-18 07:50:10 -07:00
Alexey Bataev 6a6fc4890d [SLP][NFC]Formatting of the getEntryCost function, NFC. 2022-10-18 07:18:26 -07:00
Kazu Hirata b2f41e9ac1 [Vectorize] Use std::conditional_t (NFC) 2022-10-15 14:52:25 -07:00
Alexey Bataev c787986cdd [SLP]Improve costs of vectorized loads/stores by analyzing GEPs.
When generating masked gathers nodes, SLP vectorizer accounts the cost
of the GEPs for loads as part of the scalar-vector transformation cost
estimation. But it does not do it for vectorized loads/stores, while it
may completely remove some of the GEPs completely. Because of this in
some cases masked gather operation can be much more profitable rather
than regular vectorization (masked-gather cost + vector GEP - scalar
loads + GEPs comparing to vectorized loads - scalar loads).
Added the analysis of the removed scalarGEPs for vectorized load/store nodes for better cost estimation.

Differential Revision: https://reviews.llvm.org/D135282
2022-10-13 07:20:41 -07:00
Alexey Bataev d71ad41080 [SLP]Fix insertpoint of the extractellements instructions to avoid reshuffle crash.
Need to set the insertpoint for extractelement to point to the first
instruction in the node to avoid possible crash during external uses
combine  process. Without it we may endup with the incorrect
transformation.

Differential Revision: https://reviews.llvm.org/D135591
2022-10-12 08:18:30 -07:00
Jordan Rupprecht cbae57c0e1 [NFC] Ignore unused var in no-asserts builds 2022-10-12 08:11:10 -07:00
Alexey Bataev 1be3428ea0 [SLP]Fix PR58177: Improve isUndefVector function to avoid extra freeze.
Freeze instruction in some cases makes codegen worse, so need to be very
careful when emitting it. Instead improve analysis in isUndefVector
function to generate mask of unused elements and use it in the analysis.

Differential Revision: https://reviews.llvm.org/D135382
2022-10-12 07:32:54 -07:00
Alexey Bataev 323ed2308a [SLP]Improve/fix CSE analysis of the blocks/instructions.
Added analysis for invariant extractelement instructions and improved
detection of the CSE blocks for generated extractelement instructions.

Differential Revision: https://reviews.llvm.org/D135279
2022-10-06 12:08:48 -07:00
Alexey Bataev ab9a81f736 [SLP]Try to emit canonical shuffle with undef operand.
In the canonical form of the shuffle the poison/undef operand is the
second operand, the patch tries to emit canonical form for partial
vectorization of the buildvector sequence.
Also, this patch starts emitting freeze instruction for shuffles with undef indices if the second shuffle operan is undef, not poison. It is an initial step to D93818, where undef mask element are treated as returning poison value.

Differential Revision: https://reviews.llvm.org/D134377
2022-10-04 08:16:07 -07:00
Simon Pilgrim 5849fcb635 Revert rG1b7089fe67b924bdd5ecef786a34bdba7a88778f "[SLP] Add ScalarizationOverheadBuilder helper to track vector extractions"
Revert rGef89409a59f3b79ae143b33b7d8e6ee6285aa42f "Fix 'unused-lambda-capture' gcc warning. NFCI."
Revert rG926ccfef032d206dcbcdf74ca1e3a9ebf4d1be45 "[SLP] ScalarizationOverheadBuilder - demand all elements for scalarization if the extraction index is unknown / out of bounds"

Revert ScalarizationOverheadBuilder sequence from D134605 - when accumulating extraction costs by Type (instead of specific Value), we are not distinguishing enough when they are coming from the same source or not, and we always just count the cost once. This needs addressing before we can use getScalarizationOverhead properly.
2022-09-30 11:22:48 +01:00
Simon Pilgrim 926ccfef03 [SLP] ScalarizationOverheadBuilder - demand all elements for scalarization if the extraction index is unknown / out of bounds
Workaround for a chromium bug reported on D134605 - test case will be added later
2022-09-28 11:03:37 +01:00
Simon Pilgrim ef89409a59 Fix 'unused-lambda-capture' gcc warning. NFCI. 2022-09-27 15:15:43 +01:00
Simon Pilgrim 1b7089fe67 [SLP] Add ScalarizationOverheadBuilder helper to track vector extractions
Instead of accumulating all extraction costs separately and then adjusting for repeated subvector extractions, this patch collects all the extractions and then converts to calls to getScalarizationOverhead to improve the accuracy of the costs.

I'm not entirely satisfied with the getExtractWithExtendCost handling yet - this still just adds all the getExtractWithExtendCost costs together - it really needs to be replaced with a "getScalarizationOverheadWithExtend", but that will require further refactoring first.

This replaces my initial attempt in D124769.

Differential Revision: https://reviews.llvm.org/D134605
2022-09-27 14:49:07 +01:00
Philip Reames 954c1ed009 [SLP] Adjust debug output for store vectorization failure
When store vectorization is infeasible, it's helpful to have a debug logging indication of why.  A case I've hit a couple times now is accidentally using -march instead of -mtriple and getting the default TTI results.  This causes max-vf to become 1, and thus hits the added logging line.
2022-09-23 09:58:15 -07:00
Philip Reames 42ef572049 [SLP] Fix cost model w.r.t. operand properties
We allow the target to report different costs depending on properties of the operands; given this, we have to make sure we pass the right set of operands and account for the fact that different scalar instructions can have operands with different properties.

As a motivating example, consider a set of multiplies which each multiply by a constant (but not all the same constant). Most of the constants are power of two (but not all).

If the target doesn't have support for non-uniform constant immediates, this will likely require constant materialization and a non-uniform multiply. However, depending on the balance of target costs for constant scalar multiplies vs a single vector multiply, this might or might not be a profitable vectorization.

This ends up basically being a rewrite of the existing code. Normally, I'd scope the change more narrowly, but I kept noticing things which seemed highly suspicious, and none of the existing code appears to have any test coverage at all. I think this is a case where simply throwing out the existing code and starting from scratch is reasonable.

This is a follow on to Alexey's D126885, but also handles the arithmetic instruction case since the existing code appears to have the same problem.

Differential Revision: https://reviews.llvm.org/D132566
2022-09-23 08:40:23 -07:00
Simon Pilgrim a6e9141505 [TTI] Add OperandValueProperties::OP_NegatedPowerOf2 enum (PR51436)
The mul by constant costmodels handle power-of-2 constants, but not negated-power-of-2, despite the backends handling both.

This patch adds the OperandValueProperties::OP_NegatedPowerOf2 enum and wires it for use for basic mul cost analysis and SLP handling.

Fixes #50778

Differential Revision: https://reviews.llvm.org/D111968
2022-09-23 14:03:18 +01:00