Commit Graph

1216 Commits

Author SHA1 Message Date
Florian Hahn 3fd1cc2574
[SLP] Add Preheader to CSE blocks after hoisting CSE-able instrs.
Adding the pre-header to CSEBlocks ensures instructions are CSE'd even
after hoisting.

This was original discovered by @atrick a while ago.

Reviewed By: ABataev

Differential Revision: https://reviews.llvm.org/D133649
2022-09-12 15:53:31 +01:00
Alexey Bataev dfe1e9dd79 [SLP]Improve reordering of clustered reused scalars.
If the reused scalars are clustered, i.e. each part of the reused mask
contains all elements of the original scalars exactly once, we can
reorder those clusters to improve the whole ordering of of the clustered
vectors.

Differential Revision: https://reviews.llvm.org/D133524
2022-09-12 06:52:25 -07:00
Kazu Hirata 9eca5ed790 [llvm] Use std::enable_if_t (NFC) 2022-09-03 11:17:44 -07:00
Alexey Bataev 982d9ef1c1 [SLP]Fix PR55734: SLP vectorizer's reduce_and formation introduces poison.
Need either follow the original order of the operands for bool logical
ops, or emit freeze instruction to avoid poison propagation.

Differential Revision: https://reviews.llvm.org/D126877
2022-09-01 05:34:45 -07:00
Alexey Bataev 588115c117 [SLP][NFC]Add a check for SelectInst to match description, NFC. 2022-08-31 13:04:21 -07:00
Alexey Bataev d8d9ee10bb [SLP][NFC]Fix comment and make function following naming standard, NFC. 2022-08-31 12:37:55 -07:00
Philip Reames 8524622bdc [SLP] Simplify getOperandInfo implementation and be consistent
This is NOT nfc.  Specifically, the following behavior changes:
* Pointers are now allowed.  Both uniform, and constants.
* FP uniform non-constants can now be recognized.
* FP undefs are no longer considered constant.  This matches int behavior which we had tests for.  FP behavior was untested.  Its not clear to me int behavior is reasonable, but it's what tests seem to expect, so go with minimum impact for now.
2022-08-31 12:24:05 -07:00
Fangrui Song 13f0795425 [SLPVectorizer] Fix -Wunused-lambda-capture in -DLLVM_ENABLE_ASSERTIONS=off build 2022-08-30 23:01:22 -07:00
Alexey Bataev ec06df9459 [SLP]Fix PR57447: Assertion `!getTreeEntry(V) && "Scalar already in tree!"' failed.
The pointer operands for the ScatterVectorize node may contain
non-instruction values and they are not checked for "already being
vectorized". Need to check that such pointers are already vectorized and
gather them instead of trying to build vectorize node to avoid compiler
crash.

Differential Revision: https://reviews.llvm.org/D132949
2022-08-30 12:30:14 -07:00
Alexey Bataev afbf5466ba [SLP]Improve operands kind analaysis for constants.
Removed EnableFP parameter in getOperandInfo function since it is not
needed, the operands kinds also controlled by the operation code, which
allows to remove extra check for the type of the operands. Also, added
analysis for uniform constant float values.

This change currently does not trigger any changes in the code since TTI
does not do analysis for constant floats, so it can be considered NFC.
Tested with llvm-test-suite + SPEC2017, no changes.

Differential Revision: https://reviews.llvm.org/D132886
2022-08-30 06:35:39 -07:00
Valery N Dmitriev 329b972d41 [SLP] Try to match reductions before trying to vectorize a vector build sequence.
This patch changes order of searching for reductions vs other vectorization possibilities.
The idea is if we do not match a reduction it won't be harmful for further attempts to
find vectorizable operations on a vector build sequences. But doing it in the opposite
order we have good chance to ruin opportunity to match a reduction later.
We also don't want to try vectorizing binary operations too early as 2-way vectorization
may effectively prohibit wider ones leading to producing less effective code.

Differential Revision: https://reviews.llvm.org/D132590
2022-08-29 13:32:14 -07:00
Alexey Bataev beacf9bd9e [SLP]Fix PR57322: vectorize constant float stores.
Stores for constant floats must be vectorized, improve analysis in SLP
vectorizer for stores.

Differential Revision: https://reviews.llvm.org/D132750
2022-08-29 11:02:53 -07:00
Alexey Bataev e6345bf644 [SLP]Improve lookup of the buildvector top insertelement instruction.
When estimating the cost of the in-tree vectorized scalars in
buildvector sequences, need to take into account the vectorized
insertelement instruction. The top of the buildvector seuences is the
topmost vectorized insertelement instruction, because it will have
> than 1 use after the vectorization.

For the affected test case improves througput from 21 to 16 (per
llvm-mca).

Differential Revision: https://reviews.llvm.org/D132740
2022-08-29 08:19:52 -07:00
Kazu Hirata 56ea4f9bd3 [Transforms] Qualify auto in range-based for loops (NFC)
Identified with readability-qualified-auto.
2022-08-27 21:21:02 -07:00
Valery N Dmitriev a4c8fb9d1f [SLP][NFC] Refactor SLPVectorizerPass::vectorizeRootInstruction method.
The goal is to separate collecting items for post-processing
and processing them. Post processing also outlined as
dedicated method.

Differential Revision: https://reviews.llvm.org/D132603
2022-08-24 17:07:53 -07:00
Philip Reames 49547b2241 [slp] Pull out a getOperandInfo variant helper [nfc] 2022-08-23 13:46:05 -07:00
Philip Reames 27d3321c4f [TTI] Use OperandValueInfo in getMemoryOpCost client api [nfc]
This removes the last use of OperandValueKind from the client side API, and (once this is fully plumbed through TTI implementation) allow use of the same properties in store costing as arithmetic costing.
2022-08-22 11:26:31 -07:00
Philip Reames 274f86e7a6 [TTI] Remove OperandValueKind/Properties from getArithmeticInstrCost interface [nfc]
This completes the client side transition to the OperandValueInfo version of this routine.  Backend TTI implementations still use the prior versions for now.
2022-08-22 11:06:32 -07:00
Philip Reames c42a5f1cc2 [TTI] Migrate getOperandInfo to OperandVaueInfo [nfc]
This is part of merging OperandValueKind and OperandValueProperties.
2022-08-22 10:19:02 -07:00
Philip Reames 5cd427106d [TTI] Start process of merging OperandValueKind and OperandValueProperties [nfc]
OperandValueKind and OperandValueProperties both provide facts about the operands of an instruction for purposes of cost modeling.  We've discussed merging them several times; before I plumb through more flags, let's go ahead and do so.

This change only adds the client side interface for getArithmeticInstrCost and makes a couple of minor changes in client code to prove that it works.  Target TTI implementations still use the split flags.  I'm deliberately splitting what could be one big change into a series of smaller ones so that I can lean on the compiler to catch errors along the way.
2022-08-22 09:48:15 -07:00
Simon Pilgrim 5263155d5b [CostModel] Add CostKind argument to getShuffleCost
Defaults to TCK_RecipThroughput - as most explicit calls were assuming TCK_RecipThroughput (vectorizers) or was just doing a before-vs-after comparison (vectorcombiner). Calls via getInstructionCost were just dropping the CostKind, so again there should be no change at this time (as getShuffleCost and its expansions don't use CostKind yet) - but it will make it easier for us to better account for size/latency shuffle costs in inline/unroll passes in the future.

Differential Revision: https://reviews.llvm.org/D132287
2022-08-21 10:54:51 +01:00
Kazu Hirata 8b1b0d1d81 Revert "Use std::is_same_v instead of std::is_same (NFC)"
This reverts commit c5da37e42d.

This patch seems to break builds with some versions of MSVC.
2022-08-20 23:00:39 -07:00
Kazu Hirata c5da37e42d Use std::is_same_v instead of std::is_same (NFC) 2022-08-20 22:36:26 -07:00
Kazu Hirata 258531b7ac Remove redundant initialization of Optional (NFC) 2022-08-20 21:18:28 -07:00
Philip Reames b0a2c48e9f [tti] Consolidate getOperandInfo without OperandValueProperties copies [nfc] 2022-08-19 16:22:22 -07:00
Alexey Bataev c167028684 [SLP]Delay vectorization of postponable values for instructions with no users.
SLP vectorizer tries to find the reductions starting the operands of the
instructions with no-users/void returns/etc. But such operands can be
postponable instructions, like Cmp, InsertElement or InsertValue. Such
operands still must be postponed, vectorizer should not try to vectorize
them immediately.

Differential Revision: https://reviews.llvm.org/D131965
2022-08-19 08:39:16 -07:00
Alexey Bataev 0e7ed32c71 [SLP]Cost for a constant buildvector.
In many cases constant buildvector results in a vector load from a
constant/data pool. Need to consider this cost too.

Differential Revision: https://reviews.llvm.org/D126885
2022-08-19 08:02:42 -07:00
Alexey Bataev d53e245951 [COST][NFC]Introduce OperandValueKind in getMemoryOpCost, NFC.
Added OperandValueKind OpdInfo parameter to getMemoryOpCost functions to
better estimate cost with immediate values.

Part of D126885.
2022-08-19 07:33:00 -07:00
Simon Pilgrim 594c5b1a42 [SLP] Update TODO comment about shuffle mask decoding
This is handled in ShuffleVectorInst/getShuffleCost - getInstructionThroughput is (slowly) being removed.
2022-08-17 11:41:46 +01:00
Alexey Bataev 65c7cecb13 [SLP]Fix PR51320: Try to vectorize single store operands.
Currently, we try to vectorize values, feeding into stores, only if
slp-vectorize-hor-store option is provided. We can safely enable
vectorization of the value operand of a single store in the basic block,
if the operand value is used only in store.
It should enable extra vectorization and should not increase compile
time significantly.
Fixes https://github.com/llvm/llvm-project/issues/51320

Differential Revision: https://reviews.llvm.org/D131894
2022-08-16 07:25:21 -07:00
Philip Reames e792a353b5 [slp] adjust debug output to include final computed cost 2022-08-15 13:51:39 -07:00
Alexey Bataev 2819126d0c [SLP][NFC]Replace multiple isa calls with single one where possible,
NFC.
2022-08-15 11:56:58 -07:00
Fangrui Song de9d80c1c5 [llvm] LLVM_FALLTHROUGH => [[fallthrough]]. NFC
With C++17 there is no Clang pedantic warning or MSVC C5051.
2022-08-08 11:24:15 -07:00
Kazu Hirata 0e37ef0186 [Transforms] Fix comment typos (NFC) 2022-08-07 23:55:24 -07:00
Dawid Jurczak 1bd31a6898 [NFC] Add SmallVector constructor to allow creation of SmallVector<T> from ArrayRef of items convertible to type T
Extracted from https://reviews.llvm.org/D129781 and address comment:
https://reviews.llvm.org/D129781#3655571

Differential Revision: https://reviews.llvm.org/D130268
2022-08-05 13:35:41 +02:00
Fangrui Song 7d6017fd31 [TTI] Change new getVectorInstrCost overload to use const reference after D131114
A const reference is preferred over a non-null const pointer.
`Type *` is kept as is to match the other overload.

Reviewed By: davidxl

Differential Revision: https://reviews.llvm.org/D131197
2022-08-04 15:16:51 -07:00
Mingming Liu bc8f2f3649 [AArch64][TTI][NFC] Overload method 'getVectorInstrCost' to provide vector instruction itself, as a context information for cost estimation.
1) Overloaded (instruction-based) method is a wrapper around the current (opcode-based) method.
2) This patch also changes a few callsites (VectorCombine.cpp,
   SLPVectorizer.cpp, CodeGenPrepare.cpp) to call the overloaded method.
3) This is a split of D128302.

Differential Revision: https://reviews.llvm.org/D131114
2022-08-04 12:58:25 -07:00
Kazu Hirata acf648b5e9 Use llvm::less_first and llvm::less_second (NFC) 2022-07-24 16:21:29 -07:00
William Schmidt bccc9aa81c Don't vectorize PHIs in catchswitch blocks
We currently assert in vectorizeTree(TreeEntry*) when processing a PHI
bundle in a block containing a catchswitch.  We attempt to set the
IRBuilder insertion point following the catchswitch, which is invalid.
This is done so that ShuffleBuilder.finalize() knows where to insert
a shuffle if one is needed.

To avoid this occurring, watch out for catchswitch blocks during
buildTree_rec() processing, and avoid adding PHIs in such blocks to
the vectorizable tree.  It is unlikely that constraining vectorization
over an exception path will cause a noticeable performance loss, so
this seems preferable to trying to anticipate when a shuffle will and
will not be required.
2022-07-19 06:10:17 -07:00
Kazu Hirata 7094ab4ee7 [llvm] Modernize bool literals (NFC)
Identified with modernize-use-bool-literals.
2022-07-17 18:08:51 -07:00
Kazu Hirata 611ffcf4e4 [llvm] Use value instead of getValue (NFC) 2022-07-13 23:11:56 -07:00
Craig Topper 0266773464 [SLP] Add missing space to optimization remark.
Reviewed By: vporpo

Differential Revision: https://reviews.llvm.org/D129330
2022-07-07 23:29:11 -07:00
David Green 2de05afc19 [SLP] Peek into loads when hitting the RecursionMaxDepth
This patch slightly extends the limit on the RecursionMaxDepth inside
the SLP vectorizer. It does it only when it hits a load (or zext/sext of
a load), which allows it to peek through in the places where it will be
the most valuable, without ballooning out the O(..) by any 2^n factors.

Differential Revision: https://reviews.llvm.org/D122148
2022-07-04 14:22:50 +01:00
Alexey Bataev 4be3fc35aa [SLP][NFC]Cleanup up operands of the removed insertelements, NFC.
Replace all operands of the insertelement instruction, replaced by
shuffles, by poisons to avoid false-positive reports about incorrect function.
2022-06-30 17:51:43 -07:00
Alexey Bataev bf4dcbd2df [SLP]Fix PR56251: Do not remove the reordering from the root node, being used as an operand.
If the root order itself does not require reordering, we can just
remove its reorder mask safely (e.g., if the root node is a vector of
phis). But if this node is used as an operand in the graph, we cannot
delete the reordering, need to keep it. Otherwise the graph nodes are
not synchronized with the operands. It may cause an extra gather
instruction(s) or a compiler crash.
Also, need to be very careful when selecting the gather nodes for
reordering since there might several gather nodes with the same scalars
and we can try to reorder just the same node many times instead of
different nodes.

Differential Revision: https://reviews.llvm.org/D128680
2022-06-28 13:42:05 -07:00
Guillaume Chatelet 3c126d5fe4 [Alignment] Replace commonAlignment with std::min
`commonAlignment` is a shortcut to pick the smallest of two `Align`
objects. As-is it doesn't bring much value compared to `std::min`.

Differential Revision: https://reviews.llvm.org/D128345
2022-06-28 07:15:02 +00:00
Kazu Hirata a7938c74f1 [llvm] Don't use Optional::hasValue (NFC)
This patch replaces Optional::hasValue with the implicit cast to bool
in conditionals only.
2022-06-25 21:42:52 -07:00
Kazu Hirata 3b7c3a654c Revert "Don't use Optional::hasValue (NFC)"
This reverts commit aa8feeefd3.
2022-06-25 11:56:50 -07:00
Kazu Hirata aa8feeefd3 Don't use Optional::hasValue (NFC) 2022-06-25 11:55:57 -07:00
Alexey Bataev 2faacf61a5 [SLP]Improve shuffles cost estimation where possible.
Improved/fixed cost modeling for shuffles by providing masks, improved
cost model for non-identity insertelements.

Differential Revision: https://reviews.llvm.org/D115462
2022-06-24 09:28:01 -07:00