Commit Graph

19 Commits

Author SHA1 Message Date
Simon Pilgrim 4178e33470 [CostModel] Update RUN -passes=* to double quotes to appease update scripts on windows
DOS really doesn't like `` quotes to be used in command lines

Some prep work as I'm intending to resurrect D79483 soon
2022-08-10 17:54:06 +01:00
David Green 53be6ab25c [ARM] Fix MVE getShuffleCost legalized type check
The MVE shuffle costing for VREV instructions was making incorrect
assumptions as to legalized vector types remaining as vectors. Add a
quick check to ensure they are indeed vectors before attempting to get
the number of elements.
2022-06-07 14:36:04 +01:00
Arthur Eubanks 15ba588d6d [test] Migrate '-analyze -cost-model' to '-passes=print<cost-model>' 2022-02-09 15:42:16 -08:00
David Green 309f1e4ac8 [ARM] Add datalayout to costmodel tests. NFC
This adds a sensible datalayout to the ARM cost model tests, to prevent
the costs reported being incorrect for the size of pointers.
2021-11-16 09:49:42 +00:00
Simon Pilgrim 7397dcb403 [TTI] Add basic SK_InsertSubvector shuffle mask recognition
This patch adds an initial ShuffleVectorInst::isInsertSubvectorMask helper to recognize 2-op shuffles where the lowest elements of one of the sources are being inserted into the "in-place" other operand, this includes "concat_vectors" patterns as can be seen in the Arm shuffle cost changes. This also helped fix a x86 issue with irregular/length-changing SK_InsertSubvector costs - I'm hoping this will help with D107188

This doesn't currently attempt to work with 1-op shuffles that could either be a "widening" shuffle or a self-insertion.

The self-insertion case is tricky, but we currently always match this with the existing SK_PermuteSingleSrc logic.

The widening case will be addressed in a follow up patch that treats the cost as 0.

Masks with a high number of undef elts will still struggle to match optimal subvector widths - its currently bounded by minimum-width possible insertion, whilst some cases would benefit from wider (pow2?) subvectors.

Differential Revision: https://reviews.llvm.org/D107228
2021-08-02 11:23:44 +01:00
Sander de Smalen 4ca860742d [InstructionCost] Don't conflate Invalid costs with Unknown costs.
We previously made a change to getUserCost to return a Invalid cost
when one of the TTI costs returned '-1' (meaning 'unknown' or
'infinitely expensive'). It makes no sense to say that:

  shufflevector <2 x i8> %x, <2 x i8> %y, <4 x i32> <i32 0, i32 1, i32 2, i32 3>

has an invalid cost. Perhaps the cost is not known, but the IR is valid
and can be code-generated. Invalid should only be used for IR that
cannot possibly be code-generated and where a cost is nonsensical.

With more passes now asserting that the cost must be valid, it is possible
that those assertions will fail for perfectly valid IR. An incomplete
cost-model probably shouldn't be a reason for the compiler to break.

It's better to consider these costs as 'very expensive' and ignore them
for other reasons. At some point, we should consider replacing -1 with
some other mechanism.

Reviewed By: paulwalker-arm, dmgreen

Differential Revision: https://reviews.llvm.org/D99502
2021-03-30 09:29:42 +01:00
David Green a2e0312cda [ARM] Tone down the MVE scalarization overhead
The scalarization overhead was set deliberately high for MVE, whilst the
codegen was new. It helps protect us against the negative ramifications
of mixing scalar and vector instructions. This decreases that,
especially for floating point where the cost of extracting/inserting
lane elements can be low. For integer the cost is still fairly high due
to the cross-register-bank copy, but is no longer n^2 in the length of
the vector.

In general, this will decrease the cost of scalarizing floats and long
integer vectors. i64 increase in cost, having a high cost before and
after this patch. For floats this allows up to start doing things like
vectorizing fdiv instructions, even if they are scalarized.

Differential Revision: https://reviews.llvm.org/D98245
2021-03-19 18:30:11 +00:00
David Green 35e0567d58 [ARM] Add VREV MVE shuffle costs
This uses the shuffle mask cost from D98206 to give a better cost of MVE
VREV instructions. This helps especially in VectorCombine where the cost
of shuffles is used to reorder bitcasts, which this helps keep the phase
ordering test for fp16 reductions producing optimal code. The isVREVMask
has been moved to a header file to allow it to be used across target
transform and isel lowering.

Differential Revision: https://reviews.llvm.org/D98210
2021-03-17 21:21:43 +00:00
David Green 7c2e061188 [ARM] Extra vector shuffle tests of various kinds. NFC 2021-02-13 15:03:10 +00:00
Sam Parker f2675ab45f [ARM][CostModel] Implement getCFInstrCost
As with other targets, set the throughput cost of control-flow
instructions to free so that we don't miss out of vectorization
opportunities.

Differential Revision: https://reviews.llvm.org/D85283
2020-08-05 12:44:51 +01:00
Sam Parker 2596da3174 [CostModel] getCFInstrCost in getUserCost.
Have BasicTTI call the base implementation so that both agree on the
default behaviour, which the default being a cost of '1'. This has
required an X86 specific implementation as it seems to be very
reliant on those instructions being free. Changes are also made to
AMDGPU so that their implementations distinguish between cost kinds,
so that the unrolling isn't affected. PowerPC also has its own
implementation to prevent changes to the reg-usage vectorizer test.

The cost model test changes now reflect that ret instructions are not
generally free.

Differential Revision: https://reviews.llvm.org/D79164
2020-06-15 09:28:46 +01:00
David Green 587feec07e [ARM] Change all tests from "thumbv8.1-m.main" to "thumbv8.1m.main". NFC 2020-03-04 13:47:35 +00:00
David Green a655393f17 [ARM] Add MVE beats vector cost model
The MVE architecture has the idea of "beats", where a vector instruction can be
executed over several ticks of the architecture. This adds a similar system
into the Arm backend cost model, multiplying the cost of all vector
instructions by a factor.

This factor essentially becomes the expected difference between scalar code
and vector code, on average. MVE Vector instructions can also overlap so the a
true cost of them is often lower. But equally scalar instructions can in some
situations be dual issued, or have other optimisations such as unrolling or
make use of dsp instructions. The default is chosen as 2. This should not
prevent vectorisation is a most cases (as the vector instructions will still be
doing at least 4 times the work), but it will help prevent over vectorising in
cases where the benefits are less likely.

This adds things so far to the obvious places in ARMTargetTransformInfo, and
updates a few related costs like not treating float instructions as cost 2 just
because they are floats.

Differential Revision: https://reviews.llvm.org/D66005

llvm-svn: 368733
2019-08-13 18:12:08 +00:00
David Green 3e39f39ad9 [ARM] MVE shuffle broadcast costs
A VDUP will perform a vector broadcast in a single instruction. Update the cost
model for MVE accordingly.

Code originally by David Sherwood.

Differential Revision: https://reviews.llvm.org/D63448

llvm-svn: 368589
2019-08-12 16:54:07 +00:00
David Green 83bbfaa5e4 [ARM] Put some of the TTI costmodel behind hasNeon calls.
This puts some of the calls in ARMTargetTransformInfo.cpp behind hasNeon()
checks, now that we have MVE, and updates all the tests accordingly.

Differential Revision: https://reviews.llvm.org/D63447

llvm-svn: 368587
2019-08-12 15:59:52 +00:00
David Green 84cb4b2b53 [ARM] Add or update a number of costmodel tests. NFC
This adds a number of cost model tests for ARM, useful for MVE. It also re-jigs
some of the existing tests to make them easier to update and read.

llvm-svn: 368586
2019-08-12 15:40:27 +00:00
Simon Pilgrim 051feee5c2 Add BROADCAST shuffle cost tests.
Part of a lot of cleanup necessary before PR39368.

llvm-svn: 345023
2018-10-23 13:00:22 +00:00
Simon Pilgrim 8c3d87b8cf [ARM] Regenerate reverse shuffle costs
Came about while cleaning up general shuffle costs for PR39368

llvm-svn: 344966
2018-10-22 22:26:00 +00:00
Arnold Schwaighofer 89aef93841 ARM cost model: Add vector reverse shuffle costs
A reverse shuffle is lowered to a vrev and possibly a vext instruction (quad
word).

radar://13171406

llvm-svn: 174933
2013-02-12 02:40:39 +00:00