Commit Graph

750 Commits

Author SHA1 Message Date
Matthias Gehre c1502425ba Move TargetTransformInfo::maxLegalDivRemBitWidth -> TargetLowering::maxSupportedDivRemBitWidth
Also remove new-pass-manager version of ExpandLargeDivRem because there is no way
yet to access TargetLowering in the new pass manager.

Differential Revision: https://reviews.llvm.org/D133691
2022-09-12 17:06:16 +01:00
Simon Pilgrim 20ad05f9b4 [CostModel][X86] Add CostKinds handling for abs ops
This was achieved with an updated version of the 'cost-tables vs llvm-mca' script D103695
2022-09-12 16:34:37 +01:00
Simon Pilgrim bd0109f392 [CostModel][X86] Move AVX512/AVX2 uniform shift costs into the generic uniform cost tables
They shouldn't be happening after XOP shift costs - AVX2 shift supports takes preference over XOP for everything but vXi8 shifts - the improvement is pretty limited as it only affects bdver4 targets but it does help clean up a fraction of the messy shift cost logic....
2022-09-12 12:08:42 +01:00
Simon Pilgrim a931dbfbd3 [CostModel][X86] Merge AVX512BW vXi8/vXi16 shifts into default AVX512BW cost table
We only need to handle the uniform cases early
2022-09-10 18:18:42 +01:00
Simon Pilgrim 10edf88458 [CostModel][X86] Update CTPOP costs
With the bdver2 model updates, many of the AVX1 costs were far too high - it also helped expose some costs mismatches for Atom/Silvermont
2022-09-10 17:57:20 +01:00
Simon Pilgrim 55b78e28d8 [CostModel][X86] Add missing i8 throughput cost 2022-09-09 10:58:51 +01:00
Simon Pilgrim e74102a963 [CostModel][X86] Merge getTypeBasedIntrinsicInstrCost into getIntrinsicInstrCost
For the few non type based intrinsic cases we can just check for !isTypeBasedOnly() to access the args directly.

I don't think we have a need to keep getTypeBasedIntrinsicInstrCost in BasicTTIImpl.h any more and can do a similar merge there as well - but it's a messier refactor and will take a while.
2022-09-07 12:04:09 +01:00
Simon Pilgrim 648e182d92 [CostModel][X86] getIntrinsicInstrCost - convert to CostKindTblEntry
Begin the refactoring to use CostKindTblEntry and return real latency/codesize/sizelatency costs instead of reusing the throughput numbers

This should allow us to merge getTypeBasedIntrinsicInstrCost into getIntrinsicInstrCost and remove all remaining references
2022-09-06 22:05:32 +01:00
Simon Pilgrim 10e0f3e948 [CostModel][X86] Add CostKinds handling for ctpop ops
This was achieved with an updated version of the 'cost-tables vs llvm-mca' script D103695 (although it still struggles with avx512 predicate numbers which had to be done manually)

Some of the pre-AVX values still aren't great - atom/slm worst case numbers for ctpop expansion really affect these (especially throughput/latency), so we need to clean them up in a more consistent way - its a pity we don't have models for more older cpus (merom/nehalem etc.) as other examples.
2022-09-06 17:27:24 +01:00
Matthias Gehre 2090e85fee [llvm/CodeGen] Enable the ExpandLargeDivRem pass for X86, Arm and AArch64
This adds the ExpandLargeDivRem to the default pass pipeline.
The limit at which it expands div/rem instructions is configured
via a new TargetTransformInfo hook (default: no expansion)
X86, Arm and AArch64 backends implement this hook to expand div/rem
instructions with more than 128 bits.

Differential Revision: https://reviews.llvm.org/D130076
2022-09-06 15:32:04 +01:00
Simon Pilgrim 83552e8c72 [CostModel][X86] Add CostKinds handling for SSE FCMP_ONE/FCMP_UEQ predicates
These require special handling to account for their expansion in lowering.

I'm trying very hard not to have to add predicate specific costs - but it might be inevitable.....
2022-09-06 12:05:22 +01:00
Simon Pilgrim c1b5e36d74 [CostModel][X86] Add CostKinds handling for fcmp ops
This was achieved with an updated version of the 'cost-tables vs llvm-mca' script D103695 (although it still struggles with avx512 predicate numbers which had to be done manually)

SSE numbers are still too low for FCMP_ONE/FCMP_UEQ cases which expand to a more complex sequence than the existing 'ExtraCost' system can manage.
2022-09-06 10:34:53 +01:00
Simon Pilgrim 8534f51474 [CostModel][X86] Add CostKinds handling for sqrt intrinsicc
This was achieved using the 'cost-tables vs llvm-mca' script from D103695

Some of the znver1/znver2 latency/throughput numbers were really weird (some copy+paste afaict) - I've used the numbers from the AMD SoG, which roughly match the 'worst case' range value from Agner
2022-09-04 18:39:21 +01:00
Simon Pilgrim 626a84db47 [CostModel][X86] getTypeBasedIntrinsicInstrCost - convert to CostKindTblEntry
Begin the refactoring to use CostKindTblEntry and return real latency/codesize/sizelatency costs instead of reusing the throughput numbers
2022-09-04 17:59:08 +01:00
Simon Pilgrim 80d4b3a275 Revert rG06e73626cf0fc33b025a0f98f1eee4a302279982 "[CostModel][X86] getTypeBasedIntrinsicInstrCost - convert to CostKindTblEntry"
Some arm buildbots are complaining about a phase ordering test failure in unsigned-multiply-overflow-check.ll - I guess this test needs making x86 specific first
2022-09-04 17:51:11 +01:00
Simon Pilgrim 06e73626cf [CostModel][X86] getTypeBasedIntrinsicInstrCost - convert to CostKindTblEntry
Begin the refactoring to use CostKindTblEntry and return real latency/codesize/sizelatency costs instead of reusing the throughput numbers
2022-09-04 17:28:45 +01:00
Simon Pilgrim 59dbd6a0cf [CostModel][X86] Remove redundant AVX512 v64i8 shift costs
These are handled earlier (and more accurately) in AVX512BWShiftCostTable
2022-09-04 14:06:26 +01:00
Simon Pilgrim c444af1c20 [CostModel][X86] Add CostKinds handling for mul ops
This was achieved using the 'cost-tables vs llvm-mca' script D103695

Also fix a missing pmullw v16i16 half-rate throughput as znver1 double-pumps - matches numbers from AMD SoG + Agner
2022-09-04 11:59:05 +01:00
Simon Pilgrim 444685de06 [CostModel][X86] Adjust mul v4i32/v8i32 throughput cost
Based off the numbers from AMD SoG + Agner - vXi32 are both half-rate, and znver1 double-pumps the v8i32 op

We should have caught this earlier as many Intel models have half-rate pmulld already :-(
2022-09-03 18:45:08 +01:00
Simon Pilgrim 114b7762a9 [CostModel][X86] Add CostKinds handling for add/sub ops
This was achieved using the 'cost-tables vs llvm-mca' script D103695
2022-09-03 18:45:08 +01:00
Simon Pilgrim 5aee2726d8 [CostModel][X86] Add CostKinds handling for fdiv ops
This was achieved with an updated version of the 'cost-tables vs llvm-mca' script D103695

As we're using 'typical' worst case values, not all cost entries come from a single CPU - e.g. the latency/throughput from haswell but the size-latency(uops) from zen1/alderlake-e due to 'double pumping'

As the uop count (used for TCK_SizeAndLatency) for divss/divps is typically so low, we need to override isExpensiveToSpeculativelyExecute to ensure we keep fdiv calls behind branches - although for some very recent cpu targets it might not be necessary any more and could be relaxed.
2022-09-03 15:48:39 +01:00
Simon Pilgrim 1c12e12111 [CostModel][X86] Add fdiv(double) throughput x87 costs for 2022-09-03 14:08:25 +01:00
Simon Pilgrim 0735200e3f [CostModel][X86] Add CostKinds handling for fmul ops
This was achieved with an updated version of the 'cost-tables vs llvm-mca' script D103695

As we're using 'typical' worst case values, not all cost entries come from a single CPU - e.g. the latency/throughput from haswell but the size-latency(uops) from zen1/alderlake-e due to 'double pumping'
2022-09-03 10:42:20 +01:00
Simon Pilgrim 82090cb85e [CostModel][X86] Remove unused float x87 costs
We only need the double costs for SSE1 fallback
2022-09-03 09:59:20 +01:00
Simon Pilgrim 116d8f8cf0 Revert rG11765b77be84d793ebedc5b5436c463490746131 "[CostModel][X86] Add CostKinds handling for fmul ops"
I need to address some x87 codegen changes before re-committing this.
2022-09-02 17:21:25 +01:00
Simon Pilgrim 11765b77be [CostModel][X86] Add CostKinds handling for fmul ops
This was achieved with an updated version of the 'cost-tables vs llvm-mca' script D103695

As we're using 'typical' worst case values, not all cost entries come from a single CPU - e.g. the latency/throughput from haswell but the size-latency(uops) from zen1/alderlake-e due to 'double pumping'
2022-09-02 16:57:23 +01:00
Simon Pilgrim 6941b1f6c1 [CostModel][X86] Add CostKinds to SSE42 fadd/fsub/fneg ops
These were missed in an earlier commit, the latency/codesize/size-latency numbers aren't different from the SSE2 values that it was falling through to, hence no test change, but it did mean we were wasting a lookup.
2022-09-02 16:32:44 +01:00
Simon Pilgrim ad16f3e413 [CostModel][X86] Add CostKinds handling for fadd/fsub/fneg ops
This was achieved with an updated version of the 'cost-tables vs llvm-mca' script D103695 which I'll update shortly

As we're using 'typical' worst case values, not all cost entries come from a single CPU - e.g. the latency/throughput from haswell but the size-latency(uops) from zen1/alderlake-e due to 'double pumping'
2022-09-02 11:50:01 +01:00
Simon Pilgrim ad8e4dd2ad [CostModel][X86] Add and/or/xor general cost kinds support
Account for double-pumping on early AVX1/AVX2 targets
2022-08-31 17:26:05 +01:00
Simon Pilgrim 1209b9c2c2 [CostModel][X86] Replace CostKindCosts constructor with default values.
This improves static initialization of the cost tables and significantly speeds up MSVC compile time.
2022-08-31 10:44:44 +01:00
Simon Pilgrim 7830445086 [CostModel][X86] Account for add/sub 512-bit vector splitting costs on non-AVX512BW targets 2022-08-30 16:54:06 +01:00
Simon Pilgrim 0d0dc4e6ab [CostModel][X86] Add CodeSize handling for and/or/xor ops
Eventually this will be part of the cost table lookup
2022-08-26 18:42:52 +01:00
Simon Pilgrim f9445ae75c [CostModel][X86] Add CodeSize handling for fneg ops
Eventually this will be part of the cost table lookup
2022-08-26 17:34:52 +01:00
Simon Pilgrim 7790518c1f [CostModel][X86] getArithmeticInstrCost - use the cost tables for all cost kinds
The tables currently only have TCK_RecipThroughput costs, but we should now be able to add individual entries without any further refactoring
2022-08-26 17:34:52 +01:00
Simon Pilgrim fef3eeef48 [CostModel][X86] Convert AVX2 SRA by uniform constant to cost table
When adding cost kind support it will be easier to maintain these if we're not calculating on the fly
2022-08-26 16:14:13 +01:00
Simon Pilgrim f3590b6440 [CostModel][X86] getArithmeticInstrCost - move SLM reduceVMULWidth cost handling into the generic MUL handling
This is still SLM specific atm, but converting this to more closely match the codegen from reduceVMULWidth should be straightforward
2022-08-26 16:14:12 +01:00
Simon Pilgrim 9c29b4a0ac [CostModel][X86] Convert AVX1/SSE41 SREM/SDIV by constants to cost tables
When adding cost kind support it will be easier to maintain these if we're not calculating on the fly
2022-08-26 16:14:12 +01:00
Simon Pilgrim 9f94240fe1 [CostModel][X86] getArithmeticInstrCost - use cost kind specific look up tables
Building on D132216, use CostKindTblEntry cost tables to simplify the transition to supporting cost kinds other than recip-throughput

Adding full cost kinds support is going to take a while, but by converting to CostKindTblEntry first it will make it easier to support the costs on a per-ISD basis.
2022-08-26 14:28:35 +01:00
Simon Pilgrim 1736f76948 [CostModel][X86] getTypeBasedIntrinsicInstrCost - adjustTableCost - split CostTblEntry into ISD/Cost pair. NFC
This will be necessary to allow us to reuse this for other cost kind types
2022-08-26 11:17:22 +01:00
Simon Pilgrim 3edec9ba60 [CostModel][X86] Support cost kind specific look up tables (REAPPLIED)
Most of our cost model tables have been created assuming cost kind == recip-throughput. But we're starting to see passes wanting to get accurate costs for the other kinds as well. Some of these can be determined procedurally (e.g. codesize by default could just be the split count after type legalization), but others are going to need to be handled in cost tables - this is especially true for x86 which has so many ISA combinations.

I've created a 'CostKindCosts' struct which can hold cost values for the 4 cost kinds, defaulting to -1U for unknown cost, this can be used with the existing CostTblEntryT/CostTableLookup template code. I've also added a [TargetCostKind] accessor to make it much easier to look up individual <Optional> costs.

This just changes the ISD::SELECT costs to check the effect (and also to check that the ISD::SETCC are correctly handled for default/None cost kinds) - the plan would be to slowly extend this and move the CostKindTblEntry type somewhere generic to allow other targets to use it once its matured.

I'm also going to resurrect D103695 so that it can help with latency/codesize/sizelatency coverage testing.

For sizelatency - IIRC the definition was vague to let it be target specific - I've tried to use typical uop counts so they're comparable to MicroOpBufferSize etc.

REAPPLIED: Added early out to prevent getCmpSelInstrCost being used for anything but generic integer/float scalar/vector types - getTypeLegalizationCost can't handle the "exotic" TypeID enums that some passes attempt to get a costs for (aggregates etc.).

Differential Revision: https://reviews.llvm.org/D132216
2022-08-25 16:49:17 +01:00
Benjamin Kramer ab85996e47 Revert "[CostModel][X86] Support cost kind specific look up tables"
This reverts commit 45846854a2.

This triggers an assertion failure during Clang selfhost

Unknown type!
UNREACHABLE executed at llvm/lib/CodeGen/ValueTypes.cpp:548!
*** SIGABRT received by PID 6107 (TID 6107) on cpu 218 from PID 6107; stack trace: ***
    @     0x556c8827c2d1         64  llvm::llvm_unreachable_internal()
    @     0x556c82a5542a         32  llvm::MVT::getVT()
    @     0x556c82a54a28         80  llvm::EVT::getEVT()
    @     0x556c7dda1526         80  llvm::TargetLoweringBase::getValueType()
    @     0x556c8174dd38        112  llvm::BasicTTIImplBase<>::getTypeLegalizationCost()
    @     0x556c81755e72        144  llvm::X86TTIImpl::getCmpSelInstrCost()
    @     0x556c8174cadf        512  llvm::TargetTransformInfoImplCRTPBase<>::getInstructionCost()
    @     0x556c84ab4dd2         32  llvm::TargetTransformInfo::getInstructionCost()
    @     0x556c82ead283       1968  llvm::sinkRegion()
2022-08-25 15:42:44 +02:00
Simon Pilgrim 2e5f16516a [CostModel][X86] Add CodeSize handling for fdiv ops
Eventually this will be part of the cost table lookup
2022-08-25 14:08:03 +01:00
Simon Pilgrim 45846854a2 [CostModel][X86] Support cost kind specific look up tables
Most of our cost model tables have been created assuming cost kind == recip-throughput. But we're starting to see passes wanting to get accurate costs for the other kinds as well. Some of these can be determined procedurally (e.g. codesize by default could just be the split count after type legalization), but others are going to need to be handled in cost tables - this is especially true for x86 which has so many ISA combinations.

I've created a 'CostKindCosts' struct which can hold cost values for the 4 cost kinds, defaulting to -1U for unknown cost, this can be used with the existing CostTblEntryT/CostTableLookup template code. I've also added a [TargetCostKind] accessor to make it much easier to look up individual <Optional> costs.

This just changes the ISD::SELECT costs to check the effect (and also to check that the ISD::SETCC are correctly handled for default/None cost kinds) - the plan would be to slowly extend this and move the CostKindTblEntry type somewhere generic to allow other targets to use it once its matured.

I'm also going to resurrect D103695 so that it can help with latency/codesize/sizelatency coverage testing.

For sizelatency - IIRC the definition was vague to let it be target specific - I've tried to use typical uop counts so they're comparable to MicroOpBufferSize etc.

Differential Revision: https://reviews.llvm.org/D132216
2022-08-25 12:23:36 +01:00
Simon Pilgrim 9317e6311f [TTI] Add SK_Splice shuffle mask detection and X86 costs
Enables fixed sized vectors to detect SK_Splice shuffle patterns and provides basic X86 cost support

Differential Revision: https://reviews.llvm.org/D132374
2022-08-23 20:07:30 +01:00
Philip Reames df20ff9ae2 [TTI] Kill last couple uses of OperandValueKind in targets [nfc]
Use the accessor methods on the containing class instead so that we can change the representation.
2022-08-23 08:54:41 -07:00
Philip Reames c9608d57b8 [TTI] Plumb through OperandValueInfo in getMemoryOpCost [NFC]
This has the effect of exposing the power-of-two property for use in memory op costing, but no target actually uses it yet.  The main point of this change is simple consistency with the recently changes getArithmeticInstrCost, and to remove the last (interface) use of OperandValueKind.
2022-08-23 07:55:42 -07:00
Philip Reames 104fa367ee [TTI] Use OperandValueInfo in getArithmeticInstrCost implementation [NFC]
This change completes the process of replacing OperandValueKind and OperandValueProperties which were previously passed independently in this API with a single container class which contains both.

This is the change which motivated the whole sequence which preceeded it.  In an original spike version of this change, I'd noticed a nasty bug: I'd changed the signature without changing names, and as result, we silently passed additional information through a callsite which previously dropped the power-of-two fact.  This might be harmless in most cases, but at least a couple clearly dependend for correctness on not passing that property through.

I did my best to split off prior changes which reduced the scope of this one, and which made it possible to use compiler assistance.  For instance, every parameter which changes type in this change also changes name.  This was intentional to make sure that every call site possible effected must show up in the diff.  This let me audit each one closely.
2022-08-22 15:16:39 -07:00
Philip Reames 478cf94378 [X86][AArch64][WebAsm][RISCV] Query operand properties instead of using enums directly [nfc]
This is part of an ongoing transition to use OperandValueInfo which combines OperandValueKind and OperandValueProperties.  This change adds some accessor methods and uses them to simplify backend code.  The primary motivation of doing so is removing uses of the parameters so that an upcoming api change is less error prone.
2022-08-22 13:37:59 -07:00
Philip Reames 5e87a020a5 [X86][TTI] Rename OpNInfo to OpNKind [nfc]
Both are reasonable names; this is solely that an upcoming change can use the OpNInfo name, and the compiler can tell me if I forgot to update something (instead of silently passing along properties that might not hold.)
2022-08-22 13:37:59 -07:00
Simon Pilgrim dd5b48976c [CostModel][X86] getShuffleCost - treat SK_Splice as SK_PermuteTwoSrc
SK_Splice should be equivalent to a PALIGNR instruction etc. - but as discussed on D132308, until full fixed vector support for SK_Splice is in place, just assume its a SK_PermuteTwoSrc.
2022-08-22 10:51:08 +01:00