Commit Graph

17 Commits

Author SHA1 Message Date
Simon Pilgrim cbfa857346 [CostModel][X86] Adjust 128-bit select costs to account for slow BLENDV op
Based off the script from D103695 - Jaguar, Bulldozer, Silvermont (et al) and Haswell all have slow BLENDV ops, so adjust the worse case cost values
2022-05-06 13:07:34 +01:00
Simon Pilgrim f0e8c1d6d9 [CostModel][X86] Adjust 256-bit select costs to account for slow BLENDV op
Based off the script from D103695, on AVX1, Jaguar/Bulldozer both have low throughput for ymm select patterns (BLENDV + OR(AND,ANDN))), and even on AVX2 Haswell still struggles with BLENDV ops
2022-05-06 11:27:37 +01:00
Simon Pilgrim c2964746e3 [CostModel][X86] Reduce cost of vector selects on SSE2/AVX1 targets
Based off the script from D103695, we were exaggerating the cost of the OR(AND(X,M),AND(Y,~M)) expansion using instruction count instead of effective throughput
2022-05-01 09:32:14 +01:00
Simon Pilgrim d663166acb [CostModel][X86] Reduce cost of v2i64 icmp base cost on SSE2 targets
Based off the script from D103695, we were exaggerating the cost of the v2i64 comparison expansion using instruction count instead of effective throughput
2022-03-30 09:11:55 +01:00
Simon Pilgrim 4455c5cdea [CostModel][X86] Update RUN -passes=* to double quotes to appease update scripts on windows 2022-03-18 11:44:18 +00:00
Arthur Eubanks 15ba588d6d [test] Migrate '-analyze -cost-model' to '-passes=print<cost-model>' 2022-02-09 15:42:16 -08:00
Simon Pilgrim 716883736b [CostModel][TTI] Replace BAD_ICMP_PREDICATE with ICMP_SGT for generic sadd/ssub sat cost expansion
The comparison always checks for negative values so know the icmp predicate will be ICMP_SGT
2021-10-07 15:42:45 +01:00
Simon Pilgrim 0776924a17 [CostModel][X86] getCmpSelInstrCost - treat BAD_PREDICATEs the same as the worst case cost predicates for ICMP/FCMP instructions
As suggested on D111024, we should treat getCmpSelInstrCost calls without a specific predicate as matching the worst case predicate cost.

These regressions will be addressed with a mixture of D111024 and fixing other specific getCmpSelInstrCost calls to have realistic predicates.
2021-10-06 10:14:56 +01:00
Craig Topper 765348298c [CostModel] Update default cost model for sadd/ssub overflow to match TargetLowering
The expansion for these was updated in https://reviews.llvm.org/D47927 but the cost model was not adjusted.

I believe the cost model was also incorrect for the old expansion.
The expansion prior to D47927 used 3 icmps using LHS, RHS, and Result
to calculate theirs signs. Then 2 icmps to compare the signs. Followed
by an And. The previous cost model was using 3 icmps and 2 selects.
Digging back through git blame, those 2 selects in the cost model used to
be 2 icmps, but were changed in https://reviews.llvm.org/D90681

Differential Revision: https://reviews.llvm.org/D110739
2021-09-30 09:41:14 -07:00
Simon Pilgrim e11195d0a9 [CostModel][X86] Remove unused CHECK prefixes
Allows us to remove the "CHECK: {{^}}" hack and help simplify D91275
2020-11-13 17:31:48 +00:00
Sanjay Patel 3c050a597c [CostModel] fix cost calc bug for sadd/ssub with overflow
As noted in D90554, there's an opcode typo in using an easily
misused cost model API: getCmpSelInstrCost(). Beyond that, the
assumed sequence of ops is questionable, but that would be
another patch.

My guess is that the x86 test diffs show that we are probably
wrong both before and after this change, so there will be no
practical difference.
As an example, I tried this test which shows a cost of '7'
either way:

  define <4 x i32> @sadd(<4 x i32> %va, <4 x i32> %vb) {
    %V4I32  = call {<4 x i32>, <4 x i1>}  @llvm.sadd.with.overflow.v4i32(<4 x i32> %va, <4 x i32> %vb)
    %ov = extractvalue {<4 x i32>, <4 x i1>} %V4I32, 1
    %r = extractvalue {<4 x i32>, <4 x i1>} %V4I32, 0
    %z = select <4 x i1> %ov, <4 x i32> <i32 42, i32 42, i32 42, i32 42>, <4 x i32> %r
    ret <4 x i32> %z
  }

  $ llc -o - sadd.ll -mattr=avx
        vpaddd  %xmm1, %xmm0, %xmm2
        vpcmpgtd        %xmm2, %xmm0, %xmm0
        vpxor   %xmm0, %xmm1, %xmm0
        vblendvps       %xmm0, LCPI0_0(%rip), %xmm2, %xmm0a

Differential Revision: https://reviews.llvm.org/D90681
2020-11-03 11:03:47 -05:00
Fangrui Song 7979f24954 [test] Fix some unused check prefixes in test/Analysis/CostModel/X86 2020-10-31 23:29:57 -07:00
Simon Pilgrim eaa41e103c [CostModel][X86] Try to check against common prefixes before using target-specific cpu checks
SLM/GLM is still a mess so not all of them have been updated yet.
2020-02-24 11:59:07 +00:00
Simon Pilgrim d7f0207d73 [CostModel][X86] Fix SLM <2 x i64> icmp costs
SLM is 2 x slower for <2 x i64> comparison ops than other vector types, we should account for this like we do for SLM <2 x i64> add/sub/mul costs.

This should remove some of the SLM codegen diffs in D43582

llvm-svn: 372954
2019-09-26 10:14:38 +00:00
Simon Pilgrim adca820927 [TTI] Add generic SADDSAT/SSUBSAT costs
Add generic costs calculation for SADDSAT/SSUBSAT intrinsics, this uses generic costs for sadd_with_overflow/ssub_with_overflow, an extra sign comparison + a selects based on the sign/overflow.

This completes PR40316

Differential Revision: https://reviews.llvm.org/D57239

llvm-svn: 352315
2019-01-27 13:51:59 +00:00
Simon Pilgrim d824f99a6c [X86] Add ADD/SUB SSAT/USAT vector costs (PR40123)
Costs for real SSE2 instructions

llvm-svn: 350295
2019-01-03 11:38:42 +00:00
Simon Pilgrim 55ea89305d [X86] Add ADD/SUB SSAT/USAT cost tests (PR40123)
llvm-svn: 350293
2019-01-03 11:29:24 +00:00