Commit Graph

15 Commits

Author SHA1 Message Date
Craig Topper ff66919020 [X86][CostModel] Bump the cost of vpermw/vpermt2b/vperm2w
vpermw is 2 uops. vpermt2b/vpermt2w are two shuffle uops and a port 015 uop. Weirdly vpermb is a single uop.

This patch bumps the cost to 2 for these operations. Maybe should go to 3 for the vpermt2*, but I've started conservative.

I've also removed a few entries that were now the same as earlier subtargets or that I didn't think we really did. Like I don't think we extend v32i8 to v32i16, shuffle, and then truncate.

Differential Revision: https://reviews.llvm.org/D79148
2020-04-30 11:32:25 -07:00
Craig Topper 8dfb9627b7 [X86] Make v32i16/v64i8 legal types without avx512bw. Use custom splitting instead.
This moves v32i16/v64i8 to a model consistent with how we
treat integer types with avx1.

This does change the ABI for types vXi16/vXi8 vectors larger than
512 bits to pass in multiple zmms instead of multiple ymms. We'd
already hacked some code to make v64i8/v32i16 pass in zmm.

Cost model is still a bit of a mess. In some place I tried to
match existing behavior. But really we need to account for
splitting and concating costs. Cost model for shuffles is
especially pessimistic.

Differential Revision: https://reviews.llvm.org/D76212
2020-04-15 12:17:18 -07:00
Simon Pilgrim 12c629ec6c [CostModel][X86] Add shuffle costs for some common sub-128bit vectors
v2i8/v4i8/v8i8 + v2i16/v4i16 all show up in vectorizer code and by just using the legalized types (v16i8/v8i16) we're highly exaggerating the actual cost of the shuffle.
2020-04-09 19:57:06 +01:00
Simon Pilgrim 6a57ba17c0 [CostModel][X86] Add shuffle cost tests for sub-128bit vectors 2020-04-04 13:08:25 +01:00
Simon Pilgrim eaa41e103c [CostModel][X86] Try to check against common prefixes before using target-specific cpu checks
SLM/GLM is still a mess so not all of them have been updated yet.
2020-02-24 11:59:07 +00:00
Simon Pilgrim 2a9cde026c [X86][AVX] Reduce v4f64/v4i64 shuffle costs (PR37882)
These were being over cautious for costs for one/two op general shuffles - VSHUFPD doesn't have to replicate the same shuffle in both lanes like VSHUFPS does. 

llvm-svn: 335216
2018-06-21 11:37:13 +00:00
Simon Pilgrim cd9ccf8824 [CostModel][X86] Split off BtVer2 cost checks
llvm-svn: 330433
2018-04-20 13:50:33 +00:00
Simon Pilgrim 34b397a318 [CostModel][X86] Add some specific cpu targets to the cost models
We're mostly testing with generic isa attributes, but PR36550 will require testing of specific target's scheduler models as well.

llvm-svn: 330056
2018-04-13 19:30:15 +00:00
Simon Pilgrim 63ae5579e7 [CostModel][X86] Regenerate vector shuffle cost tests with update_analyze_test_checks.py
llvm-svn: 329410
2018-04-06 16:00:28 +00:00
Simon Pilgrim c63f93a197 [CostModel][X86][XOP] Improve costs for XOP shuffles
VPPERM/VPERMIL2PD/VPERMIL2PS all provide more effective 2-input shuffles than regular AVX instructions

llvm-svn: 311005
2017-08-16 13:50:20 +00:00
Simon Pilgrim b59c2d9d73 [CostModel][X86] Add SSE2 two-src shuffle costs
llvm-svn: 310654
2017-08-10 19:32:35 +00:00
Simon Pilgrim 7354531b82 [CostModel][X86] Add avx1 two-src shuffle costs
llvm-svn: 310650
2017-08-10 19:02:51 +00:00
Simon Pilgrim ac2e50a4ca [CostModel][X86] Add avx2 two-src shuffle costs
llvm-svn: 310645
2017-08-10 18:29:34 +00:00
Simon Pilgrim 2f529412e1 [CostModel][X86] Extend two src shuffle cost tests
Cover most 128/256/512/1024-bit cases for vXf64/vXi64, vXf32/vXi32, vXi16 + vXi8

llvm-svn: 310641
2017-08-10 18:02:45 +00:00
Elena Demikhovsky 21706cbd24 AVX-512 Loop Vectorizer: Cost calculation for interleave load/store patterns.
X86 target does not provide any target specific cost calculation for interleave patterns.It uses the common target-independent calculation, which gives very high numbers. As a result, the scalar version is chosen in many cases. The situation on AVX-512 is even worse, since we have 3-src shuffles that significantly reduce the cost.

In this patch I calculate the cost on AVX-512. It will allow to compare interleave pattern with gather/scatter and choose a better solution (PR31426).

* Shiffle-broadcast cost will be changed in Simon's upcoming patch.

Differential Revision: https://reviews.llvm.org/D28118

llvm-svn: 290810
2017-01-02 10:37:52 +00:00