Commit Graph

16676 Commits

Author SHA1 Message Date
Nirav Dave 8c5f47ac40 [DAG, X86] Fix ISel-time node insertion ids
As in SystemZ backend, correctly propagate node ids when inserting new
unselected nodes into the DAG during instruction Seleciton for X86
target.

Fixes PR36865.

Reviewers: jyknight, craig.topper

Subscribers: hiraditya, llvm-commits

Differential Revision: https://reviews.llvm.org/D44797

llvm-svn: 328233
2018-03-22 19:32:07 +00:00
Craig Topper 4a3be6e578 [X86] Correct the scheduling data for some of the 32 and 64 bit multiplies to as best as I understand how they are implemented.
llvm-svn: 328231
2018-03-22 19:22:51 +00:00
Simon Pilgrim bcb86bb927 [X86][Btver2] Conversion, MaskedLoad/MaskedStore and NTStores all are scheduled through the JFPU1 pipe
llvm-svn: 328226
2018-03-22 18:29:16 +00:00
Simon Pilgrim 0e031afa95 [X86][Btver2] FCMP (inc FMAX/FMIN) instructions use the JFPA functional pipe
The ymm instructions are double pumped as well.

llvm-svn: 328222
2018-03-22 17:43:12 +00:00
Simon Pilgrim e5b51f6786 [X86][Btver2] FMUL ymm instructions are double pumped on the JFPM functional pipe
llvm-svn: 328217
2018-03-22 17:25:38 +00:00
Simon Pilgrim 53b2c3329a [X86][SSE42] Use the default PCMPEST/PCMPIST scheduler classes directly. NFCI.
Models were completely overriding all SSE42 strins instructions when the default classes could be used for exactly the same coverage.

llvm-svn: 328203
2018-03-22 14:56:18 +00:00
Simon Pilgrim 3b2ff1faa9 [X86][CLMUL] Use the default CLMUL scheduler classes directly. NFCI.
Models were completely overriding all CLMUL instructions when the WriteCLMUL default classes could be used for exactly the same coverage.

llvm-svn: 328194
2018-03-22 13:37:30 +00:00
Simon Pilgrim 6bdd6b32fd [X86][CLMUL] Fix/add missing itinerary tags to (V)PCLMULQDQ instructions
PCLMULQDQrm was using the rr itinerary.

Difference in itineraries between PCLMULQDQ/VPCLMULQDQ variants was causing an unnecessary duplication of scheduler class entries.

llvm-svn: 328193
2018-03-22 13:36:06 +00:00
Simon Pilgrim 7684e055b3 [X86] Use the default AES scheduler classes directly. NFCI.
Models were completely overriding all AES instructions when the WriteAES default classes could be used for exactly the same coverage.

Removes 6 unnecessary scheduler classes from every model.

Note: Still looking for a way for tblgen to warn when this is happening - often the override is more complete than the default.
llvm-svn: 328192
2018-03-22 13:18:08 +00:00
Craig Topper df7855fc8d [X86] Remove unused SchedWriteRes classes. NFC
llvm-svn: 328181
2018-03-22 04:52:08 +00:00
Craig Topper fc179c6dd5 [X86][Skylake] Merge multiple InstrRW entries that map to the same SchedWriteRes group (NFCI) (PR35955)
I've also merged some VEX/non-VEX instregex strings with a (V?) prefix or (Y?) ymm variant - there are still a lot more of these to do.

This reduces the size of the optimized llc binary on my computer by 400K. Presumably because we went from 5000+ scheduler classes per CPU to ~2000.

llvm-svn: 328179
2018-03-22 04:23:41 +00:00
Craig Topper 2854dc93e1 [X86] Rewrite getOperandBias in X86BaseInfo.h to be a little more structured and update comments to be more clear about what it does. NFC
llvm-svn: 328136
2018-03-21 19:30:28 +00:00
Simon Pilgrim ec2f878779 [X86][Haswell] Merge multiple InstrRW entries that map to the same SchedWriteRes group (NFCI) (PR35955)
I've also merged some VEX/non-VEX instregex strings with a (V?) prefix or (Y?) ymm variant - there are still a lot more of these to do.

llvm-svn: 328111
2018-03-21 16:19:03 +00:00
Simon Pilgrim 96b605d34c [X86][SandyBridge] Merge more VEX/non-VEX instregex patterns (NFCI) (PR35955)
llvm-svn: 328110
2018-03-21 16:05:58 +00:00
Craig Topper 5a69a0011b [X86][Broadwell] Merge multiple InstrRW entries that map to the same SchedWriteRes group (NFCI) (PR35955)
llvm-svn: 328076
2018-03-21 06:28:42 +00:00
Craig Topper 137a4dd84d [X86] Fix the SchedRW for XOP vpcom register form instructions to not be marked as loads.
llvm-svn: 328071
2018-03-21 03:41:33 +00:00
Craig Topper d25f1acf67 [X86] Change PMULLD to 10 cycles on Skylake per Agner's tables and llvm-exegesis.
Also restrict to port 0 and 1 for SkylakeClient. It looks like the scheduler models don't account for client not having a full vector ALU on port 5 like server.

Fixes PR36808.

llvm-svn: 328061
2018-03-20 23:39:48 +00:00
Simon Pilgrim 572bfa562a [X86] Drop unnecessary InstRW overrides for WriteFMA
As noticed on D44687, these already match the WriteFMA def so can be removed.

llvm-svn: 328045
2018-03-20 21:15:23 +00:00
Martin Storsjo 07589fc496 [X86] Don't use the MSVC stack protector names on mingw
Mingw uses the same stack protector functions as GCC provides
on other platforms as well.

Patch by Valentin Churavy!

Differential Revision: https://reviews.llvm.org/D27296

llvm-svn: 328039
2018-03-20 20:37:51 +00:00
Nirav Dave ce71989188 [MC,X86] Cleanup some X86 parser functions to use MCParser helpers. NFCI.
llvm-svn: 328019
2018-03-20 19:12:41 +00:00
Krzysztof Parzyszek eb0c510ecd [X86] Add phony registers for high halves of regs with low halves
Registers E[A-D]X, E[SD]I, E[BS]P, and EIP have 16-bit subregisters
that cover the low halves of these registers. This change adds artificial
subregisters for the high halves in order to differentiate (in terms of
register units) between the 32- and the low 16-bit registers.

This patch contains parts that aim to preserve the calculated register
pressure. This is in order to preserve the current codegen (minimize the
impact of this patch). The approach of having artificial subregisters
could be used to fix PR23423, but the pressure calculation would need
to be changed.

Differential Revision: https://reviews.llvm.org/D43353

llvm-svn: 328016
2018-03-20 18:46:55 +00:00
Simon Pilgrim 62690e9d0e [X86][Haswell][Znver1] Fix typo in fldl instregexs
Missing comma was casing 2 instregex entries to be concatenated together by mistake.

Found while investigating PR35548

llvm-svn: 327992
2018-03-20 15:44:47 +00:00
Simon Pilgrim 4a83f802cc [X86][SandyBridge] Merge multiple InstrRW entries that map to the same SchedWriteRes group (NFCI) (PR35955)
I've also merged some VEX/non-VEX instregex strings with a (V?) prefix - there are still a lot more of these to do.

llvm-svn: 327974
2018-03-20 12:26:55 +00:00
Martin Storsjo 802b434156 [X86] Properly implement the calling convention for f80 for mingw/x86_64
In these cases, both parameters and return values are passed
as a pointer to a stack allocation.

MSVC doesn't use the f80 data type at all, while it is used
for long doubles on mingw.

Normally, this part of the calling convention is handled
within clang, but for intrinsics that are lowered to libcalls,
it may need to be handled within llvm as well.

Differential Revision: https://reviews.llvm.org/D44592

llvm-svn: 327957
2018-03-20 06:19:38 +00:00
Craig Topper ad7c685791 [X86] Rename MOVSX32_NOREXrr8 to MOVSX32rr8_NOREX so that the scheduler model regular expressions will pick it up with the regular version.
Do the same for MOVSX32_NOREXrm8, MOVZX32_NOREXrr8, and MOVZX32_NOREXrm8

llvm-svn: 327948
2018-03-20 05:00:20 +00:00
Craig Topper 4778fa7e8a [X86] Fix the SchedRW for memory forms of CMP and TEST.
They were incorrectly marked as RMW operations. Some of the CMP instrucions worked, but the ones that use a similar encoding as RMW form of ADD ended up marked as RMW.

TEST used the same tablegen class as some of the CMPs.

llvm-svn: 327947
2018-03-20 03:55:17 +00:00
Craig Topper 3e9462607e [X86] Add TEST16mi/TEST32mi/TEST64mi32 to the Sandybridge/Haswell/Broadwell/Skylake scheduler models.
Move it from a load+store group on SNB to a load only group, the same group as CMP.

llvm-svn: 327944
2018-03-20 03:02:03 +00:00
Craig Topper 7c90e29cf8 [X86] Add ROR/ROL/SHR/SAR by 1 instructions to the Sandy Bridge scheduler model.
I assume these match the generic immediate version like they do in the other models.

llvm-svn: 327943
2018-03-20 03:01:59 +00:00
Craig Topper 2330d6cd55 [X86] Fix the SNB scheduler for BLENDVB.
PBLENDVBrr0 was with the memory version of VBLENDVB and PBLENDVBrm0 was missing.

llvm-svn: 327937
2018-03-20 01:30:21 +00:00
Craig Topper ab6076514d [X86] Simplify the AVX512 code in LowerTruncate a little.
We don't need to create an ISD::TRUNCATE node to return, we started with one and can return it. Also remove the call to getExtendInVec, the result is just going to be a getNode of that value passed in.

llvm-svn: 327914
2018-03-19 21:58:02 +00:00
Craig Topper 3b967466d5 [X86] Replace a couple calls to getExtendInVec with getNode and the appropriate target independent EXTEND_VECTOR_INREG opcode.
llvm-svn: 327899
2018-03-19 20:20:22 +00:00
Nirav Dave 3264c1bdf6 [DAG, X86] Revert r327197 "Revert r327170, r327171, r327172"
Reland ISel cycle checking improvements after simplifying node id
invariant traversal and correcting typo.

llvm-svn: 327898
2018-03-19 20:19:46 +00:00
Craig Topper 9770107b5f [X86] Add JMP16r and JMP32r to Sandybridge scheduler model.
Fixes PR36010

llvm-svn: 327883
2018-03-19 19:00:37 +00:00
Craig Topper 5e65996fac [X86] Remove OUT32rr/OUT8rr/OUT32ri/OUT8ri from Sandybridge scheduler model.
PR35590 was already filed for this information being wrong. It's probably better to default to WriteSystem behavior instead of using something completely wrong.

llvm-svn: 327882
2018-03-19 19:00:35 +00:00
Craig Topper b4c7873f8c [X86] Add JCXZ/JECXZ to Sandybridge/Haswell/Broadwell/Skylake scheduler models.
JRCXZ was already present, but not the others.

We never codegen this instruction so this doesn't affect much just trying to get them all into a single generated scheduler class in the output.

llvm-svn: 327881
2018-03-19 19:00:32 +00:00
Craig Topper afabf36505 [X86] Correct regular expression in Zen scheduler model that was excluding JECXZ instruction.
The regex was looking for JECXZ_32 or JECXZ_64, but their is just one instruction called JECXZ. They used to exist as separate instructions, but were merged over 3 years ago.

llvm-svn: 327880
2018-03-19 19:00:29 +00:00
Craig Topper 591f44df54 [X86] Correct the SchedRW on (V)MOVAPSrr_REV and similar to match their non _REV counterparts.
llvm-svn: 327879
2018-03-19 19:00:26 +00:00
Craig Topper 836cfb3a4c [X86] Add the rest of the TEST with immediate instructions to the scheduler models to match their 8-bit counterpart.
llvm-svn: 327874
2018-03-19 17:58:41 +00:00
Craig Topper 645e531a69 [X86] Add MOV16ri*/MOV32ri*/MOV64ri* to scheduler models to match MOV8ri. Correct SchedRW and itinerary for MOV32ri64.
llvm-svn: 327872
2018-03-19 17:46:59 +00:00
Craig Topper 259eaa6e7c [X86] Remove sse41 specific code from lowering v16i8 multiply
With the SRAs removed from the SSE2 code in D44267, then there doesn't appear to be any advantage to the sse41 code. The punpcklbw instruction and pmovsx seem to have the same latency and throughput on most CPUs. And the SSE41 code requires moving the upper 64-bits into the lower 64-bit before the sign extend can be done. The unpckhbw in sse2 code can do better than that.

llvm-svn: 327869
2018-03-19 17:31:41 +00:00
Craig Topper 5ccd87233f [X86] Make the multiply and divide itineraries more consistent.
Sometimes we used the same itinerary for MEM and REG forms, but that seems inconsistent with our usual usage.

We also used the MUL8 itinerary for MULX32/64 which was also weird.

The test changes are because we were using IIC_IMUL32_RR and IIC_IMUL64_RR instead of IIC_IMUL32_REG/IIC_IMUL64_REG for the 32 and 64 bit multiplies that produce double width result.

llvm-svn: 327866
2018-03-19 16:38:33 +00:00
Simon Pilgrim 30c38c3849 [X86] Generalize schedule classes to support multiple stages
Currently the WriteResPair style multi-classes take a single pipeline stage and latency, this patch generalizes this to make it easier to create complex schedules with ResourceCycles and NumMicroOps be overriden from their defaults.

This has already been done for the Jaguar scheduler to remove a number of custom schedule classes and adding it to the other x86 targets will make it much tidier as we add additional classes in the future to try and replace so many custom cases.

I've converted some instructions but a lot of the models need a bit of cleanup after the patch has been committed - memory latencies not being consistent, the class not actually being used when we could remove some/all customs, etc. I'd prefer to keep this as NFC as possible so later patches can be smaller and target specific.

Differential Revision: https://reviews.llvm.org/D44612

llvm-svn: 327855
2018-03-19 14:46:07 +00:00
Sanjay Patel 05daae75ad [x86] put nops into the WriteNop class and customize for Jaguar
1. Given that we already have a classification bucket with 'nop' in the name, 
   that's where 'nop' belongs. Right now, it's only used for prefix bytes and 'pause'.
2. Make the latency of this class '1' for Jaguar to tell the scheduler (and presumably 
   llvm-mca) how to model the resource requirements better even though a nop has no 
   dependencies.

Differential Revision: https://reviews.llvm.org/D44608

llvm-svn: 327853
2018-03-19 14:26:50 +00:00
Craig Topper e18fbab988 [X86] Merge XADD8rr regular expression with XADD16rr/XADD32rr/XADD64rr in a couple scheduler models.
llvm-svn: 327821
2018-03-19 04:21:42 +00:00
Craig Topper d10ceffa5f [X86] Add ADD16i16/ADD32i32/ADD64i32 and similar to the scheduler models to match ADD8i8.
Also move ADC8i8 and SBB8i8 in the Sandy Bridge model to the same class as ADC8ri and SBB8ri. That seems more accurate since its the 8i8 is just the register forced to AL instead of coming from modrm.

llvm-svn: 327820
2018-03-19 04:21:40 +00:00
Craig Topper e9c99d32b3 [X6] Remove two unused InstrItinClass
llvm-svn: 327819
2018-03-19 02:07:32 +00:00
Craig Topper 793733a6c8 [X86] Use IIC_CMOV64_RR/RM on 64-bit cmov instructions.
llvm-svn: 327817
2018-03-19 00:56:12 +00:00
Craig Topper 9b60dcb29b [X86] Merge 32 and 64-bit RORX/SHLX/SARX/SHRX into single regular expressions in scheduler models.
llvm-svn: 327816
2018-03-19 00:56:11 +00:00
Craig Topper 13a1650d8a [X86] Merge 8-bit instructions into instregex with 16/32/64 instructions in the scheduler models as much as possible. NFCI
This reduces the total number of generated scheduler classes from 5404 to 5316.

llvm-svn: 327815
2018-03-19 00:56:09 +00:00
Simon Pilgrim 203876f104 [X86][Btver2] Fix crc32 schedule costs
The default is currently FAdd for some reason

llvm-svn: 327807
2018-03-18 19:54:42 +00:00