Commit Graph

16109 Commits

Author SHA1 Message Date
Craig Topper c6a4a97260 [X86] Add VCOMISDZrr, VCOMISSZrr, VUCOMISDZrr, and VUCOMISSZrr to the skylake server sheduler model
llvm-svn: 320326
2017-12-10 19:47:57 +00:00
Craig Topper a0be5a06c1 [X86] Rename some instructions that start with Int_ to have the _Int at the end.
This matches AVX512 version and is more consistent overall. And improves our scheduler models.

In some cases this adds _Int to instructions that didn't have any Int_ before. It's a side effect of the adjustments made to some of the multiclasses.

llvm-svn: 320325
2017-12-10 19:47:56 +00:00
Simon Pilgrim c493d4f5b9 [X86][X87] Fix typo in znver1 FIST/FISTT schedule patterns
llvm-svn: 320322
2017-12-10 19:19:22 +00:00
Craig Topper 1de942b2d1 [X86] Rename some instructions from 'rb' to 'rrb' to make 'b' a proper suffix. Fix the scheduling information for some of them.
Some of the scheduling information was only present for the 'rb' version' and not the 'rr' version. Now we match 'rr(b?)'

llvm-svn: 320320
2017-12-10 17:42:44 +00:00
Craig Topper c7445f2cdc [X86] Add VCVTQQ2PS to the skylake server scheduler models.
llvm-svn: 320319
2017-12-10 17:42:43 +00:00
Craig Topper c268527b2f [X86] Add VPMULLWZ256 to the skylake server scheduler model
llvm-svn: 320318
2017-12-10 17:42:42 +00:00
Craig Topper 4ec397cbd3 [X86] Add 256/512-bit EVEX VPSADBW instructions to skylake server scheduler model.
llvm-svn: 320317
2017-12-10 17:42:41 +00:00
Craig Topper aa904d5ab6 [X86] Fix a few instructions that were named Z512 instead of just Z.
This makes things consistent with our normal instruction naming.

llvm-svn: 320316
2017-12-10 17:42:39 +00:00
Craig Topper 7c89de1760 [X86] Add VPSRLWZrr to skylake server scheduler model.
llvm-svn: 320315
2017-12-10 17:42:38 +00:00
Craig Topper 1d7760db49 [X86] Add VPUNPCKLWDZrr to skylake server scheduler model.
llvm-svn: 320314
2017-12-10 17:42:37 +00:00
Craig Topper 57c2815cbe [X86] Adjust tablegen includes so we can use Instructions in scheduler models instead of just instregexs.
This separates the CPU specific scheduler model includes to occur after the instructions. Moves the instruction includes between the basic scheduler information and the CPU specific scheduler models.

llvm-svn: 320313
2017-12-10 17:42:36 +00:00
Simon Pilgrim 1f8cfba0bb [X86] Flag BroadWell scheduler model as complete
Locally tag COPY as WriteMove, which has caused some reg-reg + reg-mem instruction tests to reorder.

llvm-svn: 320308
2017-12-10 13:49:51 +00:00
Simon Pilgrim 49c74934dd Strip trailing whitespace. NFCI.
llvm-svn: 320306
2017-12-10 13:00:37 +00:00
Simon Pilgrim 320996576d [X86] Flag ZNVER1 scheduler model as complete
We just have to locally tag COPY as WriteMove

llvm-svn: 320304
2017-12-10 12:43:53 +00:00
Simon Pilgrim 8547645948 [X86] Flag SLM scheduler model as complete
We just have to locally tag COPY as WriteMove

llvm-svn: 320303
2017-12-10 12:36:29 +00:00
Simon Pilgrim 91c159d841 [X86][AVX[ Tag VZEROALL/VZEROUPPER instructions scheduler classes
llvm-svn: 320302
2017-12-10 12:26:35 +00:00
Simon Pilgrim 6de94a1adc [X86] Tag SSE4A instructions as SSE INTALU scheduler classes
llvm-svn: 320301
2017-12-10 12:08:04 +00:00
Simon Pilgrim cd58171110 [X86] Flag BTVER2 scheduler model as complete
We just have to locally tag COPY as WriteMove

llvm-svn: 320300
2017-12-10 11:51:29 +00:00
Simon Pilgrim b7fb2e2fa1 [X86] Tag ADJSTACK instructions as INTALU scheduler class
llvm-svn: 320299
2017-12-10 11:34:08 +00:00
Simon Pilgrim 1a030016a6 [X86] Tag MORESTACK instructions as ret scheduler class
llvm-svn: 320296
2017-12-10 10:08:21 +00:00
Craig Topper 253562eb81 [X86] Fix duplicate entries in skylake server scheduler model by changing Z128 to Z256
Based on the fact that the 'Y' version of the instruction is next to this, I assume Z256 is the intended value.

llvm-svn: 320295
2017-12-10 09:14:45 +00:00
Craig Topper 90c9c15936 [X86] Add MOVQI2PQIrm, MOVSDmr, and MOVSDrm to scheduler information
The VEX versions were present but not the legacy SSE versions.

llvm-svn: 320294
2017-12-10 09:14:44 +00:00
Craig Topper 28e55386ac [X86] Add LEA64_32r to scheduler models for Sandybridge,Haswell,Broadwell,Skylake
llvm-svn: 320293
2017-12-10 09:14:42 +00:00
Craig Topper 8ade4640f3 [X86] Add IN16/OUT16 to scheduling information for Haswell,Broadwell,Skylake
Sandy Bridge is also missing it, but it has other issues. See PR35590.

llvm-svn: 320292
2017-12-10 09:14:41 +00:00
Craig Topper 1a88c50fd7 [X86] Fix scheduler models to support ADD32ri in addition to ADD32ri8. Similar for all sizes of AND/OR/XOR/SUB/ADC/SBB/CMP.
llvm-svn: 320291
2017-12-10 09:14:39 +00:00
Craig Topper c89e282f7d [X86] Rename some instructions so that 'b' is added as a suffix instead of replacing an 'r'
llvm-svn: 320290
2017-12-10 09:14:38 +00:00
Craig Topper 6c65910160 [X86] Add CMPSDrr/rm to the scheduler models.
Somehow CMPSSrr/rm was there and the VEX version was there, but this was consistently missing.

llvm-svn: 320289
2017-12-10 09:14:37 +00:00
Craig Topper da7e78e18c [X86] Rename the rb form of scalar ADD/SUB/MUL/DIV to include _Int since they can only be selected by intrinsics.
llvm-svn: 320283
2017-12-10 04:07:28 +00:00
Craig Topper 4e57776fb2 [X86] Correct the _Int part of more scheduler model instrexes. Put _b in the correct order relative to _Int
llvm-svn: 320282
2017-12-10 03:16:38 +00:00
Craig Topper a2f5528084 [X86] Remove ReadAfterLd from several several rb instructions
This affects CVTSD2SS, FMA, RCP28, RSQRT28, and SQRT scalar instructions

'b' here refers to 'sae' not broadcast. These aren't memory instructions.

llvm-svn: 320281
2017-12-10 03:16:36 +00:00
Craig Topper 391c6f9507 [X86] Fix bad regular expressions in the scheduler models. Question marks should be outside of multicharacter parenthesized expressions
If the question mark is inside the parentheses it only applies to the single character proceeding it.

I had to make a few additional cleanups to fix some duplicate warnings that were exposed by fixing this.

llvm-svn: 320279
2017-12-10 01:24:08 +00:00
Craig Topper 8ee98d0b51 [X86] Make the _Int part of some instregex sheduler patterns optional
llvm-svn: 320278
2017-12-10 01:24:06 +00:00
Craig Topper 5ffe80103e [X86] Add the commutable floating point min/max pseudo instructions to sandybridge,haswell,broadwell,skylakeclient scheduler models.
llvm-svn: 320277
2017-12-10 01:24:05 +00:00
Simon Pilgrim 6655eef1b4 [X86] Tag PIC setup instruction as jump scheduler class
llvm-svn: 320276
2017-12-10 00:40:37 +00:00
Simon Pilgrim 5d74949e5f [X86] Tag ACQUIRE/RELEASE atomic instructions as microcoded scheduler classes
Note: We may be too pessimistic here and should possibly use something closer to the LOCK arithmetic instructions
llvm-svn: 320275
2017-12-10 00:30:57 +00:00
Simon Pilgrim dcbe723d28 [X86] Tag TLS instructions as system scheduler classes
llvm-svn: 320274
2017-12-10 00:12:57 +00:00
Simon Pilgrim 3508a09455 [X86] Tag ALLOCA/VAARG instructions as system scheduler classes
llvm-svn: 320273
2017-12-10 00:03:16 +00:00
Craig Topper f4e3044db9 [X86] Use KMOV instructions to zero upper bits of vectors when possible.
llvm-svn: 320268
2017-12-09 23:10:59 +00:00
Craig Topper 5ac75d5628 [X86] Improve lowering of vXi1 insert_subvectors to better utilize (insert_subvector zero, vec, 0) for zeroing upper bits.
This can be better recognized during isel when the producer already zeroed the upper bits.

llvm-svn: 320267
2017-12-09 22:44:42 +00:00
Simon Pilgrim e049038692 [X86] Tag LOCK/REX64/DATA16/DATA32 instruction prefix scheduler classes
llvm-svn: 320266
2017-12-09 21:27:03 +00:00
Simon Pilgrim b2b93f6204 Strip trailing whitespace. NFCI.
llvm-svn: 320265
2017-12-09 20:44:51 +00:00
Simon Pilgrim 7e636cc419 [X86] Tag FS/GS BASE R/W instruction scheduler classes
llvm-svn: 320264
2017-12-09 20:42:27 +00:00
Simon Pilgrim 231fab072f [X86] Tag REP/REPNE prefix instructions as microcoded scheduler classes
llvm-svn: 320263
2017-12-09 20:16:37 +00:00
Simon Pilgrim 2e7314eb2f [X86] Tag missing EH pseudo instruction scheduler classes
llvm-svn: 320262
2017-12-09 20:04:02 +00:00
Simon Pilgrim cb71e72707 [X86] Tag frame pointer XORs instruction scheduler classes
llvm-svn: 320261
2017-12-09 19:56:39 +00:00
Craig Topper 504534514c [X86] Don't use getTargetConstant for all 0s and all 1s mask vector.
llvm-svn: 320260
2017-12-09 19:18:30 +00:00
Simon Pilgrim df702104d3 [X86] Tag segment prefixes as NOP instruction scheduling classes
llvm-svn: 320257
2017-12-09 16:58:34 +00:00
Simon Pilgrim d3e21c6b79 [X86][AVX512] Drop a default NoItinerary argument that isn't used any more. NFCI.
Requires re-ordering of AVX512_maskable_custom arguments.

llvm-svn: 320255
2017-12-09 16:20:54 +00:00
Craig Topper 6504a8f888 [X86] When inserting into the upper bits of a vXi1 vector, make sure we shift enough bits if we widened the vector.
We may need to widen the vector to make the shifts legal, but if we do that we need to make sure we shift left/right after accounting for the new size. If not we can't guarantee we are shifting in zeros.

The test cases affected actually show cases where we should move the shifts all together, but that's another problem.

llvm-svn: 320248
2017-12-09 08:19:07 +00:00
Craig Topper b3e14ce90c [X86] Improve lowering of concats of mask vectors to better optimize zero vector inputs.
We were previously using kunpck with zero inputs unnecessarily. And we had cases where we would insert into a zero vector and then insert into larger zero vector incurring two sets of shifts.

llvm-svn: 320244
2017-12-09 07:02:19 +00:00
Craig Topper e29f50da4d [X86][Mips] Remove unused method declaration from the X86 and Mips AsmPrinters.
Both had a declaration of EmitXRayTable, but there is no method defined in either with that name. There is a emitXRayTable in the base class with a lower case 'e' and they both call that.

llvm-svn: 320213
2017-12-08 23:30:03 +00:00
Adrian Prantl d13170174c Generalize llvm::replaceDbgDeclare and actually support the use-case that
is mentioned in the documentation (inserting a deref before the plus_uconst).

llvm-svn: 320203
2017-12-08 21:58:18 +00:00
Simon Pilgrim 5f7fcb2ea9 [X86] CMOV pseudo instructions shouldn't need scheduling info as they should be lowered early
llvm-svn: 320193
2017-12-08 20:42:35 +00:00
Simon Pilgrim f621dcf8d7 [X86][X87] Tag x87 load/store instructions scheduler classes
llvm-svn: 320192
2017-12-08 20:31:48 +00:00
Craig Topper 7f0d456ef8 [X86] Teach lowering to only let through (insert_subvector (vXi1 zeros), subvec, 0) for vector sizes that have native KSHIFT support.
For narrow sizes we'll widen the zero vector and widen the insert. Then do an extract_subvector to get back down to correct size.

This allows us to remove some patterns from the isel table that had to COPY_TO_REGCLASS to an oversized register, do the shift and then COPY_TO_REGCLASS back to the narrow register. Now this is represented explicitly in the DAG.

This seems to have perturbed the register allocation in one of the tests, but the number of instructions didn't change.

llvm-svn: 320190
2017-12-08 20:10:33 +00:00
Simon Pilgrim 6415f56c79 [X86][X87] Tag x87 float compare instructions scheduler classes
llvm-svn: 320189
2017-12-08 20:10:31 +00:00
Simon Pilgrim 2db2851378 [X86][MPX] Tag TSX/HLE/SGX instructions scheduler classes
Currently tagged these as system instructions.

llvm-svn: 320177
2017-12-08 19:26:22 +00:00
Simon Pilgrim 42fcda9a6c [X86][MPX] Tag MPX instructions scheduler classes
Currently tagged these as system instructions, once we have uses for them (ASAN?) and they are faster we will need to improve on this.

llvm-svn: 320173
2017-12-08 19:03:42 +00:00
Sanjay Patel d4468912b0 [x86] use hasAVX2() rather than hasInt256(); NFC
These are aliases, but the thing we're checking here is that the target has
vpsllv*, not that the data type is 256-bit. Those instructions exist for
128-bit vectors too...but sadly, not for all element sizes.

llvm-svn: 320170
2017-12-08 18:35:51 +00:00
Simon Pilgrim 8e39dc36b8 [X86] Tag move immediate instructions scheduler classes
llvm-svn: 320169
2017-12-08 18:35:40 +00:00
Simon Pilgrim 19d460b066 [X86][SHA] Tag SHA instructions scheduler classes
Put these under VecIMul itinerary classes for now - seems to be a good average value

llvm-svn: 320161
2017-12-08 16:38:41 +00:00
Simon Pilgrim 4ba3314d55 [X86] Tag VIA PadLock crypto instructions scheduler classes
llvm-svn: 320159
2017-12-08 16:06:40 +00:00
Simon Pilgrim 1ddcae665e [X86] Tag PKU/INVPCID/RDPID/SMAP/SMX/PTWRITE system instructions scheduler classes
llvm-svn: 320158
2017-12-08 15:48:37 +00:00
Simon Pilgrim 83708cabc0 [X86][AVX512] Tag CLWB instruction to CLFLUSH/PREFETCH scheduler class
llvm-svn: 320156
2017-12-08 15:19:10 +00:00
Simon Pilgrim 26f106fda4 [X86][AVX512] Tag AVX512_512_SEXT_MASK_* instructions scheduler classes
Match VPTERNLOG which these pseudos will eventually alias to

llvm-svn: 320154
2017-12-08 15:17:32 +00:00
Gadi Haber 2cf601f28f [X86][Haswell]: Updating the scheduling information for the Haswell subtarget.
Updated the scheduling information for the Haswell subtarget with the following changes:

Regrouped the instructions after adding appropriate load + store latencies.
Added scheduling for missing instructions such as the GATHER instrs.
The changes were made after revisiting the latencies impact of all memory uOps.

Reviewers: RKSimon, zvi, craig.topper, apilipenko
Differential Revision: https://reviews.llvm.org/D40021

Change-Id: Iaf6c1f5169add1552845a8a566af4e5a359217a7
llvm-svn: 320137
2017-12-08 09:48:44 +00:00
Craig Topper 037115c29f [X86] Always consider inserting a vXi1 vector into the lsbs of a zero vector to be legal during lowering. Add isel patterns to emit shifts.
Previously we only allowed these through if the subvector came from a compare or test instruction which we would again check for during isel.

With this change we only check for the compare and test instructions during isel and have fallback patterns that emit the shifts if needed.

I noticed that in a lot of cases we don't actually see the compare during lowering and rely on an odd legalization of concat_vectors with a zero vector as the second argument. This keeps the concat_vectors around long enough for a later dag combine to expose the compare then we re-legalize the concat_vectors and catch the compare.

llvm-svn: 320134
2017-12-08 08:10:58 +00:00
Craig Topper 323ba39f10 [X86] Handle alls version of vXi1 insert_vector_elt with a constant index without falling back to shuffles.
We previously only supported inserting to the LSB or MSB where it was easy to zero to perform an OR to insert.

This change effectively extracts the old value and the new value, xors them together and then xors that single bit with the correct location in the original vector. This will cancel out the old value in the first xor leaving the new value in the position.

The way I've implemented this uses 3 shifts and two xors and uses an additional register. We can avoid the additional register at the cost of another shift.

llvm-svn: 320120
2017-12-08 00:16:09 +00:00
Craig Topper fd86b3cf22 [X86] Fix indentation. NFC
llvm-svn: 320119
2017-12-08 00:15:57 +00:00
Craig Topper dfc79c7c33 [X86] Fix InsertBitToMaskVector to only issue KSHIFTS of native size so that upper bits are properly zeroed.
There's no v2i1 or v4i1 kshift, and v8i1 is only supported with AVXDQ. Isel has fake patterns to extend these types to native shifts, but makes no guarantees about the value of any bits shifted in when shifting right.

This patch promotes the vector to a type that supports a native shift first and only allows inserting into the msb of a native sized shift.

I've constructed this in a way that doesn't do the promotion if we're going to fallback to using a xmm/ymm/zmm shuffle. I think I have a plan to remove the shuffle fall back entirely. In which case we this can be simplified, but I wanted to fix the correctness issue first.

llvm-svn: 320081
2017-12-07 20:10:04 +00:00
Craig Topper 7b8fa5f782 [X86] Fix typo in variable name. NFC
llvm-svn: 320080
2017-12-07 20:10:01 +00:00
Craig Topper b67e5da89b [X86] Make a couple helper lowering methods static.
llvm-svn: 320079
2017-12-07 20:09:55 +00:00
Simon Pilgrim 6d9ac1b1eb [X86] Replace tabs with spaces. NFCI.
llvm-svn: 320065
2017-12-07 17:55:19 +00:00
Simon Pilgrim 386b23f1fa [X86] Tag BMI/BMI2/TBM instructions scheduler classes
Put these under UNARY/BINOP ALU itinerary classes for now - seems to be a good average value

llvm-svn: 320064
2017-12-07 17:37:39 +00:00
Simon Pilgrim 2983b46973 [X86] Tag SALC instructions scheduler class
Treat these the same as LAHF/SAHF (although its not a x86_64 instruction)

llvm-svn: 320055
2017-12-07 16:07:06 +00:00
Simon Pilgrim a13271bcba [X86][VMX] Tag VMX instructions scheduler classes
Tagged all as system instructions

llvm-svn: 320053
2017-12-07 15:57:32 +00:00
Simon Pilgrim f1d599adb2 [X86] Tag LZCNT/TZCNT instructions scheduler classes
Tagged as IMUL instructions for a reasonable approximation (ALU tends to be a lot faster) - POPCNT is currently tagged as FAdd which I think should be replaced with IMUL as well

llvm-svn: 320051
2017-12-07 15:24:14 +00:00
Simon Pilgrim 6b7cd86ca7 [X86][SVM] Tag SVM instructions scheduler classes
Tagged all as system instructions

llvm-svn: 320047
2017-12-07 14:35:17 +00:00
Simon Pilgrim 60411d9a8c [X86] Tag RDRAND/RDSEED instruction scheduler classes
llvm-svn: 320045
2017-12-07 14:18:48 +00:00
Simon Pilgrim bd5f7455a2 [X86][X87] X87 math binop pseudo instructions don't need scheduling info
llvm-svn: 320044
2017-12-07 14:07:18 +00:00
Simon Pilgrim ca63dcce7f [X86][SSE42] SSE42 string pseudo instructions don't need scheduling info
llvm-svn: 320043
2017-12-07 13:52:07 +00:00
Andrew V. Tischenko 44cfc51415 Add proper BTVER2 sched support for MOV instr.
Differential Revision: https://reviews.llvm.org/D40345

llvm-svn: 320034
2017-12-07 11:19:49 +00:00
Francis Visoiu Mistrih a8a83d150f [CodeGen] Use MachineOperand::print in the MIRPrinter for MO_Register.
Work towards the unification of MIR and debug output by refactoring the
interfaces.

For MachineOperand::print, keep a simple version that can be easily called
from `dump()`, and a more complex one which will be called from both the
MIRPrinter and MachineInstr::print.

Add extra checks inside MachineOperand for detached operands (operands
with getParent() == nullptr).

https://reviews.llvm.org/D40836

* find . \( -name "*.mir" -o -name "*.cpp" -o -name "*.h" -o -name "*.ll" -o -name "*.s" \) -type f -print0 | xargs -0 sed -i '' -E 's/kill: ([^ ]+) ([^ ]+)<def> ([^ ]+)/kill: \1 def \2 \3/g'
* find . \( -name "*.mir" -o -name "*.cpp" -o -name "*.h" -o -name "*.ll" -o -name "*.s" \) -type f -print0 | xargs -0 sed -i '' -E 's/kill: ([^ ]+) ([^ ]+) ([^ ]+)<def>/kill: \1 \2 def \3/g'
* find . \( -name "*.mir" -o -name "*.cpp" -o -name "*.h" -o -name "*.ll" -o -name "*.s" \) -type f -print0 | xargs -0 sed -i '' -E 's/kill: def ([^ ]+) ([^ ]+) ([^ ]+)<def>/kill: def \1 \2 def \3/g'
* find . \( -name "*.mir" -o -name "*.cpp" -o -name "*.h" -o -name "*.ll" -o -name "*.s" \) -type f -print0 | xargs -0 sed -i '' -E 's/<def>//g'
* find . \( -name "*.mir" -o -name "*.cpp" -o -name "*.h" -o -name "*.ll" -o -name "*.s" \) -type f -print0 | xargs -0 sed -i '' -E 's/([^ ]+)<kill>/killed \1/g'
* find . \( -name "*.mir" -o -name "*.cpp" -o -name "*.h" -o -name "*.ll" -o -name "*.s" \) -type f -print0 | xargs -0 sed -i '' -E 's/([^ ]+)<imp-use,kill>/implicit killed \1/g'
* find . \( -name "*.mir" -o -name "*.cpp" -o -name "*.h" -o -name "*.ll" -o -name "*.s" \) -type f -print0 | xargs -0 sed -i '' -E 's/([^ ]+)<dead>/dead \1/g'
* find . \( -name "*.mir" -o -name "*.cpp" -o -name "*.h" -o -name "*.ll" -o -name "*.s" \) -type f -print0 | xargs -0 sed -i '' -E 's/([^ ]+)<def[ ]*,[ ]*dead>/dead \1/g'
* find . \( -name "*.mir" -o -name "*.cpp" -o -name "*.h" -o -name "*.ll" -o -name "*.s" \) -type f -print0 | xargs -0 sed -i '' -E 's/([^ ]+)<imp-def[ ]*,[ ]*dead>/implicit-def dead \1/g'
* find . \( -name "*.mir" -o -name "*.cpp" -o -name "*.h" -o -name "*.ll" -o -name "*.s" \) -type f -print0 | xargs -0 sed -i '' -E 's/([^ ]+)<imp-def>/implicit-def \1/g'
* find . \( -name "*.mir" -o -name "*.cpp" -o -name "*.h" -o -name "*.ll" -o -name "*.s" \) -type f -print0 | xargs -0 sed -i '' -E 's/([^ ]+)<imp-use>/implicit \1/g'
* find . \( -name "*.mir" -o -name "*.cpp" -o -name "*.h" -o -name "*.ll" -o -name "*.s" \) -type f -print0 | xargs -0 sed -i '' -E 's/([^ ]+)<internal>/internal \1/g'
* find . \( -name "*.mir" -o -name "*.cpp" -o -name "*.h" -o -name "*.ll" -o -name "*.s" \) -type f -print0 | xargs -0 sed -i '' -E 's/([^ ]+)<undef>/undef \1/g'

llvm-svn: 320022
2017-12-07 10:40:31 +00:00
Simon Pilgrim 9afbe77a91 [X86][AVX512] Tag mask reg op instruction scheduler classes
llvm-svn: 319945
2017-12-06 19:36:00 +00:00
Simon Pilgrim d255a6201d [X86][AVX512] Tag scalar insert/extract instruction scheduler classes
Classes don't look great but match what we're doing on SSE/AVX

llvm-svn: 319920
2017-12-06 18:46:06 +00:00
Craig Topper 8b0f185c31 [X86] Simplify the TTI code for getInterleavedMemoryOpCost around for AVX512BW. NFCI
Previously the lambda for AVX512 passed out a flag that indicated whether AVX512BW was required and that was checked against the AVX512BW subtarget flag outside.

This patch changes the interface to pass the AVX512BW subtarget bit in and return its value if we detect 16 or 8 bit types.

llvm-svn: 319919
2017-12-06 18:40:46 +00:00
Simon Pilgrim 809c024b3d [X86][AVX2] Tag MASKMOV instruction scheduler classes
llvm-svn: 319915
2017-12-06 18:24:48 +00:00
Simon Pilgrim df05251921 [X86][AVX512] Tag aligned/unaligned move instruction scheduler classes
llvm-svn: 319913
2017-12-06 17:59:26 +00:00
Simon Pilgrim aa902be158 [X86][AVX512] Tag BROADCAST instruction scheduler classes
llvm-svn: 319900
2017-12-06 15:48:40 +00:00
Benjamin Kramer 1e9bf765a1 [X86] Avoid unused variable warning in Release builds. NFCI.
llvm-svn: 319891
2017-12-06 13:32:36 +00:00
Simon Pilgrim 07dc6d6975 [X86][AVX512] Drop default NoItinerary arguments that aren't needed
Requires reordering of AVX512_maskable_common arguments, but helps track what is still missing itinerary tags

llvm-svn: 319890
2017-12-06 13:14:44 +00:00
Simon Pilgrim bfe969c49b [X86][AVX512] Tag Mask<->Vector instructions scheduler classes
llvm-svn: 319887
2017-12-06 11:59:05 +00:00
Simon Pilgrim 75673946fc [X86][AVX512] Cleanup scalar move scheduler classes
llvm-svn: 319884
2017-12-06 11:23:13 +00:00
Craig Topper 3275eb7a68 [X86] Split 512-bit vector extends from types other than vXi1 out of LowerZERO_EXTEND_AVX512/LowerSIGN_EXTEND_AVX512. NFCI
Most of the code in these routines is for handling extends from vXi1 types. The 512-bit handling for other extends is very much like the AVX2 code. So make the special routines just do vXi1 types and move the other 512-bit handling to the place that handles AVX2.

llvm-svn: 319878
2017-12-06 07:37:20 +00:00
Craig Topper 647e4f590f [X86] Update to getSetCCResultType to be more robust to EVT types.
Attempt to determine what the type will be legalized to and then analyze that to see if we will be able to use a vXi1 compare.

llvm-svn: 319861
2017-12-06 00:15:17 +00:00
Simon Pilgrim d495301414 [X86][AVX512] Tag BLENDM instruction scheduler classes
llvm-svn: 319833
2017-12-05 21:05:25 +00:00
Simon Pilgrim b69dae42e3 [X86][AVX512] Tag GATHER/SCATTER instruction scheduler classes
NOTE: At the moment these use the WriteLoad/WriteStore classes, which severely underestimates the costs. This needs to be reviewed.
llvm-svn: 319829
2017-12-05 20:47:11 +00:00
Hans Wennborg 5df9f0878b Re-commit r319490 "XOR the frame pointer with the stack cookie when protecting the stack"
The patch originally broke Chromium (crbug.com/791714) due to its failing to
specify that the new pseudo instructions clobber EFLAGS. This commit fixes
that.

> Summary: This strengthens the guard and matches MSVC.
>
> Reviewers: hans, etienneb
>
> Subscribers: hiraditya, JDevlieghere, vlad.tsyrklevich, llvm-commits
>
> Differential Revision: https://reviews.llvm.org/D40622

llvm-svn: 319824
2017-12-05 20:22:20 +00:00
Simon Pilgrim 13d449d3e2 [X86][AVX512] Tag VPSLLDQ/VPSRLDQ instruction scheduler classes
llvm-svn: 319822
2017-12-05 20:16:22 +00:00
Simon Pilgrim 833c260a4b [X86][AVX512] Tag VPTRUNC/VPMOVSX/VPMOVZX instruction scheduler classes
llvm-svn: 319815
2017-12-05 19:21:28 +00:00
Simon Pilgrim 65f805fe30 [X86][X87] Tag FCMOV instruction scheduler classes
llvm-svn: 319804
2017-12-05 18:01:26 +00:00
Simon Pilgrim d9f1ae3266 [X86][AVX512] Tag VNNIW instruction scheduler classes
llvm-svn: 319784
2017-12-05 16:17:21 +00:00
Simon Pilgrim 4a9b1e1273 [X86][AVX512] Drop some default NoItinerary arguments that aren't needed any more
llvm-svn: 319782
2017-12-05 16:10:57 +00:00
Jina Nahias 51c1a627c2 [x86][AVX512] Lowering kunpack intrinsics to LLVM IR
This patch, together with a matching clang patch (https://reviews.llvm.org/D39719), implements the lowering of X86 kunpack intrinsics to IR.

Differential Revision: https://reviews.llvm.org/D39720

Change-Id: I4088d9428478f9457f6afddc90bd3d66b3daf0a1
llvm-svn: 319778
2017-12-05 15:42:56 +00:00
Simon Pilgrim 4d08aedba3 [X86][AVX512] Tag VPMADD52/VPSADBW instruction scheduler classes
llvm-svn: 319772
2017-12-05 14:59:40 +00:00
Simon Pilgrim 71660c61e6 [X86][AVX512] Add missing scalar CMPSS/CMPSD logic scheduler classes
llvm-svn: 319770
2017-12-05 14:34:42 +00:00
Simon Pilgrim b9b46394e3 [X86][AVX512] Cleanup bit logic scheduler classes
llvm-svn: 319767
2017-12-05 14:04:23 +00:00
Simon Pilgrim fd3a2632e5 [X86][AVX512] Tag scalar CVT and CMP instruction scheduler classes
llvm-svn: 319765
2017-12-05 13:49:44 +00:00
Simon Pilgrim aa91155960 [X86][AVX512] Tag VPCMP/VPCMPU instruction scheduler classes
Move hardcoded itinerary out to the instruction declarations. Not sure that IIC_SSE_ALU_F32P is the best schedule for integer comparisons, but I'm not going to change it right now.

llvm-svn: 319760
2017-12-05 12:14:36 +00:00
Simon Pilgrim a2b5862641 [X86][AVX512] Cleanup VPCMP scheduler classes
Move hardcoded itinerary out to the instruction declarations. Not sure that IIC_SSE_ALU_F32P is the best schedule for integer comparisons, but I'm not going to change it right now.

llvm-svn: 319758
2017-12-05 12:02:22 +00:00
Simon Pilgrim 54b8aa2bb2 [X86][AVX512] Tag VFIXUPIMM instructions scheduler classes
llvm-svn: 319757
2017-12-05 11:46:57 +00:00
Guy Blank f3cefdd350 [X86] Fix a bug in handling GRXX subclasses in Domain Reassignment pass
When trying to determine the correct Mask register class corresponding
to a GPR register class, not all register classes were handled.
This caused an assertion to be raised on some scenarios.

Differential Revision:
https://reviews.llvm.org/D40290

llvm-svn: 319745
2017-12-05 09:08:24 +00:00
Craig Topper a404ce955a [X86] Use vector widening to support sign extend from i1 when the dest type is not 512-bits and vlx is not enabled.
Previously we used a wider element type and truncated. But its more efficient to keep the element type and drop unused elements.

If BWI isn't supported and we have a i16 or i8 type, we'll extend it to be i32 and still use a truncate.

llvm-svn: 319740
2017-12-05 06:37:21 +00:00
Daniel Sanders 3c1c4c0ee0 Revert r319691: [globalisel][tablegen] Split atomic load/store into separate opcode and enable for AArch64.
Some concerns were raised with the direction. Revert while we discuss it and look into an alternative

llvm-svn: 319739
2017-12-05 05:52:07 +00:00
Craig Topper e1ba2450c2 [X86] Fix a crash if avx512bw and xop are both enabled when the IR contrains a v32i8 bitreverse.
llvm-svn: 319737
2017-12-05 04:47:12 +00:00
Craig Topper 276c770e57 [X86] Use vector widening to support zero extend from i1 when the dest type is not 512-bits and vlx is not enabled.
Previously we used a wider element type and truncated. But its more efficient to keep the element type and drop unused elements.

If BWI isn't supported and we have a i16 or i8 type, we'll extend it to be i32 and still use a truncate.

llvm-svn: 319728
2017-12-05 01:45:46 +00:00
Craig Topper 913b42b0e1 [X86] Don't use kunpck for vXi1 concat_vectors if the upper bits are undef.
This can be efficiently selected by a COPY_TO_REGCLASS without the need for an extra instruction.

llvm-svn: 319726
2017-12-05 01:28:06 +00:00
Craig Topper 6302012442 [X86] Use getZeroVector and remove an unnecessary creation of an APInt before calling getConstant. NFCI
The getConstant function can take care of creating the APInt internally.

getZeroVector will take care of using the correct type for the build vector to avoid re-lowering.

The test change here is because execution domain constraints apparently pass through undef inputs of a zeroing xor. So the different ordering of register allocation here caused the dependency to change.

llvm-svn: 319725
2017-12-05 01:28:04 +00:00
Craig Topper adadaae586 [X86] Rearrange some of the code around AVX512 sign/zero extends. NFCI
Move the AVX512 code out of LowerAVXExtend. LowerAVXExtend has two callers but one of them pre-checks for AVX-512 so the code is only live from the other caller. So move the AVX-512 checks up to that caller for symmetry.

Move all of the i1 input type code in Lower_AVX512ZeroExend together.

llvm-svn: 319724
2017-12-05 01:28:00 +00:00
Hans Wennborg 361d4392cf Revert r319490 "XOR the frame pointer with the stack cookie when protecting the stack"
This broke the Chromium build (crbug.com/791714). Reverting while investigating.

> Summary: This strengthens the guard and matches MSVC.
>
> Reviewers: hans, etienneb
>
> Subscribers: hiraditya, JDevlieghere, vlad.tsyrklevich, llvm-commits
>
> Differential Revision: https://reviews.llvm.org/D40622
>
> git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@319490 91177308-0d34-0410-b5e6-96231b3b80d8

llvm-svn: 319706
2017-12-04 22:21:15 +00:00
Daniel Sanders 04e4f47e93 [globalisel][tablegen] Split atomic load/store into separate opcode and enable for AArch64.
This patch splits atomics out of the generic G_LOAD/G_STORE and into their own
G_ATOMIC_LOAD/G_ATOMIC_STORE. This is a pragmatic decision rather than a
necessary one. Atomic load/store has little in implementation in common with
non-atomic load/store. They tend to be handled very differently throughout the
backend. It also has the nice side-effect of slightly improving the common-case
performance at ISel since there's no longer a need for an atomicity check in the
matcher table.

All targets have been updated to remove the atomic load/store check from the
G_LOAD/G_STORE path. AArch64 has also been updated to mark
G_ATOMIC_LOAD/G_ATOMIC_STORE legal.

There is one issue with this patch though which also affects the extending loads
and truncating stores. The rules only match when an appropriate G_ANYEXT is
present in the MIR. For example,
  (G_ATOMIC_STORE (G_TRUNC:s16 (G_ANYEXT:s32 (G_ATOMIC_LOAD:s16 X))))
will match but:
  (G_ATOMIC_STORE (G_ATOMIC_LOAD:s16 X))
will not. This shouldn't be a problem at the moment, but as we get better at
eliminating extends/truncates we'll likely start failing to match in some
cases. The current plan is to fix this in a patch that changes the
representation of extending-load/truncating-store to allow the MMO to describe
a different type to the operation.

llvm-svn: 319691
2017-12-04 20:39:32 +00:00
Francis Visoiu Mistrih 25528d6de7 [CodeGen] Unify MBB reference format in both MIR and debug output
As part of the unification of the debug format and the MIR format, print
MBB references as '%bb.5'.

The MIR printer prints the IR name of a MBB only for block definitions.

* find . \( -name "*.mir" -o -name "*.cpp" -o -name "*.h" -o -name "*.ll" \) -type f -print0 | xargs -0 sed -i '' -E 's/BB#" << ([a-zA-Z0-9_]+)->getNumber\(\)/" << printMBBReference(*\1)/g'
* find . \( -name "*.mir" -o -name "*.cpp" -o -name "*.h" -o -name "*.ll" \) -type f -print0 | xargs -0 sed -i '' -E 's/BB#" << ([a-zA-Z0-9_]+)\.getNumber\(\)/" << printMBBReference(\1)/g'
* find . \( -name "*.txt" -o -name "*.s" -o -name "*.mir" -o -name "*.cpp" -o -name "*.h" -o -name "*.ll" \) -type f -print0 | xargs -0 sed -i '' -E 's/BB#([0-9]+)/%bb.\1/g'
* grep -nr 'BB#' and fix

Differential Revision: https://reviews.llvm.org/D40422

llvm-svn: 319665
2017-12-04 17:18:51 +00:00
Craig Topper 4520d4f8ad [X86] Allow VPMAXUQ/VPMAXSQ/VPMINUQ/VPMINSQ to be used with 128/256 bit vectors when AVX512 is enabled.
These instructions can be used by widening to 512-bits and extracting back to 128/256. We do similar to several other instructions already.

llvm-svn: 319641
2017-12-04 07:21:01 +00:00
Craig Topper 1151facf76 [X86] Don't turn UINT_TO_FP into SINT_TO_FP during lowering.
We already do this as a DAG combine. The version during lowering can only trigger if known bits changes something that improves known bits analysis. But this means we should be improving known bits analysis to work on the unlowered form instead.

llvm-svn: 319640
2017-12-04 05:38:44 +00:00
Simon Pilgrim 569e53b0f6 [X86][AVX512] Tag PH2PS/PS2PH conversion instructions scheduler classes
llvm-svn: 319637
2017-12-03 21:43:54 +00:00
Simon Pilgrim 465a88bb92 [X86][AVX512] Tag packed F2I/I2F/F2F conversion instructions scheduler class
llvm-svn: 319636
2017-12-03 21:16:12 +00:00
Simon Pilgrim bc8d0223fb [X86][SSE] Remove unused IIC_SSE_CVT_PI2PS_RR/IIC_SSE_CVT_PI2PS_RM itineraries
llvm-svn: 319634
2017-12-03 20:57:04 +00:00
Simon Pilgrim 299a54c5b9 [X86][SSE] Cleanup float/int conversion scheduler itinerary classes
Makes it easier to grok where each is supposed to be used, mainly useful for adding to the AVX512 instructions but hopefully can be used more in SSE/AVX as well.

llvm-svn: 319614
2017-12-02 12:27:44 +00:00
Craig Topper 7d9a3b82c6 [X86] Teach the assembler to support %db8-%db15 as aliases for %dr8-%dr15.
llvm-svn: 319612
2017-12-02 08:27:46 +00:00
Craig Topper 3e846ecb5b [X86] Support %dr8-%dr15 in the assembler.
Apparently I failed to make this work when I fixed it in the disassembler way back in r224862.

llvm-svn: 319611
2017-12-02 08:27:45 +00:00
Matt Morehouse 9e658c974b Revert "[X86] Improvement in CodeGen instruction selection for LEAs."
This reverts r319543, due to ASan bot breakage.

llvm-svn: 319591
2017-12-01 22:20:26 +00:00
Simon Pilgrim 031d8b71b3 [X86][AVX512] Tag subvector extract/insert instructions scheduler classes
llvm-svn: 319568
2017-12-01 18:40:32 +00:00
Simon Pilgrim 8d5e469c32 Fix line endings. NFCI.
llvm-svn: 319559
2017-12-01 17:24:15 +00:00
Simon Pilgrim fb01cb1b0c [X86][AVX512] Tag VPERM2I/VPERM2T instructions scheduler class
llvm-svn: 319558
2017-12-01 17:23:06 +00:00
Simon Pilgrim 54c6083fb1 [X86][AVX512] Tag VFPCLASS instructions scheduler class
llvm-svn: 319554
2017-12-01 16:51:48 +00:00
Simon Pilgrim 07b4c5917e [X86][AVX512] Tag VPSHUFBITQMB instructions scheduler class
llvm-svn: 319553
2017-12-01 16:35:57 +00:00
Simon Pilgrim 904d1a895c [X86][AVX512] Tag VPCOMRESS/VPEXPAND instructions scheduler classes
llvm-svn: 319551
2017-12-01 16:20:03 +00:00
Jatin Bhateja 328199ec26 [X86] Improvement in CodeGen instruction selection for LEAs.
Summary:
1/  Operand folding during complex pattern matching for LEAs has been extended, such that it promotes Scale to
     accommodate similar operand appearing in the DAG  e.g.
                 T1 = A + B
                 T2 = T1 + 10
                 T3 = T2 + A
    For above DAG rooted at T3, X86AddressMode will now look like
                Base = B , Index = A , Scale = 2 , Disp = 10

2/  During OptimizeLEAPass down the pipeline factorization is now performed over LEAs so that if there is an opportunity
     then complex LEAs (having 3 operands) could be factored out  e.g.
                 leal 1(%rax,%rcx,1), %rdx
                 leal 1(%rax,%rcx,2), %rcx
     will be factored as following
                 leal 1(%rax,%rcx,1), %rdx
                 leal (%rdx,%rcx)   , %edx

3/ Aggressive operand folding for AM based selection for LEAs is sensitive to loops, thus avoiding creation of any complex LEAs within a loop.

4/ Simplify LEA converts (lea (BASE,1,INDEX,0)  --> add (BASE, INDEX) which offers better through put.

PR32755 will be taken care of by this pathc.

Previous patch revisions : r313343 , r314886

Reviewers: lsaba, RKSimon, craig.topper, qcolombet, jmolloy, jbhateja

Reviewed By: lsaba, RKSimon, jbhateja

Subscribers: jmolloy, spatel, igorb, llvm-commits

Differential Revision: https://reviews.llvm.org/D35014

llvm-svn: 319543
2017-12-01 14:07:38 +00:00
Simon Pilgrim 2dc4ff1cde [X86][AVX512] Tag vshift/vpermv/pshufd/pshufb instructions scheduler classes
llvm-svn: 319540
2017-12-01 13:25:54 +00:00
Volkan Keles a32ff00b00 GlobalISel: Enable the legalization of G_MERGE_VALUES and G_UNMERGE_VALUES
Summary: LegalizerInfo assumes all G_MERGE_VALUES and G_UNMERGE_VALUES instructions are legal, so it is not possible to legalize vector operations on illegal vector types. This patch fixes the problem by removing the related check and adding default actions for G_MERGE_VALUES and G_UNMERGE_VALUES.

Reviewers: qcolombet, ab, dsanders, aditya_nandakumar, t.p.northover, kristof.beyls

Reviewed By: dsanders

Subscribers: rovka, javed.absar, igorb, llvm-commits

Differential Revision: https://reviews.llvm.org/D39823

llvm-svn: 319524
2017-12-01 08:19:10 +00:00
Craig Topper f8470a6399 [X86] Custom legalize v2i32 gathers via widening rather than promoting.
The default legalization for v2i32 is promotion to v2i64. This results in a gather that reads 64-bit elements rather than 32. If one of the elements is near a page boundary this can cause an illegal access that can fault.

We also miscalculate the scale for the gather which is an even worse problem, but we probably could have found a separate way to fix that.

llvm-svn: 319521
2017-12-01 06:02:02 +00:00
Craig Topper 11f733df9b [X86] Add a DAG combine to simplify masks for AVX2 gather instructions.
AVX2 gathers only use the upper bit of the mask allowing us to simplify sign_extend_inreg to a shift left.

llvm-svn: 319514
2017-12-01 02:49:07 +00:00
Zachary Turner 8065f0b975 Mark all library options as hidden.
These command line options are not intended for public use, and often
don't even make sense in the context of a particular tool anyway. About
90% of them are already hidden, but when people add new options they
forget to hide them, so if you were to make a brand new tool today, link
against one of LLVM's libraries, and run tool -help you would get a
bunch of junk that doesn't make sense for the tool you're writing.

This patch hides these options. The real solution is to not have
libraries defining command line options, but that's a much larger effort
and not something I'm prepared to take on.

Differential Revision: https://reviews.llvm.org/D40674

llvm-svn: 319505
2017-12-01 00:53:10 +00:00
Reid Kleckner ba4014e9dc XOR the frame pointer with the stack cookie when protecting the stack
Summary: This strengthens the guard and matches MSVC.

Reviewers: hans, etienneb

Subscribers: hiraditya, JDevlieghere, vlad.tsyrklevich, llvm-commits

Differential Revision: https://reviews.llvm.org/D40622

llvm-svn: 319490
2017-11-30 22:41:21 +00:00
Craig Topper d4257565cf [X86] Promote i8 CTPOP to i32 instead of i16 when we have the POPCNT instruction.
The 32-bit version is shorter to encode and the zext we emit for the promotion is likely going to be a 32-bit zero extend anyway.

llvm-svn: 319468
2017-11-30 20:15:31 +00:00
Simon Pilgrim bb791b3dbd [X86][AVX512] Tag fcmp/ptest/ternlog instructions scheduler classes
llvm-svn: 319433
2017-11-30 13:18:06 +00:00
Francis Visoiu Mistrih 93ef145862 [CodeGen] Print "%vreg0" as "%0" in both MIR and debug output
As part of the unification of the debug format and the MIR format, avoid
printing "vreg" for virtual registers (which is one of the current MIR
possibilities).

Basically:

* find . \( -name "*.mir" -o -name "*.cpp" -o -name "*.h" -o -name "*.ll" \) -type f -print0 | xargs -0 sed -i '' -E "s/%vreg([0-9]+)/%\1/g"
* grep -nr '%vreg' . and fix if needed
* find . \( -name "*.mir" -o -name "*.cpp" -o -name "*.h" -o -name "*.ll" \) -type f -print0 | xargs -0 sed -i '' -E "s/ vreg([0-9]+)/ %\1/g"
* grep -nr 'vreg[0-9]\+' . and fix if needed

Differential Revision: https://reviews.llvm.org/D40420

llvm-svn: 319427
2017-11-30 12:12:19 +00:00
Simon Pilgrim d1a7d0c3f1 [X86][AVX512] Tag binop/rounding/sae instructions scheduler classes
llvm-svn: 319424
2017-11-30 12:01:52 +00:00
Simon Pilgrim 3e5987cf8d [X86][AVX512] Tag RCP/RSQRT/GETEXP instructions scheduler classes
llvm-svn: 319418
2017-11-30 10:48:47 +00:00
Craig Topper a495744d2c [X86] Optimize avx2 vgatherqps for v2f32 with v2i64 index type.
Normal type legalization will widen everything. This requires forcing 0s into the mask register. We can instead choose the form that only reads 2 elements without zeroing the mask.

llvm-svn: 319406
2017-11-30 07:01:40 +00:00