Commit Graph

16109 Commits

Author SHA1 Message Date
Craig Topper 1c7d07c601 [X86] Remove unneeded code for handling the old kunpck intrinsics.
llvm-svn: 320917
2017-12-16 06:58:30 +00:00
Craig Topper c08960597c [X86] Add 128 and 256-bit VPOPCNTDQ instructions. Adjust some tablegen classes LZCNT/POPCNT.
I think when this instruction was first published it was only for a Knights CPU and thus VLX version was missing.

llvm-svn: 320910
2017-12-16 02:40:28 +00:00
Craig Topper 6b129fde5a [X86] Add back the assert from r320830 that was reverted in r320850
Hopefully r320864 has fixed the offending case that failed the assert.

llvm-svn: 320898
2017-12-16 00:33:16 +00:00
Matthias Braun f1caa2833f MachineFunction: Return reference from getFunction(); NFC
The Function can never be nullptr so we can return a reference.

llvm-svn: 320884
2017-12-15 22:22:58 +00:00
Craig Topper 6b8ac481f1 [X86] Use AND32ri8 instead of AND64ri8 in Asan code in EmitCallAsanReport for 32-bit mode.
This seemed to work due to a quirk in the X86 MC encoder that didn't emit a REX byte that the AND64ri8 implies when in 32-bit mode. This made the encoding the same as AND32ri8. I tried to add an assert to catch the dropped REX prefix that caught this.

llvm-svn: 320864
2017-12-15 21:18:06 +00:00
Craig Topper 422ed23298 [X86] In LowerVectorCTPOP use ISD::ZERO_EXTEND/ISD::TRUNCATE instead of the target specific nodes.
The target independent nodes will get legalized to the target specific nodes by their own legalization process. Someday I'd like to stop using a target specific for zero extends and truncates of legal types so the less places we reference the target specific opcode the better.

llvm-svn: 320863
2017-12-15 21:18:05 +00:00
Craig Topper f08ab74ae3 [X86] Remove unnecessary TODO.
When I wrote it I thought we were missing a potential optimization for KNL. But investigating further shows that for KNL we still do the optimal thing by widening to v4f32 and then using special isel patterns to widen again to zmm a register.

llvm-svn: 320862
2017-12-15 20:57:18 +00:00
Craig Topper df2521a638 [X86] Remove assert in X86MCCodeEmitter.cpp that was added in r320830.
It seems to be failing real code which is concerning, but we were silently getting away with it. I'll investigate further.

llvm-svn: 320850
2017-12-15 19:38:14 +00:00
Craig Topper 3fb8386685 [SelectionDAG][X86] Fix insert_vector_elt lowering for v32i1/v64i1 with non-constant index
Summary:
Currently we don't handle v32i1/v64i1 insert_vector_elt correctly as we fail to look at the number of elements closely and assume it can only be v16i1 or v8i1.

We also can't type legalize v64i1 insert_vector_elt correctly on KNL due to the type not being byte addressable as required by the legalizing through memory accesses path requires.

For the first issue, the patch now tries to pick a 512-bit register with the correct number of elements and promotes to that.

For the second issue, we now extend the vector to a byte addressable type, do the stores to memory, load the two halves, and then truncate the halves back to the original type. Technically since we changed the type, we may not need two loads, but actually checking that is more work and for the v64i1 case we do need them.

Reviewers: RKSimon, delena, spatel, zvi

Reviewed By: RKSimon

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D40942

llvm-svn: 320849
2017-12-15 19:35:22 +00:00
Craig Topper 23c348850f [X86] Add 'Requires<[In64BitMode]>' to a bunch of instructions that only have memory and immediate operands.
The asm parser wasn't preventing these from being accepted in 32-bit mode. Instructions that use a GR64 register are protected by the parser rejecting the register in 32-bit mode.

llvm-svn: 320846
2017-12-15 19:01:51 +00:00
Craig Topper 914b1d524c [X86] Change BNDLDX to use anymem instead of i64mem for itsmemory operand.
This instruction doesn't access memory. It juse use a similar looking memory encoding. Don't require Intel syntax to put "qword ptr" in front of it.

llvm-svn: 320845
2017-12-15 19:01:50 +00:00
Craig Topper 446f3e2084 [X86] Remove the 'Requires' In64BitMode/Not64BitMode from the LWP instructions.
These aren't doing anything due to a top level "let Predicates =". I think the GR32/GR64 register class protects these anyway.

llvm-svn: 320844
2017-12-15 19:01:49 +00:00
Craig Topper 365e8aa5d5 [X86] Remove the 'Requires<[In64BitMode]>' from SHSTK instructions.
This has no effect due to a top level "let Predicates =" around the instructions. But its also not required because the GR64 usage in the instruction guarantees it can never match.

llvm-svn: 320843
2017-12-15 19:01:48 +00:00
Andrew V. Tischenko 22f0742dda Fix for bug PR35549 - Repeated schedule comments.
Differential Revision: https://reviews.llvm.org/D40960

llvm-svn: 320837
2017-12-15 18:13:05 +00:00
Craig Topper a16395008c [X86] Fix XSAVE64 and similar instructions to not be allowed by the assembler in 32-bit mode.
There was a top level "let Predicates =" in the .td file that was overriding the Requires on each instruction.

I've added an assert to the code emitter to catch more cases like this. I'm sure this isn't the only place where the right predicates aren't being applied. This assert already found that we don't block btq/btsq/btrq in 32-bit mode.

llvm-svn: 320830
2017-12-15 17:22:58 +00:00
Francis Visoiu Mistrih 0b5bdceabf [CodeGen] Print stack object references as %(fixed-)stack.0 in both MIR and debug output
Work towards the unification of MIR and debug output by printing
`%stack.0` instead of `<fi#0>`, and `%fixed-stack.0` instead of
`<fi#-4>` (supposing there are 4 fixed stack objects).

Only debug syntax is affected.

Differential Revision: https://reviews.llvm.org/D41027

llvm-svn: 320827
2017-12-15 16:33:45 +00:00
Craig Topper ad9221d684 [X86] Widen (v2i32 (fp_to_uint v2f64)) to (v8i32 (fp_to_uint v8f64)) during legalization if we have AVX512F, but not VLX. NFC
Previously we widened it using isel patterns.

llvm-svn: 320824
2017-12-15 16:22:20 +00:00
Craig Topper 7cfacbf6ea [X86] Fix a couple bugs in my recent changes to vXi1 insert_subvector lowering.
A couple places didn't use the same SDValue variables to connect everything all the way through.

I don't have a test case for a bug in insert into the lower bits of a non-zero, non-undef vector. Not sure the best way to create that. We don't create the case when lowering concat_vectors which is the main way to get insert_subvectors.

llvm-svn: 320790
2017-12-15 07:16:41 +00:00
Craig Topper 1a1e6d6cf6 [X86] Add a TODO about v8i1 CONCAT_VECTORS.
llvm-svn: 320784
2017-12-15 01:03:46 +00:00
Craig Topper 5ebf3ac9c2 [X86] Further rearrange the setOperationAction calls to separate the ones that require 512-bit registers OR VLX into separate sections. NFCI
We have several instructions that were introduced in AVX512F that are only available in 512-bit form on KNL. We still make use of them for 128/256 by artificially widening and extracting during isel.

This commit separates these operations from the true 512-bit operations. This way we can qualify the normal 512-bit operations with needing 512-bit register support. And these special operations will get qualified with needing 512-bit registers OR VLX.

The 512-bit register qualification will be introduced in a future patch this just gets everything grouped to minimize deltas on that patch.

llvm-svn: 320782
2017-12-15 01:03:43 +00:00
Craig Topper 07a28f777e [X86] Group setOperationActions related to vXi1 masks together. NFCI
Previously they were sort of interleaved in with XMM/YMM/ZMM action related code.

Trying to separate things so its easier to split 512-bit vectors later.

llvm-svn: 320781
2017-12-15 01:03:42 +00:00
Craig Topper b89bc20a64 [X86] Make ISD::INSERT_SUBVECTOR v8i1 legal with AVX512F because we should be custom lowering inserting v1i1 into v8i1 under this.
I don't have a test case at the moment. Just noticed while auditing things.

llvm-svn: 320780
2017-12-15 01:03:40 +00:00
Craig Topper 212070486d [X86] Move some of the hasVLX qualified code out of the main hasAVX512 block in the X86ISelLowering constructor. NFCI
Move it into the separate hasVLX block later in the constructor.

I'm trying to separate 128/256 and 512-bit related code so we can eventually qualify the hasAVX512 block with support for 512-bit vectors required by the prefer-vector-width feature support being talked about in D41096.

llvm-svn: 320779
2017-12-15 01:03:38 +00:00
Saleem Abdulrasool 05e285bcc5 FastISel: support no-PLT PIC calls on ELF x86_64
Add support for properly handling PIC code with no-PLT.  This equates to
`-fpic -fno-plt -O0` with the clang frontend.  External functions are
marked with nonlazybind, which must then be indirected through the GOT.
This allows code to be built without optimizations in PIC mode without
going through the PLT.  Addresses PR35653!

llvm-svn: 320776
2017-12-15 00:32:09 +00:00
Craig Topper 4341a7b08c [X86] Remove an unnecessary SmallVector that was collecting chains for two SDNode's we're still holding SDValues for. NFCI
We can just get the chains from those SDValues to create the TokenFactor.

llvm-svn: 320757
2017-12-14 22:50:10 +00:00
Matt Arsenault 7d7adf4f2e TLI: Allow using PSV for intrinsic mem operands
llvm-svn: 320756
2017-12-14 22:34:10 +00:00
Zachary Turner 260fe3eca6 Fix many -Wsign-compare and -Wtautological-constant-compare warnings.
Most of the -Wsign-compare warnings are due to the fact that
enums are signed by default in the MS ABI, while the
tautological comparison warnings trigger on x86 builds where
sizeof(size_t) is 4 bytes, so N > numeric_limits<unsigned>::max()
is always false.

Differential Revision: https://reviews.llvm.org/D41256

llvm-svn: 320750
2017-12-14 22:07:03 +00:00
Matt Arsenault 1117133687 DAG: Expose all MMO flags in getTgtMemIntrinsic
Rather than adding more bits to express every
MMO flag you could want, just directly use the
MMO flags. Also fixes using a bunch of bool arguments to
getMemIntrinsicNode.

On AMDGPU, buffer and image intrinsics should always
have MODereferencable set, but currently there is no
way to do that directly during the initial intrinsic
lowering.

llvm-svn: 320746
2017-12-14 21:39:51 +00:00
Craig Topper 600f1ba333 [X86] Don't zero the upper bits of the k-register before extracting a single bit from a vXi1.
This doesn't match the semantics of the extract_vector_elt operation. Nothing downstream knows the bits were zeroed so they still get masked or sign extended after the extrat anyway.

llvm-svn: 320723
2017-12-14 18:35:25 +00:00
Andrew V. Tischenko 070d5e3054 Any Target Asm comments should start from MachineInstr::TAsmComments value.
llvm-svn: 320693
2017-12-14 12:07:11 +00:00
Michael Zuckerman 19fd217eaa [AVX512] Adding support for load truncate store of I1
store operation on a truncated memory (load) of vXi1 is poorly supported by LLVM and most of the time end with an assertion.
This patch fixes this issue.

Differential Revision: https://reviews.llvm.org/D39547

Change-Id: Ida5523dd09c1ad384acc0a27e9e59273d28cbdc9
llvm-svn: 320691
2017-12-14 11:55:50 +00:00
Craig Topper 8cdf7c0e68 [X86] Make ANY_EXTEND from vXi1 Custom for more types.
We should be able to support ANY_EXTEND for any types we support ZERO_EXTEND for.

llvm-svn: 320675
2017-12-14 08:26:00 +00:00
Craig Topper 271a5c72a0 [X86] Remove redundant setOperationAction calls.
These calls already exist earlier under AVX2 feature.

llvm-svn: 320673
2017-12-14 08:25:53 +00:00
Craig Topper f82867c95a Recommit r320461 "[X86] Use regular expressions more aggressively to reduce the number of scheduler entries needed for FMA3 instructions."
I've hopefully sidestepped the MSVC issue that caused it to be reverted. We no longer include the Sched enum from X86GenInstrInfo.inc on the X86 target. So hopefully MSVC's preprocessor will skip over it and nothing will notice the 11000 character enum name.

Original commit message:

When the scheduler tables are generated by tablegen, the instructions are divided up into groups based on their default scheduling information and how they are referenced by groups for each processor. For any set of instructions that are matched by a specific InstRW line, that group of instructions is guaranteed to not be in a group with any other instructions. So in general, the more InstRW class definitions are created, the more groups we end up with in the generated files. Particularly if a lot of the InstRW lines only match to single instructions, which is true of a large number of the Intel scheduler models.

This change alone reduces the number of instructions groups from ~6000 to ~5500. And there's lots more we could do.

llvm-svn: 320655
2017-12-13 23:11:30 +00:00
Michael Zolotukhin 67b04bd8ac Recover some overzealously removed includes.
llvm-svn: 320648
2017-12-13 22:21:02 +00:00
Michael Zolotukhin ad24af7f58 Remove redundant includes from lib/Target/X86.
llvm-svn: 320636
2017-12-13 21:31:19 +00:00
Simon Pilgrim f00ea1b4cd [X86] Add RDMSR/WRMSR, RDPMC + RDTSC/RDTSCP schedule tests
Add missing RDTSCP itinerary

llvm-svn: 320581
2017-12-13 14:22:04 +00:00
Simon Pilgrim f51f4d3623 [X86][SSE] MOVMSK only uses the sign bit from each vector element
Pass the input vector through SimplifyDemandedBits as we only need the sign bit from each vector element of MOVMSK

We'd probably get more hits if SimplifyDemandedBits was better at handling vectors...

Differential Revision: https://reviews.llvm.org/D41119

llvm-svn: 320570
2017-12-13 11:43:14 +00:00
Sanjoy Das 1074eb225b Reapply "[X86] Flag BroadWell scheduler model as complete"
This reverts commit r320508, in effect re-applying r320308.  Simon has already
reverted the parts that caused the crash that motivated the revert in r320492.

llvm-svn: 320512
2017-12-12 19:11:31 +00:00
Sanjoy Das 81a4a02cbc Revert "[X86] Flag BroadWell scheduler model as complete"
This reverts commit r320308.  r320308 crashes LLC, please see the llvm-commits
thread for a reproducer.

llvm-svn: 320508
2017-12-12 18:40:58 +00:00
Craig Topper 712a209db9 [X86] Add a couple TODOs about missing coverage/features motivated by D40335
D40335 was wanting to add FMSUBADD support, but it discovered that there are two pieces of code to make FMADDSUB and only one of those is tested. So I've asked that review to implement the one path until we get tests that test the existing code.

llvm-svn: 320507
2017-12-12 18:39:04 +00:00
Nirav Dave 674d053d18 [X86] Cleanup type conversion of 64-bit load-store pairs.
Summary:
Simplify and generalize chain handling and search for 64-bit load-store pairs.
Nontemporal test now converts 64-bit integer load-store into f64 which it realizes directly instead of splitting into two i32 pairs.

Reviewers: craig.topper, spatel

Reviewed By: craig.topper

Subscribers: hiraditya, llvm-commits

Differential Revision: https://reviews.llvm.org/D40918

llvm-svn: 320505
2017-12-12 18:25:48 +00:00
Simon Pilgrim 68f9accf51 [X86] Remove CompleteModel tags from CPU targets until we have better error checking (PR35636)
The checks we have for complete models are not great and miss many cases - e.g. in PR35636 it failed to recognise that only the first output (of 2) was actually tagged by the InstRW

Raised PR35639 and PR35643 as examples

llvm-svn: 320492
2017-12-12 16:12:53 +00:00
Ayman Musa c2eed926b0 [X86] Recognize constant arrays with special values and replace loads from it with subtract and shift instructions, which then will be replaced by X86 BZHI machine instruction.
Recognize constant arrays with the following values:
  0x0, 0x1, 0x3, 0x7, 0xF, 0x1F, .... , 2^(size - 1) -1
where //size// is the size of the array.

the result of a load with index //idx// from this array is equivalent to the result of the following:
  (0xFFFFFFFF >> (sub 32, idx))             (assuming the array of type 32-bit integer).

And the result of an 'AND' operation on the returned value of such a load and another input, is exactly equivalent to the X86 BZHI instruction behavior.

See test cases in the LIT test for better understanding.

Differential Revision: https://reviews.llvm.org/D34141

llvm-svn: 320481
2017-12-12 14:13:51 +00:00
Simon Pilgrim 0f8a5a41cf Revert r320461 - causing ICE in windows buildss
[X86] Use regular expressions more aggressively to reduce the number of scheduler entries needed for FMA3 instructions.

When the scheduler tables are generated by tablegen, the instructions are divided up into groups based on their default scheduling information and how they are referenced by groups for each processor. For any set of instructions that are matched by a specific InstRW line, that group of instructions is guaranteed to not be in a group with any other instructions. So in general, the more InstRW class definitions are created, the more groups we end up with in the generated files. Particularly if a lot of the InstRW lines only match to single instructions, which is true of a large number of the Intel scheduler models.

This change alone reduces the number of instructions groups from ~6000 to ~5500. And there's lots more we could do.

llvm-svn: 320470
2017-12-12 11:34:25 +00:00
Craig Topper c8e64ab539 [X86] Use regular expressions more aggressively to reduce the number of scheduler entries needed for FMA3 instructions.
When the scheduler tables are generated by tablegen, the instructions are divided up into groups based on their default scheduling information and how they are referenced by groups for each processor. For any set of instructions that are matched by a specific InstRW line, that group of instructions is guaranteed to not be in a group with any other instructions. So in general, the more InstRW class definitions are created, the more groups we end up with in the generated files. Particularly if a lot of the InstRW lines only match to single instructions, which is true of a large number of the Intel scheduler models.

This change alone reduces the number of instructions groups from ~6000 to ~5500. And there's lots more we could do.

llvm-svn: 320461
2017-12-12 08:17:04 +00:00
Craig Topper 468a813315 [X86] Use Ld scheduler classes for instructions with folded loads.
llvm-svn: 320459
2017-12-12 07:06:35 +00:00
Craig Topper c1e72c019d [X86] Correct the FMA3 regular expressions in the znver1 scheduler model.
llvm-svn: 320458
2017-12-12 07:06:32 +00:00
Simon Pilgrim 6d89f407db Normalize line endings. NFCI.
llvm-svn: 320389
2017-12-11 17:01:21 +00:00
Simon Pilgrim fabe354b42 [X86] Add LWP schedule tests
Tag LWP instructions as WriteSystem

llvm-svn: 320387
2017-12-11 16:47:21 +00:00
Craig Topper c6a4a97260 [X86] Add VCOMISDZrr, VCOMISSZrr, VUCOMISDZrr, and VUCOMISSZrr to the skylake server sheduler model
llvm-svn: 320326
2017-12-10 19:47:57 +00:00
Craig Topper a0be5a06c1 [X86] Rename some instructions that start with Int_ to have the _Int at the end.
This matches AVX512 version and is more consistent overall. And improves our scheduler models.

In some cases this adds _Int to instructions that didn't have any Int_ before. It's a side effect of the adjustments made to some of the multiclasses.

llvm-svn: 320325
2017-12-10 19:47:56 +00:00
Simon Pilgrim c493d4f5b9 [X86][X87] Fix typo in znver1 FIST/FISTT schedule patterns
llvm-svn: 320322
2017-12-10 19:19:22 +00:00
Craig Topper 1de942b2d1 [X86] Rename some instructions from 'rb' to 'rrb' to make 'b' a proper suffix. Fix the scheduling information for some of them.
Some of the scheduling information was only present for the 'rb' version' and not the 'rr' version. Now we match 'rr(b?)'

llvm-svn: 320320
2017-12-10 17:42:44 +00:00
Craig Topper c7445f2cdc [X86] Add VCVTQQ2PS to the skylake server scheduler models.
llvm-svn: 320319
2017-12-10 17:42:43 +00:00
Craig Topper c268527b2f [X86] Add VPMULLWZ256 to the skylake server scheduler model
llvm-svn: 320318
2017-12-10 17:42:42 +00:00
Craig Topper 4ec397cbd3 [X86] Add 256/512-bit EVEX VPSADBW instructions to skylake server scheduler model.
llvm-svn: 320317
2017-12-10 17:42:41 +00:00
Craig Topper aa904d5ab6 [X86] Fix a few instructions that were named Z512 instead of just Z.
This makes things consistent with our normal instruction naming.

llvm-svn: 320316
2017-12-10 17:42:39 +00:00
Craig Topper 7c89de1760 [X86] Add VPSRLWZrr to skylake server scheduler model.
llvm-svn: 320315
2017-12-10 17:42:38 +00:00
Craig Topper 1d7760db49 [X86] Add VPUNPCKLWDZrr to skylake server scheduler model.
llvm-svn: 320314
2017-12-10 17:42:37 +00:00
Craig Topper 57c2815cbe [X86] Adjust tablegen includes so we can use Instructions in scheduler models instead of just instregexs.
This separates the CPU specific scheduler model includes to occur after the instructions. Moves the instruction includes between the basic scheduler information and the CPU specific scheduler models.

llvm-svn: 320313
2017-12-10 17:42:36 +00:00
Simon Pilgrim 1f8cfba0bb [X86] Flag BroadWell scheduler model as complete
Locally tag COPY as WriteMove, which has caused some reg-reg + reg-mem instruction tests to reorder.

llvm-svn: 320308
2017-12-10 13:49:51 +00:00
Simon Pilgrim 49c74934dd Strip trailing whitespace. NFCI.
llvm-svn: 320306
2017-12-10 13:00:37 +00:00
Simon Pilgrim 320996576d [X86] Flag ZNVER1 scheduler model as complete
We just have to locally tag COPY as WriteMove

llvm-svn: 320304
2017-12-10 12:43:53 +00:00
Simon Pilgrim 8547645948 [X86] Flag SLM scheduler model as complete
We just have to locally tag COPY as WriteMove

llvm-svn: 320303
2017-12-10 12:36:29 +00:00
Simon Pilgrim 91c159d841 [X86][AVX[ Tag VZEROALL/VZEROUPPER instructions scheduler classes
llvm-svn: 320302
2017-12-10 12:26:35 +00:00
Simon Pilgrim 6de94a1adc [X86] Tag SSE4A instructions as SSE INTALU scheduler classes
llvm-svn: 320301
2017-12-10 12:08:04 +00:00
Simon Pilgrim cd58171110 [X86] Flag BTVER2 scheduler model as complete
We just have to locally tag COPY as WriteMove

llvm-svn: 320300
2017-12-10 11:51:29 +00:00
Simon Pilgrim b7fb2e2fa1 [X86] Tag ADJSTACK instructions as INTALU scheduler class
llvm-svn: 320299
2017-12-10 11:34:08 +00:00
Simon Pilgrim 1a030016a6 [X86] Tag MORESTACK instructions as ret scheduler class
llvm-svn: 320296
2017-12-10 10:08:21 +00:00
Craig Topper 253562eb81 [X86] Fix duplicate entries in skylake server scheduler model by changing Z128 to Z256
Based on the fact that the 'Y' version of the instruction is next to this, I assume Z256 is the intended value.

llvm-svn: 320295
2017-12-10 09:14:45 +00:00
Craig Topper 90c9c15936 [X86] Add MOVQI2PQIrm, MOVSDmr, and MOVSDrm to scheduler information
The VEX versions were present but not the legacy SSE versions.

llvm-svn: 320294
2017-12-10 09:14:44 +00:00
Craig Topper 28e55386ac [X86] Add LEA64_32r to scheduler models for Sandybridge,Haswell,Broadwell,Skylake
llvm-svn: 320293
2017-12-10 09:14:42 +00:00
Craig Topper 8ade4640f3 [X86] Add IN16/OUT16 to scheduling information for Haswell,Broadwell,Skylake
Sandy Bridge is also missing it, but it has other issues. See PR35590.

llvm-svn: 320292
2017-12-10 09:14:41 +00:00
Craig Topper 1a88c50fd7 [X86] Fix scheduler models to support ADD32ri in addition to ADD32ri8. Similar for all sizes of AND/OR/XOR/SUB/ADC/SBB/CMP.
llvm-svn: 320291
2017-12-10 09:14:39 +00:00
Craig Topper c89e282f7d [X86] Rename some instructions so that 'b' is added as a suffix instead of replacing an 'r'
llvm-svn: 320290
2017-12-10 09:14:38 +00:00
Craig Topper 6c65910160 [X86] Add CMPSDrr/rm to the scheduler models.
Somehow CMPSSrr/rm was there and the VEX version was there, but this was consistently missing.

llvm-svn: 320289
2017-12-10 09:14:37 +00:00
Craig Topper da7e78e18c [X86] Rename the rb form of scalar ADD/SUB/MUL/DIV to include _Int since they can only be selected by intrinsics.
llvm-svn: 320283
2017-12-10 04:07:28 +00:00
Craig Topper 4e57776fb2 [X86] Correct the _Int part of more scheduler model instrexes. Put _b in the correct order relative to _Int
llvm-svn: 320282
2017-12-10 03:16:38 +00:00
Craig Topper a2f5528084 [X86] Remove ReadAfterLd from several several rb instructions
This affects CVTSD2SS, FMA, RCP28, RSQRT28, and SQRT scalar instructions

'b' here refers to 'sae' not broadcast. These aren't memory instructions.

llvm-svn: 320281
2017-12-10 03:16:36 +00:00
Craig Topper 391c6f9507 [X86] Fix bad regular expressions in the scheduler models. Question marks should be outside of multicharacter parenthesized expressions
If the question mark is inside the parentheses it only applies to the single character proceeding it.

I had to make a few additional cleanups to fix some duplicate warnings that were exposed by fixing this.

llvm-svn: 320279
2017-12-10 01:24:08 +00:00
Craig Topper 8ee98d0b51 [X86] Make the _Int part of some instregex sheduler patterns optional
llvm-svn: 320278
2017-12-10 01:24:06 +00:00
Craig Topper 5ffe80103e [X86] Add the commutable floating point min/max pseudo instructions to sandybridge,haswell,broadwell,skylakeclient scheduler models.
llvm-svn: 320277
2017-12-10 01:24:05 +00:00
Simon Pilgrim 6655eef1b4 [X86] Tag PIC setup instruction as jump scheduler class
llvm-svn: 320276
2017-12-10 00:40:37 +00:00
Simon Pilgrim 5d74949e5f [X86] Tag ACQUIRE/RELEASE atomic instructions as microcoded scheduler classes
Note: We may be too pessimistic here and should possibly use something closer to the LOCK arithmetic instructions
llvm-svn: 320275
2017-12-10 00:30:57 +00:00
Simon Pilgrim dcbe723d28 [X86] Tag TLS instructions as system scheduler classes
llvm-svn: 320274
2017-12-10 00:12:57 +00:00
Simon Pilgrim 3508a09455 [X86] Tag ALLOCA/VAARG instructions as system scheduler classes
llvm-svn: 320273
2017-12-10 00:03:16 +00:00
Craig Topper f4e3044db9 [X86] Use KMOV instructions to zero upper bits of vectors when possible.
llvm-svn: 320268
2017-12-09 23:10:59 +00:00
Craig Topper 5ac75d5628 [X86] Improve lowering of vXi1 insert_subvectors to better utilize (insert_subvector zero, vec, 0) for zeroing upper bits.
This can be better recognized during isel when the producer already zeroed the upper bits.

llvm-svn: 320267
2017-12-09 22:44:42 +00:00
Simon Pilgrim e049038692 [X86] Tag LOCK/REX64/DATA16/DATA32 instruction prefix scheduler classes
llvm-svn: 320266
2017-12-09 21:27:03 +00:00
Simon Pilgrim b2b93f6204 Strip trailing whitespace. NFCI.
llvm-svn: 320265
2017-12-09 20:44:51 +00:00
Simon Pilgrim 7e636cc419 [X86] Tag FS/GS BASE R/W instruction scheduler classes
llvm-svn: 320264
2017-12-09 20:42:27 +00:00
Simon Pilgrim 231fab072f [X86] Tag REP/REPNE prefix instructions as microcoded scheduler classes
llvm-svn: 320263
2017-12-09 20:16:37 +00:00
Simon Pilgrim 2e7314eb2f [X86] Tag missing EH pseudo instruction scheduler classes
llvm-svn: 320262
2017-12-09 20:04:02 +00:00
Simon Pilgrim cb71e72707 [X86] Tag frame pointer XORs instruction scheduler classes
llvm-svn: 320261
2017-12-09 19:56:39 +00:00
Craig Topper 504534514c [X86] Don't use getTargetConstant for all 0s and all 1s mask vector.
llvm-svn: 320260
2017-12-09 19:18:30 +00:00
Simon Pilgrim df702104d3 [X86] Tag segment prefixes as NOP instruction scheduling classes
llvm-svn: 320257
2017-12-09 16:58:34 +00:00
Simon Pilgrim d3e21c6b79 [X86][AVX512] Drop a default NoItinerary argument that isn't used any more. NFCI.
Requires re-ordering of AVX512_maskable_custom arguments.

llvm-svn: 320255
2017-12-09 16:20:54 +00:00
Craig Topper 6504a8f888 [X86] When inserting into the upper bits of a vXi1 vector, make sure we shift enough bits if we widened the vector.
We may need to widen the vector to make the shifts legal, but if we do that we need to make sure we shift left/right after accounting for the new size. If not we can't guarantee we are shifting in zeros.

The test cases affected actually show cases where we should move the shifts all together, but that's another problem.

llvm-svn: 320248
2017-12-09 08:19:07 +00:00
Craig Topper b3e14ce90c [X86] Improve lowering of concats of mask vectors to better optimize zero vector inputs.
We were previously using kunpck with zero inputs unnecessarily. And we had cases where we would insert into a zero vector and then insert into larger zero vector incurring two sets of shifts.

llvm-svn: 320244
2017-12-09 07:02:19 +00:00