llvm-project

Commit Graph

Author	SHA1	Message	Date
Craig Topper	1c7d07c601	[X86] Remove unneeded code for handling the old kunpck intrinsics. llvm-svn: 320917	2017-12-16 06:58:30 +00:00
Craig Topper	c08960597c	[X86] Add 128 and 256-bit VPOPCNTDQ instructions. Adjust some tablegen classes LZCNT/POPCNT. I think when this instruction was first published it was only for a Knights CPU and thus VLX version was missing. llvm-svn: 320910	2017-12-16 02:40:28 +00:00
Craig Topper	6b129fde5a	[X86] Add back the assert from r320830 that was reverted in r320850 Hopefully r320864 has fixed the offending case that failed the assert. llvm-svn: 320898	2017-12-16 00:33:16 +00:00
Matthias Braun	f1caa2833f	MachineFunction: Return reference from getFunction(); NFC The Function can never be nullptr so we can return a reference. llvm-svn: 320884	2017-12-15 22:22:58 +00:00
Craig Topper	6b8ac481f1	[X86] Use AND32ri8 instead of AND64ri8 in Asan code in EmitCallAsanReport for 32-bit mode. This seemed to work due to a quirk in the X86 MC encoder that didn't emit a REX byte that the AND64ri8 implies when in 32-bit mode. This made the encoding the same as AND32ri8. I tried to add an assert to catch the dropped REX prefix that caught this. llvm-svn: 320864	2017-12-15 21:18:06 +00:00
Craig Topper	422ed23298	[X86] In LowerVectorCTPOP use ISD::ZERO_EXTEND/ISD::TRUNCATE instead of the target specific nodes. The target independent nodes will get legalized to the target specific nodes by their own legalization process. Someday I'd like to stop using a target specific for zero extends and truncates of legal types so the less places we reference the target specific opcode the better. llvm-svn: 320863	2017-12-15 21:18:05 +00:00
Craig Topper	f08ab74ae3	[X86] Remove unnecessary TODO. When I wrote it I thought we were missing a potential optimization for KNL. But investigating further shows that for KNL we still do the optimal thing by widening to v4f32 and then using special isel patterns to widen again to zmm a register. llvm-svn: 320862	2017-12-15 20:57:18 +00:00
Craig Topper	df2521a638	[X86] Remove assert in X86MCCodeEmitter.cpp that was added in r320830. It seems to be failing real code which is concerning, but we were silently getting away with it. I'll investigate further. llvm-svn: 320850	2017-12-15 19:38:14 +00:00
Craig Topper	3fb8386685	[SelectionDAG][X86] Fix insert_vector_elt lowering for v32i1/v64i1 with non-constant index Summary: Currently we don't handle v32i1/v64i1 insert_vector_elt correctly as we fail to look at the number of elements closely and assume it can only be v16i1 or v8i1. We also can't type legalize v64i1 insert_vector_elt correctly on KNL due to the type not being byte addressable as required by the legalizing through memory accesses path requires. For the first issue, the patch now tries to pick a 512-bit register with the correct number of elements and promotes to that. For the second issue, we now extend the vector to a byte addressable type, do the stores to memory, load the two halves, and then truncate the halves back to the original type. Technically since we changed the type, we may not need two loads, but actually checking that is more work and for the v64i1 case we do need them. Reviewers: RKSimon, delena, spatel, zvi Reviewed By: RKSimon Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D40942 llvm-svn: 320849	2017-12-15 19:35:22 +00:00
Craig Topper	23c348850f	[X86] Add 'Requires<[In64BitMode]>' to a bunch of instructions that only have memory and immediate operands. The asm parser wasn't preventing these from being accepted in 32-bit mode. Instructions that use a GR64 register are protected by the parser rejecting the register in 32-bit mode. llvm-svn: 320846	2017-12-15 19:01:51 +00:00
Craig Topper	914b1d524c	[X86] Change BNDLDX to use anymem instead of i64mem for itsmemory operand. This instruction doesn't access memory. It juse use a similar looking memory encoding. Don't require Intel syntax to put "qword ptr" in front of it. llvm-svn: 320845	2017-12-15 19:01:50 +00:00
Craig Topper	446f3e2084	[X86] Remove the 'Requires' In64BitMode/Not64BitMode from the LWP instructions. These aren't doing anything due to a top level "let Predicates =". I think the GR32/GR64 register class protects these anyway. llvm-svn: 320844	2017-12-15 19:01:49 +00:00
Craig Topper	365e8aa5d5	[X86] Remove the 'Requires<[In64BitMode]>' from SHSTK instructions. This has no effect due to a top level "let Predicates =" around the instructions. But its also not required because the GR64 usage in the instruction guarantees it can never match. llvm-svn: 320843	2017-12-15 19:01:48 +00:00
Andrew V. Tischenko	22f0742dda	Fix for bug PR35549 - Repeated schedule comments. Differential Revision: https://reviews.llvm.org/D40960 llvm-svn: 320837	2017-12-15 18:13:05 +00:00
Craig Topper	a16395008c	[X86] Fix XSAVE64 and similar instructions to not be allowed by the assembler in 32-bit mode. There was a top level "let Predicates =" in the .td file that was overriding the Requires on each instruction. I've added an assert to the code emitter to catch more cases like this. I'm sure this isn't the only place where the right predicates aren't being applied. This assert already found that we don't block btq/btsq/btrq in 32-bit mode. llvm-svn: 320830	2017-12-15 17:22:58 +00:00
Francis Visoiu Mistrih	0b5bdceabf	[CodeGen] Print stack object references as %(fixed-)stack.0 in both MIR and debug output Work towards the unification of MIR and debug output by printing `%stack.0` instead of `<fi#0>`, and `%fixed-stack.0` instead of `<fi#-4>` (supposing there are 4 fixed stack objects). Only debug syntax is affected. Differential Revision: https://reviews.llvm.org/D41027 llvm-svn: 320827	2017-12-15 16:33:45 +00:00
Craig Topper	ad9221d684	[X86] Widen (v2i32 (fp_to_uint v2f64)) to (v8i32 (fp_to_uint v8f64)) during legalization if we have AVX512F, but not VLX. NFC Previously we widened it using isel patterns. llvm-svn: 320824	2017-12-15 16:22:20 +00:00
Craig Topper	7cfacbf6ea	[X86] Fix a couple bugs in my recent changes to vXi1 insert_subvector lowering. A couple places didn't use the same SDValue variables to connect everything all the way through. I don't have a test case for a bug in insert into the lower bits of a non-zero, non-undef vector. Not sure the best way to create that. We don't create the case when lowering concat_vectors which is the main way to get insert_subvectors. llvm-svn: 320790	2017-12-15 07:16:41 +00:00
Craig Topper	1a1e6d6cf6	[X86] Add a TODO about v8i1 CONCAT_VECTORS. llvm-svn: 320784	2017-12-15 01:03:46 +00:00
Craig Topper	5ebf3ac9c2	[X86] Further rearrange the setOperationAction calls to separate the ones that require 512-bit registers OR VLX into separate sections. NFCI We have several instructions that were introduced in AVX512F that are only available in 512-bit form on KNL. We still make use of them for 128/256 by artificially widening and extracting during isel. This commit separates these operations from the true 512-bit operations. This way we can qualify the normal 512-bit operations with needing 512-bit register support. And these special operations will get qualified with needing 512-bit registers OR VLX. The 512-bit register qualification will be introduced in a future patch this just gets everything grouped to minimize deltas on that patch. llvm-svn: 320782	2017-12-15 01:03:43 +00:00
Craig Topper	07a28f777e	[X86] Group setOperationActions related to vXi1 masks together. NFCI Previously they were sort of interleaved in with XMM/YMM/ZMM action related code. Trying to separate things so its easier to split 512-bit vectors later. llvm-svn: 320781	2017-12-15 01:03:42 +00:00
Craig Topper	b89bc20a64	[X86] Make ISD::INSERT_SUBVECTOR v8i1 legal with AVX512F because we should be custom lowering inserting v1i1 into v8i1 under this. I don't have a test case at the moment. Just noticed while auditing things. llvm-svn: 320780	2017-12-15 01:03:40 +00:00
Craig Topper	212070486d	[X86] Move some of the hasVLX qualified code out of the main hasAVX512 block in the X86ISelLowering constructor. NFCI Move it into the separate hasVLX block later in the constructor. I'm trying to separate 128/256 and 512-bit related code so we can eventually qualify the hasAVX512 block with support for 512-bit vectors required by the prefer-vector-width feature support being talked about in D41096. llvm-svn: 320779	2017-12-15 01:03:38 +00:00
Saleem Abdulrasool	05e285bcc5	FastISel: support no-PLT PIC calls on ELF x86_64 Add support for properly handling PIC code with no-PLT. This equates to `-fpic -fno-plt -O0` with the clang frontend. External functions are marked with nonlazybind, which must then be indirected through the GOT. This allows code to be built without optimizations in PIC mode without going through the PLT. Addresses PR35653! llvm-svn: 320776	2017-12-15 00:32:09 +00:00
Craig Topper	4341a7b08c	[X86] Remove an unnecessary SmallVector that was collecting chains for two SDNode's we're still holding SDValues for. NFCI We can just get the chains from those SDValues to create the TokenFactor. llvm-svn: 320757	2017-12-14 22:50:10 +00:00
Matt Arsenault	7d7adf4f2e	TLI: Allow using PSV for intrinsic mem operands llvm-svn: 320756	2017-12-14 22:34:10 +00:00
Zachary Turner	260fe3eca6	Fix many -Wsign-compare and -Wtautological-constant-compare warnings. Most of the -Wsign-compare warnings are due to the fact that enums are signed by default in the MS ABI, while the tautological comparison warnings trigger on x86 builds where sizeof(size_t) is 4 bytes, so N > numeric_limits<unsigned>::max() is always false. Differential Revision: https://reviews.llvm.org/D41256 llvm-svn: 320750	2017-12-14 22:07:03 +00:00
Matt Arsenault	1117133687	DAG: Expose all MMO flags in getTgtMemIntrinsic Rather than adding more bits to express every MMO flag you could want, just directly use the MMO flags. Also fixes using a bunch of bool arguments to getMemIntrinsicNode. On AMDGPU, buffer and image intrinsics should always have MODereferencable set, but currently there is no way to do that directly during the initial intrinsic lowering. llvm-svn: 320746	2017-12-14 21:39:51 +00:00
Craig Topper	600f1ba333	[X86] Don't zero the upper bits of the k-register before extracting a single bit from a vXi1. This doesn't match the semantics of the extract_vector_elt operation. Nothing downstream knows the bits were zeroed so they still get masked or sign extended after the extrat anyway. llvm-svn: 320723	2017-12-14 18:35:25 +00:00
Andrew V. Tischenko	070d5e3054	Any Target Asm comments should start from MachineInstr::TAsmComments value. llvm-svn: 320693	2017-12-14 12:07:11 +00:00
Michael Zuckerman	19fd217eaa	[AVX512] Adding support for load truncate store of I1 store operation on a truncated memory (load) of vXi1 is poorly supported by LLVM and most of the time end with an assertion. This patch fixes this issue. Differential Revision: https://reviews.llvm.org/D39547 Change-Id: Ida5523dd09c1ad384acc0a27e9e59273d28cbdc9 llvm-svn: 320691	2017-12-14 11:55:50 +00:00
Craig Topper	8cdf7c0e68	[X86] Make ANY_EXTEND from vXi1 Custom for more types. We should be able to support ANY_EXTEND for any types we support ZERO_EXTEND for. llvm-svn: 320675	2017-12-14 08:26:00 +00:00
Craig Topper	271a5c72a0	[X86] Remove redundant setOperationAction calls. These calls already exist earlier under AVX2 feature. llvm-svn: 320673	2017-12-14 08:25:53 +00:00
Craig Topper	f82867c95a	Recommit r320461 "[X86] Use regular expressions more aggressively to reduce the number of scheduler entries needed for FMA3 instructions." I've hopefully sidestepped the MSVC issue that caused it to be reverted. We no longer include the Sched enum from X86GenInstrInfo.inc on the X86 target. So hopefully MSVC's preprocessor will skip over it and nothing will notice the 11000 character enum name. Original commit message: When the scheduler tables are generated by tablegen, the instructions are divided up into groups based on their default scheduling information and how they are referenced by groups for each processor. For any set of instructions that are matched by a specific InstRW line, that group of instructions is guaranteed to not be in a group with any other instructions. So in general, the more InstRW class definitions are created, the more groups we end up with in the generated files. Particularly if a lot of the InstRW lines only match to single instructions, which is true of a large number of the Intel scheduler models. This change alone reduces the number of instructions groups from ~6000 to ~5500. And there's lots more we could do. llvm-svn: 320655	2017-12-13 23:11:30 +00:00
Michael Zolotukhin	67b04bd8ac	Recover some overzealously removed includes. llvm-svn: 320648	2017-12-13 22:21:02 +00:00
Michael Zolotukhin	ad24af7f58	Remove redundant includes from lib/Target/X86. llvm-svn: 320636	2017-12-13 21:31:19 +00:00
Simon Pilgrim	f00ea1b4cd	[X86] Add RDMSR/WRMSR, RDPMC + RDTSC/RDTSCP schedule tests Add missing RDTSCP itinerary llvm-svn: 320581	2017-12-13 14:22:04 +00:00
Simon Pilgrim	f51f4d3623	[X86][SSE] MOVMSK only uses the sign bit from each vector element Pass the input vector through SimplifyDemandedBits as we only need the sign bit from each vector element of MOVMSK We'd probably get more hits if SimplifyDemandedBits was better at handling vectors... Differential Revision: https://reviews.llvm.org/D41119 llvm-svn: 320570	2017-12-13 11:43:14 +00:00
Sanjoy Das	1074eb225b	Reapply "[X86] Flag BroadWell scheduler model as complete" This reverts commit r320508, in effect re-applying r320308. Simon has already reverted the parts that caused the crash that motivated the revert in r320492. llvm-svn: 320512	2017-12-12 19:11:31 +00:00
Sanjoy Das	81a4a02cbc	Revert "[X86] Flag BroadWell scheduler model as complete" This reverts commit r320308. r320308 crashes LLC, please see the llvm-commits thread for a reproducer. llvm-svn: 320508	2017-12-12 18:40:58 +00:00
Craig Topper	712a209db9	[X86] Add a couple TODOs about missing coverage/features motivated by D40335 D40335 was wanting to add FMSUBADD support, but it discovered that there are two pieces of code to make FMADDSUB and only one of those is tested. So I've asked that review to implement the one path until we get tests that test the existing code. llvm-svn: 320507	2017-12-12 18:39:04 +00:00
Nirav Dave	674d053d18	[X86] Cleanup type conversion of 64-bit load-store pairs. Summary: Simplify and generalize chain handling and search for 64-bit load-store pairs. Nontemporal test now converts 64-bit integer load-store into f64 which it realizes directly instead of splitting into two i32 pairs. Reviewers: craig.topper, spatel Reviewed By: craig.topper Subscribers: hiraditya, llvm-commits Differential Revision: https://reviews.llvm.org/D40918 llvm-svn: 320505	2017-12-12 18:25:48 +00:00
Simon Pilgrim	68f9accf51	[X86] Remove CompleteModel tags from CPU targets until we have better error checking (PR35636) The checks we have for complete models are not great and miss many cases - e.g. in PR35636 it failed to recognise that only the first output (of 2) was actually tagged by the InstRW Raised PR35639 and PR35643 as examples llvm-svn: 320492	2017-12-12 16:12:53 +00:00
Ayman Musa	c2eed926b0	[X86] Recognize constant arrays with special values and replace loads from it with subtract and shift instructions, which then will be replaced by X86 BZHI machine instruction. Recognize constant arrays with the following values: 0x0, 0x1, 0x3, 0x7, 0xF, 0x1F, .... , 2^(size - 1) -1 where //size// is the size of the array. the result of a load with index //idx// from this array is equivalent to the result of the following: (0xFFFFFFFF >> (sub 32, idx)) (assuming the array of type 32-bit integer). And the result of an 'AND' operation on the returned value of such a load and another input, is exactly equivalent to the X86 BZHI instruction behavior. See test cases in the LIT test for better understanding. Differential Revision: https://reviews.llvm.org/D34141 llvm-svn: 320481	2017-12-12 14:13:51 +00:00
Simon Pilgrim	0f8a5a41cf	Revert r320461 - causing ICE in windows buildss [X86] Use regular expressions more aggressively to reduce the number of scheduler entries needed for FMA3 instructions. When the scheduler tables are generated by tablegen, the instructions are divided up into groups based on their default scheduling information and how they are referenced by groups for each processor. For any set of instructions that are matched by a specific InstRW line, that group of instructions is guaranteed to not be in a group with any other instructions. So in general, the more InstRW class definitions are created, the more groups we end up with in the generated files. Particularly if a lot of the InstRW lines only match to single instructions, which is true of a large number of the Intel scheduler models. This change alone reduces the number of instructions groups from ~6000 to ~5500. And there's lots more we could do. llvm-svn: 320470	2017-12-12 11:34:25 +00:00
Craig Topper	c8e64ab539	[X86] Use regular expressions more aggressively to reduce the number of scheduler entries needed for FMA3 instructions. When the scheduler tables are generated by tablegen, the instructions are divided up into groups based on their default scheduling information and how they are referenced by groups for each processor. For any set of instructions that are matched by a specific InstRW line, that group of instructions is guaranteed to not be in a group with any other instructions. So in general, the more InstRW class definitions are created, the more groups we end up with in the generated files. Particularly if a lot of the InstRW lines only match to single instructions, which is true of a large number of the Intel scheduler models. This change alone reduces the number of instructions groups from ~6000 to ~5500. And there's lots more we could do. llvm-svn: 320461	2017-12-12 08:17:04 +00:00
Craig Topper	468a813315	[X86] Use Ld scheduler classes for instructions with folded loads. llvm-svn: 320459	2017-12-12 07:06:35 +00:00
Craig Topper	c1e72c019d	[X86] Correct the FMA3 regular expressions in the znver1 scheduler model. llvm-svn: 320458	2017-12-12 07:06:32 +00:00
Simon Pilgrim	6d89f407db	Normalize line endings. NFCI. llvm-svn: 320389	2017-12-11 17:01:21 +00:00
Simon Pilgrim	fabe354b42	[X86] Add LWP schedule tests Tag LWP instructions as WriteSystem llvm-svn: 320387	2017-12-11 16:47:21 +00:00
Craig Topper	c6a4a97260	[X86] Add VCOMISDZrr, VCOMISSZrr, VUCOMISDZrr, and VUCOMISSZrr to the skylake server sheduler model llvm-svn: 320326	2017-12-10 19:47:57 +00:00
Craig Topper	a0be5a06c1	[X86] Rename some instructions that start with Int_ to have the _Int at the end. This matches AVX512 version and is more consistent overall. And improves our scheduler models. In some cases this adds _Int to instructions that didn't have any Int_ before. It's a side effect of the adjustments made to some of the multiclasses. llvm-svn: 320325	2017-12-10 19:47:56 +00:00
Simon Pilgrim	c493d4f5b9	[X86][X87] Fix typo in znver1 FIST/FISTT schedule patterns llvm-svn: 320322	2017-12-10 19:19:22 +00:00
Craig Topper	1de942b2d1	[X86] Rename some instructions from 'rb' to 'rrb' to make 'b' a proper suffix. Fix the scheduling information for some of them. Some of the scheduling information was only present for the 'rb' version' and not the 'rr' version. Now we match 'rr(b?)' llvm-svn: 320320	2017-12-10 17:42:44 +00:00
Craig Topper	c7445f2cdc	[X86] Add VCVTQQ2PS to the skylake server scheduler models. llvm-svn: 320319	2017-12-10 17:42:43 +00:00
Craig Topper	c268527b2f	[X86] Add VPMULLWZ256 to the skylake server scheduler model llvm-svn: 320318	2017-12-10 17:42:42 +00:00
Craig Topper	4ec397cbd3	[X86] Add 256/512-bit EVEX VPSADBW instructions to skylake server scheduler model. llvm-svn: 320317	2017-12-10 17:42:41 +00:00
Craig Topper	aa904d5ab6	[X86] Fix a few instructions that were named Z512 instead of just Z. This makes things consistent with our normal instruction naming. llvm-svn: 320316	2017-12-10 17:42:39 +00:00
Craig Topper	7c89de1760	[X86] Add VPSRLWZrr to skylake server scheduler model. llvm-svn: 320315	2017-12-10 17:42:38 +00:00
Craig Topper	1d7760db49	[X86] Add VPUNPCKLWDZrr to skylake server scheduler model. llvm-svn: 320314	2017-12-10 17:42:37 +00:00
Craig Topper	57c2815cbe	[X86] Adjust tablegen includes so we can use Instructions in scheduler models instead of just instregexs. This separates the CPU specific scheduler model includes to occur after the instructions. Moves the instruction includes between the basic scheduler information and the CPU specific scheduler models. llvm-svn: 320313	2017-12-10 17:42:36 +00:00
Simon Pilgrim	1f8cfba0bb	[X86] Flag BroadWell scheduler model as complete Locally tag COPY as WriteMove, which has caused some reg-reg + reg-mem instruction tests to reorder. llvm-svn: 320308	2017-12-10 13:49:51 +00:00
Simon Pilgrim	49c74934dd	Strip trailing whitespace. NFCI. llvm-svn: 320306	2017-12-10 13:00:37 +00:00
Simon Pilgrim	320996576d	[X86] Flag ZNVER1 scheduler model as complete We just have to locally tag COPY as WriteMove llvm-svn: 320304	2017-12-10 12:43:53 +00:00
Simon Pilgrim	8547645948	[X86] Flag SLM scheduler model as complete We just have to locally tag COPY as WriteMove llvm-svn: 320303	2017-12-10 12:36:29 +00:00
Simon Pilgrim	91c159d841	[X86][AVX[ Tag VZEROALL/VZEROUPPER instructions scheduler classes llvm-svn: 320302	2017-12-10 12:26:35 +00:00
Simon Pilgrim	6de94a1adc	[X86] Tag SSE4A instructions as SSE INTALU scheduler classes llvm-svn: 320301	2017-12-10 12:08:04 +00:00
Simon Pilgrim	cd58171110	[X86] Flag BTVER2 scheduler model as complete We just have to locally tag COPY as WriteMove llvm-svn: 320300	2017-12-10 11:51:29 +00:00
Simon Pilgrim	b7fb2e2fa1	[X86] Tag ADJSTACK instructions as INTALU scheduler class llvm-svn: 320299	2017-12-10 11:34:08 +00:00
Simon Pilgrim	1a030016a6	[X86] Tag MORESTACK instructions as ret scheduler class llvm-svn: 320296	2017-12-10 10:08:21 +00:00
Craig Topper	253562eb81	[X86] Fix duplicate entries in skylake server scheduler model by changing Z128 to Z256 Based on the fact that the 'Y' version of the instruction is next to this, I assume Z256 is the intended value. llvm-svn: 320295	2017-12-10 09:14:45 +00:00
Craig Topper	90c9c15936	[X86] Add MOVQI2PQIrm, MOVSDmr, and MOVSDrm to scheduler information The VEX versions were present but not the legacy SSE versions. llvm-svn: 320294	2017-12-10 09:14:44 +00:00
Craig Topper	28e55386ac	[X86] Add LEA64_32r to scheduler models for Sandybridge,Haswell,Broadwell,Skylake llvm-svn: 320293	2017-12-10 09:14:42 +00:00
Craig Topper	8ade4640f3	[X86] Add IN16/OUT16 to scheduling information for Haswell,Broadwell,Skylake Sandy Bridge is also missing it, but it has other issues. See PR35590. llvm-svn: 320292	2017-12-10 09:14:41 +00:00
Craig Topper	1a88c50fd7	[X86] Fix scheduler models to support ADD32ri in addition to ADD32ri8. Similar for all sizes of AND/OR/XOR/SUB/ADC/SBB/CMP. llvm-svn: 320291	2017-12-10 09:14:39 +00:00
Craig Topper	c89e282f7d	[X86] Rename some instructions so that 'b' is added as a suffix instead of replacing an 'r' llvm-svn: 320290	2017-12-10 09:14:38 +00:00
Craig Topper	6c65910160	[X86] Add CMPSDrr/rm to the scheduler models. Somehow CMPSSrr/rm was there and the VEX version was there, but this was consistently missing. llvm-svn: 320289	2017-12-10 09:14:37 +00:00
Craig Topper	da7e78e18c	[X86] Rename the rb form of scalar ADD/SUB/MUL/DIV to include _Int since they can only be selected by intrinsics. llvm-svn: 320283	2017-12-10 04:07:28 +00:00
Craig Topper	4e57776fb2	[X86] Correct the _Int part of more scheduler model instrexes. Put _b in the correct order relative to _Int llvm-svn: 320282	2017-12-10 03:16:38 +00:00
Craig Topper	a2f5528084	[X86] Remove ReadAfterLd from several several rb instructions This affects CVTSD2SS, FMA, RCP28, RSQRT28, and SQRT scalar instructions 'b' here refers to 'sae' not broadcast. These aren't memory instructions. llvm-svn: 320281	2017-12-10 03:16:36 +00:00
Craig Topper	391c6f9507	[X86] Fix bad regular expressions in the scheduler models. Question marks should be outside of multicharacter parenthesized expressions If the question mark is inside the parentheses it only applies to the single character proceeding it. I had to make a few additional cleanups to fix some duplicate warnings that were exposed by fixing this. llvm-svn: 320279	2017-12-10 01:24:08 +00:00
Craig Topper	8ee98d0b51	[X86] Make the _Int part of some instregex sheduler patterns optional llvm-svn: 320278	2017-12-10 01:24:06 +00:00
Craig Topper	5ffe80103e	[X86] Add the commutable floating point min/max pseudo instructions to sandybridge,haswell,broadwell,skylakeclient scheduler models. llvm-svn: 320277	2017-12-10 01:24:05 +00:00
Simon Pilgrim	6655eef1b4	[X86] Tag PIC setup instruction as jump scheduler class llvm-svn: 320276	2017-12-10 00:40:37 +00:00
Simon Pilgrim	5d74949e5f	[X86] Tag ACQUIRE/RELEASE atomic instructions as microcoded scheduler classes Note: We may be too pessimistic here and should possibly use something closer to the LOCK arithmetic instructions llvm-svn: 320275	2017-12-10 00:30:57 +00:00
Simon Pilgrim	dcbe723d28	[X86] Tag TLS instructions as system scheduler classes llvm-svn: 320274	2017-12-10 00:12:57 +00:00
Simon Pilgrim	3508a09455	[X86] Tag ALLOCA/VAARG instructions as system scheduler classes llvm-svn: 320273	2017-12-10 00:03:16 +00:00
Craig Topper	f4e3044db9	[X86] Use KMOV instructions to zero upper bits of vectors when possible. llvm-svn: 320268	2017-12-09 23:10:59 +00:00
Craig Topper	5ac75d5628	[X86] Improve lowering of vXi1 insert_subvectors to better utilize (insert_subvector zero, vec, 0) for zeroing upper bits. This can be better recognized during isel when the producer already zeroed the upper bits. llvm-svn: 320267	2017-12-09 22:44:42 +00:00
Simon Pilgrim	e049038692	[X86] Tag LOCK/REX64/DATA16/DATA32 instruction prefix scheduler classes llvm-svn: 320266	2017-12-09 21:27:03 +00:00
Simon Pilgrim	b2b93f6204	Strip trailing whitespace. NFCI. llvm-svn: 320265	2017-12-09 20:44:51 +00:00
Simon Pilgrim	7e636cc419	[X86] Tag FS/GS BASE R/W instruction scheduler classes llvm-svn: 320264	2017-12-09 20:42:27 +00:00
Simon Pilgrim	231fab072f	[X86] Tag REP/REPNE prefix instructions as microcoded scheduler classes llvm-svn: 320263	2017-12-09 20:16:37 +00:00
Simon Pilgrim	2e7314eb2f	[X86] Tag missing EH pseudo instruction scheduler classes llvm-svn: 320262	2017-12-09 20:04:02 +00:00
Simon Pilgrim	cb71e72707	[X86] Tag frame pointer XORs instruction scheduler classes llvm-svn: 320261	2017-12-09 19:56:39 +00:00
Craig Topper	504534514c	[X86] Don't use getTargetConstant for all 0s and all 1s mask vector. llvm-svn: 320260	2017-12-09 19:18:30 +00:00
Simon Pilgrim	df702104d3	[X86] Tag segment prefixes as NOP instruction scheduling classes llvm-svn: 320257	2017-12-09 16:58:34 +00:00
Simon Pilgrim	d3e21c6b79	[X86][AVX512] Drop a default NoItinerary argument that isn't used any more. NFCI. Requires re-ordering of AVX512_maskable_custom arguments. llvm-svn: 320255	2017-12-09 16:20:54 +00:00
Craig Topper	6504a8f888	[X86] When inserting into the upper bits of a vXi1 vector, make sure we shift enough bits if we widened the vector. We may need to widen the vector to make the shifts legal, but if we do that we need to make sure we shift left/right after accounting for the new size. If not we can't guarantee we are shifting in zeros. The test cases affected actually show cases where we should move the shifts all together, but that's another problem. llvm-svn: 320248	2017-12-09 08:19:07 +00:00
Craig Topper	b3e14ce90c	[X86] Improve lowering of concats of mask vectors to better optimize zero vector inputs. We were previously using kunpck with zero inputs unnecessarily. And we had cases where we would insert into a zero vector and then insert into larger zero vector incurring two sets of shifts. llvm-svn: 320244	2017-12-09 07:02:19 +00:00

1 2 3 4 5 ...

16109 Commits