llvm-project

Commit Graph

Author	SHA1	Message	Date
Craig Topper	add9cc697a	[AVX-512] Use EVEX encoded XOR instruction for zeroing scalar registers when DQI and VLX instructions are available. This can give the register allocator more registers to use. llvm-svn: 290057	2016-12-18 06:23:14 +00:00
Simon Pilgrim	7522f54feb	[X86][SSE] Fix domains for scalar store instructions As discussed on D27692 llvm-svn: 289834	2016-12-15 17:09:24 +00:00
Simon Pilgrim	ba46422694	[X86][AVX512] Moved instruction domain lookups to the right table. NFCI. Avoid duplicating instructions in the int32/int64 domains. llvm-svn: 289830	2016-12-15 16:38:51 +00:00
Simon Pilgrim	d7518896ff	[X86][SSE] Fix domains for VZEXT_LOAD type instructions Add the missing domain equivalences for movss, movsd, movd and movq zero extending loading instructions. Differential Revision: https://reviews.llvm.org/D27684 llvm-svn: 289825	2016-12-15 16:05:29 +00:00
Philip Reames	1f1bbac8da	[peephole] Enhance folding logic to work for STATEPOINTs The general idea here is to get enough of the existing restrictions out of the way that the already existing folding logic in foldMemoryOperand can kick in for STATEPOINTs and fold references to immutable stack slots. The key changes are: Support for folding multiple operands at once which reference the same load Support for folding multiple loads into a single instruction Walk all the operands of the instruction for varidic instructions (this is a bug fix!) Once this lands, I'll post another patch which refactors the TII interface here. There's nothing actually x86 specific about the x86 code used here. Differential Revision: https://reviews.llvm.org/D24103 llvm-svn: 289510	2016-12-13 01:38:41 +00:00
Craig Topper	081c0e2864	[X86] Remove some intrinsic instructions from hasPartialRegUpdate Summary: These intrinsic instructions are all selected from intrinsics that have well defined behavior for where the upper bits come from. It's not the same place as the lower bits. As you can see we were suppressing load folding for these instructions in some cases. In none of the cases was the separate load helping avoid a partial dependency on the destination register. So we should just go ahead and allow the load to be folded. Only foldMemoryOperand was suppressing folding for these. They all have patterns for folding sse_load_f32/f64 that aren't gated with OptForSize, but sse_load_f32/f64 doesn't allow 128-bit vector loads. It only allows scalar_to_vector and vzmovl of scalar loads to match. There's no reason we can't allow a 128-bit vector load to be narrowed so I would like to fix sse_load_f32/f64 to allow that. And if I do that it changes some of these same test cases to fold the load too. Reviewers: spatel, zvi, RKSimon Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D27611 llvm-svn: 289419	2016-12-12 05:07:17 +00:00
Craig Topper	c4f2b0996d	[X86] Add masked versions of VPERMT2* and VPERMI2* to load folding tables. llvm-svn: 289186	2016-12-09 05:20:11 +00:00
Craig Topper	2aeb456425	[AVX-512] Add vpermilps/pd to load folding tables. llvm-svn: 289173	2016-12-09 02:18:11 +00:00
Michael Kuperstein	18092cf2c3	[X86] Do not assume "ri" instructions always have an immediate operand The second operand of an "ri" instruction may be an immediate, but it may also be a globalvariable, so we should make any assumptions. This fixes PR31271. Differential Revision: https://reviews.llvm.org/D27481 llvm-svn: 288964	2016-12-07 19:29:18 +00:00
Craig Topper	6413f8a8f2	[X86] Remove scalar logical op alias instructions. Just use COPY_FROM/TO_REGCLASS and the normal packed instructions instead Summary: This patch removes the scalar logical operation alias instructions. We can just use reg class copies and use the normal packed instructions instead. This removes the need for putting these instructions in the execution domain fixing tables as was done recently. I removed the loadf64_128 and loadf32_128 patterns as DAG combine creates a narrower load for (extractelt (loadv4f32)) before we ever get to isel. I plan to add similar patterns for AVX512DQ in a future commit to allow use of the larger register class when available. Reviewers: spatel, delena, zvi, RKSimon Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D27401 llvm-svn: 288771	2016-12-06 04:58:39 +00:00
Michael Kuperstein	e3036abcf9	[X86] Fix non-intrinsic roundss/roundsd to not read the destination register This changes the scalar non-intrinsic non-avx roundss/sd instruction definitions not to read their destination register - allowing partial dependency breaking. This fixes PR31143. Differential Revision: https://reviews.llvm.org/D27323 llvm-svn: 288703	2016-12-05 20:57:37 +00:00
Craig Topper	9d16bfa0f5	[AVX-512] Add many of the VPERM instructions to the load folding table. Move VPERMPDZri to the correct table. llvm-svn: 288591	2016-12-03 19:37:39 +00:00
Craig Topper	c210827b53	[AVX-512] Add EVEX VPMADDUBSW and VPMADDWD to the load folding tables. llvm-svn: 288587	2016-12-03 17:19:15 +00:00
Craig Topper	4961fa9bba	[AVX-512] Add EVEX vpshuflw/vpshufhw/vpshufd instructions to load folding tables. llvm-svn: 288484	2016-12-02 07:57:11 +00:00
Craig Topper	17ddb521ef	[AVX-512] Add EVEX PSHUFB instructions to load folding tables. llvm-svn: 288482	2016-12-02 07:06:30 +00:00
Craig Topper	f7866fad54	[AVX-512] Add masked VINSERTF/VINSERTI instructions to load folding tables. llvm-svn: 288481	2016-12-02 06:24:38 +00:00
Matthias Braun	115efcd3d1	MachineScheduler: Export function to construct "default" scheduler. This makes the createGenericSchedLive() function that constructs the default scheduler available for the public API. This should help when you want to get a scheduler and the default list of DAG mutations. This also shrinks the list of default DAG mutations: {Load\|Store}ClusterDAGMutation and MacroFusionDAGMutation are no longer added by default. Targets can easily add them if they need them. It also makes it easier for targets to add alternative/custom macrofusion or clustering mutations while staying with the default createGenericSchedLive(). It also saves the callback back and forth in TargetInstrInfo::enableClusterLoads()/enableClusterStores(). Differential Revision: https://reviews.llvm.org/D26986 llvm-svn: 288057	2016-11-28 20:11:54 +00:00
Craig Topper	ff9d45875a	[X86][FMA4] Add load folding support for FMA4 scalar intrinsic instructions. llvm-svn: 288009	2016-11-27 21:37:00 +00:00
Craig Topper	3674f44e40	[X86] Add SHL by 1 to the load folding tables. I don't think isel selects these today, favoring adding the register to itself instead. But the load folding tables shouldn't be so concerned with what isel will use and just represent the relationships. llvm-svn: 288007	2016-11-27 21:36:54 +00:00
Craig Topper	4fab487265	[AVX-512] Add integer and fp unpck instructions to load folding tables. llvm-svn: 288004	2016-11-27 19:51:41 +00:00
Craig Topper	7ad961cc70	[X86] Add TB_NO_REVERSE to entries in the load folding table where the instruction's load size is smaller than the register size. If we were to unfold these, the load size would be increased to the register size. This is not safe to do since the enlarged load can do things like cross a page boundary into a page that doesn't exist. I probably missed some instructions, but this should be a large portion of them. llvm-svn: 288001	2016-11-27 18:51:13 +00:00
Craig Topper	c3b3926f8b	[AVX-512] Add masked EVEX vpmovzx/sx instructions to load folding tables. llvm-svn: 287995	2016-11-27 08:55:31 +00:00
Craig Topper	fb64a25ba1	[X86] Remove alignment restrictions from load folding table for some instructions that don't have a restriction. Most of these are the SSE4.1 PMOVZX/PMOVSX instructions which all read less than 128-bits. The only other was PMOVUPD which by definition is an unaligned load. llvm-svn: 287991	2016-11-27 01:52:51 +00:00
Craig Topper	10d5eec1a1	[AVX-512] Add unmasked EVEX vpmovzx/sx instructions to load folding tables. llvm-svn: 287975	2016-11-26 08:21:52 +00:00
Craig Topper	97169ea5f9	[AVX-512] Add masked 128/256-bit integer add/sub instructions to load folding tables. llvm-svn: 287974	2016-11-26 08:21:48 +00:00
Craig Topper	53b33de1e3	[AVX-512] Add masked 512-bit integer add/sub instructions to load folding tables. llvm-svn: 287972	2016-11-26 07:21:00 +00:00
Craig Topper	39265bb1ce	[AVX-512] Add VLX versions of VDIVPD/PS and VMULPD/PS to load folding tables. llvm-svn: 287970	2016-11-26 07:20:53 +00:00
Craig Topper	516fd7abfe	[X86] Add SSE, AVX, and AVX2 version of MOVDQU to the load/store folding tables for consistency. Not sure this is truly needed but we had the floating point equivalents, the aligned equivalents, and the EVEX equivalents. So this just makes it complete. llvm-svn: 287960	2016-11-26 02:13:58 +00:00
Craig Topper	a363d42973	[AVX-512] Put the AVX-512 sections of the load folding tables into mostly alphabetical order. This is consistent with the older sections of the table. NFC llvm-svn: 287956	2016-11-25 23:21:34 +00:00
Craig Topper	1e48829747	[AVX-512] Add VPERMT2* and VPERMI2* instructions to load folding tables. llvm-svn: 287937	2016-11-25 16:33:53 +00:00
Michael Kuperstein	47eb85a003	[X86] Allow folding of stack reloads when loading a subreg of the spilled reg We did not support subregs in InlineSpiller:foldMemoryOperand() because targets may not deal with them correctly. This adds a target hook to let the spiller know that a target can handle subregs, and actually enables it for x86 for the case of stack slot reloads. This fixes PR30832. Differential Revision: https://reviews.llvm.org/D26521 llvm-svn: 287792	2016-11-23 18:33:49 +00:00
Craig Topper	3dcf45f08d	[X86] Remove alternate CodeGenOnly version of (v)movq that declared the load size as i128mem. Change all uses to the use the i64mem version. I'm sure this caused the load size to misprint in Intel syntax output. We were also inconsistent about which patterns used which instruction between VEX and EVEX. There are two different reg/reg versions of movq, one from a GPR and one from the lower 64-bits of an XMM register. This changes the loading folding table to use the single i64mem memory form for folding both cases. But we need to use TB_NO_REVERSE to prevent a duplicate entry in the unfolding table. llvm-svn: 287622	2016-11-22 05:31:43 +00:00
Craig Topper	cada9f2275	[AVX-512] Add support for commuting VPERMT2(B/W/D/Q/PS/PD) to/from VPERMI2(B/W/D/Q/PS/PD). Summary: The index and one of the table operands can be swapped by changing the opcode to the other version. Neither of these operands are the one that can load from memory so this can't be used to increase memory folding opportunities. We need to handle the unmasked forms and the kz forms. Since the load operand isn't being commuted we can commute the load and broadcast instructions too. Reviewers: igorb, delena, Ayal, Farhana, RKSimon Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D25652 llvm-svn: 287621	2016-11-22 04:57:34 +00:00
Michael Zuckerman	8462faeaba	Fixing a small typo (A->U). This seem to fixes PR30992. - HasAVX512 ? X86::VMOVAPSZ128rm_NOVLX + HasAVX512 ? X86::VMOVUPSZ128rm_NOVLX llvm-svn: 287532	2016-11-21 11:52:11 +00:00
Craig Topper	9f2d632ee7	[AVX-512] Add EVEX form of VMOVZPQILo2PQIZrm to load folding tables to match SSE and AVX. llvm-svn: 287523	2016-11-21 07:51:31 +00:00
Sanjay Patel	7f3d51f840	[x86] add fake scalar FP logic instructions to ReplaceableInstrs to save some bytes We can replace "scalar" FP-bitwise-logic with other forms of bitwise-logic instructions. Scalar SSE/AVX FP-logic instructions only exist in your imagination and/or the bowels of compilers, but logically equivalent int, float, and double variants of bitwise-logic instructions are reality in x86, and the float variant may be a shorter instruction depending on which flavor (SSE or AVX) of vector ISA you have...so just prefer float all the time. This is a preliminary step towards solving PR6137: https://llvm.org/bugs/show_bug.cgi?id=6137 Differential Revision: https://reviews.llvm.org/D26712 llvm-svn: 287122	2016-11-16 17:42:40 +00:00
Craig Topper	b8596e4d1d	[X86] Cleanup 'x' and 'y' mnemonic suffixes for vcvtpd2dq/vcvttpd2dq/vcvtpd2ps and similar instructions. -Don't print the 'x' suffix for the 128-bit reg/mem VEX encoded instructions in Intel syntax. This is consistent with the EVEX versions. -Don't print the 'y' suffix for the 256-bit reg/reg VEX encoded instructions in Intel or AT&T syntax. This is consistent with the EVEX versions. -Allow the 'x' and 'y' suffixes to be used for the reg/mem forms when we're assembling using Intel syntax. -Allow the 'x' and 'y' suffixes on the reg/reg EVEX encoded instructions in Intel or AT&T syntax. This is consistent with what VEX was already allowing. This should fix at least some of PR28850. llvm-svn: 286787	2016-11-14 01:53:29 +00:00
Peter Collingbourne	32ab3a817d	Re-apply r286384, "X86: Introduce the "relocImm" ComplexPattern, which represents a relocatable immediate.", with a fix for 32-bit x86. Teach X86InstrInfo::analyzeCompare() not to crash on CMP and SUB instructions that take a global address operand. llvm-svn: 286420	2016-11-09 23:53:43 +00:00
Zvi Rackover	85bc64c734	[X86] Broadcast from memory intructions aren't unfoldable Broadcast from memory instructions should be treated as moves. They can't be unfolded. Fixes pr30693. llvm-svn: 285998	2016-11-04 15:15:19 +00:00
Craig Topper	b7781a95fd	[X86] Use intrinsics table for PMADDUBSW and PMADDWD so that we can use the legacy intrinsics to select EVEX encoded instructions when available. This removes a couple tablegen classes that become unused after this change. Another class gained an additional parameter to allow PMADDUBSW to specify a different result type from its input type. llvm-svn: 285515	2016-10-30 06:56:16 +00:00
Craig Topper	defe9ffbb5	[X86] Use intrinsics table for VPMULHRSW intrincis so that the legacy intrinsics can select EVEX encoded instructions when available. This requires a minor rename of the instructions due to the use of different tablegen classes and how the names are concatenated. llvm-svn: 285501	2016-10-29 18:41:45 +00:00
Peter Collingbourne	5c924d7117	Target: Remove unused entities. llvm-svn: 283690	2016-10-09 04:38:57 +00:00
Craig Topper	e30cb00dc0	[AVX-512] Add subvector insert and extract to load/store folding tables. llvm-svn: 283689	2016-10-09 03:54:13 +00:00
Craig Topper	4262d53024	[AVX-512] Add the vector down convert instructions to the store folding tables. llvm-svn: 283687	2016-10-09 03:54:05 +00:00
Simon Pilgrim	a5d019ee95	[X86][SSE] Update register class during MOVSD/MOVSS - BLENDPD/BLENDPS commutation MOVSD/MOVSS take a 128-bit register and a FR32/FR64 register input, the commutation code wasn't taking this into account leading to verification errors. This patch inserts a vreg copy mi to ensure that the registers are correct. Fix for PR30607 Differential Revision: https://reviews.llvm.org/D25280 llvm-svn: 283539	2016-10-07 11:18:38 +00:00
Hans Wennborg	c26c03d911	Revert r282920 "X86: Allow conditional tail calls in Win64 "leaf" functions (PR26302)" This is suspected to cause a miscompile in Chromium. Reverting while investigating. llvm-svn: 283329	2016-10-05 15:39:27 +00:00
Craig Topper	ee2d995661	[X86] Add MOV8rm_NOREX to switch in isReallyTriviallyReMaterializable to match MOV8rm. llvm-svn: 283184	2016-10-04 03:11:44 +00:00
Craig Topper	4e7b888ea4	[X86] Mark all sizes of (V)MOVUPD as trivially rematerializable. I don't know for sure that we truly needs this, but its the only vector load that isn't rematerializable. Making it consistent allows it to not be a special case in the td files. llvm-svn: 283083	2016-10-03 02:00:29 +00:00
Simon Pilgrim	ccdd1ff49b	[X86][SSE] Enable commutation from MOVSD/MOVSS to BLENDPD/BLENDPS on SSE41+ targets Instead of selecting between MOVSD/MOVSS and BLENDPD/BLENDPS at shuffle lowering by subtarget this will help us select the instruction based on actual commutation requirements. We could possibly add BLENDPD/BLENDPS -> MOVSD/MOVSS commutation and MOVSD/MOVSS memory folding using a similar approach if it proves useful I avoided adding AVX512 handling as I'm not sure when we should be making use of VBLENDPD/VBLENDPS on EVEX targets llvm-svn: 283037	2016-10-01 14:26:11 +00:00
Mehdi Amini	117296c0a0	Use StringRef in Pass/PassManager APIs (NFC) llvm-svn: 283004	2016-10-01 02:56:57 +00:00

1 2 3 4 5 ...

1074 Commits