Commit Graph

782 Commits

Author SHA1 Message Date
Sanjay Patel e951a3839a rename variables again because these tables also deal with stores; NFC
Suggestion by Simon Pilgrim

llvm-svn: 229574
2015-02-17 22:38:06 +00:00
Sanjay Patel 1a20fdf36f Add comment to explain a non-obvious setting; NFC.
This is paraphrased from Simon Pilgrim's comment in:
http://reviews.llvm.org/D7492

llvm-svn: 229566
2015-02-17 22:09:54 +00:00
Sanjay Patel 203ee500e9 remove function names from comments; NFC
llvm-svn: 229558
2015-02-17 21:55:20 +00:00
Sanjay Patel 52f9f7c0f3 replace meaningless variable names; NFCI
llvm-svn: 229549
2015-02-17 21:37:28 +00:00
Sanjay Patel b811c1d6a5 prevent folding a scalar FP load into a packed logical FP instruction (PR22371)
Change the memory operands in sse12_fp_packed_scalar_logical_alias from scalars to vectors. 
That's what the hardware packed logical FP instructions define: 128-bit memory operands.
There are no scalar versions of these instructions...because this is x86.

Generating the wrong code (folding a scalar load into a 128-bit load) is still possible
using the peephole optimization pass and the load folding tables. We won't completely
solve this bug until we either fix the lowering in fabs/fneg/fcopysign and any other
places where scalar FP logic is created or fix the load folding in foldMemoryOperandImpl()
to make sure it isn't changing the size of the load.

Differential Revision: http://reviews.llvm.org/D7474

llvm-svn: 229531
2015-02-17 20:08:21 +00:00
Simon Pilgrim 31457d54f7 [X86][XOP] Enable commutation for XOP instructions
Patch to allow XOP instructions (integer comparison and integer multiply-add) to be commuted. The comparison instructions sometimes require the compare mode to be flipped but the remaining instructions can use default commutation modes.

This patch also sets the SSE domains of all the XOP instructions.

Differential Revision: http://reviews.llvm.org/D7646

llvm-svn: 229267
2015-02-14 22:40:46 +00:00
Duncan P. N. Exon Smith 5975a703e6 X86: Canonicalize access to function attributes, NFC
Canonicalize access to function attributes to use the simpler API.

getAttributes().getAttribute(AttributeSet::FunctionIndex, Kind)
  => getFnAttribute(Kind)

getAttributes().hasAttribute(AttributeSet::FunctionIndex, Kind)
  => hasFnAttribute(Kind)

llvm-svn: 229214
2015-02-14 01:59:52 +00:00
Simon Pilgrim 295eaad2b3 Relaxed over-zealous alignment requirement for VEX-encoded AES instructions
llvm-svn: 228953
2015-02-12 20:01:03 +00:00
Simon Pilgrim d142ab7d08 [X86][AVX2] Missing AVX2 memory folding instructions
Added most of the missing vector folding patterns for AVX2 (as well as fixing the vpermpd and verpmq patterns)

Differential Revision: http://reviews.llvm.org/D7492

llvm-svn: 228688
2015-02-10 13:22:57 +00:00
Simon Pilgrim cd32254a35 [X86][XOP] Added XOP memory folding patterns + tests
This patch adds the complete AMD Bulldozer XOP instruction set to the memory folding pattern tables for stack folding, etc.

Note: Many of the XOP instructions have multiple table entries as it can fold loads from different sources.

Differential Revision: http://reviews.llvm.org/D7484

llvm-svn: 228685
2015-02-10 12:57:17 +00:00
Craig Topper 9e71b82f40 [X86] Preserve mem refs on newly created 'Store' node instead of 'Load' node when handling store unfolding.
Bug spotted by Steve King.

I have no idea how to test this.

llvm-svn: 228672
2015-02-10 06:29:28 +00:00
Craig Topper f7e92f10b6 [X86] Remove unnecessary alignment checks from the load folding tables.
llvm-svn: 228671
2015-02-10 05:10:50 +00:00
Sanjay Patel a7b893d5c0 rename variable to give it some meaning; remove obvious comments; NFC
llvm-svn: 228579
2015-02-09 16:30:58 +00:00
Sanjay Patel fc54c61c56 fix comment that didn't match the code; remove unnecessary braces; NFC
llvm-svn: 228578
2015-02-09 16:04:52 +00:00
Simon Pilgrim d11b013623 Moved AVX2 vbroadcast (reg) instruction foldings under the correct grouping. NFC.
llvm-svn: 228526
2015-02-08 17:13:54 +00:00
Simon Pilgrim a2618679a8 [X86][AVX] Added missing stack folding support + test for vptest ymm instruction
llvm-svn: 228509
2015-02-07 21:44:06 +00:00
Eric Christopher 05b819718c Reuse a bunch of cached subtargets and remove getSubtarget calls
without a Function argument.

llvm-svn: 227814
2015-02-02 17:38:43 +00:00
Michael Kuperstein 13fbd45263 [X86] Convert esp-relative movs of function arguments to pushes, step 2
This moves the transformation introduced in r223757 into a separate MI pass.
This allows it to cover many more cases (not only cases where there must be a 
reserved call frame), and perform rudimentary call folding. It still doesn't 
have a heuristic, so it is enabled only for optsize/minsize, with stack 
alignment <= 8, where it ought to be a fairly clear win.

(Re-commit of r227728)

Differential Revision: http://reviews.llvm.org/D6789

llvm-svn: 227752
2015-02-01 16:56:04 +00:00
Michael Kuperstein e86aa9a8a4 Revert r227728 due to bad line endings.
llvm-svn: 227746
2015-02-01 16:15:07 +00:00
Michael Kuperstein bd57186c76 [X86] Convert esp-relative movs of function arguments to pushes, step 2
This moves the transformation introduced in r223757 into a separate MI pass.
This allows it to cover many more cases (not only cases where there must be a 
reserved call frame), and perform rudimentary call folding. It still doesn't 
have a heuristic, so it is enabled only for optsize/minsize, with stack 
alignment <= 8, where it ought to be a fairly clear win.

Differential Revision: http://reviews.llvm.org/D6789

llvm-svn: 227728
2015-02-01 11:44:44 +00:00
Simon Pilgrim 43fbaada8e Removed SSE lane blend findCommutedOpIndices overrides. NFCI.
The default op indices frmo TargetInstrInfo::findCommutedOpIndices are being commuted so we don't need to do this.

llvm-svn: 227689
2015-01-31 15:16:30 +00:00
Reid Kleckner a580b6ec67 Win64: Put a REX_W prefix on all TAILJMP* instructions
MSDN's x64 software conventions page says that this is one of the fixed
list of legal epilogues:
https://msdn.microsoft.com/en-us/library/tawsa7cb.aspx

Presumably this is how the unwinder distinguishes epilogue jumps from
in-function control flow.

Also normalize the way we place "## TAILCALL" comments on such jumps.

llvm-svn: 227611
2015-01-30 21:03:31 +00:00
Simon Pilgrim 0629ba1ad9 [X86][SSE] Float comparisons can sometimes be safely commuted
For ordered, unordered, equal and not-equal tests, packed float and double comparison instructions can be safely commuted without affecting the results. This patch checks the comparison mode of the (v)cmpps + (v)cmppd instructions and commutes the result if it can.

Differential Revision: http://reviews.llvm.org/D7178

llvm-svn: 227145
2015-01-26 22:29:24 +00:00
Simon Pilgrim 9b7c00352d [X86][PCLMUL] Enable commutation for PCLMUL instructions
Patch to allow (v)pclmulqdq to be commuted - swaps the src registers and inverts the immediate (low/high) src mask.

Differential Revision: http://reviews.llvm.org/D7180

llvm-svn: 227141
2015-01-26 22:00:18 +00:00
Simon Pilgrim 7e6d573e87 [X86][AVX] Added (V)MOVDDUP / (V)MOVSLDUP / (V)MOVSHDUP memory folding + tests.
Minor tweak now that D7042 is complete, we can enable stack folding for (V)MOVDDUP and do proper testing.

Added missing AVX ymm folding patterns and fixed alignment for AVX VMOVSLDUP / VMOVSHDUP.

llvm-svn: 226873
2015-01-22 22:39:59 +00:00
Simon Pilgrim 5fa0fb23ca [X86][SSE] Missing SSE/AVX1 memory folding integer instructions
Added most of the missing integer vector folding patterns for SSE (to SSE42) and AVX1.

The most useful of these are probably the i32/i64 extraction, i8/i16/i32/i64 insertions, zero/sign extension, unsigned saturation subtractions, i64 subtractions and the variable mask blends (pblendvb) - others include CLMUL, SSE42 string comparisons and bit tests.

Differential Revision: http://reviews.llvm.org/D7094

llvm-svn: 226745
2015-01-21 23:43:30 +00:00
Simon Pilgrim 20bc37c7db [X86][AVX] Missing AVX1 memory folding float instructions
Now that we can create much more exhaustive X86 memory folding tests, this patch adds the missing AVX1/F16C floating point instruction stack foldings we can easily test for including the scalar intrinsics (add, div, max, min, mul, sub), conversions float/int to double, half precision conversions, rounding, dot product and bit test. The patch also adds a couple of obviously missing SSE instructions (more to follow once we have full SSE testing).

Now that scalar folding is working it broke a very old test (2006-10-07-ScalarSSEMiscompile.ll) - this test appears to make no sense as its trying to ensure that a scalar subtraction isn't folded as it 'would zero the top elts of the loaded vector' - this test just appears to be wrong to me.

Differential Revision: http://reviews.llvm.org/D7055

llvm-svn: 226513
2015-01-19 22:40:45 +00:00
JF Bastien eeea8970b4 Revert "Insert random noops to increase security against ROP attacks (llvm)"
This reverts commit:
http://reviews.llvm.org/D3392

llvm-svn: 225948
2015-01-14 05:24:33 +00:00
JF Bastien dcdd5ad252 Insert random noops to increase security against ROP attacks (llvm)
A pass that adds random noops to X86 binaries to introduce diversity with the goal of increasing security against most return-oriented programming attacks.

Command line options:
  -noop-insertion // Enable noop insertion.
  -noop-insertion-percentage=X // X% of assembly instructions will have a noop prepended (default: 50%, requires -noop-insertion)
  -max-noops-per-instruction=X // Randomly generate X noops per instruction. ie. roll the dice X times with probability set above (default: 1). This doesn't guarantee X noop instructions.

In addition, the following 'quick switch' in clang enables basic diversity using default settings (currently: noop insertion and schedule randomization; it is intended to be extended in the future).
  -fdiversify

This is the llvm part of the patch.
clang part: D3393

http://reviews.llvm.org/D3392
Patch by Stephen Crane (@rinon)

llvm-svn: 225908
2015-01-14 01:07:26 +00:00
Craig Topper 39354e1b1a [X86] Merge a switch statement inside a default case of another switch statement on the same variable. There was no additional code in the default so this should be no functional change.
llvm-svn: 225345
2015-01-07 08:10:38 +00:00
Craig Topper ddbf51f904 [X86] Make isel select the 2-byte register form of INC/DEC even in non-64-bit mode. Convert to the 1-byte form in non-64-bit mode as part of MCInst lowering.
Overall this seems simpler. It reduces duplication of patterns between both modes and it simplifies the memory folding/unfolding tables as they don't need to create fake instructions just to keep track of 64-bitness.

llvm-svn: 225252
2015-01-06 07:35:50 +00:00
Craig Topper 49758aab94 [X86] Make isel select the shorter form of jump instructions instead of the long form.
The assembler backend will relax to the long form if necessary. This removes a swap from long form to short form in the MCInstLowering code. Selecting the long form used to be required by the old JIT.

llvm-svn: 225242
2015-01-06 04:23:53 +00:00
Michael Kuperstein 683c3cde43 [X86] Add missing memory variants to AVX false dependency breaking
Adds missing memory instruction variants to AVX false dependency breaking handling. (SSE was handled in r224246)

Differential Revision: http://reviews.llvm.org/D6780

llvm-svn: 224900
2014-12-28 13:15:05 +00:00
Robert Khasanov 79fb7292d7 [AVX512] Enable FP arithmetic lowering for AVX512VL subsets.
Added RegOp2MemOpTable4 to transform 4th operand from register to memory in merge-masked versions of instructions. 
Added lowering tests.

llvm-svn: 224516
2014-12-18 12:28:22 +00:00
Simon Pilgrim bf1e079005 [X86][SSE] Vector double -> float conversion memory folding (cvtpd2ps)
Added a missing memory folding relationship for the (V)CVTPD2PS instruction - we can safely fold these for stack reloads.

Differential Revision: http://reviews.llvm.org/D6663

llvm-svn: 224383
2014-12-16 22:30:10 +00:00
Michael Kuperstein 47c97157ef [X86] Break false dependencies before partial register updates when the source operand is in memory
Adds the various "rm" instruction variants into the list of instructions that have a partial register update. Also adds all variants of SQRTSD that were missing in the original list.

Differential Revision: http://reviews.llvm.org/D6620

llvm-svn: 224246
2014-12-15 13:18:21 +00:00
Robert Khasanov 8e8c39963d [AVX512] Added lowering for VBROADCASTSS/SD instructions.
Lowering patterns were written through avx512_broadcast_pat multiclass as pattern generates VBROADCAST and COPY_TO_REGCLASS nodes.
Added lowering tests.

llvm-svn: 223804
2014-12-09 18:45:30 +00:00
Michael Liao 5bf9578ce4 [X86] Clean up whitespace as well as minor coding style
llvm-svn: 223339
2014-12-04 05:20:33 +00:00
Simon Pilgrim 9c1e4123f8 [X86][AVX] 256-bit vector stack unaligned load/stores identification
Under many circumstances the stack is not 32-byte aligned, resulting in the use of the vmovups/vmovupd/vmovdqu instructions when inserting ymm reloads/spills.

This minor patch adds these instructions to the isFrameLoadOpcode/isFrameStoreOpcode helpers so that they can be correctly identified and not be treated as folded reloads/spills.

This has also been noticed by http://llvm.org/bugs/show_bug.cgi?id=18846 where it was causing redundant spills - I've added a reduced test case at test/CodeGen/X86/pr18846.ll

Differential Revision: http://reviews.llvm.org/D6252

llvm-svn: 222281
2014-11-18 23:38:19 +00:00
Tom Roeder eb7a303d1b Add Forward Control-Flow Integrity.
This commit adds a new pass that can inject checks before indirect calls to
make sure that these calls target known locations. It supports three types of
checks and, at compile time, it can take the name of a custom function to call
when an indirect call check fails. The default failure function ignores the
error and continues.

This pass incidentally moves the function JumpInstrTables::transformType from
private to public and makes it static (with a new argument that specifies the
table type to use); this is so that the CFI code can transform function types
at call sites to determine which jump-instruction table to use for the check at
that site.

Also, this removes support for jumptables in ARM, pending further performance
analysis and discussion.

Review: http://reviews.llvm.org/D4167
llvm-svn: 221708
2014-11-11 21:08:02 +00:00
Simon Pilgrim 615ab8e721 [X86][SSE] Vector integer/float conversion memory folding (cvttps2dq / cvttpd2dq)
Fixed an issue with the (v)cvttps2dq and (v)cvttpd2dq instructions being incorrectly put in the 2 source operand folding tables instead of the 1 source operand and added the missing SSE/AVX versions.

Also added missing (v)cvtps2dq and (v)cvtpd2dq instructions to the folding tables.

Differential Revision: http://reviews.llvm.org/D6001

llvm-svn: 221489
2014-11-06 22:15:41 +00:00
Andrea Di Biagio 7ecd22ca4a [X86] When commuting SSE immediate blend, make sure that the new blend mask is a valid imm8.
Example:
define <4 x i32> @test(<4 x i32> %a, <4 x i32> %b) {
  %shuffle = shufflevector <4 x i32> %a, <4 x i32> %b, <4 x i32> <i32 4, i32 5, i32 6, i32 3>
  ret <4 x i32> %shuffle
}

Before llc (-mattr=+sse4.1), produced the following assembly instruction:
  pblendw $4294967103, %xmm1, %xmm0

After
  pblendw $63, %xmm1, %xmm0

llvm-svn: 221455
2014-11-06 14:36:45 +00:00
Simon Pilgrim 1fc483d991 [X86][SSE] Vector integer to float conversion memory folding
Added missing memory folding for the (V)CVTDQ2PS instructions - we can safely fold these (but not the (V)CVTDQ2PD versions which have a register/memory size discrepancy in the source operand). I've added a test case demonstrating that stack folding now works.

Differential Revision: http://reviews.llvm.org/D5981

llvm-svn: 221407
2014-11-05 22:28:25 +00:00
Simon Pilgrim c9a0779309 [X86][SSE] Enable commutation for SSE immediate blend instructions
Patch to allow (v)blendps, (v)blendpd, (v)pblendw and vpblendd instructions to be commuted - swaps the src registers and inverts the blend mask.

This is primarily to improve memory folding (see new tests), but it also improves the quality of shuffles (see modified tests).

Differential Revision: http://reviews.llvm.org/D6015

llvm-svn: 221313
2014-11-04 23:25:08 +00:00
Reid Kleckner da00cf5f73 Work around bugs in MSVC "14" CTP 3's conversion logic
It appears to ignore or find ambiguous MachineInstrBuilder's conversion
operators that allow conversion to MachineInstr* and
MachineBasicBlock::bundle_iterator.

As a workaround, add an explicit way to get the MachineInstr.

llvm-svn: 221017
2014-10-31 23:19:46 +00:00
Robert Khasanov 1cf354c92f [AVX512] Fix VSQRT packed instructions internal names.
No functional change

llvm-svn: 220808
2014-10-28 18:22:41 +00:00
Simon Pilgrim a63672665f [X86][SSE] Vector integer/float conversion memory folding
Tidied up some entries in the folding tables so that they are under the correct comment section (they were categorised as AVX2 instructions when they're AVX1).

Minor patch agreed with qcolombet.

llvm-svn: 220613
2014-10-25 08:11:20 +00:00
Simon Pilgrim 2f9548a3ef [X86] Memory folding for commutative instructions (updated)
This patch improves support for commutative instructions in the x86 memory folding implementation by attempting to fold a commuted version of the instruction if the original folding fails - if that folding fails as well the instruction is 're-commuted' back to its original order before returning.

Updated version of r219584 (reverted in r219595) - the commutation attempt now explicitly ensures that neither of the commuted source operands are tied to the destination operand / register, which was the source of all the regressions that occurred with the original patch attempt.

Added additional regression test case provided by Joerg Sonnenberger.

Differential Revision: http://reviews.llvm.org/D5818

llvm-svn: 220239
2014-10-20 22:14:22 +00:00
NAKAMURA Takumi 75a0240056 Revert r219584, "[X86] Memory folding for commutative instructions."
It broke i686 selfhosting.

llvm-svn: 219595
2014-10-13 04:17:34 +00:00
Simon Pilgrim 77ac26d279 [X86] Memory folding for commutative instructions.
This patch improves support for commutative instructions in the x86 memory folding implementation by attempting to fold a commuted version of the instruction if the original folding fails - if that folding fails as well the instruction is 're-commuted' back to its original order before returning.

This mainly helps the stack inliner better fold reloads of 3 (or more) operand instructions (VEX encoded SSE etc.) but by performing this in the lowest foldMemoryOperandImpl implementation it also replaces the X86InstrInfo::optimizeLoadInstr version and is now used by FastISel too.

Differential Revision: http://reviews.llvm.org/D5701

llvm-svn: 219584
2014-10-12 10:52:55 +00:00