llvm-project

Commit Graph

Author	SHA1	Message	Date
Ahmed Bougacha	ed3c4d1a3d	[X86] Teach load folding to accept scalar _Int users of MOVSS/MOVSD. The _Int instructions are special, in that they operate on the full VR128 instead of FR32. The load folding then looks at MOVSS, at the user, and bails out when it sees a size mismatch. What we really know is that the rm_Int instructions don't load the higher lanes, so folding is fine. This happens for the straightforward intrinsic code, e.g.: _mm_add_ss(a, _mm_load_ss(p)); Fixes PR23349. Differential Revision: http://reviews.llvm.org/D10554 llvm-svn: 240326	2015-06-22 20:51:51 +00:00
Sanjay Patel	cfe0393b82	name change: hasPattern() -> getMachineCombinerPatterns() ; NFC This was suggested as part of D10460, but it's independent of any functional change. llvm-svn: 240192	2015-06-19 23:21:42 +00:00
Alexander Kornienko	70bc5f1398	Fixed/added namespace ending comments using clang-tidy. NFC The patch is generated using this command: tools/clang/tools/extra/clang-tidy/tool/run-clang-tidy.py -fix \ -checks=-,llvm-namespace-comment -header-filter='llvm/.\|clang/.*' \ llvm/lib/ Thanks to Eugene Kosov for the original patch! llvm-svn: 240137	2015-06-19 15:57:42 +00:00
Sanjoy Das	6b34a46298	[TargetInstrInfo] Add new hook: AnalyzeBranchPredicate. Summary: NFC: no one uses AnalyzeBranchPredicate yet. Add TargetInstrInfo::AnalyzeBranchPredicate and implement for x86. A later change adding support for page-fault based implicit null checks depends on this. Reviewers: reames, ab, atrick Reviewed By: atrick Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D10200 llvm-svn: 239742	2015-06-15 18:44:21 +00:00
Sanjoy Das	b666ea369c	[TargetInstrInfo] Rename getLdStBaseRegImmOfs and implement for x86. Summary: TargetInstrInfo::getLdStBaseRegImmOfs to TargetInstrInfo::getMemOpBaseRegImmOfs and implement for x86. The implementation only handles a few easy cases now and will be made more sophisticated in the future. This is NFCI: the only user of `getLdStBaseRegImmOfs` (now `getmemOpBaseRegImmOfs`) is `LoadClusterMotion` and `LoadClusterMotion` is disabled for x86. Reviewers: reames, ab, MatzeB, atrick Reviewed By: MatzeB, atrick Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D10199 llvm-svn: 239741	2015-06-15 18:44:14 +00:00
Matthias Braun	88e213159a	MachineLICM: Use TargetSchedModel instead of just itineraries This will use Itinieraries if available, but will also work if just a MCSchedModel is available. Differential Revision: http://reviews.llvm.org/D10428 llvm-svn: 239658	2015-06-13 03:42:11 +00:00
Ahmed Bougacha	c88bf54366	[CodeGen] ArrayRef'ize cond/pred in various TII APIs. NFC. llvm-svn: 239553	2015-06-11 19:30:37 +00:00
Sanjay Patel	1275a3c913	change assert that will never fire to llvm_unreachable llvm-svn: 239497	2015-06-10 23:27:33 +00:00
Sanjay Patel	08829bac81	[x86] Add a reassociation optimization to increase ILP via the MachineCombiner pass This is a reimplementation of D9780 at the machine instruction level rather than the DAG. Use the MachineCombiner pass to reassociate scalar single-precision AVX additions (just a starting point; see the TODO comments) to increase ILP when it's safe to do so. The code is closely based on the existing MachineCombiner optimization that is implemented for AArch64. This patch should not cause the kind of spilling tragedy that led to the reversion of r236031. Differential Revision: http://reviews.llvm.org/D10321 llvm-svn: 239486	2015-06-10 20:32:21 +00:00
Keno Fischer	e70b31fc1b	[InstrInfo] Refactor foldOperandImpl to thread through InsertPt. NFC Summary: This was a longstanding FIXME and is a necessary precursor to cases where foldOperandImpl may have to create more than one instruction (e.g. to constrain a register class). This is the split out NFC changes from D6262. Reviewers: pete, ributzka, uweigand, mcrosier Reviewed By: mcrosier Subscribers: mcrosier, ted, llvm-commits Differential Revision: http://reviews.llvm.org/D10174 llvm-svn: 239336	2015-06-08 20:09:58 +00:00
Igor Breger	00d9f8457b	AVX-512: Implemented 256/128bit VALIGND/Q instructions for SKX and KNL Implemented DAG lowering for all these forms. Added tests for DAG lowering and encoding. Differential Revision: http://reviews.llvm.org/D10310 llvm-svn: 239300	2015-06-08 14:03:17 +00:00
Simon Pilgrim	3a7718038d	[X86] Added BitScanForward/BitScanReverse memory folding + tests llvm-svn: 239257	2015-06-07 18:34:25 +00:00
Matthias Braun	07066cca20	MachineInstr: Remove unused parameter. llvm-svn: 237726	2015-05-19 21:22:20 +00:00
Jim Grosbach	e9119e41ef	MC: Modernize MCOperand API naming. NFC. MCOperand::Create() methods renamed to MCOperand::create(). llvm-svn: 237275	2015-05-13 18:37:00 +00:00
Sanjay Patel	a9f6d3505d	[x86] eliminate unnecessary shuffling/moves with unary scalar math ops (PR21507) Finish the job that was abandoned in D6958 following the refactoring in http://reviews.llvm.org/rL230221: 1. Uncomment the intrinsic def for the AVX r_Int instruction. 2. Add missing r_Int entries to the load folding tables; there are already tests that check these in "test/Codegen/X86/fold-load-unops.ll", so I haven't added any more in this patch. 3. Add patterns to solve PR21507 ( https://llvm.org/bugs/show_bug.cgi?id=21507 ). So instead of this: movaps %xmm0, %xmm1 rcpss %xmm1, %xmm1 movss %xmm1, %xmm0 We should now get: rcpss %xmm0, %xmm0 And instead of this: vsqrtss %xmm0, %xmm0, %xmm1 vblendps $1, %xmm1, %xmm0, %xmm0 ## xmm0 = xmm1[0],xmm0[1,2,3] We should now get: vsqrtss %xmm0, %xmm0, %xmm0 Differential Revision: http://reviews.llvm.org/D9504 llvm-svn: 236740	2015-05-07 15:48:53 +00:00
Sanjay Patel	f75ee4dc07	[x86] remove RCPPS and RSQRTPS intrinsic instruction definitions We don't need codegen-only intrinsic instructions for the vector forms of these instructions. This makes the reciprocal estimate instruction lowering identical to how we handle normal square roots: (V)SQRTPS / (V)SQRTPD. No existing regression tests fail with this patch. Differential Revision: http://reviews.llvm.org/D9301 llvm-svn: 236013	2015-04-28 18:48:45 +00:00
Sanjay Patel	2161c49a4e	[X86, AVX] add an exedepfix entry for vmovq == vmovlps == vmovlpd This is the AVX extension of r235014: http://llvm.org/viewvc/llvm-project?view=revision&revision=235014 Review: http://reviews.llvm.org/D8691 llvm-svn: 235210	2015-04-17 17:02:37 +00:00
Sanjay Patel	c03d93baa0	[X86] add an exedepfix entry for movq == movlps == movlpd This is a 1-line patch (with a TODO for AVX because that will affect even more regression tests) that lets us substitute the appropriate 64-bit store for the float/double/int domains. It's not clear to me exactly what the difference is between the 0xD6 (MOVPQI2QImr) and 0x7E (MOVSDto64mr) opcodes, but this is apparently the right choice. Differential Revision: http://reviews.llvm.org/D8691 llvm-svn: 235014	2015-04-15 15:47:51 +00:00
Simon Pilgrim	0184622bbc	[X86] Added SSE4.2 CRC32 memory folding patterns + tests llvm-svn: 234013	2015-04-03 14:24:40 +00:00
Simon Pilgrim	8dba5da06d	[X86][3DNow] Added 3DNow! memory folding patterns + tests llvm-svn: 234008	2015-04-03 11:50:30 +00:00
Eric Christopher	ed6a446403	Remove the need to cache the subtarget in the X86 TargetRegisterInfo classes. Use a Triple instead and simplify a lot of the querying logic to use lookups on the Triple. llvm-svn: 232071	2015-03-12 17:54:19 +00:00
Benjamin Kramer	5fbfe2ffdc	Convert push_back loops into append calls. No functionality change intended. llvm-svn: 230849	2015-02-28 13:20:15 +00:00
Benjamin Kramer	f1362f6196	ArrayRefize memory operand folding. NFC. llvm-svn: 230846	2015-02-28 12:04:00 +00:00
Benjamin Kramer	4f6ac16292	Replace std::copy with a back inserter with vector append where feasible All of the cases were just appending from random access iterators to a vector. Using insert/append can grow the vector to the perfect size directly and moves the growing out of the loop. No intended functionalty change. llvm-svn: 230845	2015-02-28 10:11:12 +00:00
Bruno Cardoso Lopes	ab7afa9144	[X86][MMX] Reapply: Add MMX instructions to foldable tables Reapply r230248. Teach the peephole optimizer to work with MMX instructions by adding entries into the foldable tables. This covers folding opportunities not handled during isel. llvm-svn: 230499	2015-02-25 15:14:02 +00:00
Bruno Cardoso Lopes	32173cdf06	Revert "[X86][MMX] Add MMX instructions to foldable tables" This reverts commit r230226 since it breaks win buildbots. llvm-svn: 230248	2015-02-23 19:53:37 +00:00
Bruno Cardoso Lopes	f488e2ae69	[X86][MMX] Add MMX instructions to foldable tables Teach the peephole optimizer to work with MMX instructions by adding entries into the foldable tables. This covers folding opportunities not handled during isel. llvm-svn: 230226	2015-02-23 15:23:22 +00:00
Sanjay Patel	e951a3839a	rename variables again because these tables also deal with stores; NFC Suggestion by Simon Pilgrim llvm-svn: 229574	2015-02-17 22:38:06 +00:00
Sanjay Patel	1a20fdf36f	Add comment to explain a non-obvious setting; NFC. This is paraphrased from Simon Pilgrim's comment in: http://reviews.llvm.org/D7492 llvm-svn: 229566	2015-02-17 22:09:54 +00:00
Sanjay Patel	203ee500e9	remove function names from comments; NFC llvm-svn: 229558	2015-02-17 21:55:20 +00:00
Sanjay Patel	52f9f7c0f3	replace meaningless variable names; NFCI llvm-svn: 229549	2015-02-17 21:37:28 +00:00
Sanjay Patel	b811c1d6a5	prevent folding a scalar FP load into a packed logical FP instruction (PR22371) Change the memory operands in sse12_fp_packed_scalar_logical_alias from scalars to vectors. That's what the hardware packed logical FP instructions define: 128-bit memory operands. There are no scalar versions of these instructions...because this is x86. Generating the wrong code (folding a scalar load into a 128-bit load) is still possible using the peephole optimization pass and the load folding tables. We won't completely solve this bug until we either fix the lowering in fabs/fneg/fcopysign and any other places where scalar FP logic is created or fix the load folding in foldMemoryOperandImpl() to make sure it isn't changing the size of the load. Differential Revision: http://reviews.llvm.org/D7474 llvm-svn: 229531	2015-02-17 20:08:21 +00:00
Simon Pilgrim	31457d54f7	[X86][XOP] Enable commutation for XOP instructions Patch to allow XOP instructions (integer comparison and integer multiply-add) to be commuted. The comparison instructions sometimes require the compare mode to be flipped but the remaining instructions can use default commutation modes. This patch also sets the SSE domains of all the XOP instructions. Differential Revision: http://reviews.llvm.org/D7646 llvm-svn: 229267	2015-02-14 22:40:46 +00:00
Duncan P. N. Exon Smith	5975a703e6	X86: Canonicalize access to function attributes, NFC Canonicalize access to function attributes to use the simpler API. getAttributes().getAttribute(AttributeSet::FunctionIndex, Kind) => getFnAttribute(Kind) getAttributes().hasAttribute(AttributeSet::FunctionIndex, Kind) => hasFnAttribute(Kind) llvm-svn: 229214	2015-02-14 01:59:52 +00:00
Simon Pilgrim	295eaad2b3	Relaxed over-zealous alignment requirement for VEX-encoded AES instructions llvm-svn: 228953	2015-02-12 20:01:03 +00:00
Simon Pilgrim	d142ab7d08	[X86][AVX2] Missing AVX2 memory folding instructions Added most of the missing vector folding patterns for AVX2 (as well as fixing the vpermpd and verpmq patterns) Differential Revision: http://reviews.llvm.org/D7492 llvm-svn: 228688	2015-02-10 13:22:57 +00:00
Simon Pilgrim	cd32254a35	[X86][XOP] Added XOP memory folding patterns + tests This patch adds the complete AMD Bulldozer XOP instruction set to the memory folding pattern tables for stack folding, etc. Note: Many of the XOP instructions have multiple table entries as it can fold loads from different sources. Differential Revision: http://reviews.llvm.org/D7484 llvm-svn: 228685	2015-02-10 12:57:17 +00:00
Craig Topper	9e71b82f40	[X86] Preserve mem refs on newly created 'Store' node instead of 'Load' node when handling store unfolding. Bug spotted by Steve King. I have no idea how to test this. llvm-svn: 228672	2015-02-10 06:29:28 +00:00
Craig Topper	f7e92f10b6	[X86] Remove unnecessary alignment checks from the load folding tables. llvm-svn: 228671	2015-02-10 05:10:50 +00:00
Sanjay Patel	a7b893d5c0	rename variable to give it some meaning; remove obvious comments; NFC llvm-svn: 228579	2015-02-09 16:30:58 +00:00
Sanjay Patel	fc54c61c56	fix comment that didn't match the code; remove unnecessary braces; NFC llvm-svn: 228578	2015-02-09 16:04:52 +00:00
Simon Pilgrim	d11b013623	Moved AVX2 vbroadcast (reg) instruction foldings under the correct grouping. NFC. llvm-svn: 228526	2015-02-08 17:13:54 +00:00
Simon Pilgrim	a2618679a8	[X86][AVX] Added missing stack folding support + test for vptest ymm instruction llvm-svn: 228509	2015-02-07 21:44:06 +00:00
Eric Christopher	05b819718c	Reuse a bunch of cached subtargets and remove getSubtarget calls without a Function argument. llvm-svn: 227814	2015-02-02 17:38:43 +00:00
Michael Kuperstein	13fbd45263	[X86] Convert esp-relative movs of function arguments to pushes, step 2 This moves the transformation introduced in r223757 into a separate MI pass. This allows it to cover many more cases (not only cases where there must be a reserved call frame), and perform rudimentary call folding. It still doesn't have a heuristic, so it is enabled only for optsize/minsize, with stack alignment <= 8, where it ought to be a fairly clear win. (Re-commit of r227728) Differential Revision: http://reviews.llvm.org/D6789 llvm-svn: 227752	2015-02-01 16:56:04 +00:00
Michael Kuperstein	e86aa9a8a4	Revert r227728 due to bad line endings. llvm-svn: 227746	2015-02-01 16:15:07 +00:00
Michael Kuperstein	bd57186c76	[X86] Convert esp-relative movs of function arguments to pushes, step 2 This moves the transformation introduced in r223757 into a separate MI pass. This allows it to cover many more cases (not only cases where there must be a reserved call frame), and perform rudimentary call folding. It still doesn't have a heuristic, so it is enabled only for optsize/minsize, with stack alignment <= 8, where it ought to be a fairly clear win. Differential Revision: http://reviews.llvm.org/D6789 llvm-svn: 227728	2015-02-01 11:44:44 +00:00
Simon Pilgrim	43fbaada8e	Removed SSE lane blend findCommutedOpIndices overrides. NFCI. The default op indices frmo TargetInstrInfo::findCommutedOpIndices are being commuted so we don't need to do this. llvm-svn: 227689	2015-01-31 15:16:30 +00:00
Reid Kleckner	a580b6ec67	Win64: Put a REX_W prefix on all TAILJMP* instructions MSDN's x64 software conventions page says that this is one of the fixed list of legal epilogues: https://msdn.microsoft.com/en-us/library/tawsa7cb.aspx Presumably this is how the unwinder distinguishes epilogue jumps from in-function control flow. Also normalize the way we place "## TAILCALL" comments on such jumps. llvm-svn: 227611	2015-01-30 21:03:31 +00:00
Simon Pilgrim	0629ba1ad9	[X86][SSE] Float comparisons can sometimes be safely commuted For ordered, unordered, equal and not-equal tests, packed float and double comparison instructions can be safely commuted without affecting the results. This patch checks the comparison mode of the (v)cmpps + (v)cmppd instructions and commutes the result if it can. Differential Revision: http://reviews.llvm.org/D7178 llvm-svn: 227145	2015-01-26 22:29:24 +00:00
Simon Pilgrim	9b7c00352d	[X86][PCLMUL] Enable commutation for PCLMUL instructions Patch to allow (v)pclmulqdq to be commuted - swaps the src registers and inverts the immediate (low/high) src mask. Differential Revision: http://reviews.llvm.org/D7180 llvm-svn: 227141	2015-01-26 22:00:18 +00:00
Simon Pilgrim	7e6d573e87	[X86][AVX] Added (V)MOVDDUP / (V)MOVSLDUP / (V)MOVSHDUP memory folding + tests. Minor tweak now that D7042 is complete, we can enable stack folding for (V)MOVDDUP and do proper testing. Added missing AVX ymm folding patterns and fixed alignment for AVX VMOVSLDUP / VMOVSHDUP. llvm-svn: 226873	2015-01-22 22:39:59 +00:00
Simon Pilgrim	5fa0fb23ca	[X86][SSE] Missing SSE/AVX1 memory folding integer instructions Added most of the missing integer vector folding patterns for SSE (to SSE42) and AVX1. The most useful of these are probably the i32/i64 extraction, i8/i16/i32/i64 insertions, zero/sign extension, unsigned saturation subtractions, i64 subtractions and the variable mask blends (pblendvb) - others include CLMUL, SSE42 string comparisons and bit tests. Differential Revision: http://reviews.llvm.org/D7094 llvm-svn: 226745	2015-01-21 23:43:30 +00:00
Simon Pilgrim	20bc37c7db	[X86][AVX] Missing AVX1 memory folding float instructions Now that we can create much more exhaustive X86 memory folding tests, this patch adds the missing AVX1/F16C floating point instruction stack foldings we can easily test for including the scalar intrinsics (add, div, max, min, mul, sub), conversions float/int to double, half precision conversions, rounding, dot product and bit test. The patch also adds a couple of obviously missing SSE instructions (more to follow once we have full SSE testing). Now that scalar folding is working it broke a very old test (2006-10-07-ScalarSSEMiscompile.ll) - this test appears to make no sense as its trying to ensure that a scalar subtraction isn't folded as it 'would zero the top elts of the loaded vector' - this test just appears to be wrong to me. Differential Revision: http://reviews.llvm.org/D7055 llvm-svn: 226513	2015-01-19 22:40:45 +00:00
JF Bastien	eeea8970b4	Revert "Insert random noops to increase security against ROP attacks (llvm)" This reverts commit: http://reviews.llvm.org/D3392 llvm-svn: 225948	2015-01-14 05:24:33 +00:00
JF Bastien	dcdd5ad252	Insert random noops to increase security against ROP attacks (llvm) A pass that adds random noops to X86 binaries to introduce diversity with the goal of increasing security against most return-oriented programming attacks. Command line options: -noop-insertion // Enable noop insertion. -noop-insertion-percentage=X // X% of assembly instructions will have a noop prepended (default: 50%, requires -noop-insertion) -max-noops-per-instruction=X // Randomly generate X noops per instruction. ie. roll the dice X times with probability set above (default: 1). This doesn't guarantee X noop instructions. In addition, the following 'quick switch' in clang enables basic diversity using default settings (currently: noop insertion and schedule randomization; it is intended to be extended in the future). -fdiversify This is the llvm part of the patch. clang part: D3393 http://reviews.llvm.org/D3392 Patch by Stephen Crane (@rinon) llvm-svn: 225908	2015-01-14 01:07:26 +00:00
Craig Topper	39354e1b1a	[X86] Merge a switch statement inside a default case of another switch statement on the same variable. There was no additional code in the default so this should be no functional change. llvm-svn: 225345	2015-01-07 08:10:38 +00:00
Craig Topper	ddbf51f904	[X86] Make isel select the 2-byte register form of INC/DEC even in non-64-bit mode. Convert to the 1-byte form in non-64-bit mode as part of MCInst lowering. Overall this seems simpler. It reduces duplication of patterns between both modes and it simplifies the memory folding/unfolding tables as they don't need to create fake instructions just to keep track of 64-bitness. llvm-svn: 225252	2015-01-06 07:35:50 +00:00
Craig Topper	49758aab94	[X86] Make isel select the shorter form of jump instructions instead of the long form. The assembler backend will relax to the long form if necessary. This removes a swap from long form to short form in the MCInstLowering code. Selecting the long form used to be required by the old JIT. llvm-svn: 225242	2015-01-06 04:23:53 +00:00
Michael Kuperstein	683c3cde43	[X86] Add missing memory variants to AVX false dependency breaking Adds missing memory instruction variants to AVX false dependency breaking handling. (SSE was handled in r224246) Differential Revision: http://reviews.llvm.org/D6780 llvm-svn: 224900	2014-12-28 13:15:05 +00:00
Robert Khasanov	79fb7292d7	[AVX512] Enable FP arithmetic lowering for AVX512VL subsets. Added RegOp2MemOpTable4 to transform 4th operand from register to memory in merge-masked versions of instructions. Added lowering tests. llvm-svn: 224516	2014-12-18 12:28:22 +00:00
Simon Pilgrim	bf1e079005	[X86][SSE] Vector double -> float conversion memory folding (cvtpd2ps) Added a missing memory folding relationship for the (V)CVTPD2PS instruction - we can safely fold these for stack reloads. Differential Revision: http://reviews.llvm.org/D6663 llvm-svn: 224383	2014-12-16 22:30:10 +00:00
Michael Kuperstein	47c97157ef	[X86] Break false dependencies before partial register updates when the source operand is in memory Adds the various "rm" instruction variants into the list of instructions that have a partial register update. Also adds all variants of SQRTSD that were missing in the original list. Differential Revision: http://reviews.llvm.org/D6620 llvm-svn: 224246	2014-12-15 13:18:21 +00:00
Robert Khasanov	8e8c39963d	[AVX512] Added lowering for VBROADCASTSS/SD instructions. Lowering patterns were written through avx512_broadcast_pat multiclass as pattern generates VBROADCAST and COPY_TO_REGCLASS nodes. Added lowering tests. llvm-svn: 223804	2014-12-09 18:45:30 +00:00
Michael Liao	5bf9578ce4	[X86] Clean up whitespace as well as minor coding style llvm-svn: 223339	2014-12-04 05:20:33 +00:00
Simon Pilgrim	9c1e4123f8	[X86][AVX] 256-bit vector stack unaligned load/stores identification Under many circumstances the stack is not 32-byte aligned, resulting in the use of the vmovups/vmovupd/vmovdqu instructions when inserting ymm reloads/spills. This minor patch adds these instructions to the isFrameLoadOpcode/isFrameStoreOpcode helpers so that they can be correctly identified and not be treated as folded reloads/spills. This has also been noticed by http://llvm.org/bugs/show_bug.cgi?id=18846 where it was causing redundant spills - I've added a reduced test case at test/CodeGen/X86/pr18846.ll Differential Revision: http://reviews.llvm.org/D6252 llvm-svn: 222281	2014-11-18 23:38:19 +00:00
Tom Roeder	eb7a303d1b	Add Forward Control-Flow Integrity. This commit adds a new pass that can inject checks before indirect calls to make sure that these calls target known locations. It supports three types of checks and, at compile time, it can take the name of a custom function to call when an indirect call check fails. The default failure function ignores the error and continues. This pass incidentally moves the function JumpInstrTables::transformType from private to public and makes it static (with a new argument that specifies the table type to use); this is so that the CFI code can transform function types at call sites to determine which jump-instruction table to use for the check at that site. Also, this removes support for jumptables in ARM, pending further performance analysis and discussion. Review: http://reviews.llvm.org/D4167 llvm-svn: 221708	2014-11-11 21:08:02 +00:00
Simon Pilgrim	615ab8e721	[X86][SSE] Vector integer/float conversion memory folding (cvttps2dq / cvttpd2dq) Fixed an issue with the (v)cvttps2dq and (v)cvttpd2dq instructions being incorrectly put in the 2 source operand folding tables instead of the 1 source operand and added the missing SSE/AVX versions. Also added missing (v)cvtps2dq and (v)cvtpd2dq instructions to the folding tables. Differential Revision: http://reviews.llvm.org/D6001 llvm-svn: 221489	2014-11-06 22:15:41 +00:00
Andrea Di Biagio	7ecd22ca4a	[X86] When commuting SSE immediate blend, make sure that the new blend mask is a valid imm8. Example: define <4 x i32> @test(<4 x i32> %a, <4 x i32> %b) { %shuffle = shufflevector <4 x i32> %a, <4 x i32> %b, <4 x i32> <i32 4, i32 5, i32 6, i32 3> ret <4 x i32> %shuffle } Before llc (-mattr=+sse4.1), produced the following assembly instruction: pblendw $4294967103, %xmm1, %xmm0 After pblendw $63, %xmm1, %xmm0 llvm-svn: 221455	2014-11-06 14:36:45 +00:00
Simon Pilgrim	1fc483d991	[X86][SSE] Vector integer to float conversion memory folding Added missing memory folding for the (V)CVTDQ2PS instructions - we can safely fold these (but not the (V)CVTDQ2PD versions which have a register/memory size discrepancy in the source operand). I've added a test case demonstrating that stack folding now works. Differential Revision: http://reviews.llvm.org/D5981 llvm-svn: 221407	2014-11-05 22:28:25 +00:00
Simon Pilgrim	c9a0779309	[X86][SSE] Enable commutation for SSE immediate blend instructions Patch to allow (v)blendps, (v)blendpd, (v)pblendw and vpblendd instructions to be commuted - swaps the src registers and inverts the blend mask. This is primarily to improve memory folding (see new tests), but it also improves the quality of shuffles (see modified tests). Differential Revision: http://reviews.llvm.org/D6015 llvm-svn: 221313	2014-11-04 23:25:08 +00:00
Reid Kleckner	da00cf5f73	Work around bugs in MSVC "14" CTP 3's conversion logic It appears to ignore or find ambiguous MachineInstrBuilder's conversion operators that allow conversion to MachineInstr* and MachineBasicBlock::bundle_iterator. As a workaround, add an explicit way to get the MachineInstr. llvm-svn: 221017	2014-10-31 23:19:46 +00:00
Robert Khasanov	1cf354c92f	[AVX512] Fix VSQRT packed instructions internal names. No functional change llvm-svn: 220808	2014-10-28 18:22:41 +00:00
Simon Pilgrim	a63672665f	[X86][SSE] Vector integer/float conversion memory folding Tidied up some entries in the folding tables so that they are under the correct comment section (they were categorised as AVX2 instructions when they're AVX1). Minor patch agreed with qcolombet. llvm-svn: 220613	2014-10-25 08:11:20 +00:00
Simon Pilgrim	2f9548a3ef	[X86] Memory folding for commutative instructions (updated) This patch improves support for commutative instructions in the x86 memory folding implementation by attempting to fold a commuted version of the instruction if the original folding fails - if that folding fails as well the instruction is 're-commuted' back to its original order before returning. Updated version of r219584 (reverted in r219595) - the commutation attempt now explicitly ensures that neither of the commuted source operands are tied to the destination operand / register, which was the source of all the regressions that occurred with the original patch attempt. Added additional regression test case provided by Joerg Sonnenberger. Differential Revision: http://reviews.llvm.org/D5818 llvm-svn: 220239	2014-10-20 22:14:22 +00:00
NAKAMURA Takumi	75a0240056	Revert r219584, "[X86] Memory folding for commutative instructions." It broke i686 selfhosting. llvm-svn: 219595	2014-10-13 04:17:34 +00:00
Simon Pilgrim	77ac26d279	[X86] Memory folding for commutative instructions. This patch improves support for commutative instructions in the x86 memory folding implementation by attempting to fold a commuted version of the instruction if the original folding fails - if that folding fails as well the instruction is 're-commuted' back to its original order before returning. This mainly helps the stack inliner better fold reloads of 3 (or more) operand instructions (VEX encoded SSE etc.) but by performing this in the lowest foldMemoryOperandImpl implementation it also replaces the X86InstrInfo::optimizeLoadInstr version and is now used by FastISel too. Differential Revision: http://reviews.llvm.org/D5701 llvm-svn: 219584	2014-10-12 10:52:55 +00:00
Chandler Carruth	0927da4583	[x86] Remove the 2-addr-to-3-addr "optimization" from shufps to pshufd. This trades a (register-renamer-friendly) movaps for a floating point / integer domain cross. That is a very bad trade, even on architectures where domain crossing is relatively fast. On any chip where there is even a cycle stall, this is a Very Bad Idea. It doesn't even seem likely to cause a spill to be introduced because the reason for the copy is to destructively shuffle in place. Thanks to Ben Kramer for fixing a bug in this code that my new shuffle lowering exposed and highlighting that perhaps it should just go away. =] llvm-svn: 219090	2014-10-05 22:57:31 +00:00
Benjamin Kramer	77b0e13aba	X86: Don't drop half of the mask when converting 2-address shufps into 3-address pshufd. It's debatable whether this transform is useful at all, but for now make sure we don't generate invalid asm. llvm-svn: 219084	2014-10-05 16:14:29 +00:00
Robert Khasanov	6d62c0202b	[AVX512] Added load/store from BW/VL subsets to Register2Memory opcode tables. Added lowering tests for these instructions. llvm-svn: 218508	2014-09-26 09:48:50 +00:00
Pavel Chupin	be9f12102f	[x32] Fix segmented stacks support Summary: Update segmented-stacks*.ll tests with x32 target case and make corresponding changes to make them pass. Test Plan: tests updated with x32 target Reviewers: nadav, rafael, dschuff Subscribers: llvm-commits, zinovy.nis Differential Revision: http://reviews.llvm.org/D5245 llvm-svn: 218247	2014-09-22 13:11:35 +00:00
Akira Hatanaka	760814a7e1	[X86] Fix a bug in X86's peephole optimization. Peephole optimization was folding MOVSDrm, which is a zero-extending double precision floating point load, into ADDPDrr, which is a SIMD add of two packed double precision floating point values. (before) %vreg21<def> = MOVSDrm <fi#0>, 1, %noreg, 0, %noreg; mem:LD8[%7](align=16)(tbaa=<badref>) VR128:%vreg21 %vreg23<def,tied1> = ADDPDrr %vreg20<tied0>, %vreg21; VR128:%vreg23,%vreg20,%vreg21 (after) %vreg23<def,tied1> = ADDPDrm %vreg20<tied0>, <fi#0>, 1, %noreg, 0, %noreg; mem:LD8[%7](align=16)(tbaa=<badref>) VR128:%vreg23,%vreg20 X86InstrInfo::foldMemoryOperandImpl already had the logic that prevented this from happening. However the check wasn't being conducted for loads from stack objects. This commit factors out the logic into a new function and uses it for checking loads from stack slots are not zero-extending loads. rdar://problem/18236850 llvm-svn: 217799	2014-09-15 18:23:52 +00:00
Eric Christopher	79cc1e3ae7	Reinstate "Nuke the old JIT." Approved by Jim Grosbach, Lang Hames, Rafael Espindola. This reinstates commits r215111, 215115, 215116, 215117, 215136. llvm-svn: 216982	2014-09-02 22:28:02 +00:00
Eric Christopher	b9fd9ed37e	Temporarily Revert "Nuke the old JIT." as it's not quite ready to be deleted. This will be reapplied as soon as possible and before the 3.6 branch date at any rate. Approved by Jim Grosbach, Lang Hames, Rafael Espindola. This reverts commits r215111, 215115, 215116, 215117, 215136. llvm-svn: 215154	2014-08-07 22:02:54 +00:00
Rafael Espindola	f8b27c41e8	Nuke the old JIT. I am sure we will be finding bits and pieces of dead code for years to come, but this is a good start. Thanks to Lang Hames for making MCJIT a good replacement! llvm-svn: 215111	2014-08-07 14:21:18 +00:00
Robert Khasanov	3c30c4bdec	[AVX512] Added load/store instructions to Register2Memory opcode tables. Added lowering tests for load/store. Reviewed by Elena Demikhovsky <elena.demikhovsky@intel.com> llvm-svn: 214972	2014-08-06 15:40:34 +00:00
JF Bastien	ac8b66b32c	Fix typos in comments and doc Committing http://reviews.llvm.org/D4798 for Robin Morisset (morisset@google.com) llvm-svn: 214934	2014-08-05 23:27:34 +00:00
Eric Christopher	d913448b38	Remove the TargetMachine forwards for TargetSubtargetInfo based information and update all callers. No functional change. llvm-svn: 214781	2014-08-04 21:25:23 +00:00
Robert Khasanov	7ca7df0bf9	[SKX] Enabling load/store instructions: encoding Instructions: VMOVAPD, VMOVAPS, VMOVDQA8, VMOVDQA16, VMOVDQA32,VMOVDQA64, VMOVDQU8, VMOVDQU16, VMOVDQU32,VMOVDQU64, VMOVUPD, VMOVUPS, Reviewed by Elena Demikhovsky <elena.demikhovsky@intel.com> llvm-svn: 214719	2014-08-04 14:35:15 +00:00
Akira Hatanaka	e5b6e0d231	[stack protector] Fix a potential security bug in stack protector where the address of the stack guard was being spilled to the stack. Previously the address of the stack guard would get spilled to the stack if it was impossible to keep it in a register. This patch introduces a new target independent node and pseudo instruction which gets expanded post-RA to a sequence of instructions that load the stack guard value. Register allocator can now just remat the value when it can't keep it in a register. <rdar://problem/12475629> llvm-svn: 213967	2014-07-25 19:31:34 +00:00
Robert Khasanov	74acbb7767	[SKX] Enabling mask instructions: encoding, lowering KMOVB, KMOVW, KMOVD, KMOVQ, KNOTB, KNOTW, KNOTD, KNOTQ Reviewed by Elena Demikhovsky <elena.demikhovsky@intel.com> llvm-svn: 213757	2014-07-23 14:49:42 +00:00
Akira Hatanaka	7cc27649a6	[X86] Mark pseudo instruction TEST8ri_NOEREX as hasSIdeEffects=0. Also, add a case clause in X86InstrInfo::shouldScheduleAdjacent to enable macro-fusion. <rdar://problem/15680770> llvm-svn: 212747	2014-07-10 18:00:53 +00:00
Juergen Ributzka	6ef06f9159	[FastISel][X86] Optimize selects when the condition comes from a compare. Optimize the select instructions sequence to use the EFLAGS directly from a compare when possible. llvm-svn: 211543	2014-06-23 21:55:36 +00:00
Juergen Ributzka	2da1bbc113	[FastISel][X86] Refactor the code to get the X86 condition from a helper function. NFC. Make use of helper functions to simplify the branch and compare instruction selection in FastISel. Also add test cases for compare and conditonal branch. llvm-svn: 211077	2014-06-16 23:58:24 +00:00
Eric Christopher	6c786a1dd1	Remove the use of TargetMachine from X86InstrInfo. llvm-svn: 210596	2014-06-10 22:34:31 +00:00
Eric Christopher	1f8ad4f4a7	Move X86RegisterInfo away from using the TargetMachine and only using the subtarget. llvm-svn: 210595	2014-06-10 22:34:28 +00:00
Tom Roeder	44cb65fff1	Add a new attribute called 'jumptable' that creates jump-instruction tables for functions marked with this attribute. It includes a pass that rewrites all indirect calls to jumptable functions to pass through these tables. This also adds backend support for generating the jump-instruction tables on ARM and X86. Note that since the jumptable attribute creates a second function pointer for a function, any function marked with jumptable must also be marked with unnamed_addr. llvm-svn: 210280	2014-06-05 19:29:43 +00:00
Nick Lewycky	0a9a866ce1	Fix a use of uninitialized value. OldCC is set when IsCmpZero \|\| IsSwapped and read when ShouldUpdateCC \|\| IsSwapped, and ShouldUpdateCC is independent. Fixes PR19932, but no test since I wasn't able to get any symptoms to appear, not even with valgrind and the testcase from the PR. It's clear what happened from inspection of the code. llvm-svn: 210168	2014-06-04 07:45:54 +00:00
Eric Christopher	0d5c99eb08	Avoid using subtarget features when adding X86 specific passes to the pass pipeline. llvm-svn: 209382	2014-05-22 01:46:02 +00:00
Eric Christopher	463b84b48b	Rename createGlobalBaseRegPass -> createX86GlobalBaseRegPass to make it obvious that it's a target specific pass. llvm-svn: 209380	2014-05-22 01:45:57 +00:00
Alexey Volkov	6226de6721	[X86] Tune LEA usage for Silvermont According to Intel Software Optimization Manual on Silvermont in some cases LEA is better to be replaced with ADD instructions: "The rule of thumb for ADDs and LEAs is that it is justified to use LEA with a valid index and/or displacement for non-destructive destination purposes (especially useful for stack offset cases), or to use a SCALE. Otherwise, ADD(s) are preferable." Differential Revision: http://reviews.llvm.org/D3826 llvm-svn: 209198	2014-05-20 08:55:50 +00:00
Benjamin Kramer	594f963ea6	X86: If we have an instruction that sets a flag and a zero test on the input of that instruction try to eliminate the test. For example tzcntl %edi, %ebx testl %edi, %edi je .label can be rewritten into tzcntl %edi, %ebx jb .label A minor complication is that tzcnt sets CF instead of ZF when the input is zero, we have to rewrite users of the flags from ZF to CF. Currently we recognize patterns using lzcnt, tzcnt and popcnt. Differential Revision: http://reviews.llvm.org/D3454 llvm-svn: 208788	2014-05-14 16:14:45 +00:00
Craig Topper	646f64f04a	Use X86 memory operand enums instead of hardcoding. llvm-svn: 208064	2014-05-06 07:04:32 +00:00
Craig Topper	062a2baef0	[C++] Use 'nullptr'. Target edition. llvm-svn: 207197	2014-04-25 05:30:21 +00:00
Chandler Carruth	d174b72a28	[cleanup] Lift using directives, DEBUG_TYPE definitions, and even some system headers above the includes of generated '.inc' files that actually contain code. In a few targets this was already done pretty consistently, but it wasn't done really consistently anywhere. It is strictly cleaner IMO and necessary in a bunch of places where the DEBUG_TYPE is referenced from the generated code. Consistency with the necessary places trumps. Hopefully the build bots are OK with the movement of intrin.h... llvm-svn: 206838	2014-04-22 02:03:14 +00:00
Chandler Carruth	e96dd8975f	[Modules] Make Support/Debug.h modular. This requires it to not change behavior based on other files defining DEBUG_TYPE, which means it cannot define DEBUG_TYPE at all. This is actually better IMO as it forces folks to define relevant DEBUG_TYPEs for their files. However, it requires all files that currently use DEBUG(...) to define a DEBUG_TYPE if they don't already. I've updated all such files in LLVM and will do the same for other upstream projects. This still leaves one important change in how LLVM uses the DEBUG_TYPE macro going forward: we need to only define the macro after header files have been #include-ed. Previously, this wasn't possible because Debug.h required the macro to be pre-defined. This commit removes that. By defining DEBUG_TYPE after the includes two things are fixed: - Header files that need to provide a DEBUG_TYPE for some inline code can do so by defining the macro before their inline code and undef-ing it afterward so the macro does not escape. - We no longer have rampant ODR violations due to including headers with different DEBUG_TYPE definitions. This may be mostly an academic violation today, but with modules these types of violations are easy to check for and potentially very relevant. Where necessary to suppor headers with DEBUG_TYPE, I have moved the definitions below the includes in this commit. I plan to move the rest of the DEBUG_TYPE macros in LLVM in subsequent commits; this one is big enough. The comments in Debug.h, which were hilariously out of date already, have been updated to reflect the recommended practice going forward. llvm-svn: 206822	2014-04-21 22:55:11 +00:00
Lang Hames	c59a2d0529	[X86] As per suggestion from Craig Topper and Hal Finkel, override TargetInstrInfo::findCommutedOpIndices to enable VFMA*231 commutation, rather than abusing commuteInstruction. Thanks very much for the suggestion guys! llvm-svn: 205489	2014-04-02 23:57:49 +00:00
Lang Hames	c2c751312e	[X86] Make the VFMA*231 variants commutable and relax the alignment restrictions on FMA3 memory operands. FMA3 instructions are VEX encoded, so they can load from unaligned memory. Testcase to follow, along with related patch. <rdar://problem/16478629> llvm-svn: 205472	2014-04-02 22:06:16 +00:00
Elena Demikhovsky	bb2f6b72d3	AVX-512: Implemented masking for integer arithmetic & logic instructions. By Robert Khasanov rob.khasanov@gmail.com llvm-svn: 204906	2014-03-27 09:45:08 +00:00
Quentin Colombet	6f12ae0d5c	[X86] Add broadcast instructions to the table used by ExeDepsFix pass. Adds the different broadcast instructions to the ReplaceableInstrsAVX2 table. That way the ExeDepsFix pass can take better decisions when AVX2 broadcasts are across domain (int <-> float). In particular, prior to this patch we were generating: vpbroadcastd LCPI1_0(%rip), %ymm2 vpand %ymm2, %ymm0, %ymm0 vmaxps %ymm1, %ymm0, %ymm0 ## <- domain change penalty Now, we generate the following nice sequence where everything is in the float domain: vbroadcastss LCPI1_0(%rip), %ymm2 vandps %ymm2, %ymm0, %ymm0 vmaxps %ymm1, %ymm0, %ymm0 <rdar://problem/16354675> llvm-svn: 204770	2014-03-26 00:10:22 +00:00
Owen Anderson	16c6bf49b7	Phase 2 of the great MachineRegisterInfo cleanup. This time, we're changing operator* on the by-operand iterators to return a MachineOperand& rather than a MachineInstr&. At this point they almost behave like normal iterators! Again, this requires making some existing loops more verbose, but should pave the way for the big range-based for-loop cleanups in the future. llvm-svn: 203865	2014-03-13 23:12:04 +00:00
Craig Topper	2d9361e325	[C++11] Add 'override' keyword to virtual methods that override their base class. llvm-svn: 203378	2014-03-09 07:44:38 +00:00
Elena Demikhovsky	8fae565f08	AVX-512: fixed comressed displacement - by Robert Khazanov llvm-svn: 203096	2014-03-06 08:15:35 +00:00
Benjamin Kramer	b6d0bd48bd	[C++11] Replace llvm::next and llvm::prior with std::next and std::prev. Remove the old functions. llvm-svn: 202636	2014-03-02 12:27:27 +00:00
Lang Hames	5ec150c967	Replace X86 FMA intrinsic pseduo-instructions with def pats. It looks like these pseudos were only used for pattern matching. Def pats are the appropriate way to do that. As a bonus, these intrinsics will now have memory operands folded properly, and better FMA3 variants selected where appropriate (see r199933). <rdar://problem/15611947> llvm-svn: 200577	2014-01-31 21:29:19 +00:00
Elena Demikhovsky	a5d38a39a0	AVX-512: added VPERM2D VPERM2Q VPERM2PS VPERM2PD instructions, they give better sequences than VPERMI llvm-svn: 199893	2014-01-23 14:27:26 +00:00
Elena Demikhovsky	172a27c750	AVX-512: Added more intrinsics for pmin/pmax, pabs, blend, pmuldq. llvm-svn: 198745	2014-01-08 10:54:22 +00:00
Craig Topper	854f644781	Handle MOV32r0 in expandPostRAPseudo instead of MCInst lowering. No functional change intended. llvm-svn: 198254	2013-12-31 03:05:38 +00:00
Elena Demikhovsky	47fc44e52e	AVX-512: Added legal type MVT::i1 and VK1 register for it. Added scalar compare VCMPSS, VCMPSD. Implemented LowerSELECT for scalar FP operations. I replaced FSETCCss, FSETCCsd with one node type FSETCCs. Node extract_vector_elt(v16i1/v8i1, idx) returns an element of type i1. llvm-svn: 197384	2013-12-16 13:52:35 +00:00
Elena Demikhovsky	6270b388c8	AVX-512: Changed intrinsics of VPCONFLICT to match GCC builtin form llvm-svn: 196914	2013-12-10 11:58:35 +00:00
Lang Hames	39609996d9	Refactor a lot of patchpoint/stackmap related code to simplify and make it target independent. Most of the x86 specific stackmap/patchpoint handling was necessitated by the use of the native address-mode format for frame index operands. PEI has now been modified to treat stackmap/patchpoint similarly to DEBUG_INFO, allowing us to use a simple, platform independent register/offset pair for frame indexes on stackmap/patchpoints. Notes: - Folding is now platform independent and automatically supported. - Emiting patchpoints with direct memory references now just involves calling the TargetLoweringBase::emitPatchPoint utility method from the target's XXXTargetLowering::EmitInstrWithCustomInserter method. (See X86TargetLowering for an example). - No more ugly platform-specific operand parsers. This patch shouldn't change the generated output for X86. llvm-svn: 195944	2013-11-29 03:07:54 +00:00
Andrew Trick	391dbadb51	StackMap: Implement support for DirectMemRefOp. A Direct stack map location records the address of frame index. This address is itself the value that the runtime requested. This differs from IndirectMemRefOp locations, which refer to a stack locations from which the requested values must be loaded. Direct locations can directly communicate the address if an alloca, while IndirectMemRefOp handle register spills. For example: entry: %a = alloca i64... llvm.experimental.stackmap(i32 <ID>, i32 <shadowBytes>, i64* %a) Since both the alloca and stackmap intrinsic are in the entry block, and the intrinsic takes the address of the alloca, the runtime can assume that LLVM will not substitute alloca with any intervening value. This must be verified by the runtime by checking that the stack map's location is a Direct location type. The runtime can then determine the alloca's relative location on the stack immediately after compilation, or at any time thereafter. This differs from Register and Indirect locations, because the runtime can only read the values in those locations when execution reaches the instruction address of the stack map. llvm-svn: 195712	2013-11-26 02:03:25 +00:00
Andrew Trick	0ab5ba8c35	Use symbolic operands in the patchpoint folding routine and fix a spilling bug. Fixes <rdar://15487687> [JS] AnyRegCC argument ends up being spilled llvm-svn: 195094	2013-11-19 03:29:59 +00:00
Juergen Ributzka	d12ccbd343	[weak vtables] Remove a bunch of weak vtables This patch removes most of the trivial cases of weak vtables by pinning them to a single object file. The memory leaks in this version have been fixed. Thanks Alexey for pointing them out. Differential Revision: http://llvm-reviews.chandlerc.com/D2068 Reviewed by Andy llvm-svn: 195064	2013-11-19 00:57:56 +00:00
Alexey Samsonov	49109a279c	Revert r194865 and r194874. This change is incorrect. If you delete virtual destructor of both a base class and a subclass, then the following code: Base *foo = new Child(); delete foo; will not cause the destructor for members of Child class. As a result, I observe plently of memory leaks. Notable examples I investigated are: ObjectBuffer and ObjectBufferStream, AttributeImpl and StringSAttributeImpl. llvm-svn: 194997	2013-11-18 09:31:53 +00:00
Andrew Trick	10d5be4e6e	Added a size field to the stack map record to handle subregister spills. Implementing this on bigendian platforms could get strange. I added a target hook, getStackSlotRange, per Jakob's recommendation to make this as explicit as possible. llvm-svn: 194942	2013-11-17 01:36:23 +00:00
Lang Hames	24e3954700	During folding for patchpoint/stackmap instructions, defer creation of new MIs until we know that folding will be successful. No functional change. llvm-svn: 194880	2013-11-15 23:13:21 +00:00
Juergen Ributzka	dbedae89b9	[weak vtables] Remove a bunch of weak vtables This patch removes most of the trivial cases of weak vtables by pinning them to a single object file. Differential Revision: http://llvm-reviews.chandlerc.com/D2068 Reviewed by Andy llvm-svn: 194865	2013-11-15 22:34:48 +00:00
Elena Demikhovsky	0a74b7da35	AVX-512: Handled extractelement from mask vector; Added VMOSHDUP/VMOVSLDUP shuffle instructions. llvm-svn: 194691	2013-11-14 11:29:27 +00:00
Andrew Trick	0ef482ef02	Cleanup the stackmap operand folding code and fix a corner case. I still don't know how to refer to the fixed operands symbolically. I plan to look into it. llvm-svn: 194529	2013-11-12 22:58:39 +00:00
Andrew Trick	3112a5e4c0	Simplify operand folding when rematerializing a load. We already know how to fold a reload from a frameindex without analyzing the load instruction. Generalize this to handle any frameindex load. This streamlines the logic for rematerializing loads from stack arguments. As a side effect, it allows stackmaps to record a stack argument location without spilling it. Verified no effect on codegen for llvm test-suite. llvm-svn: 194497	2013-11-12 18:06:12 +00:00
Andrew Trick	a28099fdd4	Fix the recently added anyregcc convention to handle spilled operands. Fixes <rdar://15432754> [JS] Assertion: "Folded a def to a non-store!" The primary purpose of anyregcc is to prevent a patchpoint's call arguments and return value from being spilled. They must be available in a register, although the calling convention does not pin the register. It's up to the front end to avoid using this convention for calls with more arguments than allocatable registers. llvm-svn: 194428	2013-11-11 22:40:25 +00:00
Juergen Ributzka	9969d3e6e8	[Stackmap] Add AnyReg calling convention support for patchpoint intrinsic. The idea of the AnyReg Calling Convention is to provide the call arguments in registers, but not to force them to be placed in a paticular order into a specified set of registers. Instead it is up tp the register allocator to assign any register as it sees fit. The same applies to the return value (if applicable). Differential Revision: http://llvm-reviews.chandlerc.com/D2009 Reviewed by Andy llvm-svn: 194293	2013-11-08 23:28:16 +00:00
Andrew Trick	153ebe6d2a	Add support for stack map generation in the X86 backend. Originally implemented by Lang Hames. llvm-svn: 193811	2013-10-31 22:11:56 +00:00
Craig Topper	f7290f7194	Replace (V)MOVZDI2PDIrr/rm instructions with patterns that select (V)MOVDI2PDIrr/rm. llvm-svn: 193146	2013-10-22 04:35:20 +00:00
Andrew Trick	b6d56be69d	Fix the ExecutionDepsFix pass to handle AVX instructions. This pass is needed to break false dependencies. Without it, unlucky register assignment can result in wild (5x) swings in performance. This pass was trying to handle AVX but not getting it right. AVX doesn't have partial register defs, it has unused register reads in which the high bits of a source operand are copied into the unused bits of the dest. Fixing this requires conservative liveness analysis. This is awkard because the pass already has its own pseudo-liveness. However, proper liveness is expensive, and we would like to use a generic utility to compute it. The fix only invokes liveness on-demand. It is rare to detect a case that needs undef-read dependence breaking, but when it happens, it can be needed many times within a very large block. I think the existing heuristic which uses a register window of 16 is too conservative for loop-carried false dependencies. If the loop is a reduction. The out-of-order engine may be able to execute several loop iterations in parallel. However, I'll leave this tuning exercise for next time. llvm-svn: 192635	2013-10-14 22:19:03 +00:00
Andrew Trick	8460a3bfa1	whitespace llvm-svn: 192633	2013-10-14 22:18:56 +00:00
Craig Topper	68d2546ec6	Remove FsMOVAPSrr and friends. They have no patterns and are no longer selected anywhere. llvm-svn: 192089	2013-10-07 06:10:45 +00:00
Benjamin Kramer	858a3880d6	X86: Don't fold spills into SSE operations if the stack is unaligned. Regalloc can emit unaligned spills nowadays, but we can't fold the spills into SSE ops if we can't guarantee alignment. PR12250. llvm-svn: 192064	2013-10-06 13:48:22 +00:00
Elena Demikhovsky	2e408aefe0	AVX-512: added scalar convert instructions and intrinsics. Fixed load folding in VPERM2I instruction. llvm-svn: 192063	2013-10-06 13:11:09 +00:00
Craig Topper	c81e29435a	Add TBM instructions to loading folding tables. llvm-svn: 192046	2013-10-05 20:20:51 +00:00
Elena Demikhovsky	34586e7d41	AVX-512: fixed a bug in getLoadStoreRegOpcode() for AVX-512 target llvm-svn: 191818	2013-10-02 12:20:42 +00:00
Craig Topper	514f02cc07	Add AES and SHA instructions to the load folding tables. llvm-svn: 190850	2013-09-17 06:50:11 +00:00
Craig Topper	684abc8236	Fix column alignment. No functional change. llvm-svn: 190849	2013-09-17 06:05:17 +00:00
Elena Demikhovsky	402ee64f13	AVX-512: updated the list of high-latency instructions. llvm-svn: 189740	2013-09-02 07:41:01 +00:00
Elena Demikhovsky	534015e550	AVX-512: gather-scatter tests; added foldable instructions; Specify GATHER/SCATTER as heavy instructions. llvm-svn: 189736	2013-09-02 07:12:29 +00:00
Elena Demikhovsky	f8f478b19d	AVX-512: added UNPACK instructions and tests for all-zero/all-ones vectors llvm-svn: 189189	2013-08-25 12:54:30 +00:00
Elena Demikhovsky	3ce8dbbac2	AVX-512: Added VMOVD, VMOVQ, VMOVSS, VMOVSD instructions. llvm-svn: 188637	2013-08-18 13:08:57 +00:00
Elena Demikhovsky	cf5b1458e6	AVX-512: Added VPERM* instructons and MOV* zmm-to-zmm instructions. Added a test for shuffles using VPERM. llvm-svn: 188147	2013-08-11 07:55:09 +00:00
Andrew Trick	47740deb26	Add MI-Sched support for x86 macro fusion. This is an awful implementation of the target hook. But we don't have abstractions yet for common machine ops, and I don't see any quick way to make it table-driven. llvm-svn: 184664	2013-06-23 09:00:28 +00:00
David Blaikie	b735b4d6db	DebugInfo: remove target-specific Frame Index handling for DBG_VALUE MachineInstrs Frame index handling is now target-agnostic, so delete the target hooks for creation & asm printing of target-specific addressing in DBG_VALUEs and any related functions. llvm-svn: 184067	2013-06-16 20:34:27 +00:00
Tim Northover	6833e3fd75	X86: Stop LEA64_32r doing unspeakable things to its arguments. Previously LEA64_32r went through virtually the entire backend thinking it was using 32-bit registers until its blissful illusions were cruelly snatched away by MCInstLower and 64-bit equivalents were substituted at the last minute. This patch makes it behave normally, and take 64-bit registers as sources all the way through. Previous uses (for 32-bit arithmetic) are accommodated via SUBREG_TO_REG instructions which make the types and classes agree properly. llvm-svn: 183693	2013-06-10 20:43:49 +00:00
Bill Wendling	8f26840c5a	Don't cache the instruction and register info from the TargetMachine, because the internals of TargetMachine could change. No functionality change intended. llvm-svn: 183571	2013-06-07 21:00:34 +00:00
Tim Northover	339bf154cc	Revert r183069: "TMP: LEA64_32r fixing" Very sorry, it was committed from the wrong branch by mistake. llvm-svn: 183070	2013-06-01 10:23:46 +00:00
Tim Northover	57954f04b3	TMP: LEA64_32r fixing llvm-svn: 183069	2013-06-01 10:21:54 +00:00
Tim Northover	64ec0ff433	X86: use sub-register sequences for MOV*r0 operations Instead of having a bunch of separate MOV8r0, MOV16r0, ... pseudo-instructions, it's better to use a single MOV32r0 (which will expand to "xorl %reg, %reg") and obtain other sizes with EXTRACT_SUBREG and SUBREG_TO_REG. The encoding is smaller and partial register updates can sometimes be avoided. Until recently, this sequence was a barrier to rematerialization though. That should now be fixed so it's an appropriate time to make the change. llvm-svn: 182928	2013-05-30 13:19:42 +00:00
Tim Northover	04eb4234fc	X86: change zext moves to use sub-register infrastructure. 32-bit writes on amd64 zero out the high bits of the corresponding 64-bit register. LLVM makes use of this for zero-extension, but until now relied on custom MCLowering and other code to fixup instructions. Now we have proper handling of sub-registers, this can be done by creating SUBREG_TO_REG instructions at selection-time. Should be no change in functionality. llvm-svn: 182921	2013-05-30 10:43:18 +00:00
Andrew Trick	ef9de2a739	Track IR ordering of SelectionDAG nodes 2/4. Change SelectionDAG::getXXXNode() interfaces as well as call sites of these functions to pass in SDLoc instead of DebugLoc. llvm-svn: 182703	2013-05-25 02:42:55 +00:00
David Majnemer	7ea2a52a0c	X86: Remove test instructions proceeding shift by immediate instructions Allow LLVM to take advantage of shift instructions that set the ZF flag, making instructions that test the destination superfluous. llvm-svn: 182454	2013-05-22 08:13:02 +00:00
David Majnemer	5ba473afb0	X86: Bad peephole interaction between adc, MOV32r0 The peephole tries to reorder MOV32r0 instructions such that they are before the instruction that modifies EFLAGS. The problem is that the peephole does not consider the case where the instruction that modifies EFLAGS also depends on the previous state of EFLAGS. Instead, walk backwards until we find an instruction that has a def for EFLAGS but does not have a use. If we find such an instruction, insert the MOV32r0 before it. If it cannot find such an instruction, skip the optimization. llvm-svn: 182184	2013-05-18 01:02:03 +00:00
David Majnemer	8f16974273	X86: Remove redundant test instructions Increase the number of instructions LLVM recognizes as setting the ZF flag. This allows us to remove test instructions that redundantly recalculate the flag. llvm-svn: 181937	2013-05-15 22:03:08 +00:00
Michael Liao	b53d8963ce	ArrayRefize getMachineNode(). No functionality change. llvm-svn: 179901	2013-04-19 22:22:57 +00:00
Preston Gurd	d6be4bf87f	This patch follows is a follow up to r178171, which uses the register form of call in preference to memory indirect on Atom. In this case, the patch applies the optimization to the code for reloading spilled registers. The patch also includes changes to sibcall.ll and movgs.ll, which were failing on the Atom buildbot after the first patch was applied. This patch by Sriram Murali. llvm-svn: 178193	2013-03-27 23:16:18 +00:00
Chandler Carruth	9fb823bbd4	Move all of the header files which are involved in modelling the LLVM IR into their new header subdirectory: include/llvm/IR. This matches the directory structure of lib, and begins to correct a long standing point of file layout clutter in LLVM. There are still more header files to move here, but I wanted to handle them in separate commits to make tracking what files make sense at each layer easier. The only really questionable files here are the target intrinsic tablegen files. But that's a battle I'd rather not fight today. I've updated both CMake and Makefile build systems (I think, and my tests think, but I may have missed something). I've also re-sorted the includes throughout the project. I'll be committing updates to Clang, DragonEgg, and Polly momentarily. llvm-svn: 171366	2013-01-02 11:36:10 +00:00
Bill Wendling	698e84fc4f	Remove the Function::getFnAttributes method in favor of using the AttributeSet directly. This is in preparation for removing the use of the 'Attribute' class as a collection of attributes. That will shift to the AttributeSet class instead. llvm-svn: 171253	2012-12-30 10:32:01 +00:00
Craig Topper	fe82eb6bcd	Remove intrinsic specific instructions for (V)SQRTPS/PD. Instead lower to target-independent ISD nodes and use the existing patterns for those. llvm-svn: 171237	2012-12-29 18:18:20 +00:00
Craig Topper	6b27251a76	Remove intrinsic specific instructions for SSE/SSE2/AVX floating point max/min instructions. Lower them to target specific nodes and use those patterns instead. This also allows them to be commuted if UnsafeFPMath is enabled. llvm-svn: 171227	2012-12-29 16:44:25 +00:00
Craig Topper	81d1e596bb	Remove alignment from a bunch more VEX encoded operations in the folding tables. llvm-svn: 171082	2012-12-26 02:44:47 +00:00
Craig Topper	b2922164f0	Remove alignment from folding table for VMOVUPD as an unaligned instruction it shouldn't require alignment... llvm-svn: 171081	2012-12-26 02:14:19 +00:00
Craig Topper	d09a9af9b6	Remove alignment requirements from (V)EXTRACTPS. This instruction does 32-bit stores which aren't required to be aligned on SSE or AVX. llvm-svn: 171080	2012-12-26 01:47:12 +00:00
Craig Topper	caef1c5d86	Remove alignment requirement from VCVTSS2SD in folding tables. Reverting r171049. This instruction doesn't require alignment. llvm-svn: 171078	2012-12-26 00:35:47 +00:00
Nadav Rotem	00410ae625	VCVTSS2SD requires a strict alignment. Thanks Elena. llvm-svn: 171049	2012-12-25 03:29:18 +00:00
Nadav Rotem	dc0ad92b64	Some x86 instructions can load/store one of the operands to memory. On SSE, this memory needs to be aligned. When these instructions are encoded in VEX (on AVX) there is no such requirement. This changes the folding tables and removes the alignment restrictions from VEX-encoded instructions. llvm-svn: 171024	2012-12-24 09:40:33 +00:00
Nadav Rotem	d5aae980cb	In some cases, due to scheduling constraints we copy the EFLAGS. The only way to read the eflags is using push and pop. If we don't adjust the stack then we run over the first frame index. This is not something that we want to do, so we have to make sure that our machine function does not copy the flags. If it does then we have to emit the prolog that adjusts the stack. rdar://12896831 llvm-svn: 170961	2012-12-21 23:48:49 +00:00
Benjamin Kramer	4669d18893	X86: Match the SSE/AVX min/max vector ops using a custom node instead of intrinsics This is very mechanical, no functionality change. Preparation for PR14667. llvm-svn: 170898	2012-12-21 14:04:55 +00:00
Jakob Stoklund Olesen	b159b5ff0d	Remove the explicit MachineInstrBuilder(MI) constructor. Use the version that also takes an MF reference instead. It would technically be possible to extract an MF reference from the MI as MI->getParent()->getParent(), but that would not work for MIs that are not inserted into any basic block. Given the reasonably small number of places this constructor was used at all, I preferred the compile time check to a run time assertion. llvm-svn: 170588	2012-12-19 21:31:56 +00:00
Bill Wendling	3d7b0b8ac7	Rename the 'Attributes' class to 'Attribute'. It's going to represent a single attribute in the future. llvm-svn: 170502	2012-12-19 07:18:57 +00:00
Craig Topper	f3ff6ae066	Simplify BMI ANDN matching to use patterns instead of a DAG combine. Also add ANDN to isDefConvertible. llvm-svn: 170305	2012-12-17 05:12:30 +00:00
Craig Topper	f924a58af1	Add rest of BMI/BMI2 instructions to the folding tables as well as popcnt and lzcnt. llvm-svn: 170304	2012-12-17 05:02:29 +00:00
Craig Topper	5b08cf7736	Remove store forms of DEC/INC from isDefConvertible. Since they are stores they don't have a register def. llvm-svn: 170303	2012-12-17 04:55:07 +00:00
Craig Topper	922f10aec4	Mark MOVDQ(A/U)rm as ReMaterializable. Mark all MOVDQ(A/U) instructions as neverHasSideEffects. llvm-svn: 169477	2012-12-06 06:49:16 +00:00
Chandler Carruth	ed0881b2a6	Use the new script to sort the includes of every file under lib. Sooooo many of these had incorrect or strange main module includes. I have manually inspected all of these, and fixed the main module include to be the nearest plausible thing I could find. If you own or care about any of these source files, I encourage you to take some time and check that these edits were sensible. I can't have broken anything (I strictly added headers, and reordered them, never removed), but they may not be the headers you'd really like to identify as containing the API being implemented. Many forward declarations and missing includes were added to a header files to allow them to parse cleanly when included first. The main module rule does in fact have its merits. =] llvm-svn: 169131	2012-12-03 16:50:05 +00:00
Jakob Stoklund Olesen	9de596e650	Remove all references to TargetInstrInfoImpl. This class has been merged into its super-class TargetInstrInfo. llvm-svn: 168760	2012-11-28 02:35:17 +00:00
Manman Ren	5b4628201f	X86: do not fold load instructions such as [V]MOVS[S\|D] to other instructions when the destination register is wider than the memory load. These load instructions load from m32 or m64 and set the upper bits to zero, while the folded instructions may accept m128. rdar://12721174 llvm-svn: 168710	2012-11-27 18:09:26 +00:00
Craig Topper	3b530ea605	Remove alignments from folding tables for scalar FMA4 instructions. llvm-svn: 167366	2012-11-04 04:40:08 +00:00
Craig Topper	8cd3b07a51	Add scalar forms of FMA4 VFNMSUB/VFNMADD to folding tables. Patch from Cameron McInally. llvm-svn: 167106	2012-10-31 04:59:46 +00:00
Bill Wendling	c9b22d735a	Create enums for the different attributes. We use the enums to query whether an Attributes object has that attribute. The opaque layer is responsible for knowing where that specific attribute is stored. llvm-svn: 165488	2012-10-09 07:45:08 +00:00
Craig Topper	9384902ef1	Move expansion of SETB_C(8/16/32/64)r from MCInstLower to ExpandPostRAPseudos and mark them as pseudos in the td file. llvm-svn: 165302	2012-10-05 06:05:15 +00:00
Sylvestre Ledru	91ce36c986	Revert 'Fix a typo 'iff' => 'if''. iff is an abreviation of if and only if. See: http://en.wikipedia.org/wiki/If_and_only_if Commit 164767 llvm-svn: 164768	2012-09-27 10:14:43 +00:00
Sylvestre Ledru	721cffd53a	Fix a typo 'iff' => 'if' llvm-svn: 164767	2012-09-27 09:59:43 +00:00
Bill Wendling	863bab689a	Remove the `hasFnAttr' method from Function. The hasFnAttr method has been replaced by querying the Attributes explicitly. No intended functionality change. llvm-svn: 164725	2012-09-26 21:48:26 +00:00
Michael Liao	2b425e1e24	Add SARX/SHRX/SHLX code generation support llvm-svn: 164675	2012-09-26 08:26:25 +00:00
Michael Liao	2de86af22d	Add RORX code generation support llvm-svn: 164674	2012-09-26 08:24:51 +00:00
Michael Liao	f9f7b5518a	Add MULX code generation support llvm-svn: 164673	2012-09-26 08:22:37 +00:00
Michael Liao	3237662b65	Re-work X86 code generation of atomic ops with spin-loop - Rewrite/merge pseudo-atomic instruction emitters to address the following issue: * Reduce one unnecessary load in spin-loop previously the spin-loop looks like thisMBB: newMBB: ld t1 = [bitinstr.addr] op t2 = t1, [bitinstr.val] not t3 = t2 (if Invert) mov EAX = t1 lcs dest = [bitinstr.addr], t3 [EAX is implicit] bz newMBB fallthrough -->nextMBB the 'ld' at the beginning of newMBB should be lift out of the loop as lcs (or CMPXCHG on x86) will load the current memory value into EAX. This loop is refined as: thisMBB: EAX = LOAD [MI.addr] mainMBB: t1 = OP [MI.val], EAX LCMPXCHG [MI.addr], t1, [EAX is implicitly used & defined] JNE mainMBB sinkMBB: * Remove immopc as, so far, all pseudo-atomic instructions has all-register form only, there is no immedidate operand. * Remove unnecessary attributes/modifiers in pseudo-atomic instruction td * Fix issues in PR13458 - Add comprehensive tests on atomic ops on various data types. NOTE: Some of them are turned off due to missing functionality. - Revise tests due to the new spin-loop generated. llvm-svn: 164281	2012-09-20 03:06:15 +00:00
Jan Wen Voung	4ce1d7b4f1	Add some cases to x86 OptimizeCompare to handle DEC and INC, too. While we are setting the earlier def to true, also make it live. llvm-svn: 164056	2012-09-17 22:04:23 +00:00
Craig Topper	908e685102	Mark FMA4 instructions as commutable and add them to the folding tables. llvm-svn: 163035	2012-08-31 23:10:34 +00:00
Craig Topper	7573c8f081	Add selection of RegOp2MemOpTable3 to canFoldMemoryOperand llvm-svn: 163029	2012-08-31 22:12:16 +00:00
Craig Topper	72f51c3986	Convert V_SETALLONES/AVX_SETALLONES/AVX2_SETALLONES to Post-RA pseudos. llvm-svn: 162740	2012-08-28 07:30:47 +00:00
Craig Topper	bd509eea4a	Merge AVX_SET0PSY/AVX_SET0PDY/AVX2_SET0 into a single post-RA pseudo. llvm-svn: 162738	2012-08-28 07:05:28 +00:00
Jakob Stoklund Olesen	7030427623	Preserve operand flags in convertToThreeAddress() by copying operands. No test case, this is a generalization of r160260. llvm-svn: 162485	2012-08-23 22:36:31 +00:00
Craig Topper	f911597494	Use a switch statement instead of a bunch of if-else checks and pull out the common function call. llvm-svn: 162428	2012-08-23 04:57:36 +00:00
Craig Topper	bab0c76674	Fix up indentation and remove a couple else's after returns. llvm-svn: 162270	2012-08-21 08:29:51 +00:00
Craig Topper	bfcfdeb563	Use uint16_t for tables of opcodes. llvm-svn: 162267	2012-08-21 08:23:21 +00:00
Craig Topper	a0cabf19f8	Fix up indentation. No functional change. llvm-svn: 162264	2012-08-21 08:17:07 +00:00
Craig Topper	4bc3e5a1bf	Add a couple llvm_unreachables. Add a message to several others. llvm-svn: 162263	2012-08-21 08:16:16 +00:00
Craig Topper	653e759046	Replace a break with llvm_unreachable in the default case of a nested switch. Condense code a bit. No functional change. llvm-svn: 162261	2012-08-21 07:32:16 +00:00
Craig Topper	b58eec4eaf	Remove FMA3 intrinsic instructions in favor of patterns. llvm-svn: 162194	2012-08-20 06:21:25 +00:00
Manman Ren	959acb106b	X86: move Int_CVTSD2SSrr, Int_CVTSI2SSrr, Int_CVTSI2SDrr, Int_CVTSS2SDrr from OpTbl1 to OpTbl2 since they have 3 operands and the last operand can be changed to a memory operand. PR13576 llvm-svn: 161769	2012-08-13 18:29:41 +00:00
Manman Ren	1be131ba27	X86: enable CSE between CMP and SUB We perform the following: 1> Use SUB instead of CMP for i8,i16,i32 and i64 in ISel lowering. 2> Modify MachineCSE to correctly handle implicit defs. 3> Convert SUB back to CMP if possible at peephole. Removed pattern matching of (a>b) ? (a-b):0 and like, since they are handled by peephole now. rdar://11873276 llvm-svn: 161462	2012-08-08 00:51:41 +00:00
Jakob Stoklund Olesen	3b9a442841	Don't scan physreg use-def chains looking for a PIC base. We can't rematerialize a PIC base after register allocation anyway, and scanning physreg use-def chains is very expensive in a function with many calls. <rdar://problem/12047515> llvm-svn: 161461	2012-08-08 00:40:47 +00:00
Manman Ren	5759d01230	X86 Peephole: fold loads to the source register operand if possible. Machine CSE and other optimizations can remove instructions so folding is possible at peephole while not possible at ISel. This patch is a rework of r160919 and was tested on clang self-host on my local machine. rdar://10554090 and rdar://11873276 llvm-svn: 161152	2012-08-02 00:56:42 +00:00
Elena Demikhovsky	3cb3b0045c	Added FMA functionality to X86 target. llvm-svn: 161110	2012-08-01 12:06:00 +00:00
Manman Ren	f87dd7c01b	Revert r160920 and r160919 due to dragonegg and clang selfhost failure llvm-svn: 160927	2012-07-29 02:44:09 +00:00
Manman Ren	0fa3ab88ba	X86 Peephole: fold loads to the source register operand if possible. Machine CSE and other optimizations can remove instructions so folding is possible at peephole while not possible at ISel. rdar://10554090 and rdar://11873276 llvm-svn: 160919	2012-07-28 16:48:01 +00:00
Manman Ren	32367c063b	X86 Peephole: fix PR13475 in optimizeCompare. It is possible that an instruction can use and update EFLAGS. When checking the safety, we should check the usage of EFLAGS first before declaring it is safe to optimize due to the update. llvm-svn: 160912	2012-07-28 03:15:46 +00:00
Manman Ren	d0a4ee8427	X86: remove redundant cmp against zero. Updated OptimizeCompare in peephole to remove redundant cmp against zero. We only remove Compare if CF and OF are not used. rdar://11855129 llvm-svn: 160454	2012-07-18 21:40:01 +00:00
Nadav Rotem	4968e45b9f	Fix a bug in the 3-address conversion of LEA when one of the operands is an undef virtual register. The problem is that ProcessImplicitDefs removes the definition of the register and marks all uses as undef. If we lose the undef marker then we get a register which has no def, is not marked as undef. The live interval analysis does not collect information for these virtual registers and we crash in later passes. Together with Michael Kuperstein <michael.m.kuperstein@intel.com> llvm-svn: 160260	2012-07-16 10:52:25 +00:00
Nadav Rotem	ee3552f88d	Rename VBROADCASTSDrm into VBROADCASTSDYrm to match the naming convention. Allow the folding of vbroadcastRR to vbroadcastRM, where the memory operand is a spill slot. PR12782. Together with Michael Kuperstein <michael.m.kuperstein@intel.com> llvm-svn: 160230	2012-07-15 12:26:30 +00:00
Benjamin Kramer	abbfe69356	Make helper functions static. llvm-svn: 160173	2012-07-13 13:25:15 +00:00
Manman Ren	1553ce0e81	X86: Update to peephole optimization to move Movr0 before (Sub, Cmp) pair. When Movr0 is between sub and cmp, we move Movr0 before sub if it enables removal of Cmp. llvm-svn: 160066	2012-07-11 19:35:12 +00:00
Manman Ren	5f6fa428fa	X86: implement functions to analyze & synthesize CMOV\|SET\|Jcc getCondFromSETOpc, getCondFromCMovOpc, getSETFromCond, getCMovFromCond No functional change intended. If we want to update the condition code of CMOV\|SET\|Jcc, we first analyze the opcode to get the condition code, then update the condition code, finally synthesize the new opcode form the new condition code. llvm-svn: 159955	2012-07-09 18:57:12 +00:00
Manman Ren	bb36074047	X86: Fix optimizeCompare to correctly check safe condition. It is safe if EFLAGS is killed or re-defined. When we are done with the basic block, check whether EFLAGS is live-out. Do not optimize away cmp if EFLAGS is live-out. llvm-svn: 159888	2012-07-07 03:34:46 +00:00
Manman Ren	c965673707	X86: peephole optimization to remove cmp instruction For each Cmp, we check whether there is an earlier Sub which make Cmp redundant. We handle the case where SUB operates on the same source operands as Cmp, including the case where the two source operands are swapped. llvm-svn: 159838	2012-07-06 17:36:20 +00:00
Jakob Stoklund Olesen	49e4d4b3ef	Add early if-conversion support to X86. Implement the TII hooks needed by EarlyIfConversion to create cmov instructions and estimate their latency. Early if-conversion is still not enabled by default. llvm-svn: 159695	2012-07-04 00:09:58 +00:00
Craig Topper	b6eb513c68	Remove codegen only instruction in favor of one that has the same definition. Make some pattern operands more explicit about types. llvm-svn: 159126	2012-06-25 06:16:00 +00:00
Craig Topper	fd5e6e7db1	Remove intrinsic specific instructions for (V)CVTPS2DQ and replace with patterns. llvm-svn: 159109	2012-06-24 07:07:16 +00:00
Craig Topper	b925230fb1	Remove intrinsic specific instructions for (V)CVTPS2DQ and replace with patterns. llvm-svn: 159108	2012-06-24 06:55:37 +00:00
Craig Topper	f48ec7a708	Fix build failures from r159106. llvm-svn: 159107	2012-06-24 06:08:31 +00:00
Craig Topper	3cee08ce7d	Remove intrinsic specific instructions for CVTPD2DQ. Replace with patterns. llvm-svn: 159105	2012-06-24 05:33:24 +00:00
Craig Topper	a899cc15f1	Remove intrinsic specific instructions for (V)CVTDQ2PS. Use a Pat instead instead. llvm-svn: 159090	2012-06-23 22:33:14 +00:00
Craig Topper	1cac50bc5e	Compress flags in X86 op folding to reduce space in static tables. llvm-svn: 159073	2012-06-23 08:01:18 +00:00
Craig Topper	431f1e7192	Remove intrinsic specific instructions for 128-bit (V)CVTDQ2PD. Replace with intrinsic patterns. Mem forms omitted because the load size is only 64-bits. llvm-svn: 159070	2012-06-23 04:23:36 +00:00
Craig Topper	11913052d6	Move AVX version of convert instructions that write to GPRs to the Op1 table. llvm-svn: 158497	2012-06-15 07:02:58 +00:00
Pete Cooper	8bbce768d8	Move X86::VCVTTSD2SIrr from the 2 operand to 1 operand MemRegOp table. Can someone with more knowledge of this please look at other entries to see if others need moved. llvm-svn: 158474	2012-06-14 22:12:58 +00:00
Manman Ren	9c9641812c	Revert r157755. The commit is intended to fix rdar://11540023. It is implemented as part of peephole optimization. We can actually implement this in the SelectionDAG lowering phase. llvm-svn: 158122	2012-06-06 23:53:03 +00:00
Benjamin Kramer	628a39faa3	Remove unused private fields found by clang's new -Wunused-private-field. There are some that I didn't remove this round because they looked like obvious stubs. There are dead variables in gtest too, they should be fixed upstream. llvm-svn: 158090	2012-06-06 18:25:08 +00:00
Craig Topper	c6ac4cefcc	Add intrinsic forms for FMA instructions to opcode folding tables. llvm-svn: 157917	2012-06-04 07:46:16 +00:00
Craig Topper	3cb143016d	Add VFMADDSUB and VFMSUBADD FMA instructions to folding tables. Also add 213 forms of scalar FMA instructions. llvm-svn: 157914	2012-06-04 07:08:21 +00:00
Manman Ren	5097e4f38a	Revert r157831 llvm-svn: 157896	2012-06-03 03:14:24 +00:00
Manman Ren	879ca9d47d	X86: peephole optimization to remove cmp instruction This patch will optimize the following: sub r1, r3 cmp r3, r1 or cmp r1, r3 bge L1 TO sub r1, r3 bge L1 or ble L1 If the branch instruction can use flag from "sub", then we can eliminate the "cmp" instruction. llvm-svn: 157831	2012-06-01 19:49:33 +00:00
Hans Wennborg	789acfb63d	Implement the local-dynamic TLS model for x86 (PR3985) This implements codegen support for accesses to thread-local variables using the local-dynamic model, and adds a clean-up pass so that the base address for the TLS block can be re-used between local-dynamic access on an execution path. llvm-svn: 157818	2012-06-01 16:27:21 +00:00
Craig Topper	2e127b5274	Add VFNSUB* instructions to folding table. llvm-svn: 157802	2012-06-01 05:48:39 +00:00
Manman Ren	9bccb64e56	X86: replace SUB with CMP if possible This patch will optimize the following movq %rdi, %rax subq %rsi, %rax cmovsq %rsi, %rdi movq %rdi, %rax to cmpq %rsi, %rdi cmovsq %rsi, %rdi movq %rdi, %rax Perform this optimization if the actual result of SUB is not used. rdar: 11540023 llvm-svn: 157755	2012-05-31 17:20:29 +00:00
Elena Demikhovsky	602f3a26d6	Added FMA3 Intel instructions. I disabled FMA3 autodetection, since the result may differ from expected for some benchmarks. I added tests for GodeGen and intrinsics. I did not change llvm.fma.f32/64 - it may be done later. llvm-svn: 157737	2012-05-31 09:20:20 +00:00
Jakob Stoklund Olesen	38dcd598f9	Make the global base reg GR32_NOSP. It can sometimes be used in addressing modes that don't support %ESP. llvm-svn: 157165	2012-05-20 18:43:00 +00:00
Jakob Stoklund Olesen	3c52f0281f	Add an MF argument to TRI::getPointerRegClass() and TII::getRegClass(). The getPointerRegClass() hook can return register classes that depend on the calling convention of the current function (ptr_rc_tailcall). So far, we have been able to infer the calling convention from the subtarget alone, but as we add support for multiple calling conventions per target, that no longer works. Patch by Yiannis Tsiouris! llvm-svn: 156328	2012-05-07 22:10:26 +00:00
Craig Topper	abadc660e0	Convert some uses of XXXRegisterClass to &XXXRegClass. No functional change since they are equivalent. llvm-svn: 155186	2012-04-20 06:31:50 +00:00
Elena Demikhovsky	779a72b49e	Added VPERM optimization for AVX2 shuffles llvm-svn: 154761	2012-04-15 11:18:59 +00:00
Craig Topper	b25fda95f6	Reorder includes in Target backends to following coding standards. Remove some superfluous forward declarations. llvm-svn: 152997	2012-03-17 18:46:09 +00:00

... 3 4 5 6 7 ...

1009 Commits