llvm-project

Commit Graph

Author	SHA1	Message	Date
NAKAMURA Takumi	9d29eff198	Fix whitespace. llvm-svn: 124270	2011-01-26 02:03:37 +00:00
Chris Lattner	218092e68e	fix PR8981, a crash trying to form a conditional inc with a floating point compare. llvm-svn: 123560	2011-01-16 02:56:53 +00:00
Anton Korobeynikov	2f93128109	Rename TargetFrameInfo into TargetFrameLowering. Also, put couple of FIXMEs and fixes here and there. llvm-svn: 123170	2011-01-10 12:39:04 +00:00
Jakob Stoklund Olesen	2fb5b31578	Simplify a bunch of isVirtualRegister() and isPhysicalRegister() logic. These functions not longer assert when passed 0, but simply return false instead. No functional change intended. llvm-svn: 123155	2011-01-10 02:58:51 +00:00
Evan Cheng	078b0b095e	Recognize inline asm 'rev /bin/bash, ' as a bswap intrinsic call. llvm-svn: 123048	2011-01-08 01:24:27 +00:00
Evan Cheng	a048c83fe4	Revert r122955. It seems using movups to lower memcpy can cause massive regression (even on Nehalem) in edge cases. I also didn't see any real performance benefit. llvm-svn: 123015	2011-01-07 19:35:30 +00:00
Evan Cheng	7998b1d6fe	Use movups to lower memcpy and memset even if it's not fast (like corei7). The theory is it's still faster than a pair of movq / a quad of movl. This will probably hurt older chips like P4 but should run faster on current and future Intel processors. rdar://8817010 llvm-svn: 122955	2011-01-06 07:58:36 +00:00
Evan Cheng	3ae2b79aa3	Re-implement r122936 with proper target hooks. Now getMaxStoresPerMemcpy etc. takes an option OptSize. If OptSize is true, it would return the inline limit for functions with attribute OptSize. llvm-svn: 122952	2011-01-06 06:52:41 +00:00
Benjamin Kramer	6020ed9d99	X86: Lower a select directly to a setcc_carry if possible. int test(unsigned long a, unsigned long b) { return -(a < b); } compiles to _test: ## @test cmpq %rsi, %rdi ## encoding: [0x48,0x39,0xf7] sbbl %eax, %eax ## encoding: [0x19,0xc0] ret ## encoding: [0xc3] instead of _test: ## @test xorl %ecx, %ecx ## encoding: [0x31,0xc9] cmpq %rsi, %rdi ## encoding: [0x48,0x39,0xf7] movl $-1, %eax ## encoding: [0xb8,0xff,0xff,0xff,0xff] cmovael %ecx, %eax ## encoding: [0x0f,0x43,0xc1] ret ## encoding: [0xc3] llvm-svn: 122451	2010-12-22 23:09:28 +00:00
Benjamin Kramer	f6ddc4a1de	Add some x86 specific dagcombines for conditional increments. (add Y, (sete X, 0)) -> cmp X, 1; adc 0, Y (add Y, (setne X, 0)) -> cmp X, 1; sbb -1, Y (sub (sete X, 0), Y) -> cmp X, 1; sbb 0, Y (sub (setne X, 0), Y) -> cmp X, 1; adc -1, Y for unsigned foo(unsigned a, unsigned b) { if (a == 0) b++; return b; } we now get: foo: cmpl $1, %edi movl %esi, %eax adcl $0, %eax ret instead of: foo: testl %edi, %edi sete %al movzbl %al, %eax addl %esi, %eax ret llvm-svn: 122364	2010-12-21 21:41:44 +00:00
Chris Lattner	3e5fbd74ed	rename MVT::Flag to MVT::Glue. "Flag" is a terrible name for something that just glues two nodes together, even if it is sometimes used for flags. llvm-svn: 122310	2010-12-21 02:38:05 +00:00
Nate Begeman	4b9db07b02	Implement feedback from Bruno on making pblendvb an x86-specific ISD node in addition to being an intrinsic, and convert lowering to use it. Hopefully the pattern fragment is doing the right thing with XMM0, looks correct in testing. llvm-svn: 122277	2010-12-20 22:04:24 +00:00
Chris Lattner	5c00d41688	now that addc/adde are gone, "ADDC" in the X86 backend uses EFLAGS results, the same as setcc. Optimize ADDC(0,0,FLAGS) -> SET_CARRY(FLAGS). This is a step towards finishing off PR5443. In the testcase in that bug we now get: movq %rdi, %rax addq %rsi, %rax sbbq %rcx, %rcx testb $1, %cl setne %dl ret instead of: movq %rdi, %rax addq %rsi, %rax movl $0, %ecx adcq $0, %rcx testq %rcx, %rcx setne %dl ret llvm-svn: 122219	2010-12-20 01:37:09 +00:00
Chris Lattner	9c26d2711b	use for loop over types. llvm-svn: 122214	2010-12-20 01:03:27 +00:00
Chris Lattner	846c20d4e6	Change the X86 backend to stop using the evil ADDC/ADDE/SUBC/SUBE nodes (which their carry depenedencies with MVT::Flag operands) and use clean and beautiful EFLAGS dependences instead. We do this by changing the modelling of SBB/ADC to have EFLAGS input and outputs (which is what requires the previous scheduler change) and change X86 ISelLowering to custom lower ADDC and friends down to X86ISD::ADD/ADC/SUB/SBB nodes. With the previous series of changes, this causes no changes in the testsuite, woo. llvm-svn: 122213	2010-12-20 00:59:46 +00:00
Mon P Wang	1064992c84	Prevents PerformShuffleCombine from creating a node with an illegal type after legalize types has run, e.g., prevent creating an i64 node from a v2i64 when i64 is not a legal type. llvm-svn: 122206	2010-12-19 23:55:53 +00:00
Chris Lattner	9edf3f50bf	improve the setcc -> setcc_carry optimization to happen more consistently by moving it out of lowering into dag combine. Add some missing patterns for matching away extended versions of setcc_c. llvm-svn: 122201	2010-12-19 22:08:31 +00:00
Chris Lattner	6dddab2ffe	simplify some code to just reuse a setcc if we can instead of going through the CSE maps to get it. llvm-svn: 122196	2010-12-19 21:23:48 +00:00
Chris Lattner	c37bb023b1	now that generic vector types aren't selected onto MMX operations, we don't need -disable-mmx anymore. llvm-svn: 122189	2010-12-19 20:19:20 +00:00
Chris Lattner	ae756e1980	reduce copy/paste programming with the power of for loops. llvm-svn: 122187	2010-12-19 20:07:10 +00:00
Chris Lattner	1e8c032a6e	X86 supports i8/i16 overflow ops (except i8 multiplies), we should generate them. Now we compile: define zeroext i8 @X(i8 signext %a, i8 signext %b) nounwind ssp { entry: %0 = tail call %0 @llvm.sadd.with.overflow.i8(i8 %a, i8 %b) %cmp = extractvalue %0 %0, 1 br i1 %cmp, label %if.then, label %if.end into: _X: ## @X ## BB#0: ## %entry subl $12, %esp movb 16(%esp), %al addb 20(%esp), %al jo LBB0_2 Before we were generating: _X: ## @X ## BB#0: ## %entry pushl %ebp movl %esp, %ebp subl $8, %esp movb 12(%ebp), %al testb %al, %al setge %cl movb 8(%ebp), %dl testb %dl, %dl setge %ah cmpb %cl, %ah sete %cl addb %al, %dl testb %dl, %dl setge %al cmpb %al, %ah setne %al andb %cl, %al testb %al, %al jne LBB0_2 llvm-svn: 122186	2010-12-19 20:03:11 +00:00
Nate Begeman	97b72c99d2	Add support for matching psign & plendvb to the x86 target Remove unnecessary pandn patterns, 'vnot' patfrag looks through bitcasts llvm-svn: 122098	2010-12-17 22:55:37 +00:00
Nate Begeman	8b08f5232b	Formalize the notion that AVX and SSE are non-overlapping extensions from the compiler's point of view. Per email discussion, we either want to always use VEX-prefixed instructions or never use them, and are taking "HasAVX" to mean "Always use VEX". Passing -mattr=-avx,+sse42 should serve to restore legacy SSE support when desirable. llvm-svn: 121439	2010-12-10 00:26:57 +00:00
Eric Christopher	a8aaaee379	Rewrite the darwin tlv support to use a chain and return to copying the output to the correct register. Fixes a hidden problem uncovered by the last patch where we'd try to DAG combine our MVT::Other node oddly. llvm-svn: 121358	2010-12-09 06:25:53 +00:00
Eric Christopher	8783074091	Stop confusing people, it's not really a chain, or a tumor. llvm-svn: 121340	2010-12-09 00:57:19 +00:00
Eric Christopher	d84970ae8b	Remove extraneous copy from DAG conversion for darwin tls. This was popping up at O0 when it wasn't folded and the fast allocator would complain. llvm-svn: 121330	2010-12-09 00:27:58 +00:00
Chris Lattner	6886171792	Teach X86ISelLowering that the second result of X86ISD::UMUL is a flags result. This allows us to compile: void *test12(long count) { return new int[count]; } into: test12: movl $4, %ecx movq %rdi, %rax mulq %rcx movq $-1, %rdi cmovnoq %rax, %rdi jmp __Znam ## TAILCALL instead of: test12: movl $4, %ecx movq %rdi, %rax mulq %rcx seto %cl testb %cl, %cl movq $-1, %rdi cmoveq %rax, %rdi jmp __Znam Of course it would be even better if the regalloc inverted the cmov to 'cmovoq', which would eliminate the need for the 'movq %rdi, %rax'. llvm-svn: 120936	2010-12-05 07:49:54 +00:00
Chris Lattner	364bb0a081	it turns out that when ".with.overflow" intrinsics were added to the X86 backend that they were all implemented except umul. This one fell back to the default implementation that did a hi/lo multiply and compared the top. Fix this to check the overflow flag that the 'mul' instruction sets, so we can avoid an explicit test. Now we compile: void *func(long count) { return new int[count]; } into: __Z4funcl: ## @_Z4funcl movl $4, %ecx ## encoding: [0xb9,0x04,0x00,0x00,0x00] movq %rdi, %rax ## encoding: [0x48,0x89,0xf8] mulq %rcx ## encoding: [0x48,0xf7,0xe1] seto %cl ## encoding: [0x0f,0x90,0xc1] testb %cl, %cl ## encoding: [0x84,0xc9] movq $-1, %rdi ## encoding: [0x48,0xc7,0xc7,0xff,0xff,0xff,0xff] cmoveq %rax, %rdi ## encoding: [0x48,0x0f,0x44,0xf8] jmp __Znam ## TAILCALL instead of: __Z4funcl: ## @_Z4funcl movl $4, %ecx ## encoding: [0xb9,0x04,0x00,0x00,0x00] movq %rdi, %rax ## encoding: [0x48,0x89,0xf8] mulq %rcx ## encoding: [0x48,0xf7,0xe1] testq %rdx, %rdx ## encoding: [0x48,0x85,0xd2] movq $-1, %rdi ## encoding: [0x48,0xc7,0xc7,0xff,0xff,0xff,0xff] cmoveq %rax, %rdi ## encoding: [0x48,0x0f,0x44,0xf8] jmp __Znam ## TAILCALL Other than the silly seto+test, this is using the o bit directly, so it's going in the right direction. llvm-svn: 120935	2010-12-05 07:30:36 +00:00
Chris Lattner	116580a11c	generalize the previous check to handle -1 on either side of the select, inserting a not to compensate. Add a missing isZero check that I lost somehow. This improves codegen of: void *func(long count) { return new int[count]; } from: __Z4funcl: ## @_Z4funcl movl $4, %ecx ## encoding: [0xb9,0x04,0x00,0x00,0x00] movq %rdi, %rax ## encoding: [0x48,0x89,0xf8] mulq %rcx ## encoding: [0x48,0xf7,0xe1] testq %rdx, %rdx ## encoding: [0x48,0x85,0xd2] movq $-1, %rdi ## encoding: [0x48,0xc7,0xc7,0xff,0xff,0xff,0xff] cmoveq %rax, %rdi ## encoding: [0x48,0x0f,0x44,0xf8] jmp __Znam ## TAILCALL ## encoding: [0xeb,A] to: __Z4funcl: ## @_Z4funcl movl $4, %ecx ## encoding: [0xb9,0x04,0x00,0x00,0x00] movq %rdi, %rax ## encoding: [0x48,0x89,0xf8] mulq %rcx ## encoding: [0x48,0xf7,0xe1] cmpq $1, %rdx ## encoding: [0x48,0x83,0xfa,0x01] sbbq %rdi, %rdi ## encoding: [0x48,0x19,0xff] notq %rdi ## encoding: [0x48,0xf7,0xd7] orq %rax, %rdi ## encoding: [0x48,0x09,0xc7] jmp __Znam ## TAILCALL ## encoding: [0xeb,A] llvm-svn: 120932	2010-12-05 02:00:51 +00:00
Chris Lattner	342e6ea5f9	Improve an integer select optimization in two ways: 1. generalize (select (x == 0), -1, 0) -> (sign_bit (x - 1)) to: (select (x == 0), -1, y) -> (sign_bit (x - 1)) \| y 2. Handle the identical pattern that happens with !=: (select (x != 0), y, -1) -> (sign_bit (x - 1)) \| y cmov is often high latency and can't fold immediates or memory operands. For example for (x == 0) ? -1 : 1, before we got: < testb %sil, %sil < movl $-1, %ecx < movl $1, %eax < cmovel %ecx, %eax now we get: > cmpb $1, %sil > sbbl %eax, %eax > orl $1, %eax llvm-svn: 120929	2010-12-05 01:23:24 +00:00
Benjamin Kramer	2f489236ab	Add patterns for the x86 popcnt instruction. - Also adds a new POPCNT subtarget feature that is currently enabled if the target supports SSE4.2 (nehalem) or SSE4A (barcelona). llvm-svn: 120917	2010-12-04 20:32:23 +00:00
Benjamin Kramer	8ceebfaa04	Simplify code. No functionality change. llvm-svn: 120907	2010-12-04 14:22:24 +00:00
Evan Cheng	419ea286ee	Fix and re-enable tail call optimization of expanded libcalls. llvm-svn: 120622	2010-12-01 22:59:46 +00:00
Duncan Sands	c4fb38b821	I don't think it makes any sense to assert that the target supports SSE3 here. The user (i.e. whoever generated a call to the intrinsic in the first place) is essentially asking for a particular instruction to be placed in the assembler. If that instruction won't execute on the target machine, that's their problem not ours. Two buildbots with processors that don't support SSE3 were barfing on the apm.ll test in CodeGen/X86 because of this assertion. llvm-svn: 120574	2010-12-01 12:58:13 +00:00
Evan Cheng	a695abde49	Speculatively disable x86 portion of r120501 to appease the x86_64 buildbot. llvm-svn: 120549	2010-12-01 03:27:20 +00:00
Evan Cheng	d4b0873c06	Enable sibling call optimization of libcalls which are expanded during legalization time. Since at legalization time there is no mapping from SDNode back to the corresponding LLVM instruction and the return SDNode is target specific, this requires a target hook to check for eligibility. Only x86 and ARM support this form of sibcall optimization right now. rdar://8707777 llvm-svn: 120501	2010-11-30 23:55:39 +00:00
Eric Christopher	2d1bcf4aea	Fix insertion point in pcmp expander. While I'm there, clean up too many \n even for me. llvm-svn: 120411	2010-11-30 08:20:21 +00:00
Eric Christopher	1a86e8461a	Fix some cleanups from my last patch. llvm-svn: 120410	2010-11-30 08:10:28 +00:00
Eric Christopher	fa6657cec0	Rewrite mwait and monitor support and custom lower arguments. Fixes PR8573. llvm-svn: 120404	2010-11-30 07:20:12 +00:00
Rafael Espindola	c4774795ce	Move lowering of TLS_addr32 and TLS_addr64 to X86MCInstLower. llvm-svn: 120263	2010-11-28 21:16:39 +00:00
Rafael Espindola	5d882894d8	Lower TLS_addr32 and TLS_addr64. llvm-svn: 120225	2010-11-27 20:43:02 +00:00
Wesley Peck	527da1b6e2	Renaming ISD::BIT_CONVERT to ISD::BITCAST to better reflect the LLVM IR concept. llvm-svn: 119990	2010-11-23 03:31:01 +00:00
Anton Korobeynikov	0eecf5d201	Move hasFP() and few related hooks to TargetFrameInfo. llvm-svn: 119740	2010-11-18 21:19:35 +00:00
Chris Lattner	edb9d84dcc	add targetoperand flags for jump tables, constant pool and block address nodes to indicate when ha16/lo16 modifiers should be used. This lets us pass PowerPC/indirectbr.ll. The one annoying thing about this patch is that the MCSymbolExpr isn't expressive enough to represent ha16(label1-label2) which we need on PowerPC. I have a terrible hack in the meantime, but this will have to be revisited at some point. Last major conversion item left is global variable references. llvm-svn: 119105	2010-11-15 02:46:57 +00:00
Chris Lattner	7077efe894	move the pic base symbol stuff up to MachineFunction since it is trivial and will be shared between ppc and x86. This substantially simplifies the X86 backend also. llvm-svn: 119089	2010-11-14 22:48:15 +00:00
Chris Lattner	239f9a35ed	simplify getPICBaseSymbol a bit. llvm-svn: 119088	2010-11-14 22:37:11 +00:00
Peter Collingbourne	feea10bcdf	Recognise 32-bit ror-based bswap implementation used by uclibc llvm-svn: 119007	2010-11-13 19:54:30 +00:00
Peter Collingbourne	1c6437a62a	Support ; as asm separator llvm-svn: 119006	2010-11-13 19:54:23 +00:00
Dale Johannesen	6d95ed1760	Remove possibly useful info from comment, per Chris. llvm-svn: 118865	2010-11-12 00:43:18 +00:00
Duncan Sands	1462777017	Simplify uses of MVT and EVT. An MVT can be compared directly with a SimpleValueType, while an EVT supports equality and inequality comparisons with SimpleValueType. llvm-svn: 118169	2010-11-03 12:17:33 +00:00
Duncan Sands	fb0a48ef96	Factorize the duplicated logic for choosing the right argument calling convention out of the fast and normal ISel files, and into the calling convention TD file. llvm-svn: 117856	2010-10-31 13:21:44 +00:00
John Thompson	e8360b7182	Inline asm multiple alternative constraints development phase 2 - improved basic logic, added initial platform support. llvm-svn: 117667	2010-10-29 17:29:13 +00:00
Michael J. Spencer	7db918f1e9	x86-Win32: Switch ftol2 calling convention from stdcall to C. llvm-svn: 117474	2010-10-27 18:52:38 +00:00
Dale Johannesen	ec57ac1c3c	An stdcall function calling a non-stdcall function cannot use tailcall. PR 8461. llvm-svn: 117322	2010-10-25 22:17:05 +00:00
Duncan Sands	1f0d37e892	Add parentheses to pacify gcc, which warns otherwise. llvm-svn: 117020	2010-10-21 16:02:12 +00:00
Michael J. Spencer	f509c6ca27	X86: Add alloca probing to dynamic alloca on Windows. Fixes PR8424. llvm-svn: 116984	2010-10-21 01:41:01 +00:00
Dale Johannesen	320a553319	Remove Synthesizable from the Type system; as MMX vector types are no longer Legal on X86, we don't need it. No functional change. 8499854. llvm-svn: 116947	2010-10-20 21:32:10 +00:00
Michael J. Spencer	3e64de9504	X86: Add MS-CRT libcalls. llvm-svn: 116801	2010-10-19 07:32:52 +00:00
Michael J. Spencer	8b382e7e10	Fix Whitespace. llvm-svn: 116800	2010-10-19 07:32:42 +00:00
Eric Christopher	604e142844	Combine these together - should probably have some text associated that says what why what we just asserted is wrong. llvm-svn: 116333	2010-10-12 19:44:17 +00:00
Nick Lewycky	eb7b91d417	Mark variable 'NoImplicitFloatOps' used only in an assert as used. llvm-svn: 116323	2010-10-12 18:18:03 +00:00
Dan Gohman	395a898b2b	Initial va_arg support for x86-64. Patch by David Meyer! llvm-svn: 116319	2010-10-12 18:00:49 +00:00
Andrew Trick	e01c9001c9	Fixes bug 8297: i386 cmpxchg8b, missing MachineMemOperand llvm-svn: 116214	2010-10-11 19:02:04 +00:00
Michael J. Spencer	8dedb62019	X86: Call ulldiv and ftol2 on Windows instead of their libgcc eqivilents. llvm-svn: 116188	2010-10-11 05:29:15 +00:00
Michael J. Spencer	00765e5be0	X86: MinGW should always use libgcc on Windows. llvm-svn: 116177	2010-10-10 23:11:06 +00:00
Michael J. Spencer	7a573a5e1f	X86: Call _alldiv instead of __divdi3 on Windows (excluding cygwin). llvm-svn: 116174	2010-10-10 22:04:34 +00:00
Michael J. Spencer	bee1f7f5ba	Fix Whitespace. llvm-svn: 116173	2010-10-10 22:04:20 +00:00
Cameron Esfahani	d57f9ecd4a	Recommit 116056, now with the missing file... llvm-svn: 116083	2010-10-08 19:24:18 +00:00
Andrew Trick	cf97db2402	reverting 116056: win64_params.ll may need to be conditionalized? llvm-svn: 116063	2010-10-08 17:22:42 +00:00
Cameron Esfahani	a07b5c291d	Small patch to restore home register stack space allocation for the Win64 case. Add test case. This code eventually needs to be tighter, since it's always allocating it, even in leaf routines. llvm-svn: 116056	2010-10-08 10:31:30 +00:00
Evan Cheng	5c31bf0619	Canonicalize X86ISD::MOVDDUP nodes to v2f64 to make sure all cases match. Also eliminate unneeded isel patterns. rdar://8520311 llvm-svn: 115977	2010-10-07 20:50:20 +00:00
Anton Korobeynikov	d77a443631	va_args support for Win64. Patch by Cameron! llvm-svn: 115480	2010-10-03 22:52:07 +00:00
Dale Johannesen	dd224d2333	Massive rewrite of MMX: The x86_mmx type is used for MMX intrinsics, parameters and return values where these use MMX registers, and is also supported in load, store, and bitcast. Only the above operations generate MMX instructions, and optimizations do not operate on or produce MMX intrinsics. MMX-sized vectors <2 x i32> etc. are lowered to XMM or split into smaller pieces. Optimizations may occur on these forms and the result casted back to x86_mmx, provided the result feeds into a previous existing x86_mmx operation. The point of all this is prevent optimizations from introducing MMX operations, which is unsafe due to the EMMS problem. llvm-svn: 115243	2010-09-30 23:57:10 +00:00
Chris Lattner	b5b71e07af	improve indentation llvm-svn: 114815	2010-09-27 06:34:01 +00:00
Eric Christopher	422e463be7	This code should never fire on non-darwin subtargets. llvm-svn: 114811	2010-09-27 06:01:51 +00:00
Dale Johannesen	6a4cd59b08	We can't return SSE/MMX vectors if SSE is disabled. llvm-svn: 114745	2010-09-24 19:05:48 +00:00
Bob Wilson	e1223fb583	Attempt to fix llvm-gcc build. It was crashing when building gcov.o for an ARM cross-compiler on x86, because the MMO size did not match the type size. This fixes the MMO size and also the size of the stack object to match the type size. llvm-svn: 114554	2010-09-22 17:35:14 +00:00
Chris Lattner	8a236b63d8	reimplement elf TLS support in terms of addressing modes, eliminating SegmentBaseAddress. llvm-svn: 114529	2010-09-22 04:39:11 +00:00
Chris Lattner	a5156c30ed	convert the last 4 X86ISD nodes that should have memoperands to have them. llvm-svn: 114523	2010-09-22 01:28:21 +00:00
Chris Lattner	ed85da5600	give X86ISD::FNSTCW16m a memoperand, since it touches memory. It only can access the stack due to how it is generated though. llvm-svn: 114522	2010-09-22 01:11:26 +00:00
Chris Lattner	78f518b79b	give FP_TO_INT16_IN_MEM and friends a memoperand. They are only used with stack slots, but hey, lets be safe. llvm-svn: 114521	2010-09-22 01:05:16 +00:00
Chris Lattner	54e5329545	give VZEXT_LOAD a memory operand, it now works with segment registers. llvm-svn: 114515	2010-09-22 00:34:38 +00:00
Chris Lattner	e479e9643b	give LCMPXCHG_DAG[8] a memory operand, allowing it to work with addrspace 256/257 llvm-svn: 114508	2010-09-21 23:59:42 +00:00
Owen Anderson	5e65dfbb97	Reimplement r114460 in target-independent DAGCombine rather than target-dependent, by using the predicate to discover the number of sign bits. Enhance X86's target lowering to provide a useful response to this query. llvm-svn: 114473	2010-09-21 20:42:50 +00:00
Chris Lattner	886250c8f0	convert a couple more places to use the new getStore() llvm-svn: 114463	2010-09-21 18:51:21 +00:00
Owen Anderson	f4b1a5bdc4	When adding the carry bit to another value on X86, exploit the fact that the carry-materialization (sbbl x, x) sets the registers to 0 or ~0. Combined with two's complement arithmetic, we can fold the intermediate AND and the ADD into a single SUB. This fixes <rdar://problem/8449754>. llvm-svn: 114460	2010-09-21 18:41:19 +00:00
Chris Lattner	802527adad	eliminate some uses of the getStore overload. llvm-svn: 114453	2010-09-21 17:50:43 +00:00
Chris Lattner	7727d05dbb	convert the targets off the non-MachinePointerInfo of getLoad. llvm-svn: 114410	2010-09-21 06:44:06 +00:00
Chris Lattner	82fd06d3ce	it's more elegant to put the "getConstantPool" and "getFixedStack" on the MachinePointerInfo class. While this isn't the problem I'm setting out to solve, it is the right way to eliminate PseudoSourceValue, so lets go with it. llvm-svn: 114406	2010-09-21 06:22:23 +00:00
Chris Lattner	c3e05d6e50	update the X86 backend to use the MachinePointerInfo version of one of the getLoad methods. This fixes at least one bug where an incorrect svoffset is passed in (a potential combiner-aa miscompile). llvm-svn: 114404	2010-09-21 06:02:19 +00:00
Chris Lattner	2510de2bea	reimplement memcpy/memmove/memset lowering to use MachinePointerInfo instead of srcvalue/offset pairs. This corrects SV info for mem operations whose size is > 32-bits. llvm-svn: 114401	2010-09-21 05:40:29 +00:00
Chris Lattner	e3d864b857	convert targets to the new MF.getMachineMemOperand interface. llvm-svn: 114391	2010-09-21 04:39:43 +00:00
John Thompson	1094c80281	Added skeleton for inline asm multiple alternative constraint support. llvm-svn: 113766	2010-09-13 18:15:37 +00:00
Bruno Cardoso Lopes	99a9f4661a	Minor change. Fix comments and remove unused and redundant code llvm-svn: 113378	2010-09-08 18:12:31 +00:00
Bruno Cardoso Lopes	f7fee1c185	x86 vector shuffle lowering now relies only on target specific nodes to emit shuffles and don't do isel mask matching anymore. - Add the selection of the remaining shuffle opcode (movddup) - Introduce two new functions to "recognize" where we may get potential folds and add several comments to them explaining why they are not yet in the desidered shape. - Add more patterns to fallback the case where we select a specific shuffle opcode as if it could fold a load, but it can't, so remap to a valid instruction. - Add a couple of FIXMEs to address in the following days once there's a good solution to the current folding problem. llvm-svn: 113369	2010-09-08 17:43:25 +00:00
Bruno Cardoso Lopes	6b1d62c529	Factor out some x86 vector shuffle rewriting and add comments about the direction the shuffle lowering is heading to llvm-svn: 113286	2010-09-07 21:03:14 +00:00
Bruno Cardoso Lopes	7c483028fb	Move code around to prepare for moving some of the logic together to another function llvm-svn: 113267	2010-09-07 20:20:27 +00:00
Bill Wendling	353802114f	Add an MVT::x86mmx type. It will take the place of all current MMX vector types. llvm-svn: 113261	2010-09-07 20:03:56 +00:00
Bruno Cardoso Lopes	5a45db3e6c	decouple MMX check from regular splat checks. Some refactoring is coming, and MMX should be left alone to be easily removed after moving to intrinsics llvm-svn: 113247	2010-09-07 18:41:45 +00:00
Bruno Cardoso Lopes	4f5d4b4a6e	Remove now useless check, because the code can be matched below, no need to leave it for isel llvm-svn: 113242	2010-09-07 18:29:03 +00:00
Bruno Cardoso Lopes	c9b3316fea	Minor change. Since the checks are equivalent, use isMMX llvm-svn: 113239	2010-09-07 18:24:00 +00:00
Bruno Cardoso Lopes	c6accda78e	Remove the last bit of isShuffleMaskLegal checks and improve the comment regarding mmx shuffles llvm-svn: 113059	2010-09-04 02:58:56 +00:00
Bruno Cardoso Lopes	731bcc1abf	make explicit that we not handle several mmx shuffles llvm-svn: 113058	2010-09-04 02:50:13 +00:00
Bruno Cardoso Lopes	20779ee157	Emit target specific nodes to handle palignr. Do not touch it for MMX versions yet. llvm-svn: 113056	2010-09-04 02:36:07 +00:00
Bruno Cardoso Lopes	cff7cd18ab	Emit target specific nodes to handle splats starting at zero indicies llvm-svn: 113055	2010-09-04 02:02:14 +00:00
Bruno Cardoso Lopes	95759917eb	Emit target specific nodes for isPSHUFHWMask and isPSHUFLWMask llvm-svn: 113050	2010-09-04 01:36:45 +00:00
Bruno Cardoso Lopes	2b57008c72	Emit target specific nodes for isSHUFPMask llvm-svn: 113048	2010-09-04 01:22:57 +00:00
Bruno Cardoso Lopes	2f7af36134	Previous isMOVLMask matching already emits targets nodes, remove check llvm-svn: 113047	2010-09-04 00:50:08 +00:00
Bruno Cardoso Lopes	9f8e704151	One more check from the original isShuffleMaskLegal goes away llvm-svn: 113045	2010-09-04 00:46:16 +00:00
Bruno Cardoso Lopes	16959372bb	Remove a duplicated but useless check that i've inserted in the previous commit. llvm-svn: 113044	2010-09-04 00:43:12 +00:00
Bruno Cardoso Lopes	44578d38d3	Refactor some code and remove the extra checks for unpckl_undef and unpckh_undef llvm-svn: 113043	2010-09-04 00:39:43 +00:00
Bruno Cardoso Lopes	7829d0e74b	Remove check for unpckh mask llvm-svn: 113035	2010-09-03 23:32:47 +00:00
Bruno Cardoso Lopes	d1dacc57aa	Remove check for unpckl mask llvm-svn: 113034	2010-09-03 23:31:50 +00:00
Bruno Cardoso Lopes	207b9d6218	Inline isShuffleMaskLegal into LowerVECTOR_SHUFFLE, so we can start checking each standalone condition and decide whether emit target specific nodes or remove the condition if it's already matched before. llvm-svn: 113031	2010-09-03 23:24:06 +00:00
Bruno Cardoso Lopes	2bef20eda7	Reapply considered harmfull part of rr112934 and r112942. "Use target specific nodes instead of relying in unpckl and unpckh pattern fragments during isel time. Also place a depth limit in getShuffleScalarElt. llvm-svn: 113020	2010-09-03 22:09:41 +00:00
Bruno Cardoso Lopes	fe8717c573	Reintroduce a simple function refactoring done in r112934, also without any functionality changes llvm-svn: 113008	2010-09-03 20:20:02 +00:00
Bruno Cardoso Lopes	48e589b122	Reapply piecies of r112942 and r112934 which don't do functional changes llvm-svn: 113007	2010-09-03 20:10:35 +00:00
Bruno Cardoso Lopes	6979cf0808	Reapply Fix comment llvm-svn: 113006	2010-09-03 19:55:05 +00:00
Daniel Dunbar	6f3da24d70	Revert r112934, "- Use specific nodes to match unpckl masks.", which introduced some infinite loop and select failures. - Apologies for eager reverting, but its branch day. llvm-svn: 113000	2010-09-03 19:38:11 +00:00
Daniel Dunbar	f1aacd55c0	Revert r112938 "Fix comment", which depends on r112934, which introduced some infinite loop and select failures. llvm-svn: 112999	2010-09-03 19:38:08 +00:00
Daniel Dunbar	0ffe4db45c	Revert r112942, "Use punpckh and unpckh family of nodes instead of using unpckh mask pattern fragment", which depends on r112934, which introduced some infinite loop and select failures. llvm-svn: 112998	2010-09-03 19:38:05 +00:00
Bruno Cardoso Lopes	a85ec10483	Use punpckh and unpckh family of nodes instead of using unpckh mask pattern fragment llvm-svn: 112942	2010-09-03 01:39:08 +00:00
Bruno Cardoso Lopes	adc6bca2dd	Fix comment llvm-svn: 112938	2010-09-03 01:28:51 +00:00
Bruno Cardoso Lopes	cce44678b4	- Use specific nodes to match unpckl masks. - Teach getShuffleScalarElt how to handle more target specific nodes, so the DAGCombine can make use of it. - Add another hack to avoid the node update problem during legalization. More description on the comments llvm-svn: 112934	2010-09-03 01:24:00 +00:00
Anton Korobeynikov	a689c5b2c0	Revert win64 changes. They seem to be incomplete llvm-svn: 112885	2010-09-02 22:31:32 +00:00
Anton Korobeynikov	56291f7e53	Properly allocate win64 shadow reg area. Patch by Jan Sjodin! llvm-svn: 112875	2010-09-02 22:16:28 +00:00
Bruno Cardoso Lopes	489613f1e5	Replace unpckl_undef and unpckh_undef matching with target specific opcodes llvm-svn: 112806	2010-09-02 05:23:12 +00:00
Bruno Cardoso Lopes	e4e4be3885	Move condition out to prepare for more matching llvm-svn: 112805	2010-09-02 04:20:26 +00:00
Bruno Cardoso Lopes	bf7fd146c7	Remove checking for isUNPCKL_v_undef_Mask, the specific node is already emitted for it llvm-svn: 112804	2010-09-02 03:57:58 +00:00
Bruno Cardoso Lopes	6a7f634487	become more strict about when it's safe to use X86ISD::MOVLPS llvm-svn: 112799	2010-09-02 02:35:51 +00:00
Bruno Cardoso Lopes	04c25c15c7	Revert r112689, avoid those kind of checks cause they mess up with mmx llvm-svn: 112760	2010-09-01 22:59:03 +00:00
Bruno Cardoso Lopes	b3825216ce	Use movlps, movlpd, movss and movsd specific nodes instead of pattern matching with movlp pattern fragment llvm-svn: 112694	2010-09-01 05:08:25 +00:00
Bruno Cardoso Lopes	6aaebe877b	minor change, simplify some logic llvm-svn: 112689	2010-09-01 00:57:08 +00:00
Bruno Cardoso Lopes	2b025707a2	Move some functions around so they can be used for some other to come function llvm-svn: 112687	2010-09-01 00:51:36 +00:00
Bruno Cardoso Lopes	4b56d87290	Use x86 specific MOVSLDUP node, add more patterns to match it and remove useless load nodes llvm-svn: 112661	2010-08-31 22:35:05 +00:00
Bruno Cardoso Lopes	61996ef835	Use x86 specific MOVSHDUP node and add more patterns to match it llvm-svn: 112657	2010-08-31 22:22:11 +00:00
Bruno Cardoso Lopes	5de15ce468	Use MOVHLPS node instead of matching using movhlps and movhlps_undef pattern fragments llvm-svn: 112644	2010-08-31 21:38:49 +00:00
Bruno Cardoso Lopes	03e4c35302	Use MOVLHPS and MOVHLPS x86 nodes whenever possible. Also remove some useless nodes llvm-svn: 112642	2010-08-31 21:15:21 +00:00
Bruno Cardoso Lopes	dfd9dd5d75	Use X86ISD::MOVSS and MOVSD to represent the movl mask pattern, also fix the handling of those nodes when seeking for scalars inside vector shuffles llvm-svn: 112570	2010-08-31 02:26:40 +00:00
Chris Lattner	94656b1c8c	fix the buildvector->insertp[sd] logic to not always create a redundant insertp[sd] $0, which is a noop. Before: _f32: ## @f32 pshufd $1, %xmm1, %xmm2 pshufd $1, %xmm0, %xmm3 addss %xmm2, %xmm3 addss %xmm1, %xmm0 ## kill: XMM0<def> XMM0<kill> XMM0<def> insertps $0, %xmm0, %xmm0 insertps $16, %xmm3, %xmm0 ret after: _f32: ## @f32 movdqa %xmm0, %xmm2 addss %xmm1, %xmm2 pshufd $1, %xmm1, %xmm1 pshufd $1, %xmm0, %xmm3 addss %xmm1, %xmm3 movdqa %xmm2, %xmm0 insertps $16, %xmm3, %xmm0 ret The extra movs are due to a random (poor) scheduling decision. llvm-svn: 112379	2010-08-28 17:59:08 +00:00
Chris Lattner	bcb6090ad0	fix the BuildVector -> unpcklps logic to not do pointless shuffles when the top elements of a vector are undefined. This happens all the time for X86-64 ABI stuff because only the low 2 elements of a 4 element vector are defined. For example, on: _Complex float f32(_Complex float A, _Complex float B) { return A+B; } We used to produce (with SSE2, SSE4.1+ uses insertps): _f32: ## @f32 movdqa %xmm0, %xmm2 addss %xmm1, %xmm2 pshufd $16, %xmm2, %xmm2 pshufd $1, %xmm1, %xmm1 pshufd $1, %xmm0, %xmm0 addss %xmm1, %xmm0 pshufd $16, %xmm0, %xmm1 movdqa %xmm2, %xmm0 unpcklps %xmm1, %xmm0 ret We now produce: _f32: ## @f32 movdqa %xmm0, %xmm2 addss %xmm1, %xmm2 pshufd $1, %xmm1, %xmm1 pshufd $1, %xmm0, %xmm3 addss %xmm1, %xmm3 movaps %xmm2, %xmm0 unpcklps %xmm3, %xmm0 ret This implements rdar://8368414 llvm-svn: 112378	2010-08-28 17:28:30 +00:00
Chris Lattner	96db6e66f4	improve comments in the unpcklps generating logic, introduce a new EltStride variable instead of reusing NumElems variable for a non-obvious purpose. No functionality change. llvm-svn: 112377	2010-08-28 17:15:43 +00:00
Bruno Cardoso Lopes	a982aa24ef	Clean up the logic of vector shuffles -> vector shifts. Also teach this logic how to handle target specific shuffles if needed, this is necessary while searching recursively for zeroed scalar elements in vector shuffle operands. llvm-svn: 112348	2010-08-28 02:46:39 +00:00
Anton Korobeynikov	c0b36921c2	Properly handle passing of FP stuff to varargs function on Win64: value should be copied to the corresponding shadow reg as well. Patch by Cameron Esfahani! llvm-svn: 112262	2010-08-27 14:43:06 +00:00
Bruno Cardoso Lopes	e25ba0c7c2	zap the now unused MVT::getIntVectorWithNumElements llvm-svn: 112218	2010-08-26 20:53:12 +00:00
Chris Lattner	eb2cc0ce0e	implement SplitVecOp_CONCAT_VECTORS, fixing the included testcase with SSE1. llvm-svn: 112171	2010-08-26 05:51:22 +00:00
Chris Lattner	cc60609cb4	fix sse1 only codegen in x86-64 mode, which is something we apparently try to support. llvm-svn: 112168	2010-08-26 05:24:29 +00:00
Bruno Cardoso Lopes	d4085f6e91	Revert this for now, PUNPCKLDQ dont operate on v4f32 llvm-svn: 112090	2010-08-25 21:26:37 +00:00
Anton Korobeynikov	b3b53ecac0	Fix nasty mingw32 bug, which e.g. prevented llvm-gcc bootstrap there. Mark _alloca call as clobberring EFLAGS, otherwise some DCE might remove other flags-clobberring stuff (e.g. cmp instructions) occuring after _alloca call. llvm-svn: 112034	2010-08-25 07:50:11 +00:00
Bruno Cardoso Lopes	0770d25758	PUNPCKLDQ should also be used for v4f32 llvm-svn: 112020	2010-08-25 02:55:40 +00:00
Bruno Cardoso Lopes	2e45d522c1	teach lowering to get target specific nodes for pshufd, emulating the same isel behavior for now, so we can pass all vector shuffle tests llvm-svn: 112017	2010-08-25 02:35:37 +00:00
Dan Gohman	c88fda477a	Fix X86's isLegalAddressingMode to recognize that static addresses need not be RIP-relative in small mode. llvm-svn: 111917	2010-08-24 15:55:12 +00:00
Bruno Cardoso Lopes	758d7b1f5c	Use pshufhw and pshuflw in more cases and fix getTargetShuffleNode number of arguments llvm-svn: 111890	2010-08-24 01:16:15 +00:00
Bruno Cardoso Lopes	264d90fff7	Start using target speficic nodes for shuffles: pshufhw and pshuflw llvm-svn: 111837	2010-08-23 20:41:02 +00:00
Anton Korobeynikov	cbbe4501df	Revert invalid r111792. Jump tables are not broken on x86-64 / coff, it's COFF emitter which does not support differences of two symbols (and needs to be fixed). GAS is pretty fine with code produced. llvm-svn: 111801	2010-08-23 07:38:51 +00:00
Michael J. Spencer	e87231232a	Workaround broken jump tables on x86-64 COFF. llvm-svn: 111792	2010-08-23 04:45:37 +00:00
Bruno Cardoso Lopes	9f20e7a1bf	Prepare LowerVECTOR_SHUFFLEv8i16 to use x86 target specific nodes directly llvm-svn: 111704	2010-08-21 01:32:18 +00:00
Bruno Cardoso Lopes	6f3b38a851	This is the first step towards refactoring the x86 vector shuffle code. The general idea here is to have a group of x86 target specific nodes which are going to be selected during lowering and then directly matched in isel. The commit includes the addition of those specific nodes and a bunch of patterns, and incrementally we're going to switch between them and what we have right now. Both the patterns and target specific nodes can change as we move forward with this work. llvm-svn: 111691	2010-08-20 22:55:05 +00:00
Anton Korobeynikov	231ab847ca	More fixes for win64: - Do not clobber al during variadic calls, this is AMD64 ABI-only feature - Emit wincall64, where necessary Patch by Cameron Esfahani! llvm-svn: 111289	2010-08-17 21:06:07 +00:00
Eric Christopher	54194bd127	Rework how the non-sse2 memory barrier is lowered so that the encoding is correct for the built-in assembler. Based on a patch from Chris. llvm-svn: 111083	2010-08-14 21:51:50 +00:00
Chris Lattner	2f6c3434ac	improve indentation llvm-svn: 111073	2010-08-14 17:26:09 +00:00
Bruno Cardoso Lopes	081861b6b7	Fix comment to reflect code, and remove an unused argument llvm-svn: 111022	2010-08-13 17:50:47 +00:00
Bruno Cardoso Lopes	7306c86886	Begin to support some vector operations for AVX 256-bit intructions. The long term goal here is to be able to match enough of vector_shuffle and build_vector so all avx intrinsics which aren't mapped to their own built-ins but to shufflevector calls can be codegen'd. This is the first (baby) step, support building zeroed vectors. llvm-svn: 110897	2010-08-12 02:06:36 +00:00
Dan Gohman	5531aa4de1	Use ISD::ADD instead of ISD::SUB with a negated constant. This avoids trouble if the return type of TD->getPointerSize() is changed to something which doesn't promote to a signed type, and is simpler anyway. Also, use getCopyFromReg instead of getRegister to read a physical register's value. llvm-svn: 110835	2010-08-11 18:14:00 +00:00
Bruno Cardoso Lopes	91d61df3eb	Add AVX matching patterns to Packed Bit Test intrinsics. Apply the same approach of SSE4.1 ptest intrinsics but create a new x86 node "testp" since AVX introduces vtest{ps}{pd} instructions which set ZF and CF depending on sign bit AND and ANDN of packed floating-point sources. This is slightly different from what the "ptest" does. Tests comming with the other 256 intrinsics tests. llvm-svn: 110744	2010-08-10 23:25:42 +00:00
Bruno Cardoso Lopes	85da72a88f	Support AVX 256-bit load and store intrinsics llvm-svn: 110645	2010-08-10 01:43:16 +00:00
Bruno Cardoso Lopes	77954bdf7a	Support very basic (doesn't include ABI support in the front-end, varags, ...) 256-bit argument passing and return for AVX llvm-svn: 110394	2010-08-05 23:35:51 +00:00
Eric Christopher	2db8464282	Make x86-64 membarriers work without sse and clean up some of the uses. llvm-svn: 110274	2010-08-04 23:03:04 +00:00
Bruno Cardoso Lopes	349165b48f	Support all 128-bit AVX vector intrinsics. Most part of them I already declared during the addition of the assembler support, the additional changes are: - Add missing intrinsics - Move all SSE conversion instructions in X86InstInfo64.td to the SSE.td file. - Duplicate some patterns to AVX mode. - Step into PCMPEST/PCMPIST custom inserter and add AVX versions. llvm-svn: 109878	2010-07-30 19:54:33 +00:00
Jakob Stoklund Olesen	ba0e124aaf	Revert r109652, and remove the offending assert in loadRegFromStackSlot instead. We do sometimes load from a too small stack slot when dealing with x86 arguments (varargs and smaller-than-32-bit args). It looks like we know what we are doing in those cases, so I am going to remove the assert instead of artifically enlarging stack slot sizes. The assert in storeRegToStackSlot stays in. We don't want to write beyond the bounds of a stack slot. llvm-svn: 109764	2010-07-29 17:42:27 +00:00
Jakob Stoklund Olesen	f2234fbe70	Create a fixed stack object for varargs that is as large as any register. The size of this object isn't used for anything - technically it is of variable size. This avoids a false positive from the assert in X86InstrInfo::loadRegFromStackSlot, and fixes PR7735. llvm-svn: 109652	2010-07-28 20:55:38 +00:00
Nate Begeman	53afc8f06a	Implement a vectorized algorithm for <16 x i8> << <16 x i8> This is about 4x faster and smaller than the existing scalarization. llvm-svn: 109566	2010-07-28 00:21:48 +00:00
Nate Begeman	269a6da023	~40% faster vector shl <4 x i32> on SSE 4.1 Larger improvements for smaller types coming in future patches. For: define <2 x i64> @shl(<4 x i32> %r, <4 x i32> %a) nounwind readnone ssp { entry: %shl = shl <4 x i32> %r, %a ; <<4 x i32>> [#uses=1] %tmp2 = bitcast <4 x i32> %shl to <2 x i64> ; <<2 x i64>> [#uses=1] ret <2 x i64> %tmp2 } We get: _shl: ## @shl pslld $23, %xmm1 paddd LCPI0_0, %xmm1 cvttps2dq %xmm1, %xmm1 pmulld %xmm1, %xmm0 ret Instead of: _shl: ## @shl pshufd $3, %xmm0, %xmm2 movd %xmm2, %eax pshufd $3, %xmm1, %xmm2 movd %xmm2, %ecx shll %cl, %eax movd %eax, %xmm2 pshufd $1, %xmm0, %xmm3 movd %xmm3, %eax pshufd $1, %xmm1, %xmm3 movd %xmm3, %ecx shll %cl, %eax movd %eax, %xmm3 punpckldq %xmm2, %xmm3 movd %xmm0, %eax movd %xmm1, %ecx shll %cl, %eax movd %eax, %xmm2 movhlps %xmm0, %xmm0 movd %xmm0, %eax movhlps %xmm1, %xmm1 movd %xmm1, %ecx shll %cl, %eax movd %eax, %xmm0 punpckldq %xmm0, %xmm2 movdqa %xmm2, %xmm0 punpckldq %xmm3, %xmm0 ret llvm-svn: 109549	2010-07-27 22:37:06 +00:00
Evan Cheng	d4218b8793	On x86, f32 / f64 nodes share the same registers as 128-bit vector values. llvm-svn: 109450	2010-07-26 21:50:05 +00:00
Evan Cheng	37b740c4bf	Add an ILP scheduler. This is a register pressure aware scheduler that's appropriate for targets without detailed instruction iterineries. The scheduler schedules for increased instruction level parallelism in low register pressure situation; it schedules to reduce register pressure when the register pressure becomes high. On x86_64, this is a win for all tests in CFP2000. It also sped up 256.bzip2 by 16%. llvm-svn: 109300	2010-07-24 00:39:05 +00:00
Dale Johannesen	f2d75670b7	The only supported calling convention for X86-64 uses SSE, so we can't return floating point values if this is disabled. Detect this error for clang. With SSE1 only, f64 is a problem; it can be done, but neither llvm-gcc nor clang has ever generated correct code for it. Since nobody noticed this I think it's OK to treat it as an error for now. This also handles SSE-sized vectors of floating point. 8207686, 8204109. llvm-svn: 109201	2010-07-23 00:30:35 +00:00
Eric Christopher	9a77382685	Custom lower the memory barrier instructions and add support for lowering without sse2. Add a couple of new testcases. Fixes a few libgomp tests and latent bugs. Remove a few todos. llvm-svn: 109078	2010-07-22 02:48:34 +00:00
Eric Christopher	a4c435f1fa	80-columns. llvm-svn: 109070	2010-07-22 00:26:08 +00:00
Nate Begeman	784e062b2a	Fix a couple issues with Win64 ABI 1) all registers were spilled as xmm, regardless of actual size 2) win64 abi doesn't do the varargs-size-in-%al thing Still to look into: xmm6-15 are marked as clobbered by call instructions on win64 even though they aren't. llvm-svn: 109035	2010-07-21 20:49:52 +00:00
Eric Christopher	d27913e516	Pulling out previous patch, must've run the tests in the wrong directory. llvm-svn: 109005	2010-07-21 09:23:56 +00:00
Eric Christopher	b2d1067024	Lower MEMBARRIER on x86 and support processors without SSE2. Fixes a pile of libgomp failures in the llvm-gcc testsuite due to the libcall not existing. llvm-svn: 109004	2010-07-21 09:05:23 +00:00
Evan Cheng	55f0c6b9fc	Split -enable-finite-only-fp-math to two options: -enable-no-nans-fp-math and -enable-no-infs-fp-math. All of the current codegen fp math optimizations only care whether the fp arithmetics arguments and results can never be NaN. llvm-svn: 108465	2010-07-15 22:07:12 +00:00
Jakob Stoklund Olesen	9b449d5a92	Use TargetOpcode::COPY instead of X86-native register copy instructions when lowering atomics. This will allow those copies to still be coalesced after TII::isMoveInstr is removed. llvm-svn: 108385	2010-07-14 23:50:27 +00:00
Evan Cheng	a8e8874552	Fix for PR7193 was overly conservative. The only case where sibcall callee address cannot be allocated a register is in 32-bit mode where the first three arguments are marked inreg. In that case EAX, EDX, and ECX will be used for argument passing. This fixes PR7610. llvm-svn: 108327	2010-07-14 06:44:01 +00:00
Dan Gohman	d7b5ce3312	Reapply bottom-up fast-isel, with several fixes for x86-32: - Check getBytesToPopOnReturn(). - Eschew ST0 and ST1 for return values. - Fix the PIC base register initialization so that it doesn't ever fail to end up the top of the entry block. llvm-svn: 108039	2010-07-10 09:00:22 +00:00
Jakob Stoklund Olesen	be8d9b0bb8	An x86 function returns a floating point value in st(0), and we must make sure it is popped, even if it is ununsed. A CopyFromReg node is too weak to represent the required sideeffect, so insert an FpGET_ST0 instruction directly instead. This will matter when CopyFromReg gets lowered to a generic COPY instruction. llvm-svn: 108037	2010-07-10 04:04:25 +00:00
Bob Wilson	6586e9b203	--- Reverse-merging r107947 into '.': U utils/TableGen/FastISelEmitter.cpp --- Reverse-merging r107943 into '.': U test/CodeGen/X86/fast-isel.ll U test/CodeGen/X86/fast-isel-loads.ll U include/llvm/Target/TargetLowering.h U include/llvm/Support/PassNameParser.h U include/llvm/CodeGen/FunctionLoweringInfo.h U include/llvm/CodeGen/CallingConvLower.h U include/llvm/CodeGen/FastISel.h U include/llvm/CodeGen/SelectionDAGISel.h U lib/CodeGen/LLVMTargetMachine.cpp U lib/CodeGen/CallingConvLower.cpp U lib/CodeGen/SelectionDAG/SelectionDAGBuilder.cpp U lib/CodeGen/SelectionDAG/FunctionLoweringInfo.cpp U lib/CodeGen/SelectionDAG/FastISel.cpp U lib/CodeGen/SelectionDAG/SelectionDAGISel.cpp U lib/CodeGen/SelectionDAG/ScheduleDAGSDNodes.cpp U lib/CodeGen/SelectionDAG/InstrEmitter.cpp U lib/CodeGen/SelectionDAG/TargetLowering.cpp U lib/Target/XCore/XCoreISelLowering.cpp U lib/Target/XCore/XCoreISelLowering.h U lib/Target/X86/X86ISelLowering.cpp U lib/Target/X86/X86FastISel.cpp U lib/Target/X86/X86ISelLowering.h llvm-svn: 107987	2010-07-09 16:37:18 +00:00
Dan Gohman	0a7d155d67	Fix the memoperand offsets in code generated for va_start. llvm-svn: 107948	2010-07-09 01:06:48 +00:00
Dan Gohman	0b5aa1cdd3	Re-apply bottom-up fast-isel, with fixes. Be very careful to avoid emitting a DBG_VALUE after a terminator, or emitting any instructions before an EH_LABEL. llvm-svn: 107943	2010-07-09 00:39:23 +00:00
Chris Lattner	f469307c77	Change LEA to have 5 operands for its memory operand, just like all other instructions, even though a segment is not allowed. This resolves a bunch of gross hacks in the encoder and makes LEA more consistent with the rest of the instruction set. No functionality change. llvm-svn: 107934	2010-07-08 23:46:44 +00:00
Chris Lattner	ec536276f0	add some long-overdue enums to refer to the parts of the 5-operand X86 memory operand. llvm-svn: 107925	2010-07-08 22:41:28 +00:00
Dan Gohman	e75704369d	Revert 107840 107839 107813 107804 107800 107797 107791. Debug info intrinsics win for now. llvm-svn: 107850	2010-07-08 01:00:56 +00:00
Evan Cheng	1c349f18f8	Move getExtLoad() and (some) getLoad() DebugLoc argument after EVT argument for consistency sake. llvm-svn: 107820	2010-07-07 22:15:37 +00:00
Dan Gohman	2d4d01d0de	Add X86FastISel support for return statements. This entails refactoring a bunch of stuff, to allow the target-independent calling convention logic to be employed. llvm-svn: 107800	2010-07-07 18:32:53 +00:00
Dan Gohman	87fb4e8fcd	Simplify FastISel's constructor by giving it a FunctionLoweringInfo instance, rather than pointers to all of FunctionLoweringInfo's members. This eliminates an NDEBUG ABI sensitivity. llvm-svn: 107789	2010-07-07 16:29:44 +00:00
Dan Gohman	fe7532a308	Split the SDValue out of OutputArg so that SelectionDAG-independent code can do calling-convention queries. This obviates OutputArgReg. llvm-svn: 107786	2010-07-07 15:54:55 +00:00
Dale Johannesen	ce65663330	Accept RIP-relative symbols with 'i' constraint, and print the (%rip) only if the 'a' modifier is present. PR 7528. llvm-svn: 107727	2010-07-06 23:27:00 +00:00
Dan Gohman	ee0cb70381	CanLowerReturn doesn't need a SelectionDAG; it just needs an LLVMContext. SelectBasicBlock doesn't needs its BasicBlock argument. llvm-svn: 107712	2010-07-06 22:19:37 +00:00
Devang Patel	a3ca21b228	Propagate debug loc. llvm-svn: 107710	2010-07-06 22:08:15 +00:00
Dan Gohman	3439629239	Reapply r107655 with fixes; insert the pseudo instruction into the block before calling the expansion hook. And don't put EFLAGS in a mbb's live-in list twice. llvm-svn: 107691	2010-07-06 20:24:04 +00:00
Dan Gohman	f4f04107ef	Revert r107655. llvm-svn: 107668	2010-07-06 15:49:48 +00:00
Dan Gohman	12205645a6	Fix a bunch of custom-inserter functions to handle the case where the pseudo instruction is not at the end of the block. llvm-svn: 107655	2010-07-06 15:18:19 +00:00
Eric Christopher	2ad0c779c3	Fix up -fstack-protector on linux to use the segment registers. Split out testcases per architecture and os now. Patch from Nelson Elhage. llvm-svn: 107640	2010-07-06 05:18:56 +00:00
Eric Christopher	d429846eca	Have the X86 backend use Triple instead of a string and some enums. llvm-svn: 107625	2010-07-05 19:26:33 +00:00
Chris Lattner	c4a7073db3	more tidying. llvm-svn: 107615	2010-07-05 05:53:14 +00:00
Chris Lattner	45cc4d74a3	Just rip v2f32 support completely out of the X86 backend. In the example in the testcase, we now generate: _test1: ## @test1 movss 4(%esp), %xmm0 addss 8(%esp), %xmm0 movl 12(%esp), %eax movss %xmm0, (%eax) ret instead of: _test1: ## @test1 subl $20, %esp movl 24(%esp), %eax movq %mm0, (%esp) movq %mm0, 8(%esp) movss (%esp), %xmm0 addss 12(%esp), %xmm0 movss %xmm0, (%eax) addl $20, %esp ret v2f32 support did not work reliably because most of the X86 backend didn't know it was legal. It was apparently only added to support returning source-level v2f32 values in MMX registers in x86-32 mode. If ABI compatibility is important on this GCC-extended-vector type for some reason, then the frontend should generate IR that returns v2i32 instead of v2f32. However, we generally don't try very hard to be abi compatible on gcc extended vectors. llvm-svn: 107601	2010-07-04 23:07:25 +00:00
Chris Lattner	681b926d54	fix PR7518 - terrible codegen of <2 x float>, by only marking v2f32 as legal in 32-bit mode. It is just as terrible there, but I just care about x86-64 and noone claims it is valuable in 64-bit mode. llvm-svn: 107600	2010-07-04 22:57:10 +00:00
Evan Cheng	0664a67fe1	Remove isSS argument from CreateFixedObject. Fixed objects cannot be spill slots so it's always false. llvm-svn: 107550	2010-07-03 00:40:23 +00:00
Gabor Greif	12ca3d9fac	use ArgOperand API llvm-svn: 107280	2010-06-30 13:03:37 +00:00
Duncan Sands	67bfa9d109	Remove pointless and unused variables. llvm-svn: 107130	2010-06-29 12:48:49 +00:00
Bill Wendling	0a5bb081cc	Reduce indentation via early exit. NFC. llvm-svn: 107067	2010-06-28 21:08:32 +00:00
Gabor Greif	83205af3fa	use ArgOperand API llvm-svn: 106944	2010-06-26 11:51:52 +00:00
Dale Johannesen	ce97d55ad9	The hasMemory argument is irrelevant to how the argument for an "i" constraint should get lowered; PR 6309. While this argument was passed around a lot, this is the only place it was used, so it goes away from a lot of other places. llvm-svn: 106893	2010-06-25 21:55:36 +00:00
Bill Wendling	e41e40f689	- Reapply r106066 now that the bzip2 build regression has been fixed. - 2010-06-25-CoalescerSubRegDefDead.ll is the testcase for r106878. llvm-svn: 106880	2010-06-25 20:48:10 +00:00
Dale Johannesen	5ad5226c58	Disallow matching "i" constraint to symbol addresses when address requires a register or secondary load to compute (most PIC modes). This improves "g" constraint handling. 8015842. The test from 2007 is attempting to test the fix for PR1761, but since -relocation-model=static doesn't work on Darwin x86-64, it was not testing what it was supposed to be testing and was passing erroneously. Fixed to use Linux x86-64. llvm-svn: 106779	2010-06-24 20:14:51 +00:00
Dan Gohman	600f62b3ba	Reapply r106634, now that the bug it exposed is fixed. llvm-svn: 106746	2010-06-24 14:30:44 +00:00
Dan Gohman	c3e291c560	Fix a bug in the code which determines when it's safe to use the bt instruction, which was exposed by r106263. llvm-svn: 106718	2010-06-24 02:07:59 +00:00
Daniel Dunbar	4df321b7ad	Revert r106263, "Fold the ShrinkDemandedOps pass into the regular DAGCombiner pass,"... it was causing both 'file' (with clang) and 176.gcc (with llvm-gcc) to be miscompiled. llvm-svn: 106634	2010-06-23 17:09:26 +00:00
Jim Grosbach	6f71039fa4	The generic DAG combiner can now fold atomic fences when needed, so switch to using that. llvm-svn: 106633	2010-06-23 16:25:07 +00:00
Daniel Dunbar	ef5a4383ad	Revert r106066, "Create a more targeted fix for not sinking instructions into a range where it"... it causes bzip2 to be miscompiled by Clang. Conflicts: lib/CodeGen/MachineSink.cpp llvm-svn: 106614	2010-06-23 00:48:25 +00:00
Jim Grosbach	6c275bc5a2	fix typo llvm-svn: 106574	2010-06-22 20:52:02 +00:00
Nick Lewycky	dcc7b6dcb6	Fix warning in no-asserts build. llvm-svn: 106405	2010-06-20 20:27:42 +00:00
Dan Gohman	92c11acdb8	Change UpdateNodeOperands' operand and return value from SDValue to SDNode *, since it doesn't care about the ResNo value. llvm-svn: 106282	2010-06-18 15:30:29 +00:00
Dan Gohman	c3479f5342	Delete unused variables. llvm-svn: 106280	2010-06-18 14:32:32 +00:00
Dan Gohman	f1d8304fe3	Eliminate unnecessary uses of getZExtValue(). llvm-svn: 106279	2010-06-18 14:22:04 +00:00
Dan Gohman	35b6f9a929	isValueValidForType can be a static member function. llvm-svn: 106278	2010-06-18 14:01:07 +00:00
Dan Gohman	b92156d5e4	Fold the ShrinkDemandedOps pass into the regular DAGCombiner pass, which is faster, simpler, and less surprising. llvm-svn: 106263	2010-06-18 01:05:21 +00:00
Bill Wendling	8c0cf0994d	Create a more targeted fix for not sinking instructions into a range where it will conflict with another live range. The place which creates this scenerio is the code in X86 that lowers a select instruction by splitting the MBBs. This eliminates the need to check from the bottom up in an MBB for live pregs. llvm-svn: 106066	2010-06-15 23:46:31 +00:00
Eric Christopher	6c4d63e1a5	For 32-bit non-pic tlv mach-o addressing we don't need a pic base or a relative address. llvm-svn: 106064	2010-06-15 23:08:42 +00:00
Eric Christopher	89d103a8ce	Ensure that mov and not lea are used to stick the address into the register. While we're at it, make sure it's in the right one. llvm-svn: 105645	2010-06-08 22:04:25 +00:00
Dale Johannesen	df1a7f83bf	Fix some liveout handling related to tail calls, see comments. I don't think this ever resulted in problems on x86, but it would on ARM. llvm-svn: 105509	2010-06-05 00:30:45 +00:00
Eric Christopher	b0e1a458ce	Add first pass at darwin tls compiler support. llvm-svn: 105381	2010-06-03 04:07:48 +00:00
Eli Friedman	6e3d5af945	Fix comment so it doesn't include comments which are irrelevant to the x86 backend. Add a FIXME noting what can be fixed here. llvm-svn: 105342	2010-06-02 19:35:46 +00:00
Dan Gohman	a690618c58	Use comments to document non-obvious code rather than mailing list archives. llvm-svn: 105341	2010-06-02 19:13:40 +00:00
Eli Friedman	526e6d045f	Don't try to custom-lower 64-bit add-with-overflow and friends on x86-32; the x86 backend currently doesn't know how to handle them. This doesn't really fix anything because LegalizeTypes doesn't know how to handle them either. We do get a better error message, though. llvm-svn: 105305	2010-06-02 00:27:18 +00:00
Evan Cheng	27c4933e02	Fix PR7193: if sibling call address can take a register, make sure there are enough registers available by counting inreg arguments. llvm-svn: 105092	2010-05-29 01:35:22 +00:00
Dale Johannesen	e8be73f3e7	Fix comment typos. llvm-svn: 105059	2010-05-28 23:24:28 +00:00
Dale Johannesen	9e43c07bc5	Mark some math lib intrinsic nodes Legal on SSE4.1. No functional effect as these nodes are not generated yet. llvm-svn: 104879	2010-05-27 20:12:41 +00:00
Dan Gohman	dc53f1cb5c	FastISel doesn't yet handle callee-pop functions. To support this, move IsCalleePop from X86ISelLowering to X86Subtarget. llvm-svn: 104866	2010-05-27 18:43:40 +00:00
Zhongxing Xu	730a977e02	SRetReturnReg was set in LowerFormalArguments(). So only assert it here. llvm-svn: 104691	2010-05-26 08:10:02 +00:00
Evan Cheng	168ced94d8	Implement @llvm.returnaddress. rdar://8015977. llvm-svn: 104421	2010-05-22 01:47:14 +00:00
Dale Johannesen	2b78565842	Previous commit message should refer to 104308. llvm-svn: 104337	2010-05-21 18:44:47 +00:00
Dale Johannesen	6361e3e8a2	Fix two bugs in 104348: Case where MMX is disabled wasn't handled right. MMX->MMX bitconverts are Legal. llvm-svn: 104336	2010-05-21 18:40:15 +00:00
Dale Johannesen	b3b9c8ac48	Fix i64->f64 conversion, x86-64, -no-sse. A bit tricky since there's a 3rd 64-bit type, MMX vectors. PR 7135. llvm-svn: 104308	2010-05-21 00:52:33 +00:00
Evan Cheng	738e920edf	Code refactoring: pull SchedPreference enum from TargetLowering.h to TargetMachine.h and put it in its own namespace. llvm-svn: 104147	2010-05-19 20:19:50 +00:00
Dale Johannesen	2ef974ee0e	Revert 103911; it broke a test that expects bitconvert <1xi64> -> i64 to work in MMX registers on hosts where -no-sse is the default (not mine). The right thing is to accept this and make i64->f64 conversions go through memory, but I don't have time right now. llvm-svn: 103914	2010-05-16 20:19:04 +00:00
Dale Johannesen	fc1492d71b	Make x86-64 64-bit bitconvert work when SSE is not available. (This worked as of about 6 months ago and I didn't track down exactly what broke it; I think this fix is appropriate.) llvm-svn: 103911	2010-05-16 18:22:38 +00:00
Anton Korobeynikov	8f35fabbc1	Add support for thiscall calling convention. Patch by Charles Davis and Steven Watanabe! llvm-svn: 103902	2010-05-16 09:08:45 +00:00
Dale Johannesen	3a366a88f2	Fix uint64->{float, double} conversion to do rounding correctly in 32-bit. The implementation in LegalizeIntegerTypes to handle this as sint64->float + appropriate power of 2 is subject to double rounding, considered incorrect by numerics people. Use this implementation only when it is safe. This leads to using library calls in some cases that produced inline code before, but it's correct now. (EVTToAPFloatSemantics belongs somewhere else, any suggestions?) Add a correctly rounding (though not particularly fast) conversion that uses X87 80-bit computations for x86-32. 7885399, 5901940. This shows up in gcc.c-torture/execute/ieee/rbug.c in the gcc testsuite on some platforms. llvm-svn: 103883	2010-05-15 18:51:12 +00:00
Bill Wendling	95f6ebcb37	Rename "HasCalls" in MachineFrameInfo to "AdjustsStack" to better describe what the variable actually tracks. N.B., several back-ends are using "HasCalls" as being synonymous for something that adjusts the stack. This isn't 100% correct and should be looked into. llvm-svn: 103802	2010-05-14 21:14:32 +00:00
Dan Gohman	35dd005d22	Lowering of atomic instructions can result in operands being used more than once. If ISel had put a kill flag on one of them, it's not valid to transfer the kill flag to each new instance. llvm-svn: 103799	2010-05-14 21:01:44 +00:00
Dan Gohman	bb919dfb6b	Implement a bunch more TargetSelectionDAGInfo infrastructure. Move EmitTargetCodeForMemcpy, EmitTargetCodeForMemset, and EmitTargetCodeForMemmove out of TargetLowering and into SelectionDAGInfo to exercise this. llvm-svn: 103481	2010-05-11 17:31:57 +00:00
Dan Gohman	25c1653700	Get rid of the EdgeMapping map. Instead, just check for BasicBlock changes before doing phi lowering for switches. llvm-svn: 102809	2010-05-01 00:01:06 +00:00
Dan Gohman	2e2cc87081	Make this code less confusing. Instead of reassigning BB, just operate on the original variables, so it's easier to see what is being done to which blocks. llvm-svn: 102759	2010-04-30 20:14:26 +00:00
Dan Gohman	57bb73c80b	Remove the -disable-16bit command-line option, which is now obsolete. llvm-svn: 102730	2010-04-30 18:30:26 +00:00
Evan Cheng	5117a555e0	Another sibcall bug. If caller and callee calling conventions differ, then it's only safe to do a tail call if the results are returned in the same way. llvm-svn: 102683	2010-04-30 01:12:32 +00:00
Evan Cheng	050df1b8de	Enable i16 to i32 promotion by default. llvm-svn: 102493	2010-04-28 08:30:49 +00:00
Evan Cheng	d21f564543	Unbreak the build. Only form shld / shrd after legalization. llvm-svn: 102488	2010-04-28 02:25:18 +00:00
Evan Cheng	347e3b8f15	Rather than having a ton of patterns for double shift instructions, e.g. SHLD16rrCL, just perform custom dag combine to form x86 specific dag so they match to the same pattern. This also makes sure later dag combine do not cause isel to miss them (e.g. promoting i16 to i32). llvm-svn: 102485	2010-04-28 01:18:01 +00:00
Stuart Hastings	c0458f1a40	Tweak x86 INC/DEC generation to look for CopyToReg or SETCC. Radar 7866163. llvm-svn: 102477	2010-04-28 00:35:10 +00:00
Evan Cheng	3b928af28f	SRA promotion is also not free. llvm-svn: 102456	2010-04-27 19:48:31 +00:00
Evan Cheng	6e45f1d1ff	Promoting 16-bit cmp / test aren't free. Don't do it. llvm-svn: 102366	2010-04-26 19:06:11 +00:00
Evan Cheng	ed69b382ea	- Move TargetLowering::EmitTargetCodeForFrameDebugValue to TargetInstrInfo and rename it to emitFrameIndexDebugValue. - Teach spiller to modify DBG_VALUE instructions to reference spill slots. llvm-svn: 102323	2010-04-26 07:38:55 +00:00
Dale Johannesen	582565e991	Stop abusing EmitInstrWithCustomInserter for target-dependent form of DEBUG_VALUE, as it doesn't have reasonable default behavior for unsupported targets. Add a new hook instead. No functional change. llvm-svn: 102320	2010-04-25 21:33:54 +00:00
Evan Cheng	a02d0e7d6b	Avoid promoting a i16 node if it would eliminate a (store (op (load))) opportunity. llvm-svn: 102237	2010-04-24 04:44:57 +00:00
Evan Cheng	0367559786	Fix X86ISD::CMP i16 to i32 promotion. llvm-svn: 102192	2010-04-23 18:21:16 +00:00
Dan Gohman	c594eab10f	Move HandlePHINodesInSuccessorBlocks functions out of SelectionDAGISel and into SelectionDAGBuilder and FastISel. llvm-svn: 102123	2010-04-22 20:46:50 +00:00
Evan Cheng	f1223bdec0	- It's not safe to promote rotates (at least not trivially). - Some code refactoring. llvm-svn: 102111	2010-04-22 20:19:46 +00:00
Evan Cheng	9c8cd8c061	isel (i32 anyext i16) as insert_subreg when 16-bit ops are being promoted. llvm-svn: 101979	2010-04-21 01:47:12 +00:00
Dale Johannesen	0522b90cdb	Because of the EMMS problem, right now we have to support user-defined operations that use MMX register types, but the compiler shouldn't generate them on its own. This adds a Synthesizable abstraction to represent this, and changes the vector widening computation so it won't produce MMX types. (The motivation is to remove noise from the ABI compatibility part of the gcc test suite, which has some breakage right now.) llvm-svn: 101951	2010-04-20 22:34:09 +00:00
Evan Cheng	e19aa5cc52	More progress on promoting i16 operations to i32 for x86. Work in progress. llvm-svn: 101808	2010-04-19 19:29:22 +00:00
Dan Gohman	21cea8ac2e	Use const qualifiers with TargetLowering. This eliminates several const_casts, and it reinforces the design of the Target classes being immutable. SelectionDAGISel::IsLegalToFold is now a static member function, because PIC16 uses it in an unconventional way. There is more room for API cleanup here. And PIC16's AsmPrinter no longer uses TargetLowering. llvm-svn: 101635	2010-04-17 15:26:15 +00:00
Dan Gohman	31ae586c74	Move per-function state out of TargetLowering subclasses and into MachineFunctionInfo subclasses. llvm-svn: 101634	2010-04-17 14:41:14 +00:00
Evan Cheng	f1bd5fcdb4	More work to allow dag combiner to promote 16-bit ops to 32-bit. llvm-svn: 101621	2010-04-17 06:13:15 +00:00
Eric Christopher	7258dcd77f	Revert 101465, it broke internal OpenGL testing. Probably the best way to know that all getOperand() calls have been handled is to replace that API instead of updating. llvm-svn: 101579	2010-04-16 23:37:20 +00:00
Dan Gohman	148c69a3f6	Eliminate an unnecessary SelectionDAG dependency in getOptimalMemOpType. llvm-svn: 101531	2010-04-16 20:11:05 +00:00
Gabor Greif	f375520f7b	reapply r101434 with a fix for self-hosting rotate CallInst operands, i.e. move callee to the back of the operand array the motivation for this patch are laid out in my mail to llvm-commits: more efficient access to operands and callee, faster callgraph-construction, smaller compiler binary llvm-svn: 101465	2010-04-16 15:33:14 +00:00
Evan Cheng	af56facacd	Adding support for dag combiner to promote operations for profit. This requires target specific queries. For example, x86 should promote i16 to i32 when it does not impact load folding. x86 support is off by default. It can be enabled with -promote-16bit. Work in progress. llvm-svn: 101448	2010-04-16 06:14:10 +00:00
Gabor Greif	403e9694f9	back out r101423 and r101397, they break llvm-gcc self-host on darwin10 llvm-svn: 101434	2010-04-16 01:16:20 +00:00
Gabor Greif	33ae80bff7	reapply r101364, which has been backed out in r101368 with a fix rotate CallInst operands, i.e. move callee to the back of the operand array the motivation for this patch are laid out in my mail to llvm-commits: more efficient access to operands and callee, faster callgraph-construction, smaller compiler binary llvm-svn: 101397	2010-04-15 20:51:13 +00:00
Gabor Greif	9fd00c7d25	back out r101364, as it trips the linux nightlybot on some clang C++ tests llvm-svn: 101368	2010-04-15 12:46:56 +00:00
Gabor Greif	aafd209632	rotate CallInst operands, i.e. move callee to the back of the operand array the motivation for this patch are laid out in my mail to llvm-commits: more efficient access to operands and callee, faster callgraph-construction, smaller compiler binary llvm-svn: 101364	2010-04-15 10:49:53 +00:00
Dan Gohman	bcaf681cde	Add const qualifiers to CodeGen's use of LLVM IR constructs. llvm-svn: 101334	2010-04-15 01:51:59 +00:00
Eric Christopher	eabc9623da	Allow lowering for palignr instructions for mmx sized vectors. Add patterns to handle the lowering. llvm-svn: 101331	2010-04-15 01:40:20 +00:00
Dan Gohman	7deb447781	Factor out EH landing pad code into a separate function, and constify a bunch of stuff to support it. llvm-svn: 101273	2010-04-14 19:53:31 +00:00
Evan Cheng	ebe47c872f	Avoid using f64 to lower memcpy from constant string. It's cheaper to use i32 store of immediates. llvm-svn: 100751	2010-04-08 07:37:57 +00:00
Chris Lattner	2104b8d36e	rename llvm::llvm_report_error -> llvm::report_fatal_error llvm-svn: 100709	2010-04-07 22:58:41 +00:00
John McCall	6ac5cc973c	Clean up some signedness oddities in this code noticed by clang. llvm-svn: 100599	2010-04-07 01:49:15 +00:00
Chris Lattner	305f2efb63	unthread MMI from FastISel llvm-svn: 100416	2010-04-05 06:05:26 +00:00
Chris Lattner	7cfa70e9b3	fastisel doesn't need DwarfWriter, remove some tendricles. llvm-svn: 100381	2010-04-05 02:19:28 +00:00
Mon P Wang	c576ee9040	Reapply address space patch after fixing an issue in MemCopyOptimizer. Added support for address spaces and added a isVolatile field to memcpy, memmove, and memset, e.g., llvm.memcpy.i32(i8, i8, i32, i32) -> llvm.memcpy.p0i8.p0i8.i32(i8, i8, i32, i32, i1) llvm-svn: 100304	2010-04-04 03:10:48 +00:00
Chris Lattner	6f306d7d30	use DebugLoc default ctor instead of DebugLoc::getUnknownLoc() llvm-svn: 100214	2010-04-02 20:16:16 +00:00
Evan Cheng	61399375a2	Correctly lower memset / memcpy of undef. It should be a nop. PR6767. llvm-svn: 100208	2010-04-02 19:36:14 +00:00
Mon P Wang	999c1b927b	Revert r100191 since it breaks objc in clang llvm-svn: 100199	2010-04-02 18:43:02 +00:00
Mon P Wang	a972ab8564	Reapply address space patch after fixing an issue in MemCopyOptimizer. Added support for address spaces and added a isVolatile field to memcpy, memmove, and memset, e.g., llvm.memcpy.i32(i8, i8, i32, i32) -> llvm.memcpy.p0i8.p0i8.i32(i8, i8, i32, i32, i1) llvm-svn: 100191	2010-04-02 18:04:15 +00:00
Eric Christopher	06a1639b98	Remove FIXME - if there's a better way to do this it isn't here. llvm-svn: 100176	2010-04-02 04:32:37 +00:00
Chandler Carruth	8d6d0d4c58	Disambiguate conditional expression for newer GCCs. llvm-svn: 100167	2010-04-02 01:31:24 +00:00
Evan Cheng	f997c31598	In 64-bit mode, use i64 to lower memcpy / memset instead of f64. llvm-svn: 100137	2010-04-01 20:27:45 +00:00
Evan Cheng	d9929f03cf	Add comments about DstAlign and SrcAlign. llvm-svn: 100132	2010-04-01 20:10:42 +00:00
Evan Cheng	4c014c892a	- Avoid using floating point stores to implement memset unless the value is zero. - Do not try to infer GV alignment unless its type is sized. It's not possible to infer alignment if it has opaque type. llvm-svn: 100118	2010-04-01 18:19:11 +00:00
Evan Cheng	43cd9e3845	Fix sdisel memcpy, memset, memmove lowering: 1. Makes it possible to lower with floating point loads and stores. 2. Avoid unaligned loads / stores unless it's fast. 3. Fix some memcpy lowering logic bug related to when to optimize a load from constant string into a constant. 4. Adjust x86 memcpy lowering threshold to make it more sane. 5. Fix x86 target hook so it uses vector and floating point memory ops more effectively. rdar://7774704 llvm-svn: 100090	2010-04-01 06:04:33 +00:00
Bob Wilson	6f7fd28824	Revert Mon Ping's change 99928, since it broke all the llvm-gcc buildbots. llvm-svn: 99948	2010-03-30 22:27:04 +00:00
Mon P Wang	7460571381	Added support for address spaces and added a isVolatile field to memcpy, memmove, and memset, e.g., llvm.memcpy.i32(i8, i8, i32, i32) -> llvm.memcpy.p0i8.p0i8.i32(i8, i8, i32, i32, i1) A update of langref will occur in a subsequent checkin. llvm-svn: 99928	2010-03-30 20:55:56 +00:00
Chris Lattner	9897043928	Rip out the 'is temporary' nonsense from the MCContext interface to create symbols. It is extremely error prone and a source of a lot of the remaining integrated assembler bugs on x86-64. This fixes rdar://7807601. llvm-svn: 99902	2010-03-30 18:10:53 +00:00
Eric Christopher	c1ddaaf5b1	Add FIXME for operand promotion. llvm-svn: 99859	2010-03-30 01:04:59 +00:00
Benjamin Kramer	2788f797ca	Make isInt?? and isUint?? template specializations of the generic versions. This makes calls a little bit more consistent and allows easy removal of the specializations in the future. Convert all callers to the templated functions. llvm-svn: 99838	2010-03-29 21:13:41 +00:00
Evan Cheng	3365fb1412	Do not sibcall if stack needs to be dynamically aligned. llvm-svn: 99620	2010-03-26 16:26:03 +00:00
Evan Cheng	00a620c61e	Allow trivial sibcall of vararg callee when no arguments are being passed. llvm-svn: 99598	2010-03-26 02:13:13 +00:00
Nate Begeman	2ceb288416	Per chris's request, add some comments. llvm-svn: 99434	2010-03-24 22:19:06 +00:00
Nate Begeman	583e05d8ce	BUILD_VECTOR was missing out on some prime opportunities to use SSE 4.1 inserts. llvm-svn: 99423	2010-03-24 20:49:50 +00:00
Evan Cheng	3f6f769c4f	If call result is in ST0 and it is not being passed to the caller's caller, then it is not safe to optimize the call into a sibcall since the call result has to be popped off the x87 stack. llvm-svn: 99032	2010-03-20 02:58:15 +00:00
Daniel Dunbar	5599256415	MC: Allow modifiers in MCSymbolRefExpr, and eliminate X86MCTargetExpr. - Although it would be nice to allow this decoupling, the assembler needs to be able to reason about MCSymbolRefExprs in too many places to make this viable. We can use a target specific encoding of the variant if this becomes an issue. - This patch also extends llvm-mc to support parsing of the modifiers, as opposed to lumping them in with the symbol. llvm-svn: 98592	2010-03-15 23:51:06 +00:00
Dan Gohman	c6ddebd6d1	Recognize code for doing vector gather/scatter index calculations with 32-bit indices. Instead of shuffling each element out of the index vector, when all indices are needed, just store the input vector to the stack and load the elements out. llvm-svn: 98588	2010-03-15 23:23:03 +00:00
Bill Wendling	bbcaa40227	Now that the default for Darwin platforms is to place the LSDA into the TEXT section, remove the target-specific code that performs this. llvm-svn: 98580	2010-03-15 21:09:38 +00:00
Bill Wendling	0344874921	Place the LSDA into the TEXT section for x86 Darwin. If the global it's pointing to is local to the translation unit, we need to place fill the value of that symbol into the non-lazy pointer. This should conclude all Darwin changes for placing the LSDA into the TEXT section. There is some cleanup to do. I.e., there's no longer a special need for target-specific code here. But that can come later. llvm-svn: 98564	2010-03-15 19:04:37 +00:00
Evan Cheng	ae5edee6c8	Avoid sibcall optimization if either caller or callee is using sret semantics. llvm-svn: 98561	2010-03-15 18:54:48 +00:00
Chris Lattner	6feb7e3325	fix PR6605, X86ISD::CMP always returns i32 (EFLAGS), not the operand type. llvm-svn: 98507	2010-03-14 18:44:35 +00:00
Chris Lattner	a30d4ce194	add support for pentium class CPUs which do not have cmov, PR4841. Patch by Craig Smith! llvm-svn: 98496	2010-03-14 18:31:44 +00:00
Evan Cheng	d703df67ce	Do not force indirect tailcall through fixed registers: eax, r11. Add support to allow loads to be folded to tail call instructions. llvm-svn: 98465	2010-03-14 03:48:46 +00:00
Chris Lattner	29bdac4928	eliminate the now-unneeded context argument of MBB::getSymbol() llvm-svn: 98451	2010-03-13 21:04:28 +00:00
Bill Wendling	3d0cd822a9	Add a beta-test for placing the LSDA into the TEXT section on X86. llvm-svn: 98370	2010-03-12 19:20:40 +00:00
Benjamin Kramer	d69ee90f2f	Use StringRef::substr instead of std::string::substr to avoid using a free'd string temporary. This should fix PR6590. llvm-svn: 98349	2010-03-12 13:54:59 +00:00
Dan Gohman	576aec4363	Remove getWidenVectorType, which is no longer used. llvm-svn: 98289	2010-03-11 21:39:57 +00:00
Bill Wendling	00810c39da	revert r98270. llvm-svn: 98281	2010-03-11 19:50:31 +00:00
Evan Cheng	31fe835bf2	Bad bad bug. x86 force indirect tail call address into eax when it's meant to force it into a call preserved register instead. Change it to ecx for now. llvm-svn: 98270	2010-03-11 18:49:14 +00:00
Chris Lattner	a179e4d0a8	add support, testcases, and dox for the new GHC calling convention. Patch by David Terei! llvm-svn: 98212	2010-03-11 00:22:57 +00:00
Dale Johannesen	49de0607a8	Progress towards shepherding debug info through SelectionDAG. No functional effect yet. This is still evolving and should not be viewed as final. llvm-svn: 98195	2010-03-10 22:13:47 +00:00
Chris Lattner	ac2361a9b0	set the temporary bit on MCSymbols correctly. llvm-svn: 98124	2010-03-10 02:25:11 +00:00
Anton Korobeynikov	d5e3fd6dc8	Lower dynamic stack allocation on mingw32 to separate instruction. We cannot use a normal call here since it has extra unmodelled side effects (it changes stack pointer). This should fix PR5292. llvm-svn: 97884	2010-03-06 19:32:29 +00:00
Evan Cheng	27494232d4	Fix typo. llvm-svn: 97818	2010-03-05 19:55:55 +00:00
Evan Cheng	654ec2a663	Fix an oops in x86 sibcall optimization. If the ByVal callee argument is itself passed as a pointer, then it's obviously not safe to do a tail call. llvm-svn: 97797	2010-03-05 08:38:04 +00:00
Evan Cheng	cf67ffa500	Rever 96389 and 96990. They are causing some miscompilation that I do not fully understand. llvm-svn: 97782	2010-03-05 03:08:23 +00:00
Dan Gohman	b8ebd408da	Fix recognition of 16-bit bswap for C front-ends which emit the clobber registers in a different order. llvm-svn: 97741	2010-03-04 19:58:08 +00:00
Bill Wendling	78c5b7a76d	Remove dead parameter passing. llvm-svn: 97536	2010-03-02 01:55:18 +00:00
Evan Cheng	87d50aa18a	Remove the optimize for code size limitation on r67917. Optimize 64-bit imul by constants into leas + shl regardless if optimizing for code size. The size saving from using imulq isn't worth it. Also, the lea and shl instructions may expose further optimization. llvm-svn: 97507	2010-03-01 22:00:11 +00:00
Evan Cheng	228c31f045	Re-apply 97040 with fix. This survives a ppc self-host llvm-gcc bootstrap. llvm-svn: 97310	2010-02-27 07:36:59 +00:00
Dan Gohman	ec4e1b67bf	Truncate from i64 to i32 is "free" on x86-32, because it involves just discarding one of the registers. llvm-svn: 97100	2010-02-25 03:04:36 +00:00
Daniel Dunbar	4811d004be	Speculatively revert r97011, "Re-apply 96540 and 96556 with fixes.", again in the hopes of fixing PPC bootstrap. llvm-svn: 97040	2010-02-24 17:05:47 +00:00
Dan Gohman	3860521406	When forming SSE min and max nodes for UGE and ULE comparisons, it's necessary to swap the operands to handle NaN and negative zero properly. Also, reintroduce logic for checking for NaN conditions when forming SSE min and max instructions, fixed to take into consideration NaNs and negative zeros. This allows forming min and max instructions in more cases. llvm-svn: 97025	2010-02-24 06:52:40 +00:00
Evan Cheng	328a607490	Re-apply 96540 and 96556 with fixes. llvm-svn: 97011	2010-02-24 01:42:31 +00:00
Evan Cheng	da52f449a0	Fix rev 96389 by restricting the xform to mask that's either signbit or max signed value. llvm-svn: 96990	2010-02-23 21:51:54 +00:00
Chris Lattner	a828850b4d	X86InstrInfoSSE.td declares PINSRW as having type v8i16, don't alis it in the MMX .td file with a different width, split into two X86ISD opcodes. This fixes an x86 testcase. llvm-svn: 96859	2010-02-23 02:07:48 +00:00
Arnold Schwaighofer	30ece5b807	Mark the return address stack slot as mutable when moving the return address during a tail call. A parameter might overwrite this stack slot during the tail call. The sequence during a tail call is: 1.) load return address to temp reg 2.) move parameters (might involve storing to return address stack slot) 3.) store return address to new location from temp reg If the stack location is marked immutable CodeGen can colocate load (1) with the store (3). This fixes bug 6225. llvm-svn: 96783	2010-02-22 16:18:09 +00:00
Dan Gohman	b87de8d30d	Remove the logic for reasoning about NaNs from the code that forms SSE min and max instructions. The real thing this code needs to be concerned about is negative zero. Update the sse-minmax.ll test accordingly, and add tests for -enable-unsafe-fp-math mode as well. llvm-svn: 96775	2010-02-22 04:03:39 +00:00
Chris Lattner	db8d6678e9	fix an incorrect VT: eflags is always i32. The bug was causing us to create an X86ISD::Cmp node with result type i64 on the CodeGen/X86/shift-i256.ll testcase and the new isel was assert on it downstream. llvm-svn: 96768	2010-02-22 00:28:59 +00:00
Anton Korobeynikov	31a9212b0b	It turned out that we failed to emit proper symbol stubs on non-x86/darwin for ages (we emitted a reference to a stub, but no stub was emitted). The code inside x86-32/macho target objfile lowering should actually be the generic one - move it there. This (I really, really hope) should fix EH issues on ppc/darwin and arm/darwin. llvm-svn: 96755	2010-02-21 20:28:15 +00:00
Duncan Sands	d0bf6f640f	Revert commits 96556 and 96640, because commit 96556 breaks the dragonegg self-host build. I reverted 96640 in order to revert 96556 (96640 goes on top of 96556), but it also looks like with both of them applied the breakage happens even earlier. The symptom of the 96556 miscompile is the following crash: llvm[3]: Compiling AlphaISelLowering.cpp for Release build cc1plus: /home/duncan/tmp/tmp/llvm/lib/CodeGen/SelectionDAG/SelectionDAG.cpp:4982: void llvm::SelectionDAG::ReplaceAllUsesWith(llvm::SDNode, llvm::SDNode, llvm::SelectionDAG::DAGUpdateListener*): Assertion `(!From->hasAnyUseOfValue(i) \|\| From->getValueType(i) == To->getValueType(i)) && "Cannot use this version of ReplaceAllUsesWith!"' failed. Stack dump: 0. Running pass 'X86 DAG->DAG Instruction Selection' on function '@_ZN4llvm19AlphaTargetLowering14LowerOperationENS_7SDValueERNS_12SelectionDAGE' g++: Internal error: Aborted (program cc1plus) This occurs when building LLVM using LLVM built by LLVM (via dragonegg). Probably LLVM has miscompiled itself, though it may have miscompiled GCC and/or dragonegg itself: at this point of the self-host build, all of GCC, LLVM and dragonegg were built using LLVM. Unfortunately this kind of thing is extremely hard to debug, and while I did rummage around a bit I didn't find any smoking guns, aka obviously miscompiled code. Found by bisection. r96556 \| evancheng \| 2010-02-18 03:13:50 +0100 (Thu, 18 Feb 2010) \| 5 lines Some dag combiner goodness: Transform br (xor (x, y)) -> br (x != y) Transform br (xor (xor (x,y), 1)) -> br (x == y) Also normalize (and (X, 1) == / != 1 -> (and (X, 1)) != / == 0 to match to "test on x86" and "tst on arm" r96640 \| evancheng \| 2010-02-19 01:34:39 +0100 (Fri, 19 Feb 2010) \| 16 lines Transform (xor (setcc), (setcc)) == / != 1 to (xor (setcc), (setcc)) != / == 1. e.g. On x86_64 %0 = icmp eq i32 %x, 0 %1 = icmp eq i32 %y, 0 %2 = xor i1 %1, %0 br i1 %2, label %bb, label %return => testl %edi, %edi sete %al testl %esi, %esi sete %cl cmpb %al, %cl je LBB1_2 llvm-svn: 96672	2010-02-19 11:30:41 +00:00
Evan Cheng	0ceb68a552	Some dag combiner goodness: Transform br (xor (x, y)) -> br (x != y) Transform br (xor (xor (x,y), 1)) -> br (x == y) Also normalize (and (X, 1) == / != 1 -> (and (X, 1)) != / == 0 to match to "test on x86" and "tst on arm" llvm-svn: 96556	2010-02-18 02:13:50 +00:00
Evan Cheng	82b04130cb	Look for SSE and instructions of this form: (and x, (build_vector c1,c2,c3,c4)). If there exists a use of a build_vector that's the bitwise complement of the mask, then transform the node to (and (xor x, (build_vector -1,-1,-1,-1)), (build_vector ~c1,~c2,~c3,~c4)). Since this transformation is only useful when 1) the given build_vector will become a load from constpool, and 2) (and (xor x -1), y) matches to a single instruction, I decided this is appropriate as a x86 specific transformation. rdar://7323335 llvm-svn: 96389	2010-02-16 21:09:44 +00:00
Anton Korobeynikov	ae4ccc10da	Preliminary patch to improve dwarf EH generation - Hooks to return Personality / FDE / LSDA / TType encoding depending on target / options (e.g. code model / relocation model) - MCIzation of Dwarf EH printer to use encoding information - Stub generation for ELF target (needed for indirect references) - Some other small changes here and there llvm-svn: 96285	2010-02-15 22:35:59 +00:00

... 5 6 7 8 9 ...

1791 Commits