llvm-project

Commit Graph

Author	SHA1	Message	Date
Bruno Cardoso Lopes	9f8e704151	One more check from the original isShuffleMaskLegal goes away llvm-svn: 113045	2010-09-04 00:46:16 +00:00
Bruno Cardoso Lopes	16959372bb	Remove a duplicated but useless check that i've inserted in the previous commit. llvm-svn: 113044	2010-09-04 00:43:12 +00:00
Bruno Cardoso Lopes	44578d38d3	Refactor some code and remove the extra checks for unpckl_undef and unpckh_undef llvm-svn: 113043	2010-09-04 00:39:43 +00:00
Bruno Cardoso Lopes	7829d0e74b	Remove check for unpckh mask llvm-svn: 113035	2010-09-03 23:32:47 +00:00
Bruno Cardoso Lopes	d1dacc57aa	Remove check for unpckl mask llvm-svn: 113034	2010-09-03 23:31:50 +00:00
Bruno Cardoso Lopes	207b9d6218	Inline isShuffleMaskLegal into LowerVECTOR_SHUFFLE, so we can start checking each standalone condition and decide whether emit target specific nodes or remove the condition if it's already matched before. llvm-svn: 113031	2010-09-03 23:24:06 +00:00
Bruno Cardoso Lopes	2bef20eda7	Reapply considered harmfull part of rr112934 and r112942. "Use target specific nodes instead of relying in unpckl and unpckh pattern fragments during isel time. Also place a depth limit in getShuffleScalarElt. llvm-svn: 113020	2010-09-03 22:09:41 +00:00
Bruno Cardoso Lopes	fe8717c573	Reintroduce a simple function refactoring done in r112934, also without any functionality changes llvm-svn: 113008	2010-09-03 20:20:02 +00:00
Bruno Cardoso Lopes	48e589b122	Reapply piecies of r112942 and r112934 which don't do functional changes llvm-svn: 113007	2010-09-03 20:10:35 +00:00
Bruno Cardoso Lopes	6979cf0808	Reapply Fix comment llvm-svn: 113006	2010-09-03 19:55:05 +00:00
Daniel Dunbar	6f3da24d70	Revert r112934, "- Use specific nodes to match unpckl masks.", which introduced some infinite loop and select failures. - Apologies for eager reverting, but its branch day. llvm-svn: 113000	2010-09-03 19:38:11 +00:00
Daniel Dunbar	f1aacd55c0	Revert r112938 "Fix comment", which depends on r112934, which introduced some infinite loop and select failures. llvm-svn: 112999	2010-09-03 19:38:08 +00:00
Daniel Dunbar	0ffe4db45c	Revert r112942, "Use punpckh and unpckh family of nodes instead of using unpckh mask pattern fragment", which depends on r112934, which introduced some infinite loop and select failures. llvm-svn: 112998	2010-09-03 19:38:05 +00:00
Bruno Cardoso Lopes	a85ec10483	Use punpckh and unpckh family of nodes instead of using unpckh mask pattern fragment llvm-svn: 112942	2010-09-03 01:39:08 +00:00
Bruno Cardoso Lopes	adc6bca2dd	Fix comment llvm-svn: 112938	2010-09-03 01:28:51 +00:00
Bruno Cardoso Lopes	cce44678b4	- Use specific nodes to match unpckl masks. - Teach getShuffleScalarElt how to handle more target specific nodes, so the DAGCombine can make use of it. - Add another hack to avoid the node update problem during legalization. More description on the comments llvm-svn: 112934	2010-09-03 01:24:00 +00:00
Anton Korobeynikov	a689c5b2c0	Revert win64 changes. They seem to be incomplete llvm-svn: 112885	2010-09-02 22:31:32 +00:00
Anton Korobeynikov	56291f7e53	Properly allocate win64 shadow reg area. Patch by Jan Sjodin! llvm-svn: 112875	2010-09-02 22:16:28 +00:00
Bruno Cardoso Lopes	489613f1e5	Replace unpckl_undef and unpckh_undef matching with target specific opcodes llvm-svn: 112806	2010-09-02 05:23:12 +00:00
Bruno Cardoso Lopes	e4e4be3885	Move condition out to prepare for more matching llvm-svn: 112805	2010-09-02 04:20:26 +00:00
Bruno Cardoso Lopes	bf7fd146c7	Remove checking for isUNPCKL_v_undef_Mask, the specific node is already emitted for it llvm-svn: 112804	2010-09-02 03:57:58 +00:00
Bruno Cardoso Lopes	6a7f634487	become more strict about when it's safe to use X86ISD::MOVLPS llvm-svn: 112799	2010-09-02 02:35:51 +00:00
Bruno Cardoso Lopes	04c25c15c7	Revert r112689, avoid those kind of checks cause they mess up with mmx llvm-svn: 112760	2010-09-01 22:59:03 +00:00
Bruno Cardoso Lopes	b3825216ce	Use movlps, movlpd, movss and movsd specific nodes instead of pattern matching with movlp pattern fragment llvm-svn: 112694	2010-09-01 05:08:25 +00:00
Bruno Cardoso Lopes	6aaebe877b	minor change, simplify some logic llvm-svn: 112689	2010-09-01 00:57:08 +00:00
Bruno Cardoso Lopes	2b025707a2	Move some functions around so they can be used for some other to come function llvm-svn: 112687	2010-09-01 00:51:36 +00:00
Bruno Cardoso Lopes	4b56d87290	Use x86 specific MOVSLDUP node, add more patterns to match it and remove useless load nodes llvm-svn: 112661	2010-08-31 22:35:05 +00:00
Bruno Cardoso Lopes	61996ef835	Use x86 specific MOVSHDUP node and add more patterns to match it llvm-svn: 112657	2010-08-31 22:22:11 +00:00
Bruno Cardoso Lopes	5de15ce468	Use MOVHLPS node instead of matching using movhlps and movhlps_undef pattern fragments llvm-svn: 112644	2010-08-31 21:38:49 +00:00
Bruno Cardoso Lopes	03e4c35302	Use MOVLHPS and MOVHLPS x86 nodes whenever possible. Also remove some useless nodes llvm-svn: 112642	2010-08-31 21:15:21 +00:00
Bruno Cardoso Lopes	dfd9dd5d75	Use X86ISD::MOVSS and MOVSD to represent the movl mask pattern, also fix the handling of those nodes when seeking for scalars inside vector shuffles llvm-svn: 112570	2010-08-31 02:26:40 +00:00
Chris Lattner	94656b1c8c	fix the buildvector->insertp[sd] logic to not always create a redundant insertp[sd] $0, which is a noop. Before: _f32: ## @f32 pshufd $1, %xmm1, %xmm2 pshufd $1, %xmm0, %xmm3 addss %xmm2, %xmm3 addss %xmm1, %xmm0 ## kill: XMM0<def> XMM0<kill> XMM0<def> insertps $0, %xmm0, %xmm0 insertps $16, %xmm3, %xmm0 ret after: _f32: ## @f32 movdqa %xmm0, %xmm2 addss %xmm1, %xmm2 pshufd $1, %xmm1, %xmm1 pshufd $1, %xmm0, %xmm3 addss %xmm1, %xmm3 movdqa %xmm2, %xmm0 insertps $16, %xmm3, %xmm0 ret The extra movs are due to a random (poor) scheduling decision. llvm-svn: 112379	2010-08-28 17:59:08 +00:00
Chris Lattner	bcb6090ad0	fix the BuildVector -> unpcklps logic to not do pointless shuffles when the top elements of a vector are undefined. This happens all the time for X86-64 ABI stuff because only the low 2 elements of a 4 element vector are defined. For example, on: _Complex float f32(_Complex float A, _Complex float B) { return A+B; } We used to produce (with SSE2, SSE4.1+ uses insertps): _f32: ## @f32 movdqa %xmm0, %xmm2 addss %xmm1, %xmm2 pshufd $16, %xmm2, %xmm2 pshufd $1, %xmm1, %xmm1 pshufd $1, %xmm0, %xmm0 addss %xmm1, %xmm0 pshufd $16, %xmm0, %xmm1 movdqa %xmm2, %xmm0 unpcklps %xmm1, %xmm0 ret We now produce: _f32: ## @f32 movdqa %xmm0, %xmm2 addss %xmm1, %xmm2 pshufd $1, %xmm1, %xmm1 pshufd $1, %xmm0, %xmm3 addss %xmm1, %xmm3 movaps %xmm2, %xmm0 unpcklps %xmm3, %xmm0 ret This implements rdar://8368414 llvm-svn: 112378	2010-08-28 17:28:30 +00:00
Chris Lattner	96db6e66f4	improve comments in the unpcklps generating logic, introduce a new EltStride variable instead of reusing NumElems variable for a non-obvious purpose. No functionality change. llvm-svn: 112377	2010-08-28 17:15:43 +00:00
Bruno Cardoso Lopes	a982aa24ef	Clean up the logic of vector shuffles -> vector shifts. Also teach this logic how to handle target specific shuffles if needed, this is necessary while searching recursively for zeroed scalar elements in vector shuffle operands. llvm-svn: 112348	2010-08-28 02:46:39 +00:00
Anton Korobeynikov	c0b36921c2	Properly handle passing of FP stuff to varargs function on Win64: value should be copied to the corresponding shadow reg as well. Patch by Cameron Esfahani! llvm-svn: 112262	2010-08-27 14:43:06 +00:00
Bruno Cardoso Lopes	e25ba0c7c2	zap the now unused MVT::getIntVectorWithNumElements llvm-svn: 112218	2010-08-26 20:53:12 +00:00
Chris Lattner	eb2cc0ce0e	implement SplitVecOp_CONCAT_VECTORS, fixing the included testcase with SSE1. llvm-svn: 112171	2010-08-26 05:51:22 +00:00
Chris Lattner	cc60609cb4	fix sse1 only codegen in x86-64 mode, which is something we apparently try to support. llvm-svn: 112168	2010-08-26 05:24:29 +00:00
Bruno Cardoso Lopes	d4085f6e91	Revert this for now, PUNPCKLDQ dont operate on v4f32 llvm-svn: 112090	2010-08-25 21:26:37 +00:00
Anton Korobeynikov	b3b53ecac0	Fix nasty mingw32 bug, which e.g. prevented llvm-gcc bootstrap there. Mark _alloca call as clobberring EFLAGS, otherwise some DCE might remove other flags-clobberring stuff (e.g. cmp instructions) occuring after _alloca call. llvm-svn: 112034	2010-08-25 07:50:11 +00:00
Bruno Cardoso Lopes	0770d25758	PUNPCKLDQ should also be used for v4f32 llvm-svn: 112020	2010-08-25 02:55:40 +00:00
Bruno Cardoso Lopes	2e45d522c1	teach lowering to get target specific nodes for pshufd, emulating the same isel behavior for now, so we can pass all vector shuffle tests llvm-svn: 112017	2010-08-25 02:35:37 +00:00
Dan Gohman	c88fda477a	Fix X86's isLegalAddressingMode to recognize that static addresses need not be RIP-relative in small mode. llvm-svn: 111917	2010-08-24 15:55:12 +00:00
Bruno Cardoso Lopes	758d7b1f5c	Use pshufhw and pshuflw in more cases and fix getTargetShuffleNode number of arguments llvm-svn: 111890	2010-08-24 01:16:15 +00:00
Bruno Cardoso Lopes	264d90fff7	Start using target speficic nodes for shuffles: pshufhw and pshuflw llvm-svn: 111837	2010-08-23 20:41:02 +00:00
Anton Korobeynikov	cbbe4501df	Revert invalid r111792. Jump tables are not broken on x86-64 / coff, it's COFF emitter which does not support differences of two symbols (and needs to be fixed). GAS is pretty fine with code produced. llvm-svn: 111801	2010-08-23 07:38:51 +00:00
Michael J. Spencer	e87231232a	Workaround broken jump tables on x86-64 COFF. llvm-svn: 111792	2010-08-23 04:45:37 +00:00
Bruno Cardoso Lopes	9f20e7a1bf	Prepare LowerVECTOR_SHUFFLEv8i16 to use x86 target specific nodes directly llvm-svn: 111704	2010-08-21 01:32:18 +00:00
Bruno Cardoso Lopes	6f3b38a851	This is the first step towards refactoring the x86 vector shuffle code. The general idea here is to have a group of x86 target specific nodes which are going to be selected during lowering and then directly matched in isel. The commit includes the addition of those specific nodes and a bunch of patterns, and incrementally we're going to switch between them and what we have right now. Both the patterns and target specific nodes can change as we move forward with this work. llvm-svn: 111691	2010-08-20 22:55:05 +00:00
Anton Korobeynikov	231ab847ca	More fixes for win64: - Do not clobber al during variadic calls, this is AMD64 ABI-only feature - Emit wincall64, where necessary Patch by Cameron Esfahani! llvm-svn: 111289	2010-08-17 21:06:07 +00:00
Eric Christopher	54194bd127	Rework how the non-sse2 memory barrier is lowered so that the encoding is correct for the built-in assembler. Based on a patch from Chris. llvm-svn: 111083	2010-08-14 21:51:50 +00:00
Chris Lattner	2f6c3434ac	improve indentation llvm-svn: 111073	2010-08-14 17:26:09 +00:00
Bruno Cardoso Lopes	081861b6b7	Fix comment to reflect code, and remove an unused argument llvm-svn: 111022	2010-08-13 17:50:47 +00:00
Bruno Cardoso Lopes	7306c86886	Begin to support some vector operations for AVX 256-bit intructions. The long term goal here is to be able to match enough of vector_shuffle and build_vector so all avx intrinsics which aren't mapped to their own built-ins but to shufflevector calls can be codegen'd. This is the first (baby) step, support building zeroed vectors. llvm-svn: 110897	2010-08-12 02:06:36 +00:00
Dan Gohman	5531aa4de1	Use ISD::ADD instead of ISD::SUB with a negated constant. This avoids trouble if the return type of TD->getPointerSize() is changed to something which doesn't promote to a signed type, and is simpler anyway. Also, use getCopyFromReg instead of getRegister to read a physical register's value. llvm-svn: 110835	2010-08-11 18:14:00 +00:00
Bruno Cardoso Lopes	91d61df3eb	Add AVX matching patterns to Packed Bit Test intrinsics. Apply the same approach of SSE4.1 ptest intrinsics but create a new x86 node "testp" since AVX introduces vtest{ps}{pd} instructions which set ZF and CF depending on sign bit AND and ANDN of packed floating-point sources. This is slightly different from what the "ptest" does. Tests comming with the other 256 intrinsics tests. llvm-svn: 110744	2010-08-10 23:25:42 +00:00
Bruno Cardoso Lopes	85da72a88f	Support AVX 256-bit load and store intrinsics llvm-svn: 110645	2010-08-10 01:43:16 +00:00
Bruno Cardoso Lopes	77954bdf7a	Support very basic (doesn't include ABI support in the front-end, varags, ...) 256-bit argument passing and return for AVX llvm-svn: 110394	2010-08-05 23:35:51 +00:00
Eric Christopher	2db8464282	Make x86-64 membarriers work without sse and clean up some of the uses. llvm-svn: 110274	2010-08-04 23:03:04 +00:00
Bruno Cardoso Lopes	349165b48f	Support all 128-bit AVX vector intrinsics. Most part of them I already declared during the addition of the assembler support, the additional changes are: - Add missing intrinsics - Move all SSE conversion instructions in X86InstInfo64.td to the SSE.td file. - Duplicate some patterns to AVX mode. - Step into PCMPEST/PCMPIST custom inserter and add AVX versions. llvm-svn: 109878	2010-07-30 19:54:33 +00:00
Jakob Stoklund Olesen	ba0e124aaf	Revert r109652, and remove the offending assert in loadRegFromStackSlot instead. We do sometimes load from a too small stack slot when dealing with x86 arguments (varargs and smaller-than-32-bit args). It looks like we know what we are doing in those cases, so I am going to remove the assert instead of artifically enlarging stack slot sizes. The assert in storeRegToStackSlot stays in. We don't want to write beyond the bounds of a stack slot. llvm-svn: 109764	2010-07-29 17:42:27 +00:00
Jakob Stoklund Olesen	f2234fbe70	Create a fixed stack object for varargs that is as large as any register. The size of this object isn't used for anything - technically it is of variable size. This avoids a false positive from the assert in X86InstrInfo::loadRegFromStackSlot, and fixes PR7735. llvm-svn: 109652	2010-07-28 20:55:38 +00:00
Nate Begeman	53afc8f06a	Implement a vectorized algorithm for <16 x i8> << <16 x i8> This is about 4x faster and smaller than the existing scalarization. llvm-svn: 109566	2010-07-28 00:21:48 +00:00
Nate Begeman	269a6da023	~40% faster vector shl <4 x i32> on SSE 4.1 Larger improvements for smaller types coming in future patches. For: define <2 x i64> @shl(<4 x i32> %r, <4 x i32> %a) nounwind readnone ssp { entry: %shl = shl <4 x i32> %r, %a ; <<4 x i32>> [#uses=1] %tmp2 = bitcast <4 x i32> %shl to <2 x i64> ; <<2 x i64>> [#uses=1] ret <2 x i64> %tmp2 } We get: _shl: ## @shl pslld $23, %xmm1 paddd LCPI0_0, %xmm1 cvttps2dq %xmm1, %xmm1 pmulld %xmm1, %xmm0 ret Instead of: _shl: ## @shl pshufd $3, %xmm0, %xmm2 movd %xmm2, %eax pshufd $3, %xmm1, %xmm2 movd %xmm2, %ecx shll %cl, %eax movd %eax, %xmm2 pshufd $1, %xmm0, %xmm3 movd %xmm3, %eax pshufd $1, %xmm1, %xmm3 movd %xmm3, %ecx shll %cl, %eax movd %eax, %xmm3 punpckldq %xmm2, %xmm3 movd %xmm0, %eax movd %xmm1, %ecx shll %cl, %eax movd %eax, %xmm2 movhlps %xmm0, %xmm0 movd %xmm0, %eax movhlps %xmm1, %xmm1 movd %xmm1, %ecx shll %cl, %eax movd %eax, %xmm0 punpckldq %xmm0, %xmm2 movdqa %xmm2, %xmm0 punpckldq %xmm3, %xmm0 ret llvm-svn: 109549	2010-07-27 22:37:06 +00:00
Evan Cheng	d4218b8793	On x86, f32 / f64 nodes share the same registers as 128-bit vector values. llvm-svn: 109450	2010-07-26 21:50:05 +00:00
Evan Cheng	37b740c4bf	Add an ILP scheduler. This is a register pressure aware scheduler that's appropriate for targets without detailed instruction iterineries. The scheduler schedules for increased instruction level parallelism in low register pressure situation; it schedules to reduce register pressure when the register pressure becomes high. On x86_64, this is a win for all tests in CFP2000. It also sped up 256.bzip2 by 16%. llvm-svn: 109300	2010-07-24 00:39:05 +00:00
Dale Johannesen	f2d75670b7	The only supported calling convention for X86-64 uses SSE, so we can't return floating point values if this is disabled. Detect this error for clang. With SSE1 only, f64 is a problem; it can be done, but neither llvm-gcc nor clang has ever generated correct code for it. Since nobody noticed this I think it's OK to treat it as an error for now. This also handles SSE-sized vectors of floating point. 8207686, 8204109. llvm-svn: 109201	2010-07-23 00:30:35 +00:00
Eric Christopher	9a77382685	Custom lower the memory barrier instructions and add support for lowering without sse2. Add a couple of new testcases. Fixes a few libgomp tests and latent bugs. Remove a few todos. llvm-svn: 109078	2010-07-22 02:48:34 +00:00
Eric Christopher	a4c435f1fa	80-columns. llvm-svn: 109070	2010-07-22 00:26:08 +00:00
Nate Begeman	784e062b2a	Fix a couple issues with Win64 ABI 1) all registers were spilled as xmm, regardless of actual size 2) win64 abi doesn't do the varargs-size-in-%al thing Still to look into: xmm6-15 are marked as clobbered by call instructions on win64 even though they aren't. llvm-svn: 109035	2010-07-21 20:49:52 +00:00
Eric Christopher	d27913e516	Pulling out previous patch, must've run the tests in the wrong directory. llvm-svn: 109005	2010-07-21 09:23:56 +00:00
Eric Christopher	b2d1067024	Lower MEMBARRIER on x86 and support processors without SSE2. Fixes a pile of libgomp failures in the llvm-gcc testsuite due to the libcall not existing. llvm-svn: 109004	2010-07-21 09:05:23 +00:00
Evan Cheng	55f0c6b9fc	Split -enable-finite-only-fp-math to two options: -enable-no-nans-fp-math and -enable-no-infs-fp-math. All of the current codegen fp math optimizations only care whether the fp arithmetics arguments and results can never be NaN. llvm-svn: 108465	2010-07-15 22:07:12 +00:00
Jakob Stoklund Olesen	9b449d5a92	Use TargetOpcode::COPY instead of X86-native register copy instructions when lowering atomics. This will allow those copies to still be coalesced after TII::isMoveInstr is removed. llvm-svn: 108385	2010-07-14 23:50:27 +00:00
Evan Cheng	a8e8874552	Fix for PR7193 was overly conservative. The only case where sibcall callee address cannot be allocated a register is in 32-bit mode where the first three arguments are marked inreg. In that case EAX, EDX, and ECX will be used for argument passing. This fixes PR7610. llvm-svn: 108327	2010-07-14 06:44:01 +00:00
Dan Gohman	d7b5ce3312	Reapply bottom-up fast-isel, with several fixes for x86-32: - Check getBytesToPopOnReturn(). - Eschew ST0 and ST1 for return values. - Fix the PIC base register initialization so that it doesn't ever fail to end up the top of the entry block. llvm-svn: 108039	2010-07-10 09:00:22 +00:00
Jakob Stoklund Olesen	be8d9b0bb8	An x86 function returns a floating point value in st(0), and we must make sure it is popped, even if it is ununsed. A CopyFromReg node is too weak to represent the required sideeffect, so insert an FpGET_ST0 instruction directly instead. This will matter when CopyFromReg gets lowered to a generic COPY instruction. llvm-svn: 108037	2010-07-10 04:04:25 +00:00
Bob Wilson	6586e9b203	--- Reverse-merging r107947 into '.': U utils/TableGen/FastISelEmitter.cpp --- Reverse-merging r107943 into '.': U test/CodeGen/X86/fast-isel.ll U test/CodeGen/X86/fast-isel-loads.ll U include/llvm/Target/TargetLowering.h U include/llvm/Support/PassNameParser.h U include/llvm/CodeGen/FunctionLoweringInfo.h U include/llvm/CodeGen/CallingConvLower.h U include/llvm/CodeGen/FastISel.h U include/llvm/CodeGen/SelectionDAGISel.h U lib/CodeGen/LLVMTargetMachine.cpp U lib/CodeGen/CallingConvLower.cpp U lib/CodeGen/SelectionDAG/SelectionDAGBuilder.cpp U lib/CodeGen/SelectionDAG/FunctionLoweringInfo.cpp U lib/CodeGen/SelectionDAG/FastISel.cpp U lib/CodeGen/SelectionDAG/SelectionDAGISel.cpp U lib/CodeGen/SelectionDAG/ScheduleDAGSDNodes.cpp U lib/CodeGen/SelectionDAG/InstrEmitter.cpp U lib/CodeGen/SelectionDAG/TargetLowering.cpp U lib/Target/XCore/XCoreISelLowering.cpp U lib/Target/XCore/XCoreISelLowering.h U lib/Target/X86/X86ISelLowering.cpp U lib/Target/X86/X86FastISel.cpp U lib/Target/X86/X86ISelLowering.h llvm-svn: 107987	2010-07-09 16:37:18 +00:00
Dan Gohman	0a7d155d67	Fix the memoperand offsets in code generated for va_start. llvm-svn: 107948	2010-07-09 01:06:48 +00:00
Dan Gohman	0b5aa1cdd3	Re-apply bottom-up fast-isel, with fixes. Be very careful to avoid emitting a DBG_VALUE after a terminator, or emitting any instructions before an EH_LABEL. llvm-svn: 107943	2010-07-09 00:39:23 +00:00
Chris Lattner	f469307c77	Change LEA to have 5 operands for its memory operand, just like all other instructions, even though a segment is not allowed. This resolves a bunch of gross hacks in the encoder and makes LEA more consistent with the rest of the instruction set. No functionality change. llvm-svn: 107934	2010-07-08 23:46:44 +00:00
Chris Lattner	ec536276f0	add some long-overdue enums to refer to the parts of the 5-operand X86 memory operand. llvm-svn: 107925	2010-07-08 22:41:28 +00:00
Dan Gohman	e75704369d	Revert 107840 107839 107813 107804 107800 107797 107791. Debug info intrinsics win for now. llvm-svn: 107850	2010-07-08 01:00:56 +00:00
Evan Cheng	1c349f18f8	Move getExtLoad() and (some) getLoad() DebugLoc argument after EVT argument for consistency sake. llvm-svn: 107820	2010-07-07 22:15:37 +00:00
Dan Gohman	2d4d01d0de	Add X86FastISel support for return statements. This entails refactoring a bunch of stuff, to allow the target-independent calling convention logic to be employed. llvm-svn: 107800	2010-07-07 18:32:53 +00:00
Dan Gohman	87fb4e8fcd	Simplify FastISel's constructor by giving it a FunctionLoweringInfo instance, rather than pointers to all of FunctionLoweringInfo's members. This eliminates an NDEBUG ABI sensitivity. llvm-svn: 107789	2010-07-07 16:29:44 +00:00
Dan Gohman	fe7532a308	Split the SDValue out of OutputArg so that SelectionDAG-independent code can do calling-convention queries. This obviates OutputArgReg. llvm-svn: 107786	2010-07-07 15:54:55 +00:00
Dale Johannesen	ce65663330	Accept RIP-relative symbols with 'i' constraint, and print the (%rip) only if the 'a' modifier is present. PR 7528. llvm-svn: 107727	2010-07-06 23:27:00 +00:00
Dan Gohman	ee0cb70381	CanLowerReturn doesn't need a SelectionDAG; it just needs an LLVMContext. SelectBasicBlock doesn't needs its BasicBlock argument. llvm-svn: 107712	2010-07-06 22:19:37 +00:00
Devang Patel	a3ca21b228	Propagate debug loc. llvm-svn: 107710	2010-07-06 22:08:15 +00:00
Dan Gohman	3439629239	Reapply r107655 with fixes; insert the pseudo instruction into the block before calling the expansion hook. And don't put EFLAGS in a mbb's live-in list twice. llvm-svn: 107691	2010-07-06 20:24:04 +00:00
Dan Gohman	f4f04107ef	Revert r107655. llvm-svn: 107668	2010-07-06 15:49:48 +00:00
Dan Gohman	12205645a6	Fix a bunch of custom-inserter functions to handle the case where the pseudo instruction is not at the end of the block. llvm-svn: 107655	2010-07-06 15:18:19 +00:00
Eric Christopher	2ad0c779c3	Fix up -fstack-protector on linux to use the segment registers. Split out testcases per architecture and os now. Patch from Nelson Elhage. llvm-svn: 107640	2010-07-06 05:18:56 +00:00
Eric Christopher	d429846eca	Have the X86 backend use Triple instead of a string and some enums. llvm-svn: 107625	2010-07-05 19:26:33 +00:00
Chris Lattner	c4a7073db3	more tidying. llvm-svn: 107615	2010-07-05 05:53:14 +00:00
Chris Lattner	45cc4d74a3	Just rip v2f32 support completely out of the X86 backend. In the example in the testcase, we now generate: _test1: ## @test1 movss 4(%esp), %xmm0 addss 8(%esp), %xmm0 movl 12(%esp), %eax movss %xmm0, (%eax) ret instead of: _test1: ## @test1 subl $20, %esp movl 24(%esp), %eax movq %mm0, (%esp) movq %mm0, 8(%esp) movss (%esp), %xmm0 addss 12(%esp), %xmm0 movss %xmm0, (%eax) addl $20, %esp ret v2f32 support did not work reliably because most of the X86 backend didn't know it was legal. It was apparently only added to support returning source-level v2f32 values in MMX registers in x86-32 mode. If ABI compatibility is important on this GCC-extended-vector type for some reason, then the frontend should generate IR that returns v2i32 instead of v2f32. However, we generally don't try very hard to be abi compatible on gcc extended vectors. llvm-svn: 107601	2010-07-04 23:07:25 +00:00
Chris Lattner	681b926d54	fix PR7518 - terrible codegen of <2 x float>, by only marking v2f32 as legal in 32-bit mode. It is just as terrible there, but I just care about x86-64 and noone claims it is valuable in 64-bit mode. llvm-svn: 107600	2010-07-04 22:57:10 +00:00
Evan Cheng	0664a67fe1	Remove isSS argument from CreateFixedObject. Fixed objects cannot be spill slots so it's always false. llvm-svn: 107550	2010-07-03 00:40:23 +00:00
Gabor Greif	12ca3d9fac	use ArgOperand API llvm-svn: 107280	2010-06-30 13:03:37 +00:00
Duncan Sands	67bfa9d109	Remove pointless and unused variables. llvm-svn: 107130	2010-06-29 12:48:49 +00:00
Bill Wendling	0a5bb081cc	Reduce indentation via early exit. NFC. llvm-svn: 107067	2010-06-28 21:08:32 +00:00
Gabor Greif	83205af3fa	use ArgOperand API llvm-svn: 106944	2010-06-26 11:51:52 +00:00
Dale Johannesen	ce97d55ad9	The hasMemory argument is irrelevant to how the argument for an "i" constraint should get lowered; PR 6309. While this argument was passed around a lot, this is the only place it was used, so it goes away from a lot of other places. llvm-svn: 106893	2010-06-25 21:55:36 +00:00
Bill Wendling	e41e40f689	- Reapply r106066 now that the bzip2 build regression has been fixed. - 2010-06-25-CoalescerSubRegDefDead.ll is the testcase for r106878. llvm-svn: 106880	2010-06-25 20:48:10 +00:00
Dale Johannesen	5ad5226c58	Disallow matching "i" constraint to symbol addresses when address requires a register or secondary load to compute (most PIC modes). This improves "g" constraint handling. 8015842. The test from 2007 is attempting to test the fix for PR1761, but since -relocation-model=static doesn't work on Darwin x86-64, it was not testing what it was supposed to be testing and was passing erroneously. Fixed to use Linux x86-64. llvm-svn: 106779	2010-06-24 20:14:51 +00:00
Dan Gohman	600f62b3ba	Reapply r106634, now that the bug it exposed is fixed. llvm-svn: 106746	2010-06-24 14:30:44 +00:00
Dan Gohman	c3e291c560	Fix a bug in the code which determines when it's safe to use the bt instruction, which was exposed by r106263. llvm-svn: 106718	2010-06-24 02:07:59 +00:00
Daniel Dunbar	4df321b7ad	Revert r106263, "Fold the ShrinkDemandedOps pass into the regular DAGCombiner pass,"... it was causing both 'file' (with clang) and 176.gcc (with llvm-gcc) to be miscompiled. llvm-svn: 106634	2010-06-23 17:09:26 +00:00
Jim Grosbach	6f71039fa4	The generic DAG combiner can now fold atomic fences when needed, so switch to using that. llvm-svn: 106633	2010-06-23 16:25:07 +00:00
Daniel Dunbar	ef5a4383ad	Revert r106066, "Create a more targeted fix for not sinking instructions into a range where it"... it causes bzip2 to be miscompiled by Clang. Conflicts: lib/CodeGen/MachineSink.cpp llvm-svn: 106614	2010-06-23 00:48:25 +00:00
Jim Grosbach	6c275bc5a2	fix typo llvm-svn: 106574	2010-06-22 20:52:02 +00:00
Nick Lewycky	dcc7b6dcb6	Fix warning in no-asserts build. llvm-svn: 106405	2010-06-20 20:27:42 +00:00
Dan Gohman	92c11acdb8	Change UpdateNodeOperands' operand and return value from SDValue to SDNode *, since it doesn't care about the ResNo value. llvm-svn: 106282	2010-06-18 15:30:29 +00:00
Dan Gohman	c3479f5342	Delete unused variables. llvm-svn: 106280	2010-06-18 14:32:32 +00:00
Dan Gohman	f1d8304fe3	Eliminate unnecessary uses of getZExtValue(). llvm-svn: 106279	2010-06-18 14:22:04 +00:00
Dan Gohman	35b6f9a929	isValueValidForType can be a static member function. llvm-svn: 106278	2010-06-18 14:01:07 +00:00
Dan Gohman	b92156d5e4	Fold the ShrinkDemandedOps pass into the regular DAGCombiner pass, which is faster, simpler, and less surprising. llvm-svn: 106263	2010-06-18 01:05:21 +00:00
Bill Wendling	8c0cf0994d	Create a more targeted fix for not sinking instructions into a range where it will conflict with another live range. The place which creates this scenerio is the code in X86 that lowers a select instruction by splitting the MBBs. This eliminates the need to check from the bottom up in an MBB for live pregs. llvm-svn: 106066	2010-06-15 23:46:31 +00:00
Eric Christopher	6c4d63e1a5	For 32-bit non-pic tlv mach-o addressing we don't need a pic base or a relative address. llvm-svn: 106064	2010-06-15 23:08:42 +00:00
Eric Christopher	89d103a8ce	Ensure that mov and not lea are used to stick the address into the register. While we're at it, make sure it's in the right one. llvm-svn: 105645	2010-06-08 22:04:25 +00:00
Dale Johannesen	df1a7f83bf	Fix some liveout handling related to tail calls, see comments. I don't think this ever resulted in problems on x86, but it would on ARM. llvm-svn: 105509	2010-06-05 00:30:45 +00:00
Eric Christopher	b0e1a458ce	Add first pass at darwin tls compiler support. llvm-svn: 105381	2010-06-03 04:07:48 +00:00
Eli Friedman	6e3d5af945	Fix comment so it doesn't include comments which are irrelevant to the x86 backend. Add a FIXME noting what can be fixed here. llvm-svn: 105342	2010-06-02 19:35:46 +00:00
Dan Gohman	a690618c58	Use comments to document non-obvious code rather than mailing list archives. llvm-svn: 105341	2010-06-02 19:13:40 +00:00
Eli Friedman	526e6d045f	Don't try to custom-lower 64-bit add-with-overflow and friends on x86-32; the x86 backend currently doesn't know how to handle them. This doesn't really fix anything because LegalizeTypes doesn't know how to handle them either. We do get a better error message, though. llvm-svn: 105305	2010-06-02 00:27:18 +00:00
Evan Cheng	27c4933e02	Fix PR7193: if sibling call address can take a register, make sure there are enough registers available by counting inreg arguments. llvm-svn: 105092	2010-05-29 01:35:22 +00:00
Dale Johannesen	e8be73f3e7	Fix comment typos. llvm-svn: 105059	2010-05-28 23:24:28 +00:00
Dale Johannesen	9e43c07bc5	Mark some math lib intrinsic nodes Legal on SSE4.1. No functional effect as these nodes are not generated yet. llvm-svn: 104879	2010-05-27 20:12:41 +00:00
Dan Gohman	dc53f1cb5c	FastISel doesn't yet handle callee-pop functions. To support this, move IsCalleePop from X86ISelLowering to X86Subtarget. llvm-svn: 104866	2010-05-27 18:43:40 +00:00
Zhongxing Xu	730a977e02	SRetReturnReg was set in LowerFormalArguments(). So only assert it here. llvm-svn: 104691	2010-05-26 08:10:02 +00:00
Evan Cheng	168ced94d8	Implement @llvm.returnaddress. rdar://8015977. llvm-svn: 104421	2010-05-22 01:47:14 +00:00
Dale Johannesen	2b78565842	Previous commit message should refer to 104308. llvm-svn: 104337	2010-05-21 18:44:47 +00:00
Dale Johannesen	6361e3e8a2	Fix two bugs in 104348: Case where MMX is disabled wasn't handled right. MMX->MMX bitconverts are Legal. llvm-svn: 104336	2010-05-21 18:40:15 +00:00
Dale Johannesen	b3b9c8ac48	Fix i64->f64 conversion, x86-64, -no-sse. A bit tricky since there's a 3rd 64-bit type, MMX vectors. PR 7135. llvm-svn: 104308	2010-05-21 00:52:33 +00:00
Evan Cheng	738e920edf	Code refactoring: pull SchedPreference enum from TargetLowering.h to TargetMachine.h and put it in its own namespace. llvm-svn: 104147	2010-05-19 20:19:50 +00:00
Dale Johannesen	2ef974ee0e	Revert 103911; it broke a test that expects bitconvert <1xi64> -> i64 to work in MMX registers on hosts where -no-sse is the default (not mine). The right thing is to accept this and make i64->f64 conversions go through memory, but I don't have time right now. llvm-svn: 103914	2010-05-16 20:19:04 +00:00
Dale Johannesen	fc1492d71b	Make x86-64 64-bit bitconvert work when SSE is not available. (This worked as of about 6 months ago and I didn't track down exactly what broke it; I think this fix is appropriate.) llvm-svn: 103911	2010-05-16 18:22:38 +00:00
Anton Korobeynikov	8f35fabbc1	Add support for thiscall calling convention. Patch by Charles Davis and Steven Watanabe! llvm-svn: 103902	2010-05-16 09:08:45 +00:00
Dale Johannesen	3a366a88f2	Fix uint64->{float, double} conversion to do rounding correctly in 32-bit. The implementation in LegalizeIntegerTypes to handle this as sint64->float + appropriate power of 2 is subject to double rounding, considered incorrect by numerics people. Use this implementation only when it is safe. This leads to using library calls in some cases that produced inline code before, but it's correct now. (EVTToAPFloatSemantics belongs somewhere else, any suggestions?) Add a correctly rounding (though not particularly fast) conversion that uses X87 80-bit computations for x86-32. 7885399, 5901940. This shows up in gcc.c-torture/execute/ieee/rbug.c in the gcc testsuite on some platforms. llvm-svn: 103883	2010-05-15 18:51:12 +00:00
Bill Wendling	95f6ebcb37	Rename "HasCalls" in MachineFrameInfo to "AdjustsStack" to better describe what the variable actually tracks. N.B., several back-ends are using "HasCalls" as being synonymous for something that adjusts the stack. This isn't 100% correct and should be looked into. llvm-svn: 103802	2010-05-14 21:14:32 +00:00
Dan Gohman	35dd005d22	Lowering of atomic instructions can result in operands being used more than once. If ISel had put a kill flag on one of them, it's not valid to transfer the kill flag to each new instance. llvm-svn: 103799	2010-05-14 21:01:44 +00:00
Dan Gohman	bb919dfb6b	Implement a bunch more TargetSelectionDAGInfo infrastructure. Move EmitTargetCodeForMemcpy, EmitTargetCodeForMemset, and EmitTargetCodeForMemmove out of TargetLowering and into SelectionDAGInfo to exercise this. llvm-svn: 103481	2010-05-11 17:31:57 +00:00
Dan Gohman	25c1653700	Get rid of the EdgeMapping map. Instead, just check for BasicBlock changes before doing phi lowering for switches. llvm-svn: 102809	2010-05-01 00:01:06 +00:00
Dan Gohman	2e2cc87081	Make this code less confusing. Instead of reassigning BB, just operate on the original variables, so it's easier to see what is being done to which blocks. llvm-svn: 102759	2010-04-30 20:14:26 +00:00
Dan Gohman	57bb73c80b	Remove the -disable-16bit command-line option, which is now obsolete. llvm-svn: 102730	2010-04-30 18:30:26 +00:00
Evan Cheng	5117a555e0	Another sibcall bug. If caller and callee calling conventions differ, then it's only safe to do a tail call if the results are returned in the same way. llvm-svn: 102683	2010-04-30 01:12:32 +00:00
Evan Cheng	050df1b8de	Enable i16 to i32 promotion by default. llvm-svn: 102493	2010-04-28 08:30:49 +00:00
Evan Cheng	d21f564543	Unbreak the build. Only form shld / shrd after legalization. llvm-svn: 102488	2010-04-28 02:25:18 +00:00

1 2 3 4 5 ...

1483 Commits