llvm-project

Commit Graph

Author	SHA1	Message	Date
Nate Begeman	4b9db07b02	Implement feedback from Bruno on making pblendvb an x86-specific ISD node in addition to being an intrinsic, and convert lowering to use it. Hopefully the pattern fragment is doing the right thing with XMM0, looks correct in testing. llvm-svn: 122277	2010-12-20 22:04:24 +00:00
Chris Lattner	846c20d4e6	Change the X86 backend to stop using the evil ADDC/ADDE/SUBC/SUBE nodes (which their carry depenedencies with MVT::Flag operands) and use clean and beautiful EFLAGS dependences instead. We do this by changing the modelling of SBB/ADC to have EFLAGS input and outputs (which is what requires the previous scheduler change) and change X86 ISelLowering to custom lower ADDC and friends down to X86ISD::ADD/ADC/SUB/SBB nodes. With the previous series of changes, this causes no changes in the testsuite, woo. llvm-svn: 122213	2010-12-20 00:59:46 +00:00
Chris Lattner	9edf3f50bf	improve the setcc -> setcc_carry optimization to happen more consistently by moving it out of lowering into dag combine. Add some missing patterns for matching away extended versions of setcc_c. llvm-svn: 122201	2010-12-19 22:08:31 +00:00
Nate Begeman	97b72c99d2	Add support for matching psign & plendvb to the x86 target Remove unnecessary pandn patterns, 'vnot' patfrag looks through bitcasts llvm-svn: 122098	2010-12-17 22:55:37 +00:00
Chris Lattner	364bb0a081	it turns out that when ".with.overflow" intrinsics were added to the X86 backend that they were all implemented except umul. This one fell back to the default implementation that did a hi/lo multiply and compared the top. Fix this to check the overflow flag that the 'mul' instruction sets, so we can avoid an explicit test. Now we compile: void *func(long count) { return new int[count]; } into: __Z4funcl: ## @_Z4funcl movl $4, %ecx ## encoding: [0xb9,0x04,0x00,0x00,0x00] movq %rdi, %rax ## encoding: [0x48,0x89,0xf8] mulq %rcx ## encoding: [0x48,0xf7,0xe1] seto %cl ## encoding: [0x0f,0x90,0xc1] testb %cl, %cl ## encoding: [0x84,0xc9] movq $-1, %rdi ## encoding: [0x48,0xc7,0xc7,0xff,0xff,0xff,0xff] cmoveq %rax, %rdi ## encoding: [0x48,0x0f,0x44,0xf8] jmp __Znam ## TAILCALL instead of: __Z4funcl: ## @_Z4funcl movl $4, %ecx ## encoding: [0xb9,0x04,0x00,0x00,0x00] movq %rdi, %rax ## encoding: [0x48,0x89,0xf8] mulq %rcx ## encoding: [0x48,0xf7,0xe1] testq %rdx, %rdx ## encoding: [0x48,0x85,0xd2] movq $-1, %rdi ## encoding: [0x48,0xc7,0xc7,0xff,0xff,0xff,0xff] cmoveq %rax, %rdi ## encoding: [0x48,0x0f,0x44,0xf8] jmp __Znam ## TAILCALL Other than the silly seto+test, this is using the o bit directly, so it's going in the right direction. llvm-svn: 120935	2010-12-05 07:30:36 +00:00
Evan Cheng	d4b0873c06	Enable sibling call optimization of libcalls which are expanded during legalization time. Since at legalization time there is no mapping from SDNode back to the corresponding LLVM instruction and the return SDNode is target specific, this requires a target hook to check for eligibility. Only x86 and ARM support this form of sibcall optimization right now. rdar://8707777 llvm-svn: 120501	2010-11-30 23:55:39 +00:00
Eric Christopher	1a86e8461a	Fix some cleanups from my last patch. llvm-svn: 120410	2010-11-30 08:10:28 +00:00
Eric Christopher	fa6657cec0	Rewrite mwait and monitor support and custom lower arguments. Fixes PR8573. llvm-svn: 120404	2010-11-30 07:20:12 +00:00
Rafael Espindola	5d882894d8	Lower TLS_addr32 and TLS_addr64. llvm-svn: 120225	2010-11-27 20:43:02 +00:00
Wesley Peck	527da1b6e2	Renaming ISD::BIT_CONVERT to ISD::BITCAST to better reflect the LLVM IR concept. llvm-svn: 119990	2010-11-23 03:31:01 +00:00
Duncan Sands	7c601ded34	On X86, MEMBARRIER, MFENCE, SFENCE, LFENCE are not target memory intrinsics, so don't claim they are. They are allocated using DAG.getNode, so attempts to access MemSDNode fields results in reading off the end of the allocated memory. This fixes crashes with "llc -debug" due to debug code trying to print MemSDNode fields for these barrier nodes (since the crashes are not deterministic, use valgrind to see this). Add some nasty checking to try to catch this kind of thing in the future. llvm-svn: 119901	2010-11-20 11:25:00 +00:00
Chris Lattner	7077efe894	move the pic base symbol stuff up to MachineFunction since it is trivial and will be shared between ppc and x86. This substantially simplifies the X86 backend also. llvm-svn: 119089	2010-11-14 22:48:15 +00:00
Chris Lattner	239f9a35ed	simplify getPICBaseSymbol a bit. llvm-svn: 119088	2010-11-14 22:37:11 +00:00
Duncan Sands	fb0a48ef96	Factorize the duplicated logic for choosing the right argument calling convention out of the fast and normal ISel files, and into the calling convention TD file. llvm-svn: 117856	2010-10-31 13:21:44 +00:00
John Thompson	e8360b7182	Inline asm multiple alternative constraints development phase 2 - improved basic logic, added initial platform support. llvm-svn: 117667	2010-10-29 17:29:13 +00:00
Michael J. Spencer	f509c6ca27	X86: Add alloca probing to dynamic alloca on Windows. Fixes PR8424. llvm-svn: 116984	2010-10-21 01:41:01 +00:00
Michael J. Spencer	9cafc872ab	Fix Whitespace. llvm-svn: 116972	2010-10-20 23:40:27 +00:00
Dan Gohman	395a898b2b	Initial va_arg support for x86-64. Patch by David Meyer! llvm-svn: 116319	2010-10-12 18:00:49 +00:00
Dale Johannesen	dd224d2333	Massive rewrite of MMX: The x86_mmx type is used for MMX intrinsics, parameters and return values where these use MMX registers, and is also supported in load, store, and bitcast. Only the above operations generate MMX instructions, and optimizations do not operate on or produce MMX intrinsics. MMX-sized vectors <2 x i32> etc. are lowered to XMM or split into smaller pieces. Optimizations may occur on these forms and the result casted back to x86_mmx, provided the result feeds into a previous existing x86_mmx operation. The point of all this is prevent optimizations from introducing MMX operations, which is unsafe due to the EMMS problem. llvm-svn: 115243	2010-09-30 23:57:10 +00:00
Chris Lattner	8a236b63d8	reimplement elf TLS support in terms of addressing modes, eliminating SegmentBaseAddress. llvm-svn: 114529	2010-09-22 04:39:11 +00:00
Chris Lattner	a5156c30ed	convert the last 4 X86ISD nodes that should have memoperands to have them. llvm-svn: 114523	2010-09-22 01:28:21 +00:00
Chris Lattner	ed85da5600	give X86ISD::FNSTCW16m a memoperand, since it touches memory. It only can access the stack due to how it is generated though. llvm-svn: 114522	2010-09-22 01:11:26 +00:00
Chris Lattner	78f518b79b	give FP_TO_INT16_IN_MEM and friends a memoperand. They are only used with stack slots, but hey, lets be safe. llvm-svn: 114521	2010-09-22 01:05:16 +00:00
Chris Lattner	54e5329545	give VZEXT_LOAD a memory operand, it now works with segment registers. llvm-svn: 114515	2010-09-22 00:34:38 +00:00
Chris Lattner	e479e9643b	give LCMPXCHG_DAG[8] a memory operand, allowing it to work with addrspace 256/257 llvm-svn: 114508	2010-09-21 23:59:42 +00:00
Owen Anderson	5e65dfbb97	Reimplement r114460 in target-independent DAGCombine rather than target-dependent, by using the predicate to discover the number of sign bits. Enhance X86's target lowering to provide a useful response to this query. llvm-svn: 114473	2010-09-21 20:42:50 +00:00
John Thompson	1094c80281	Added skeleton for inline asm multiple alternative constraint support. llvm-svn: 113766	2010-09-13 18:15:37 +00:00
Bruno Cardoso Lopes	b3825216ce	Use movlps, movlpd, movss and movsd specific nodes instead of pattern matching with movlp pattern fragment llvm-svn: 112694	2010-09-01 05:08:25 +00:00
Bruno Cardoso Lopes	03e4c35302	Use MOVLHPS and MOVHLPS x86 nodes whenever possible. Also remove some useless nodes llvm-svn: 112642	2010-08-31 21:15:21 +00:00
Bruno Cardoso Lopes	9f20e7a1bf	Prepare LowerVECTOR_SHUFFLEv8i16 to use x86 target specific nodes directly llvm-svn: 111704	2010-08-21 01:32:18 +00:00
Bruno Cardoso Lopes	6f3b38a851	This is the first step towards refactoring the x86 vector shuffle code. The general idea here is to have a group of x86 target specific nodes which are going to be selected during lowering and then directly matched in isel. The commit includes the addition of those specific nodes and a bunch of patterns, and incrementally we're going to switch between them and what we have right now. Both the patterns and target specific nodes can change as we move forward with this work. llvm-svn: 111691	2010-08-20 22:55:05 +00:00
Bruno Cardoso Lopes	91d61df3eb	Add AVX matching patterns to Packed Bit Test intrinsics. Apply the same approach of SSE4.1 ptest intrinsics but create a new x86 node "testp" since AVX introduces vtest{ps}{pd} instructions which set ZF and CF depending on sign bit AND and ANDN of packed floating-point sources. This is slightly different from what the "ptest" does. Tests comming with the other 256 intrinsics tests. llvm-svn: 110744	2010-08-10 23:25:42 +00:00
Nate Begeman	269a6da023	~40% faster vector shl <4 x i32> on SSE 4.1 Larger improvements for smaller types coming in future patches. For: define <2 x i64> @shl(<4 x i32> %r, <4 x i32> %a) nounwind readnone ssp { entry: %shl = shl <4 x i32> %r, %a ; <<4 x i32>> [#uses=1] %tmp2 = bitcast <4 x i32> %shl to <2 x i64> ; <<2 x i64>> [#uses=1] ret <2 x i64> %tmp2 } We get: _shl: ## @shl pslld $23, %xmm1 paddd LCPI0_0, %xmm1 cvttps2dq %xmm1, %xmm1 pmulld %xmm1, %xmm0 ret Instead of: _shl: ## @shl pshufd $3, %xmm0, %xmm2 movd %xmm2, %eax pshufd $3, %xmm1, %xmm2 movd %xmm2, %ecx shll %cl, %eax movd %eax, %xmm2 pshufd $1, %xmm0, %xmm3 movd %xmm3, %eax pshufd $1, %xmm1, %xmm3 movd %xmm3, %ecx shll %cl, %eax movd %eax, %xmm3 punpckldq %xmm2, %xmm3 movd %xmm0, %eax movd %xmm1, %ecx shll %cl, %eax movd %eax, %xmm2 movhlps %xmm0, %xmm0 movd %xmm0, %eax movhlps %xmm1, %xmm1 movd %xmm1, %ecx shll %cl, %eax movd %eax, %xmm0 punpckldq %xmm0, %xmm2 movdqa %xmm2, %xmm0 punpckldq %xmm3, %xmm0 ret llvm-svn: 109549	2010-07-27 22:37:06 +00:00
Evan Cheng	d4218b8793	On x86, f32 / f64 nodes share the same registers as 128-bit vector values. llvm-svn: 109450	2010-07-26 21:50:05 +00:00
Evan Cheng	37b740c4bf	Add an ILP scheduler. This is a register pressure aware scheduler that's appropriate for targets without detailed instruction iterineries. The scheduler schedules for increased instruction level parallelism in low register pressure situation; it schedules to reduce register pressure when the register pressure becomes high. On x86_64, this is a win for all tests in CFP2000. It also sped up 256.bzip2 by 16%. llvm-svn: 109300	2010-07-24 00:39:05 +00:00
Eric Christopher	9a77382685	Custom lower the memory barrier instructions and add support for lowering without sse2. Add a couple of new testcases. Fixes a few libgomp tests and latent bugs. Remove a few todos. llvm-svn: 109078	2010-07-22 02:48:34 +00:00
Eric Christopher	d27913e516	Pulling out previous patch, must've run the tests in the wrong directory. llvm-svn: 109005	2010-07-21 09:23:56 +00:00
Eric Christopher	b2d1067024	Lower MEMBARRIER on x86 and support processors without SSE2. Fixes a pile of libgomp failures in the llvm-gcc testsuite due to the libcall not existing. llvm-svn: 109004	2010-07-21 09:05:23 +00:00
Jakob Stoklund Olesen	9b449d5a92	Use TargetOpcode::COPY instead of X86-native register copy instructions when lowering atomics. This will allow those copies to still be coalesced after TII::isMoveInstr is removed. llvm-svn: 108385	2010-07-14 23:50:27 +00:00
Dan Gohman	d7b5ce3312	Reapply bottom-up fast-isel, with several fixes for x86-32: - Check getBytesToPopOnReturn(). - Eschew ST0 and ST1 for return values. - Fix the PIC base register initialization so that it doesn't ever fail to end up the top of the entry block. llvm-svn: 108039	2010-07-10 09:00:22 +00:00
Bob Wilson	6586e9b203	--- Reverse-merging r107947 into '.': U utils/TableGen/FastISelEmitter.cpp --- Reverse-merging r107943 into '.': U test/CodeGen/X86/fast-isel.ll U test/CodeGen/X86/fast-isel-loads.ll U include/llvm/Target/TargetLowering.h U include/llvm/Support/PassNameParser.h U include/llvm/CodeGen/FunctionLoweringInfo.h U include/llvm/CodeGen/CallingConvLower.h U include/llvm/CodeGen/FastISel.h U include/llvm/CodeGen/SelectionDAGISel.h U lib/CodeGen/LLVMTargetMachine.cpp U lib/CodeGen/CallingConvLower.cpp U lib/CodeGen/SelectionDAG/SelectionDAGBuilder.cpp U lib/CodeGen/SelectionDAG/FunctionLoweringInfo.cpp U lib/CodeGen/SelectionDAG/FastISel.cpp U lib/CodeGen/SelectionDAG/SelectionDAGISel.cpp U lib/CodeGen/SelectionDAG/ScheduleDAGSDNodes.cpp U lib/CodeGen/SelectionDAG/InstrEmitter.cpp U lib/CodeGen/SelectionDAG/TargetLowering.cpp U lib/Target/XCore/XCoreISelLowering.cpp U lib/Target/XCore/XCoreISelLowering.h U lib/Target/X86/X86ISelLowering.cpp U lib/Target/X86/X86FastISel.cpp U lib/Target/X86/X86ISelLowering.h llvm-svn: 107987	2010-07-09 16:37:18 +00:00
Dan Gohman	0b5aa1cdd3	Re-apply bottom-up fast-isel, with fixes. Be very careful to avoid emitting a DBG_VALUE after a terminator, or emitting any instructions before an EH_LABEL. llvm-svn: 107943	2010-07-09 00:39:23 +00:00
Dan Gohman	e75704369d	Revert 107840 107839 107813 107804 107800 107797 107791. Debug info intrinsics win for now. llvm-svn: 107850	2010-07-08 01:00:56 +00:00
Dan Gohman	2d4d01d0de	Add X86FastISel support for return statements. This entails refactoring a bunch of stuff, to allow the target-independent calling convention logic to be employed. llvm-svn: 107800	2010-07-07 18:32:53 +00:00
Dan Gohman	87fb4e8fcd	Simplify FastISel's constructor by giving it a FunctionLoweringInfo instance, rather than pointers to all of FunctionLoweringInfo's members. This eliminates an NDEBUG ABI sensitivity. llvm-svn: 107789	2010-07-07 16:29:44 +00:00
Dan Gohman	fe7532a308	Split the SDValue out of OutputArg so that SelectionDAG-independent code can do calling-convention queries. This obviates OutputArgReg. llvm-svn: 107786	2010-07-07 15:54:55 +00:00
Dan Gohman	ee0cb70381	CanLowerReturn doesn't need a SelectionDAG; it just needs an LLVMContext. SelectBasicBlock doesn't needs its BasicBlock argument. llvm-svn: 107712	2010-07-06 22:19:37 +00:00
Eric Christopher	2ad0c779c3	Fix up -fstack-protector on linux to use the segment registers. Split out testcases per architecture and os now. Patch from Nelson Elhage. llvm-svn: 107640	2010-07-06 05:18:56 +00:00
Dale Johannesen	ce97d55ad9	The hasMemory argument is irrelevant to how the argument for an "i" constraint should get lowered; PR 6309. While this argument was passed around a lot, this is the only place it was used, so it goes away from a lot of other places. llvm-svn: 106893	2010-06-25 21:55:36 +00:00
Eric Christopher	b0e1a458ce	Add first pass at darwin tls compiler support. llvm-svn: 105381	2010-06-03 04:07:48 +00:00

1 2 3 4 5 ...

376 Commits