llvm-project

Commit Graph

Author	SHA1	Message	Date
Owen Anderson	1cf8881299	Add an InstCombine transform to recognize instances of manual overflow-safe addition (performing the addition in a wider type and explicitly checking for overflow), and fold them down to intrinsics. This currently only supports signed-addition, but could be generalized if someone works out the magic constant formulas for other operations. Fixes <rdar://problem/8558713>. llvm-svn: 121905	2010-12-15 22:32:38 +00:00
Evan Cheng	b7ff5a0f20	Teach machine cse to commute instructions. llvm-svn: 121903	2010-12-15 22:16:21 +00:00
Bob Wilson	fa27a8621c	Add Neon VCVT instructions for f32 <-> f16 conversions. Clang is now providing intrinsics for these and so we need to support them in the backend. Radar 8068427. llvm-svn: 121902	2010-12-15 22:14:12 +00:00
Bob Wilson	c3ff538dcf	Fix misspelled target triples in MC/ARM test commands. llvm-svn: 121901	2010-12-15 22:14:01 +00:00
Wesley Peck	0c558b2080	Lower the MBlaze target specific calling conventions for "interrupt_handler" and "save_volatiles" correctly. This completes the custom calling convention functionality changes for the MBlaze backend that were started in 121888. llvm-svn: 121891	2010-12-15 20:27:28 +00:00
Duncan Sands	0a2c416894	Move Sub simplifications and additional Add simplifications out of instcombine and into InstructionSimplify. llvm-svn: 121861	2010-12-15 14:07:39 +00:00
Frits van Bommel	3d1803495e	Teach jump threading to "look through" a select when the branch direction of a terminator depends on it. When it sees a promising select it now tries to figure out whether the condition of the select is known in any of the predecessors and if so it maps the operands appropriately. llvm-svn: 121859	2010-12-15 09:51:20 +00:00
Rafael Espindola	844f6b6cfb	Relax alignment fragments. With this we don't need the EffectiveSize field anymore. Without that field LayoutFragment only updates offsets and we don't need to invalidate the current fragment when it is relaxed (only the ones following it). This is also a very small improvement in the accuracy of the layout info as we now use the after relaxation size immediately. llvm-svn: 121857	2010-12-15 08:45:53 +00:00
Rafael Espindola	8911d03504	Patch by David Meyer to avoid a O(N^2) behaviour when relaxing fragments. Since we now don't update addresses so early, we might relax a bit more than we need to. This is simillar to the issue in PR8467. llvm-svn: 121856	2010-12-15 07:39:29 +00:00
Chris Lattner	15090e1eb0	take care of some todos, transforming [us]mul_lohi into a wider mul if the wider mul is legal. llvm-svn: 121848	2010-12-15 06:04:19 +00:00
Chris Lattner	c3301e970c	merge two tests llvm-svn: 121847	2010-12-15 05:58:59 +00:00
Kevin Enderby	4886cc8be7	Add some more MC tests for ARM arithmetic instructions that update or don't update the condition codes. These come from my test generator and are just the ones that MC currently assembles correctly. llvm-svn: 121830	2010-12-15 01:24:36 +00:00
Owen Anderson	35609d97ae	Fix PR8790, another instance where unreachable code can cause instruction simplification to fail, this case involve a select that simplifies to itself. llvm-svn: 121817	2010-12-15 00:55:35 +00:00
Evan Cheng	19dc77cec6	Fix a minor bug in two-address pass. It was missing a commute opportunity. regB = move RCX regA = op regB, regC RAX = move regA where both regB and regC are killed. If regB is constrainted to non-compatible physical registers but regC is not constrainted at all, then it's better to commute the instruction. movl %edi, %eax shlq $32, %rcx leaq (%rcx,%rax), %rax => movl %edi, %eax shlq $32, %rcx orq %rcx, %rax rdar://8762995 llvm-svn: 121793	2010-12-14 21:34:53 +00:00
Daniel Dunbar	a9b9300bb8	MC/ARM: Fix-up fixup offset for fixup_arm_branch target specific fixup. llvm-svn: 121772	2010-12-14 17:37:16 +00:00
Chris Lattner	7499b452c1	- Insert new instructions before DomBlock's terminator, which is simpler than finding a place to insert in BB. - Don't perform the 'if condition hoisting' xform on certain i1 PHIs, as it interferes with switch formation. This re-fixes "example 7", without breaking the world hopefully. llvm-svn: 121764	2010-12-14 08:46:09 +00:00
Chris Lattner	335f0e4ad4	fix two significant issues with FoldTwoEntryPHINode: first, it can kick in on blocks whose conditions have been folded to a constant, even though one of the edges will be trivially folded. second, it doesn't clean up the "if diamond" that it just eliminated away. This is a problem because other simplifycfg xforms kick in depending on the order of block visitation, causing pointless work. llvm-svn: 121762	2010-12-14 08:01:53 +00:00
Chris Lattner	f130661688	fix yet anohter broken line llvm-svn: 121750	2010-12-14 06:09:07 +00:00
Chris Lattner	5a9d59d918	reapply my recent change that disables a piece of the switch formation work, but fixes 400.perlbmk. llvm-svn: 121749	2010-12-14 05:57:30 +00:00
Evan Cheng	c177813755	bfi A, (and B, C1), C2) -> bfi A, B, C2 iff C1 & C2 == C1. rdar://8458663 llvm-svn: 121746	2010-12-14 03:22:07 +00:00
Jason W Kim	1296055841	fix fixme case typo :-) llvm-svn: 121743	2010-12-14 01:42:38 +00:00
Owen Anderson	3e5648896e	Fix recent buildbot breakage by pulling SimplifyCFG back to its state as of r121694, the most recent state where I'm confident there were no crashes or miscompilations. XFAIL the test added since then for now. llvm-svn: 121733	2010-12-13 23:49:28 +00:00
Jason W Kim	0e909c5f9c	First cut of ARM/MC/ELF PIC relocations. Test has fixme, to move to .s -> .o test when AsmParser works better. llvm-svn: 121732	2010-12-13 23:16:07 +00:00
Bob Wilson	651eaa02b8	Remove the rest of the _sfp Neon instruction patterns. Use the same COPY_TO_REGCLASS approach as for the 2-register _sfp instructions. This change made a big difference in the code generated for the CodeGen/Thumb2/cross-rc-coalescing-2.ll test: The coalescer is still doing a fine job, but some instructions that were previously moved outside the loop are not moved now. It's using fewer VFP registers now, which is generally a good thing, so I think the estimates for register pressure changed and that affected the LICM behavior. Since that isn't obviously wrong, I've just changed the test file. This completes the work for Radar 8711675. llvm-svn: 121730	2010-12-13 23:02:37 +00:00
Chris Lattner	a6e5d5694a	temporarily disable part of my previous patch, which causes an iterator invalidation issue, causing a crash on some versions of perlbmk. llvm-svn: 121728	2010-12-13 23:02:19 +00:00
Dan Gohman	c4bf5cac9f	Reapply r121520, PartialAlias implementation for BasicAA, now that memdep is updated to handle it. llvm-svn: 121725	2010-12-13 22:50:24 +00:00
Benjamin Kramer	1e155ab7e1	Fix sort predicate. qsort(3)'s predicate semantics differ from std::sort's. Fixes PR 8780. llvm-svn: 121705	2010-12-13 18:20:38 +00:00
Chris Lattner	8e21a02c19	rename test llvm-svn: 121697	2010-12-13 08:39:40 +00:00
Chris Lattner	10bd29f1d4	Add a couple dag combines to transform mulhi/mullo into a wider multiply when the wider type is legal. This allows us to compile: define zeroext i16 @test1(i16 zeroext %x) nounwind { entry: %div = udiv i16 %x, 33 ret i16 %div } into: test1: # @test1 movzwl 4(%esp), %eax imull $63551, %eax, %eax # imm = 0xF83F shrl $21, %eax ret instead of: test1: # @test1 movw $-1985, %ax # imm = 0xFFFFFFFFFFFFF83F mulw 4(%esp) andl $65504, %edx # imm = 0xFFE0 movl %edx, %eax shrl $5, %eax ret Implementing rdar://8760399 and example #4 from: http://blog.regehr.org/archives/320 We should implement the same thing for [su]mul_hilo, but I don't have immediate plans to do this. llvm-svn: 121696	2010-12-13 08:39:01 +00:00
Chris Lattner	fb836f8c1a	reinstate my patch: the miscompile was caused by an inverted branch in the 'and' case. llvm-svn: 121695	2010-12-13 08:12:19 +00:00
Chris Lattner	79db357d80	Completely disable the optimization I added in r121680 until I can track down a miscompile. This should bring the buildbots back to life llvm-svn: 121693	2010-12-13 07:41:29 +00:00
Chris Lattner	fbeb55844b	Make simplifycfg reprocess newly formed "br (cond1 \| cond2)" conditions when simplifying, allowing them to be eagerly turned into switches. This is the last step required to get "Example 7" from this blog post: http://blog.regehr.org/archives/320 On X86, we now generate this machine code, which (to my eye) seems better than the ICC generated code: _crud: ## @crud ## BB#0: ## %entry cmpb $33, %dil jb LBB0_4 ## BB#1: ## %switch.early.test addb $-34, %dil cmpb $58, %dil ja LBB0_3 ## BB#2: ## %switch.early.test movzbl %dil, %eax movabsq $288230376537592865, %rcx ## imm = 0x400000017001421 btq %rax, %rcx jb LBB0_4 LBB0_3: ## %lor.rhs xorl %eax, %eax ret LBB0_4: ## %lor.end movl $1, %eax ret llvm-svn: 121690	2010-12-13 07:00:06 +00:00
Chris Lattner	cb570f87e5	fix a bug in r121680 that upset the various buildbots. llvm-svn: 121687	2010-12-13 05:34:18 +00:00
Chris Lattner	bc9e6d9dbe	make these tests a bit less fragile llvm-svn: 121682	2010-12-13 05:10:30 +00:00
Chris Lattner	a442f24a36	enhance the "change or icmp's into switch" xform to handle one value in an 'or sequence' that it doesn't understand. This allows us to optimize something insane like this: int crud (unsigned char c, unsigned x) { if(((((((((( (int) c <= 32 \|\| (int) c == 46) \|\| (int) c == 44) \|\| (int) c == 58) \|\| (int) c == 59) \|\| (int) c == 60) \|\| (int) c == 62) \|\| (int) c == 34) \|\| (int) c == 92) \|\| (int) c == 39) != 0) foo(); } into: define i32 @crud(i8 zeroext %c, i32 %x) nounwind ssp noredzone { entry: %cmp = icmp ult i8 %c, 33 br i1 %cmp, label %if.then, label %switch.early.test switch.early.test: ; preds = %entry switch i8 %c, label %if.end [ i8 39, label %if.then i8 44, label %if.then i8 58, label %if.then i8 59, label %if.then i8 60, label %if.then i8 62, label %if.then i8 46, label %if.then i8 92, label %if.then i8 34, label %if.then ] by pulling the < comparison out ahead of the newly formed switch. llvm-svn: 121680	2010-12-13 04:50:38 +00:00
Chris Lattner	a737721d14	merge two tests llvm-svn: 121679	2010-12-13 04:45:56 +00:00
Chris Lattner	62cc76e9cc	Fix my previous patch to handle a degenerate case that the llvm-gcc bootstrap buildbot tripped over. llvm-svn: 121674	2010-12-13 03:43:57 +00:00
Chris Lattner	d9bacc088a	fix a fairly serious oversight with switch formation from or'd conditions. Previously we'd compile something like this: int crud (unsigned char c) { return c == 62 \|\| c == 34 \|\| c == 92; } into: switch i8 %c, label %lor.rhs [ i8 62, label %lor.end i8 34, label %lor.end ] lor.rhs: ; preds = %entry %cmp8 = icmp eq i8 %c, 92 br label %lor.end lor.end: ; preds = %entry, %entry, %lor.rhs %0 = phi i1 [ true, %entry ], [ %cmp8, %lor.rhs ], [ true, %entry ] %lor.ext = zext i1 %0 to i32 ret i32 %lor.ext which failed to merge the compare-with-92 into the switch. With this patch we simplify this all the way to: switch i8 %c, label %lor.rhs [ i8 62, label %lor.end i8 34, label %lor.end i8 92, label %lor.end ] lor.rhs: ; preds = %entry br label %lor.end lor.end: ; preds = %entry, %entry, %entry, %lor.rhs %0 = phi i1 [ true, %entry ], [ false, %lor.rhs ], [ true, %entry ], [ true, %entry ] %lor.ext = zext i1 %0 to i32 ret i32 %lor.ext which is much better for codegen's switch lowering stuff. This kicks in 33 times on 176.gcc (for example) cutting 103 instructions off the generated code. llvm-svn: 121671	2010-12-13 03:18:54 +00:00
Bill Wendling	73ce4a6fd8	Add support for using the `!if' operator when initializing variables: class A<bit a, bits<3> x, bits<3> y> { bits<3> z; let z = !if(a, x, y); } The variable z will get the value of x when 'a' is 1 and 'y' when a is '0'. llvm-svn: 121666	2010-12-13 01:46:19 +00:00
Wesley Peck	b4f896ce90	Missed some ADDI <-> ADDIK conversions in 121649. llvm-svn: 121652	2010-12-12 22:53:14 +00:00
Benjamin Kramer	c4169cebe3	Generalize the and-icmp-select instcombine further by allowing selects of the form (x & 2^n) ? 2^m+C : C we can offset both arms by C to get the "(x & 2^n) ? 2^m : 0" form, optimize the select to a shift and apply the offset afterwards. llvm-svn: 121609	2010-12-11 10:49:22 +00:00
Benjamin Kramer	c8b035d006	Factor the (x & 2^n) ? 2^m : 0 instcombine into its own method and generalize it to catch cases where n != m with a shift. llvm-svn: 121608	2010-12-11 09:42:59 +00:00
Evan Cheng	3434575704	(or (and (shl A, #shamt), mask), B) => ARMbfi B, A, ~mask where lsb(mask) == #shamt. rdar://8752056 llvm-svn: 121606	2010-12-11 04:11:38 +00:00
Bob Wilson	9375d27460	Add float patterns for Neon vld1-lane/dup and vst1-lane operations. llvm-svn: 121583	2010-12-10 22:13:32 +00:00
Dan Gohman	39de62348f	Revert r121520, which may have introduced miscompilations. llvm-svn: 121573	2010-12-10 21:48:28 +00:00
Dan Gohman	041f74e762	Implement PartialAlias checking in BasicAA. llvm-svn: 121520	2010-12-10 20:47:03 +00:00
Bob Wilson	d29b38c893	Fix some invalid alignments for Neon vld-dup and vld/st-lane instructions. Alignments smaller than the total size of the memory being loaded or stored, unless the alignment is 8 bytes, are not allowed. Add tests for this, too. llvm-svn: 121506	2010-12-10 19:37:42 +00:00
NAKAMURA Takumi	e737e26144	macho-dump: Fix CMake build, following up to r121466. llvm-svn: 121476	2010-12-10 09:18:26 +00:00
Rafael Espindola	0a017a6db2	Fixed version of 121434 with no new memory leaks. llvm-svn: 121471	2010-12-10 07:39:47 +00:00
Daniel Dunbar	a5d9f6df8f	macho-dump: Switch to C++ macho-dump tool. llvm-svn: 121466	2010-12-10 06:19:45 +00:00
Rafael Espindola	a945a34c73	Revert my previous patch to make the valgrind bots happy. llvm-svn: 121461	2010-12-10 04:01:09 +00:00
NAKAMURA Takumi	a8c1c3fe22	Add dependency to "make check". cmake/modules/AddLLVM.cmake: Add empty "phony" target in add_llvm_loadable_module() even if loadable module were not supported. llvm-svn: 121455	2010-12-10 02:15:36 +00:00
Nate Begeman	8b08f5232b	Formalize the notion that AVX and SSE are non-overlapping extensions from the compiler's point of view. Per email discussion, we either want to always use VEX-prefixed instructions or never use them, and are taking "HasAVX" to mean "Always use VEX". Passing -mattr=-avx,+sse42 should serve to restore legacy SSE support when desirable. llvm-svn: 121439	2010-12-10 00:26:57 +00:00
Rafael Espindola	56eb741237	Initial support for the cfi directives. This is just enough to get f: .cfi_startproc nop .cfi_endproc assembled (on ELF). llvm-svn: 121434	2010-12-09 23:48:29 +00:00
Kevin Enderby	3164a346e6	Add support for parsing ARM arithmetic instructions that update or don't update the condition codes. Where the ones that do have an 's' suffix and the ones that don't don't have the suffix. The trick is if MatchInstructionImpl() fails we try again after adding a CCOut operand with the correct value and removing the 's' if present. Four simple test cases added for now, lots more to come. llvm-svn: 121401	2010-12-09 19:19:43 +00:00
Jim Grosbach	5fccad84a3	ARM stm/ldm instructions require more than one register in the register list. Otherwise, a plain str/ldr should be used instead. Make sure we account for that in prologue/epilogue code generation. rdar://8745460 llvm-svn: 121391	2010-12-09 18:31:13 +00:00
Bruno Cardoso Lopes	d47180e45e	Add ROTR and ROTRV mips32 instructions. Patch by Akira Hatanaka llvm-svn: 121377	2010-12-09 17:32:30 +00:00
Chris Lattner	bc4457e317	enhance memcpyopt to zap memcpy's that have the same src/dst. llvm-svn: 121362	2010-12-09 07:45:45 +00:00
Chris Lattner	fd51c52ef6	fix PR8753, eliminating a case where we'd infinitely make a substitution because it doesn't actually change the IR. Patch by Jakub Staszak! llvm-svn: 121361	2010-12-09 07:39:50 +00:00
Eric Christopher	a8aaaee379	Rewrite the darwin tlv support to use a chain and return to copying the output to the correct register. Fixes a hidden problem uncovered by the last patch where we'd try to DAG combine our MVT::Other node oddly. llvm-svn: 121358	2010-12-09 06:25:53 +00:00
Dan Gohman	a32986e899	Really check that the bits that will become zero are actually already zero before eliminating the operation that zeros them. This fixes rdar://8739316. llvm-svn: 121353	2010-12-09 02:52:17 +00:00
Eric Christopher	d84970ae8b	Remove extraneous copy from DAG conversion for darwin tls. This was popping up at O0 when it wasn't folded and the fast allocator would complain. llvm-svn: 121330	2010-12-09 00:27:58 +00:00
Kevin Enderby	87bc591fc5	Allow a slash, '/', as a prefix separator for X86. rdar://8741045 llvm-svn: 121320	2010-12-08 23:57:59 +00:00
Eric Christopher	6a21b40bd6	Move this test to tlv* to make it easier to notice versus linux tls support. llvm-svn: 121316	2010-12-08 23:33:23 +00:00
Jason W Kim	c79c5f6e8c	ARM/MC/ELF TPsoft is now a proper pseudo inst. Added test to check bl __aeabi_read_tp gets emitted properly for ELF/ASM as well as ELF/OBJ (including fixup) Also added support for ELF::R_ARM_TLS_IE32 llvm-svn: 121312	2010-12-08 23:14:44 +00:00
Evan Cheng	775ead3293	Fix a bad prologue / epilogue codegen bug where the compiler would emit illegal vpush instructions to save / restore VFP / NEON registers like this: vpush {d8,d10,d11} vpop {d8,d10,d11} vpush and vpop do not allow gaps in the register list. rdar://8728956 llvm-svn: 121197	2010-12-07 23:08:38 +00:00
Bruno Cardoso Lopes	f0c6e3780d	Match a pattern generated by a dag combiner opt where: (select (load (load tga0)) (load tga1)) => (load (select (load tga0) tga1)) Thanks to Akira for pointing that. llvm-svn: 121163	2010-12-07 19:00:20 +00:00
Rafael Espindola	e78d3b38de	Fix absolute recording of differences of symbols in two sections. Reduced from ctor_dtor_count-2.cpp. llvm-svn: 121152	2010-12-07 17:12:32 +00:00
Rafael Espindola	bdbe5a712d	Fix relocations with weak definitions. llvm-svn: 121114	2010-12-07 05:57:28 +00:00
NAKAMURA Takumi	98c9ae3761	Revert test/Archive/check_binary_output.ll". It fails on a buildbot. llvm-svn: 121113	2010-12-07 05:57:02 +00:00
Chris Lattner	0d71c4f564	reapply r121100 with a tweak to constant fold ConstExprs with TargetData (if available) as we go so that we get simple constantexprs not insane ones. This fixes the failure of clang/test/CodeGenCXX/virtual-base-ctor.cpp that the previous iteration of this patch had. llvm-svn: 121111	2010-12-07 04:33:29 +00:00
Rafael Espindola	2eabaae459	Fix pcrel relocations that cross sections. llvm-svn: 121107	2010-12-07 03:50:14 +00:00
NAKAMURA Takumi	1576fd7172	test/Archive/check_binary_output.ll: Add a new test to check output of 'llvm-ar -p' is sane. Thanks to Danil Malyshev! llvm-svn: 121106	2010-12-07 03:35:20 +00:00
NAKAMURA Takumi	2a61f4e364	test/Other/close-stderr.ll: Require the feature 'shell'. It is not executable on Win32 but it is executable on MSYS-bash. llvm-svn: 121105	2010-12-07 02:43:58 +00:00
NAKAMURA Takumi	75986b8c07	test: Add the feature 'shell' on LLVM_ON_UNIX. llvm-svn: 121104	2010-12-07 02:43:51 +00:00
Eric Christopher	f10dcfb9fb	Temporarily revert r121100 as it's causing clang to fail CodeGenCXX/virtual-base-ctor.cpp. llvm-svn: 121102	2010-12-07 02:41:11 +00:00
Chris Lattner	287f4366c1	fix PR8710 - teach global opt that some constantexprs are too complex to put in a global variable's initializer. llvm-svn: 121100	2010-12-07 01:59:32 +00:00
Michael J. Spencer	da817bf231	Test: Fix Support.Path and _all_ of the unittest death tests. GetTempPath defaults to \Windows\. If I typed anything else it would just decline into cursing. llvm-svn: 121095	2010-12-07 01:23:49 +00:00
Rafael Espindola	a2421ec705	Fix a crash reduced from gcc produced assembly. llvm-svn: 121085	2010-12-07 01:09:54 +00:00
Owen Anderson	99ea8a3510	Second attempt at converting Thumb2's LDRpci, including updating the gazillion places that need to know about it. llvm-svn: 121082	2010-12-07 00:45:21 +00:00
Frits van Bommel	d9df6eaa9c	Implement jump threading of 'indirectbr' by keeping track of whether we're looking for ConstantInts or BlockAddresss. llvm-svn: 121066	2010-12-06 23:36:56 +00:00
Devang Patel	c24048a718	If dbg_declare() or dbg_value() is not lowered by isel then emit DEBUG message instead of creating DBG_VALUE for undefined value in reg0. llvm-svn: 121059	2010-12-06 22:39:26 +00:00
Wesley Peck	8da34b6c35	Fixed reversed operands for IDIV and CMP instructions in MBlaze backend. Use BRAD instead of BRD for indirect branches in MBlaze backend. patch contributed by Jack Whitham! llvm-svn: 121044	2010-12-06 22:06:49 +00:00
Wesley Peck	6ce9b60811	Fix a 16-bit immediate value detection bug in the MBlaze delay slot filler. Address more hazards in the MBlaze delay slot filler. patch contributed by Jack Whitham! llvm-svn: 121037	2010-12-06 21:11:01 +00:00
Rafael Espindola	44bbe36de6	Second try at making direct object emission produce the same results as llc + llvm-mc. This time ELF is not changed and I tested that llvm-gcc bootstrap on darwin10 using darwin9's assembler and linker. llvm-svn: 121006	2010-12-06 17:27:56 +00:00
Rafael Espindola	dee3062373	Revert previous two patches while I try to find out how to make both linux and darwin assemblers happy :-( llvm-svn: 121004	2010-12-06 15:35:15 +00:00
Rafael Espindola	884d58a798	Update test for the extra =. llvm-svn: 121001	2010-12-06 15:05:36 +00:00
Che-Liang Chiou	9f2af628a6	ptx: add shift instructions llvm-svn: 120982	2010-12-06 04:00:03 +00:00
Rafael Espindola	ac60adb38d	Don't use PadSectionToAlignment on windows. llvm-svn: 120978	2010-12-06 03:03:44 +00:00
Chris Lattner	94fbdf3814	Fix PR8728, a miscompilation I recently introduced. When optimizing memcpy's like: memcpy(A, B) memcpy(A, C) we cannot delete the first memcpy as dead if A and C might be aliases. If so, we actually get: memcpy(A, B) memcpy(A, A) which is not correct to transform into: memcpy(A, A) This patch was heavily influenced by Jakub Staszak's patch in PR8728, thanks Jakub! llvm-svn: 120974	2010-12-06 01:48:06 +00:00
Evan Cheng	62c7b5bf76	Making use of VFP / NEON floating point multiply-accumulate / subtraction is difficult on current ARM implementations for a few reasons. 1. Even though a single vmla has latency that is one cycle shorter than a pair of vmul + vadd, a RAW hazard during the first (4? on Cortex-a8) can cause additional pipeline stall. So it's frequently better to single codegen vmul + vadd. 2. A vmla folowed by a vmul, vmadd, or vsub causes the second fp instruction to stall for 4 cycles. We need to schedule them apart. 3. A vmla followed vmla is a special case. Obvious issuing back to back RAW vmla + vmla is very bad. But this isn't ideal either: vmul vadd vmla Instead, we want to expand the second vmla: vmla vmul vadd Even with the 4 cycle vmul stall, the second sequence is still 2 cycles faster. Up to now, isel simply avoid codegen'ing fp vmla / vmls. This works well enough but it isn't the optimial solution. This patch attempts to make it possible to use vmla / vmls in cases where it is profitable. A. Add missing isel predicates which cause vmla to be codegen'ed. B. Make sure the fmul in (fadd (fmul)) has a single use. We don't want to compute a fmul and a fmla. C. Add additional isel checks for vmla, avoid cases where vmla is feeding into fp instructions (except for the #3 exceptional case). D. Add ARM hazard recognizer to model the vmla / vmls hazards. E. Add a special pre-regalloc case to expand vmla / vmls when it's likely the vmla / vmls will trigger one of the special hazards. Work in progress, only A+B are enabled. llvm-svn: 120960	2010-12-05 22:04:16 +00:00
Frits van Bommel	16ebe77be0	Fix PR 4170 by having ExtractValueInst::getIndexedType() reject out-of-bounds indexing. Also add asserts that the indices are valid in InsertValueInst::init(). ExtractValueInst already asserts when constructed with invalid indices. llvm-svn: 120956	2010-12-05 20:50:26 +00:00
Frits van Bommel	8fb69ee805	Teach SimplifyCFG to turn (indirectbr (select cond, blockaddress(@fn, BlockA), blockaddress(@fn, BlockB))) into (br cond, BlockA, BlockB). llvm-svn: 120943	2010-12-05 18:29:03 +00:00
Chris Lattner	6886171792	Teach X86ISelLowering that the second result of X86ISD::UMUL is a flags result. This allows us to compile: void *test12(long count) { return new int[count]; } into: test12: movl $4, %ecx movq %rdi, %rax mulq %rcx movq $-1, %rdi cmovnoq %rax, %rdi jmp __Znam ## TAILCALL instead of: test12: movl $4, %ecx movq %rdi, %rax mulq %rcx seto %cl testb %cl, %cl movq $-1, %rdi cmoveq %rax, %rdi jmp __Znam Of course it would be even better if the regalloc inverted the cmov to 'cmovoq', which would eliminate the need for the 'movq %rdi, %rax'. llvm-svn: 120936	2010-12-05 07:49:54 +00:00
Chris Lattner	364bb0a081	it turns out that when ".with.overflow" intrinsics were added to the X86 backend that they were all implemented except umul. This one fell back to the default implementation that did a hi/lo multiply and compared the top. Fix this to check the overflow flag that the 'mul' instruction sets, so we can avoid an explicit test. Now we compile: void *func(long count) { return new int[count]; } into: __Z4funcl: ## @_Z4funcl movl $4, %ecx ## encoding: [0xb9,0x04,0x00,0x00,0x00] movq %rdi, %rax ## encoding: [0x48,0x89,0xf8] mulq %rcx ## encoding: [0x48,0xf7,0xe1] seto %cl ## encoding: [0x0f,0x90,0xc1] testb %cl, %cl ## encoding: [0x84,0xc9] movq $-1, %rdi ## encoding: [0x48,0xc7,0xc7,0xff,0xff,0xff,0xff] cmoveq %rax, %rdi ## encoding: [0x48,0x0f,0x44,0xf8] jmp __Znam ## TAILCALL instead of: __Z4funcl: ## @_Z4funcl movl $4, %ecx ## encoding: [0xb9,0x04,0x00,0x00,0x00] movq %rdi, %rax ## encoding: [0x48,0x89,0xf8] mulq %rcx ## encoding: [0x48,0xf7,0xe1] testq %rdx, %rdx ## encoding: [0x48,0x85,0xd2] movq $-1, %rdi ## encoding: [0x48,0xc7,0xc7,0xff,0xff,0xff,0xff] cmoveq %rax, %rdi ## encoding: [0x48,0x0f,0x44,0xf8] jmp __Znam ## TAILCALL Other than the silly seto+test, this is using the o bit directly, so it's going in the right direction. llvm-svn: 120935	2010-12-05 07:30:36 +00:00
Chris Lattner	183ddd8ed3	fix the rest of the linux miscompares :) llvm-svn: 120933	2010-12-05 02:08:07 +00:00
Chris Lattner	116580a11c	generalize the previous check to handle -1 on either side of the select, inserting a not to compensate. Add a missing isZero check that I lost somehow. This improves codegen of: void *func(long count) { return new int[count]; } from: __Z4funcl: ## @_Z4funcl movl $4, %ecx ## encoding: [0xb9,0x04,0x00,0x00,0x00] movq %rdi, %rax ## encoding: [0x48,0x89,0xf8] mulq %rcx ## encoding: [0x48,0xf7,0xe1] testq %rdx, %rdx ## encoding: [0x48,0x85,0xd2] movq $-1, %rdi ## encoding: [0x48,0xc7,0xc7,0xff,0xff,0xff,0xff] cmoveq %rax, %rdi ## encoding: [0x48,0x0f,0x44,0xf8] jmp __Znam ## TAILCALL ## encoding: [0xeb,A] to: __Z4funcl: ## @_Z4funcl movl $4, %ecx ## encoding: [0xb9,0x04,0x00,0x00,0x00] movq %rdi, %rax ## encoding: [0x48,0x89,0xf8] mulq %rcx ## encoding: [0x48,0xf7,0xe1] cmpq $1, %rdx ## encoding: [0x48,0x83,0xfa,0x01] sbbq %rdi, %rdi ## encoding: [0x48,0x19,0xff] notq %rdi ## encoding: [0x48,0xf7,0xd7] orq %rax, %rdi ## encoding: [0x48,0x09,0xc7] jmp __Znam ## TAILCALL ## encoding: [0xeb,A] llvm-svn: 120932	2010-12-05 02:00:51 +00:00
Chris Lattner	77a11c6174	relax this to handle linux defaulting to -static. llvm-svn: 120930	2010-12-05 01:31:13 +00:00
Chris Lattner	342e6ea5f9	Improve an integer select optimization in two ways: 1. generalize (select (x == 0), -1, 0) -> (sign_bit (x - 1)) to: (select (x == 0), -1, y) -> (sign_bit (x - 1)) \| y 2. Handle the identical pattern that happens with !=: (select (x != 0), y, -1) -> (sign_bit (x - 1)) \| y cmov is often high latency and can't fold immediates or memory operands. For example for (x == 0) ? -1 : 1, before we got: < testb %sil, %sil < movl $-1, %ecx < movl $1, %eax < cmovel %ecx, %eax now we get: > cmpb $1, %sil > sbbl %eax, %eax > orl $1, %eax llvm-svn: 120929	2010-12-05 01:23:24 +00:00
Chris Lattner	0523388d60	merge some tests into select.ll and make them more specific. llvm-svn: 120928	2010-12-05 01:13:58 +00:00
Chris Lattner	b89b6f17da	rename test llvm-svn: 120927	2010-12-05 01:02:23 +00:00
Chris Lattner	d4f8c9641a	remove two tests that aren't really testing anything. llvm-svn: 120926	2010-12-05 01:02:13 +00:00
Benjamin Kramer	2f489236ab	Add patterns for the x86 popcnt instruction. - Also adds a new POPCNT subtarget feature that is currently enabled if the target supports SSE4.2 (nehalem) or SSE4A (barcelona). llvm-svn: 120917	2010-12-04 20:32:23 +00:00
Bob Wilson	ed854baad5	The Thumb tADDrSPi instruction is not valid when the destination is SP. Check for that and try narrowing it to tADDspi instead. Radar 8724703. llvm-svn: 120892	2010-12-04 04:40:19 +00:00
Rafael Espindola	1c8ac8f027	There are two reasons why we might want to use foo = a - b .long foo instead of just .long a - b First, on darwin9 64 bits the assembler produces the wrong result. Second, if "a" is the end of the section all darwin assemblers (9, 10 and mc) will not consider a - b to be a constant but will if the dummy foo is created. Split how we handle these cases. The first one is something MC should take care of. The second one has to be handled by the caller. llvm-svn: 120889	2010-12-04 03:21:47 +00:00
Rafael Espindola	1048e75fb9	Next step: Only pad debug_line when the target is darwin. Add a FIXME to avoid doing that if the target is darwin10 or newer. This fixes ) Direct object emission was producing objects without the workaround on darwin9. ) Assembly printing was producing objects with the workaround on linux. llvm-svn: 120866	2010-12-04 00:31:13 +00:00
Jim Grosbach	567ebd0cb5	Encode the 32-bit wide Thumb (and Thumb2) instructions with the high order halfword being emitted to the stream first. rdar://8728174 llvm-svn: 120848	2010-12-03 22:31:40 +00:00
Jim Grosbach	ca7eaaafda	When using the 'push' mnemonic for Thumb2 stmdb, be explicit when it's the 32-bit wide version by adding the .w suffix. llvm-svn: 120838	2010-12-03 20:33:01 +00:00
Devang Patel	88d794c628	Hide tests, that check .loc, .file in output assembly, from darwin9 buildbot. llvm-svn: 120750	2010-12-02 23:29:58 +00:00
Devang Patel	8cabd938ed	Use set directive for StartMinusEndExpr. This is a fix for llvm-gcc-i386-darwin9 buildbot failure. llvm-svn: 120742	2010-12-02 21:32:30 +00:00
Stuart Hastings	34744b1d3e	Test case for r120740. Radar 8712503. llvm-svn: 120741	2010-12-02 21:25:55 +00:00
Duncan Sands	4e7263b86f	Adjust this test for the fact that the stores are no longer being combined (which is being tracked as PR8699). llvm-svn: 120734	2010-12-02 20:56:51 +00:00
Jim Grosbach	d35424f9ca	XFAIL for now. If someone with access to an ARM/Linux host wants to have a look that would be great. They're ARM JIT failures, so without that, it's tough. llvm-svn: 120731	2010-12-02 20:20:32 +00:00
Evan Cheng	5709254bd5	Fix test. llvm-svn: 120730	2010-12-02 20:17:34 +00:00
Duncan Sands	050d93cb5a	This test dates from the time when llvm-gcc had problems if two types were named the same, so it had to qualify type names according to the enclosing scope to ensure uniqueness. This is no longer needed for correctness (though it may be helpful when reading the IR), so this test has lost its importance. Zap it because dragonegg will never be able to produce the qualified type name since modern gcc zaps language specific info (such as whether a type is nested inside another - needed to get X::Y here) before dragonegg is reached. llvm-svn: 120721	2010-12-02 18:19:23 +00:00
NAKAMURA Takumi	2dc7d5536a	test/Archive/extract.ll: Use cmp instead of diff. Thanks to Danil Malyshev! llvm-svn: 120698	2010-12-02 09:16:14 +00:00
Evan Cheng	419ea286ee	Fix and re-enable tail call optimization of expanded libcalls. llvm-svn: 120622	2010-12-01 22:59:46 +00:00
Rafael Espindola	5fe5f45352	Rename temporary symbols if they conflict with artificial symbols created by the assembler. This was blocking parsing any large .s produced by clang for example. Fixes PR8596. llvm-svn: 120603	2010-12-01 20:46:11 +00:00
Owen Anderson	943fb60b1f	Add correct encodings for STRD and LDRD, including fixup support. Additionally, update these to unified syntax. llvm-svn: 120589	2010-12-01 19:18:46 +00:00
Evan Cheng	a695abde49	Speculatively disable x86 portion of r120501 to appease the x86_64 buildbot. llvm-svn: 120549	2010-12-01 03:27:20 +00:00
Jason W Kim	29805961d8	ARM/MC/ELF relocation "hello world" for movw/movt. Lifted adjustFixupValue() from Darwin for sharing w ELF. Test added TODO: refactor ELFObjectWriter::RecordRelocation more. Possibly share more code with Darwin? Lots more relocations... llvm-svn: 120534	2010-12-01 02:40:06 +00:00
Chris Lattner	1c577b54b0	fix a bozo bug I introduced in r119930, causing a miscompile of 20040709-1.c from the gcc testsuite. I was using the size of a pointer instead of the pointee. This fixes rdar://8713376 llvm-svn: 120519	2010-12-01 01:24:55 +00:00
NAKAMURA Takumi	c8bf78e7f3	test/Archive: FileCheck-ize, and remove *.toc. These may be CRLF-tolerant. llvm-svn: 120506	2010-12-01 00:09:25 +00:00
Evan Cheng	d4b0873c06	Enable sibling call optimization of libcalls which are expanded during legalization time. Since at legalization time there is no mapping from SDNode back to the corresponding LLVM instruction and the return SDNode is target specific, this requires a target hook to check for eligibility. Only x86 and ARM support this form of sibcall optimization right now. rdar://8707777 llvm-svn: 120501	2010-11-30 23:55:39 +00:00
Chris Lattner	903add84d9	Enhance DSE to handle the variable index case in PR8657. llvm-svn: 120498	2010-11-30 23:43:23 +00:00
Chris Lattner	d513faf41f	remove fixme comment too. llvm-svn: 120493	2010-11-30 23:25:01 +00:00
Chris Lattner	370797a1fb	check in all files. This is now handled by my previous DSE commit. llvm-svn: 120492	2010-11-30 23:23:59 +00:00
Chris Lattner	c0f3379ae0	teach DSE to use GetPointerBaseWithConstantOffset to analyze may-aliasing stores that partially overlap with different base pointers. This implements PR6043 and the non-variable part of PR8657 llvm-svn: 120485	2010-11-30 23:05:20 +00:00
Chris Lattner	b63ba73b1b	enhance isRemovable to refuse to delete volatile mem transfers now that DSE hacks on them. This fixes a regression I introduced, by generalizing DSE to hack on transfers. llvm-svn: 120445	2010-11-30 19:12:10 +00:00
Owen Anderson	6187e66801	Add tests for more forms of Thumb2 loads and stores. llvm-svn: 120436	2010-11-30 18:15:21 +00:00
Che-Liang Chiou	e9baf13657	ptx: add command-line options for gpu target and ptx version llvm-svn: 120423	2010-11-30 10:14:14 +00:00
Eric Christopher	8e9fbcf0f0	Not all platforms use _<func>. Duh. llvm-svn: 120418	2010-11-30 09:23:54 +00:00
Bill Wendling	811c936ed5	Add parsing for the Thumb t_addrmode_s4 addressing mode. This can almost certainly be made more generic. But it does allow us to parse something like: ldr r3, [r2, r4] correctly in Thumb mode. llvm-svn: 120408	2010-11-30 07:44:32 +00:00
Chris Lattner	58b779e9c2	Rewrite the main DSE loop to be written in terms of reasoning about pairs of AA::Location's instead of looking for MemDep's "Def" predicate. This is more powerful and general, handling memset/memcpy/store all uniformly, and implementing PR8701 and probably obsoleting parts of memcpyoptimizer. This also fixes an obscure bug with init.trampoline and i8 stores, but I'm not surprised it hasn't been hit yet. Enhancing init.trampoline to carry the size that it stores would allow DSE to be much more aggressive about optimizing them. llvm-svn: 120406	2010-11-30 07:23:21 +00:00
Eric Christopher	fa6657cec0	Rewrite mwait and monitor support and custom lower arguments. Fixes PR8573. llvm-svn: 120404	2010-11-30 07:20:12 +00:00
Anders Carlsson	e3ea1cba79	Add a puts optimization that converts puts() to putchar('\n'). llvm-svn: 120398	2010-11-30 06:19:18 +00:00
Anders Carlsson	77e9892afd	Fix a typo. llvm-svn: 120394	2010-11-30 06:03:55 +00:00
Anders Carlsson	631d06bbce	Rename this test to FPuts.ll since it actually tests fputs. llvm-svn: 120393	2010-11-30 05:59:26 +00:00
Chris Lattner	6c7f64e0bc	remove a use of llvm-dis llvm-svn: 120383	2010-11-30 02:04:15 +00:00
Chris Lattner	c2e3445273	merge one more away llvm-svn: 120375	2010-11-30 01:06:43 +00:00
Chris Lattner	7578d0df51	I already merged partial-overwrite.ll -> PartialStore.ll Merge context-sensitive.ll -> simple.ll and upgrade it. llvm-svn: 120374	2010-11-30 01:05:07 +00:00
Chris Lattner	43e3a98675	clean up DSE tests, removing some poorly reduced and useless old test, merging more into other larger .ll files, filecheckizing along the way. llvm-svn: 120373	2010-11-30 01:00:34 +00:00
Chris Lattner	90c4947df7	enhance basicaa to return "Mod" for a memcpy call when the queried location doesn't overlap the source, and add a testcase. llvm-svn: 120370	2010-11-30 00:43:16 +00:00
Chris Lattner	9a146372b5	Teach basicaa that memset's modref set is at worst "mod" and never contains "ref". Enhance DSE to use a modref query instead of a store-specific hack to generalize the "ignore may-alias stores" optimization to handle memset and memcpy. llvm-svn: 120368	2010-11-30 00:28:45 +00:00
Owen Anderson	e22c7322b8	Correct Thumb2 encodings for a much wider range of loads and stores. llvm-svn: 120364	2010-11-30 00:14:31 +00:00
Chris Lattner	c3c754f750	my previous patch would cause us to start deleting some volatile stores, fix and add a testcase. llvm-svn: 120363	2010-11-30 00:12:39 +00:00
Bob Wilson	431ac4ef50	Add support for NEON VLD3-dup instructions. The encoding for alignment in VLD4-dup instructions is still a work in progress. llvm-svn: 120356	2010-11-30 00:00:35 +00:00
Owen Anderson	50d662b6cb	Provide Thumb2 encodings for basic loads and stores. llvm-svn: 120340	2010-11-29 22:44:32 +00:00
Evan Cheng	9a133f623c	Mark Darwin call instructions as using "r7" to prevent the frame-register assignment instructions from being moved below / above calls. rdar://8690640 llvm-svn: 120339	2010-11-29 22:43:27 +00:00
Benjamin Kramer	a22f0ce1a3	Add missing colon. llvm-svn: 120336	2010-11-29 22:39:38 +00:00
Benjamin Kramer	e6840ef4b3	Fix some broken CHECK lines. llvm-svn: 120332	2010-11-29 22:34:55 +00:00
Chris Lattner	2e8793482c	fix PR8677, patch by Jakub Staszak! llvm-svn: 120325	2010-11-29 21:59:31 +00:00
Frits van Bommel	28218aa8f1	Transform (extractvalue (load P), ...) to (load (gep P, 0, ...)) if the load has no other uses, shrinking the load. llvm-svn: 120323	2010-11-29 21:56:20 +00:00
Frits van Bommel	40a80ac963	Update this test to keep testing the -instcombine transform it's supposed to be testing instead of triggering the improved constant folding for insertvalue and extractvalue. llvm-svn: 120319	2010-11-29 20:55:40 +00:00
Frits van Bommel	a98214de10	Teach ConstantFoldInstruction() how to fold insertvalue and extractvalue. llvm-svn: 120316	2010-11-29 20:36:52 +00:00
Bob Wilson	77ab165afe	Add support for NEON VLD3-dup instructions. llvm-svn: 120312	2010-11-29 19:35:29 +00:00
Kalle Raiskila	1ff0bfa28f	Handle lshr for i128 correctly on SPU also when shiftamount > 7. llvm-svn: 120288	2010-11-29 14:44:28 +00:00
Kalle Raiskila	dc620afd1e	Enable PostRA scheduling for SPU. This speeds up selected test cases with up to 5% - no slowdowns observed. llvm-svn: 120286	2010-11-29 10:30:25 +00:00
NAKAMURA Takumi	6ea8a947e8	test: Check the feature 'loadable_module' with load modules in %llvmshlibdir. %llvmshlibdir should be 'bin' on Cygming. llvm-svn: 120282	2010-11-29 07:58:32 +00:00
Bill Wendling	232e52cfb7	Add more Thumb encodings. llvm-svn: 120279	2010-11-29 01:07:48 +00:00
Bill Wendling	ccba1a8d95	More Thumb encodings. llvm-svn: 120278	2010-11-29 01:00:43 +00:00
Bill Wendling	9600e97c60	Add Thumb encodings for REV instructions. llvm-svn: 120277	2010-11-29 00:42:50 +00:00
NAKAMURA Takumi	4fc56f0be7	test: Use $SharedLibDir for loadable modules. On Cygming, loadable modules are not in lib/ but bin. llvm-svn: 120274	2010-11-29 00:20:21 +00:00
NAKAMURA Takumi	5114d0afe3	test: Add the new feature 'loadable_module'. llvm-svn: 120273	2010-11-29 00:20:09 +00:00
Bill Wendling	775899eb2e	Add more Thumb encodings. llvm-svn: 120272	2010-11-29 00:18:15 +00:00
Chris Lattner	7e8a99b1c3	fix PR8686, accepting a 'b' suffix at the end of all the setcc instructions. I choose to handle this with an asmparser hack, though it could be handled by changing all the instruction definitions to allow be "setneb" instead of "setne". The asm parser hack is better in this case, because we want the disassembler to produce setne, not setneb. llvm-svn: 120260	2010-11-28 20:23:50 +00:00
Bob Wilson	2d790df105	Add support for NEON VLD2-dup instructions. llvm-svn: 120236	2010-11-28 06:51:26 +00:00
Rafael Espindola	5d882894d8	Lower TLS_addr32 and TLS_addr64. llvm-svn: 120225	2010-11-27 20:43:02 +00:00
Rafael Espindola	eab0800695	Implement the data16 prefix. llvm-svn: 120224	2010-11-27 20:29:45 +00:00
NAKAMURA Takumi	f80507c28c	CMake: lit(check.vcproj) can run with multiple configurations on Visual Studio. Unittests need LLVM_BUILD_MODE to pick up each test. Confirmed on CentOS5, Mingw, MSYS, and with possible configurations on VS8 and VS10. llvm-svn: 120212	2010-11-27 13:10:11 +00:00
Bob Wilson	c92eea0175	Add NEON VLD1-dup instructions (load 1 element to all lanes). llvm-svn: 120194	2010-11-27 06:35:16 +00:00
Daniel Dunbar	1440fd3539	macho-dump: Fix typo. llvm-svn: 120185	2010-11-27 04:00:06 +00:00
NAKAMURA Takumi	c54a9692ce	test/site.exp.in: Add "emitir", for now, fixing up r120156. CMake depends on site.exp.in, though, "emitir" might be unused. llvm-svn: 120174	2010-11-26 08:30:15 +00:00
Duncan Sands	7904068186	Remove explicit uses of -emit-llvm, the test infrastructure adds it automatically. Use -S with llvm-gcc rather than -c, so tests can work when llvm-gcc is really dragonegg (which can output IR with -S but not -c). Yes, dragonegg supports objective-c++ (poorly though). llvm-svn: 120164	2010-11-25 21:48:20 +00:00
Duncan Sands	8182ac6a05	Remove explicit uses of -emit-llvm, the test infrastructure adds it automatically. Use -S with llvm-gcc rather than -c, so tests can work when llvm-gcc is really dragonegg (which can output IR with -S but not -c). Yes, dragonegg supports objective-c (poorly though). llvm-svn: 120163	2010-11-25 21:46:07 +00:00
Duncan Sands	e6c974b230	Use -S rather than -c for the benefit of dragonegg. llvm-svn: 120161	2010-11-25 21:41:35 +00:00
Duncan Sands	5fe97a0490	Remove explicit uses of -emit-llvm, the test infrastructure adds it automatically. Use -S with llvm-gcc rather than -c, so tests can work when llvm-gcc is really dragonegg (which can output IR with -S but not -c). llvm-svn: 120160	2010-11-25 21:39:17 +00:00
Duncan Sands	0be0ae625d	Judging from the comment, the system assembler is supposed to assemble the output of this test. Since it was producing bitcode, that clearly wasn't happening! Have it produce target assembler and assemble that instead. llvm-svn: 120159	2010-11-25 21:26:21 +00:00
Duncan Sands	b32d19de6a	Remove explicit uses of -emit-llvm, the test infrastructure adds it automatically. Use -S with llvm-gcc rather than -c, so tests can work when llvm-gcc is really dragonegg (which can output IR with -S but not -c). llvm-svn: 120158	2010-11-25 21:24:35 +00:00
Duncan Sands	2b5243d096	Dragonegg cannot output bitcode, only human readable IR, so use -S rather than -c. llvm-svn: 120157	2010-11-25 21:21:59 +00:00
Duncan Sands	c78fbf9877	Use LLVMCC_EMITIR_FLAG rather than hard-coding "-emit-llvm". llvm-svn: 120156	2010-11-25 21:19:52 +00:00
Rafael Espindola	7c2acd022e	Use multiple 0x66 prefixes so that all nops up to 15 bytes are a single instruction. llvm-svn: 120147	2010-11-25 17:14:16 +00:00
Rafael Espindola	f8e127eaf6	Factor some code to parseSectionFlags and fix the default type of a section. llvm-svn: 120145	2010-11-25 15:32:56 +00:00
Nick Lewycky	b8de00ee07	Treat a call of function pointer like a load of the pointer when considering whether the pointer can be replaced with the global variable it is a copy of. Fixes PR8680. llvm-svn: 120126	2010-11-24 22:04:20 +00:00
Rafael Espindola	9f75d5df0b	Behave a bit more like gnu as and use the symbol (instead of the section) for any relocation to a symbol defined in a tls section. llvm-svn: 120121	2010-11-24 21:57:39 +00:00
Rafael Espindola	708ac4d6ad	Relocate with the symbol if the relocation is of kind NTPOFF. Patch by David Meyer, I added the test. llvm-svn: 120104	2010-11-24 19:23:50 +00:00
Rafael Espindola	e98d483b71	Fix and add tests for all cases in x86 and x86_64 where gnu as implicitly sets the type of a symbol to STT_TLS. llvm-svn: 120100	2010-11-24 18:51:21 +00:00
Rafael Espindola	af9a7a3e92	Testcase for r120017. llvm-svn: 120099	2010-11-24 18:03:57 +00:00
Kalle Raiskila	97fc68774c	Allow for 'fcmp ogt' in SPU. Fix by Visa Putkinen! llvm-svn: 120090	2010-11-24 11:42:17 +00:00
Rafael Espindola	4e70ac7b68	If a symbol is used as tls, mark it as tls even if not declare as so. Probably fixes PR8659. llvm-svn: 120076	2010-11-24 02:19:40 +00:00
Benjamin Kramer	94a622af4c	The srem -> urem transform is not safe for any divisor that's not a power of two. E.g. -5 % 5 is 0 with srem and 1 with urem. Also addresses Frits van Bommel's comments. llvm-svn: 120049	2010-11-23 20:33:57 +00:00
Bob Wilson	d7d2cf7842	Recognize sign/zero-extended constant BUILD_VECTORs for VMULL operations. We need to check if the individual vector elements are sign/zero-extended values. For now this only handles constants values. Radar 8687140. llvm-svn: 120034	2010-11-23 19:38:38 +00:00
Benjamin Kramer	b5afa65b0a	InstCombine: Reduce "X shift (A srem B)" to "X shift (A urem B)" iff B is positive. This allows to transform the rem in "1 << ((int)x % 8);" to an and. llvm-svn: 120028	2010-11-23 18:52:42 +00:00
Duncan Sands	adc7771f18	Exploit distributive laws (eg: And distributes over Or, Mul over Add, etc) in a fairly systematic way in instcombine. Some of these cases were already dealt with, in which case I removed the existing code. The case of Add has a bunch of funky logic which covers some of this plus a few variants (considers shifts to be a form of multiplication), which I didn't touch. The simplification performed is: AB+AC -> A(B+C). The improvement is to do this in cases that were not already handled [such as AB-AC -> A(B-C), which was reported on the mailing list], and also to do it more often by not checking for "only one use" if "B+C" simplifies. llvm-svn: 120024	2010-11-23 14:23:47 +00:00
Kalle Raiskila	e1b6c273b8	Division by pow-of-2 is not cheap on SPU, do it with shifts. llvm-svn: 120022	2010-11-23 13:27:59 +00:00
Rafael Espindola	3c7cab1402	Produce a relocation for pcrel absolute values. Based on a patch by David Meyer. llvm-svn: 120006	2010-11-23 07:20:12 +00:00
Chris Lattner	e5afa15b77	duncan's spider sense was right, I completely reversed the condition on this instcombine xform. This fixes a miscompilation of 403.gcc. llvm-svn: 119988	2010-11-23 02:42:04 +00:00
Chris Lattner	adc29567fc	filecheckize llvm-svn: 119987	2010-11-23 02:26:52 +00:00
Benjamin Kramer	f1ebb63161	InstCombine: Implement X - A-B -> X + AB. llvm-svn: 119984	2010-11-22 20:31:27 +00:00
Evan Cheng	eb56dca4fd	Fix epilogue codegen to avoid leaving the stack pointer in an invalid state. Previously Thumb2 would restore sp from fp like this: mov sp, r7 sub, sp, #4 If an interrupt is taken after the 'mov' but before the 'sub', callee-saved registers might be clobbered by the interrupt handler. Instead, try restoring directly from sp: add sp, #4 Or, if necessary (with VLA, etc.) use a scratch register to compute sp and then restore it: sub.w r4, r7, #8 mov sp, r7 rdar://8465407 llvm-svn: 119977	2010-11-22 18:12:04 +00:00

... 2 3 4 5 6 ...

11897 Commits