llvm-project

Commit Graph

Author	SHA1	Message	Date
Chris Lattner	5191c65485	prepare for a change I'm about to make llvm-svn: 31248	2006-10-28 00:59:20 +00:00
Reid Spencer	00c482b7a2	Simplify code a bit by changing instances of: InsertNewInstBefore(new CastInst(Val, ValTy, Val->GetName()), I) into: InsertCastBefore(Val, ValTy, I) llvm-svn: 31204	2006-10-26 19:19:06 +00:00
Reid Spencer	7e80b0b31e	For PR950: Make necessary changes to support DIV -> [SUF]Div. This changes llvm to have three division instructions: signed, unsigned, floating point. The bytecode and assembler are bacwards compatible, however. llvm-svn: 31195	2006-10-26 06:15:43 +00:00
Nick Lewycky	5b979ae531	Fix 2006-10-25-AddSetCC. A relational operator (like setlt) can never produce an EQ property. llvm-svn: 31193	2006-10-26 02:35:18 +00:00
Nick Lewycky	9d17c82a26	Resurrect r1.25. Fix and comment the "or", "and" and "xor" transformations. llvm-svn: 31189	2006-10-25 23:48:24 +00:00
Chris Lattner	53f53db919	hide symbols properly llvm-svn: 31184	2006-10-25 21:14:31 +00:00
Chris Lattner	ebb1ad4382	Fix Transforms/ScalarRepl/2006-10-23-PointerUnionCrash.ll llvm-svn: 31151	2006-10-24 06:26:32 +00:00
Chris Lattner	dc7b9beb20	Revert back to r1.21, which was the last revision of predsimplify that passes llvm-gcc bootstrap. llvm-svn: 31146	2006-10-24 00:36:21 +00:00
Chris Lattner	fe7b6ef346	Handle fallout from the recent branch-on-undef changes. This fixes Prolangs-C/agrep and SCCP/2006-10-23-IPSCCP-Crash.ll llvm-svn: 31132	2006-10-23 18:57:02 +00:00
Nick Lewycky	53b4158448	Remove the Backwards operation. Resolving now works at the time when a property is added by running through the list of uses of the value and adding resolved properties to the property set. llvm-svn: 31126	2006-10-23 01:56:02 +00:00
Nick Lewycky	6f5c30fcec	Fix similar missing optimization opportunity in XOR. llvm-svn: 31123	2006-10-22 22:22:58 +00:00
Nick Lewycky	af2b0571d0	Whoops! Add missing NULL check. llvm-svn: 31121	2006-10-22 21:38:24 +00:00
Nick Lewycky	2c734f3fc1	Handle "if ((x\|y) != 0)" for ints like we do for bools. Fixes missed optimization opportunity pointed out by Chris Lattner. llvm-svn: 31118	2006-10-22 21:36:41 +00:00
Nick Lewycky	f345008339	AllocaInst can't return a null pointer. Fixes missed optimization opportunity pointed out by Andrew Lewycky. llvm-svn: 31115	2006-10-22 19:53:27 +00:00
Chris Lattner	250eff20da	Add a workaround for PR962, disabling the more aggressive form of this transformation. This speeds up a C++ app 2.25x. llvm-svn: 31113	2006-10-22 18:42:26 +00:00
Chris Lattner	af17096dcf	3 Changes: 1. Better document what is going on here. 2. Only hack on one branch per iteration, making the results less conservative. 3. Handle the problematic case by marking edges executable instead of by playing with value lattice states. This is far less pessimistic, and fixes SCCP/ipsccp-gvar.ll. llvm-svn: 31106	2006-10-22 05:59:17 +00:00
Chris Lattner	af1222c1a7	llvm-extract should remove module-level asm llvm-svn: 31086	2006-10-20 21:35:41 +00:00
Chris Lattner	319c86fd38	Fix an ugly problem in SCCP. This fixes Benchmarks/Misc-C++/mandel-text.cpp llvm-svn: 31073	2006-10-20 20:19:08 +00:00
Chris Lattner	5dee3b2526	Fix miscompilation of MallocBench/espresso which code review pointed out but apparently didn't make it into the final patch. llvm-svn: 31070	2006-10-20 18:20:21 +00:00
Reid Spencer	e0fc4dfc22	For PR950: This patch implements the first increment for the Signless Types feature. All changes pertain to removing the ConstantSInt and ConstantUInt classes in favor of just using ConstantInt. llvm-svn: 31063	2006-10-20 07:07:24 +00:00
Devang Patel	5d417e35bc	While creating mask, use 1ULL instead of 1. llvm-svn: 31062	2006-10-20 01:16:56 +00:00
Chris Lattner	b8b11599dd	Fix SimplifyCFG/2006-10-19-UncondDiv.ll by disabling a bad xform. llvm-svn: 31061	2006-10-20 00:42:07 +00:00
Devang Patel	5d6df959e3	It is OK to remove extra cast if operation is EQ/NE even though source and destination sign may not match but other conditions are met. llvm-svn: 31056	2006-10-19 20:59:13 +00:00
Devang Patel	88afd00d1d	Typo Typo. llvm-svn: 31055	2006-10-19 19:21:36 +00:00
Devang Patel	472530d9fc	Typo. llvm-svn: 31054	2006-10-19 19:05:38 +00:00
Devang Patel	b42aef4925	Fix bug in PR454 resolution. Added new test case. This fixes llvmAsmParser.cpp miscompile by llvm on PowerPC Darwin. llvm-svn: 31053	2006-10-19 18:54:08 +00:00
Reid Spencer	3c514959dd	Undo Chris' last patch, it caused a regression. llvm-svn: 30991	2006-10-16 23:08:08 +00:00
Chris Lattner	9a1c7dd27a	fix a buggy check that accidentally disabled this xform llvm-svn: 30967	2006-10-15 22:42:15 +00:00
Nick Lewycky	77e030bca9	Replace custom dispatch code with two uses of InstVisitor. Improves compile-time performance. llvm-svn: 30896	2006-10-12 02:02:44 +00:00
Chris Lattner	41b442242d	Implement SROA of unions with mixed pointers/integers in them. This implements PR892 and Transforms/ScalarRepl/union-pointer.ll:test2 llvm-svn: 30825	2006-10-08 23:53:04 +00:00
Chris Lattner	05f8272afa	Implement Transforms/ScalarRepl/union-pointer.ll:test llvm-svn: 30823	2006-10-08 23:28:04 +00:00
Chris Lattner	2deeaeaca7	add a new SimplifyDemandedVectorElts method, which works similarly to SimplifyDemandedBits. The idea is that some operations can be simplified if not all of the computed elements are needed. Some targets (like x86) have a large number of intrinsics that operate on a single element, but pass other elts through unmodified. If those other elements are not needed, the intrinsics can be simplified to scalar operations, and insertelement ops can be removed. This turns (f.e.): ushort %Convert_sse(float %f) { %tmp = insertelement <4 x float> undef, float %f, uint 0 ; <<4 x float>> [#uses=1] %tmp10 = insertelement <4 x float> %tmp, float 0.000000e+00, uint 1 ; <<4 x float>> [#uses=1] %tmp11 = insertelement <4 x float> %tmp10, float 0.000000e+00, uint 2 ; <<4 x float>> [#uses=1] %tmp12 = insertelement <4 x float> %tmp11, float 0.000000e+00, uint 3 ; <<4 x float>> [#uses=1] %tmp28 = tail call <4 x float> %llvm.x86.sse.sub.ss( <4 x float> %tmp12, <4 x float> < float 1.000000e+00, float 0.000000e+00, float 0.000000e+00, float 0.000000e+00 > ) ; <<4 x float>> [#uses=1] %tmp37 = tail call <4 x float> %llvm.x86.sse.mul.ss( <4 x float> %tmp28, <4 x float> < float 5.000000e-01, float 0.000000e+00, float 0.000000e+00, float 0.000000e+00 > ) ; <<4 x float>> [#uses=1] %tmp48 = tail call <4 x float> %llvm.x86.sse.min.ss( <4 x float> %tmp37, <4 x float> < float 6.553500e+04, float 0.000000e+00, float 0.000000e+00, float 0.000000e+00 > ) ; <<4 x float>> [#uses=1] %tmp59 = tail call <4 x float> %llvm.x86.sse.max.ss( <4 x float> %tmp48, <4 x float> zeroinitializer ) ; <<4 x float>> [#uses=1] %tmp = tail call int %llvm.x86.sse.cvttss2si( <4 x float> %tmp59 ) ; <int> [#uses=1] %tmp69 = cast int %tmp to ushort ; <ushort> [#uses=1] ret ushort %tmp69 } into: ushort %Convert_sse(float %f) { entry: %tmp28 = sub float %f, 1.000000e+00 ; <float> [#uses=1] %tmp37 = mul float %tmp28, 5.000000e-01 ; <float> [#uses=1] %tmp375 = insertelement <4 x float> undef, float %tmp37, uint 0 ; <<4 x float>> [#uses=1] %tmp48 = tail call <4 x float> %llvm.x86.sse.min.ss( <4 x float> %tmp375, <4 x float> < float 6.553500e+04, float undef, float undef, float undef > ) ; <<4 x float>> [#uses=1] %tmp59 = tail call <4 x float> %llvm.x86.sse.max.ss( <4 x float> %tmp48, <4 x float> < float 0.000000e+00, float undef, float undef, float undef > ) ; <<4 x float>> [#uses=1] %tmp = tail call int %llvm.x86.sse.cvttss2si( <4 x float> %tmp59 ) ; <int> [#uses=1] %tmp69 = cast int %tmp to ushort ; <ushort> [#uses=1] ret ushort %tmp69 } which improves codegen from: _Convert_sse: movss LCPI1_0, %xmm0 movss 4(%esp), %xmm1 subss %xmm0, %xmm1 movss LCPI1_1, %xmm0 mulss %xmm0, %xmm1 movss LCPI1_2, %xmm0 minss %xmm0, %xmm1 xorps %xmm0, %xmm0 maxss %xmm0, %xmm1 cvttss2si %xmm1, %eax andl $65535, %eax ret to: _Convert_sse: movss 4(%esp), %xmm0 subss LCPI1_0, %xmm0 mulss LCPI1_1, %xmm0 movss LCPI1_2, %xmm1 minss %xmm1, %xmm0 xorps %xmm1, %xmm1 maxss %xmm1, %xmm0 cvttss2si %xmm0, %eax andl $65535, %eax ret This is just a first step, it can be extended in many ways. Testcase here: Transforms/InstCombine/vec_demanded_elts.ll llvm-svn: 30752	2006-10-05 06:55:50 +00:00
Chris Lattner	52886e72d7	This case isn't implemented yet. It seems unlikely to be needed, but if it ever is, we want to get an assert instead of silent bad codegen. llvm-svn: 30716	2006-10-04 04:58:58 +00:00
Nick Lewycky	58a910dff5	Simplify logic further. Ensure that we copy KnownProperties before calling visitBasicBlock, else we may leak properties into blocks where they don't belong. llvm-svn: 30705	2006-10-03 17:36:01 +00:00
Nick Lewycky	1d00f3e144	Simplify, now that predsimplify depends on break-crit-edges. Fix SwitchInst where dest-block is the same as one of the cases. llvm-svn: 30700	2006-10-03 15:19:11 +00:00
Nick Lewycky	755f801adc	Move break-crit-edges before the predicate simplifier. Allows us to optimize in more cases. llvm-svn: 30699	2006-10-03 14:52:23 +00:00
Evan Cheng	ff510a58c2	Revert previous patch. Still breaking things. llvm-svn: 30698	2006-10-03 07:26:07 +00:00
Chris Lattner	8aca0ee8c3	Fix PR932 and Analysis/Dominators/2006-10-02-BreakCritEdges.ll: The critical edge block dominates the dest block if the destblock dominates all edges other than the one incoming from the critical edge. llvm-svn: 30696	2006-10-03 07:02:02 +00:00
Chris Lattner	7d19067c42	Fix a bug from r1.391 of this file, where we checked the size instead of the alignment when promoting allocations. This implements InstCombine/cast.ll:test32 llvm-svn: 30682	2006-10-01 19:40:58 +00:00
Chris Lattner	4797c891c0	Fix debug output llvm-svn: 30680	2006-09-30 23:32:50 +00:00
Chris Lattner	24d3d4280a	Implement SRA of heap allocations. llvm-svn: 30679	2006-09-30 23:32:09 +00:00
Chris Lattner	80a01ef6f0	Add some ifdef'd out debug info llvm-svn: 30676	2006-09-30 19:40:30 +00:00
Chris Lattner	6ab03f6a08	Eliminate ConstantBool::True and ConstantBool::False. Instead, provide ConstantBool::getTrue() and ConstantBool::getFalse(). llvm-svn: 30665	2006-09-28 23:35:22 +00:00
Owen Anderson	7cb6809c25	Another attempt at making ArgPromotion smarter. This patch no longer breaks Burg. llvm-svn: 30657	2006-09-28 23:02:22 +00:00
Chris Lattner	525804f31e	simplify code llvm-svn: 30656	2006-09-28 22:58:25 +00:00
Chris Lattner	e03ca2ca4a	set DEBUG_TYPE right llvm-svn: 30623	2006-09-27 04:58:23 +00:00
Nick Lewycky	059c79264f	Style changes only. Remove dead code, fix a comment. llvm-svn: 30588	2006-09-23 15:13:08 +00:00
Chris Lattner	6bd6da4097	Be far more careful when splitting a loop header, either to form a preheader or when splitting loops with a common header into multiple loops. In particular the old code would always insert the preheader before the old loop header. This is disasterous in cases where the loop hasn't been rotated. For example, it can produce code like: .. outside the loop... jmp LBB1_2 #bb13.outer LBB1_1: #bb1 movsd 8(%esp,%esi,8), %xmm1 mulsd (%edi), %xmm1 addsd %xmm0, %xmm1 addl $24, %edi incl %esi jmp LBB1_3 #bb13 LBB1_2: #bb13.outer leal (%edx,%eax,8), %edi pxor %xmm1, %xmm1 xorl %esi, %esi LBB1_3: #bb13 movapd %xmm1, %xmm0 cmpl $4, %esi jl LBB1_1 #bb1 Note that the loop body is actually LBB1_1 + LBB1_3, which means that the loop now contains an uncond branch WITHIN it to jump around the inserted loop header (LBB1_2). Doh. This patch changes the preheader insertion code to insert it in the right spot, producing this code: ... outside the loop, fall into the header ... LBB1_1: #bb13.outer leal (%edx,%eax,8), %esi pxor %xmm0, %xmm0 xorl %edi, %edi jmp LBB1_3 #bb13 LBB1_2: #bb1 movsd 8(%esp,%edi,8), %xmm0 mulsd (%esi), %xmm0 addsd %xmm1, %xmm0 addl $24, %esi incl %edi LBB1_3: #bb13 movapd %xmm0, %xmm1 cmpl $4, %edi jl LBB1_2 #bb1 Totally crazy, no branch in the loop! :) llvm-svn: 30587	2006-09-23 08:19:21 +00:00
Chris Lattner	608cd05e3f	Teach UpdateDomInfoForRevectoredPreds to handle revectored preds that are not reachable, making it general purpose enough for use by InsertPreheaderForLoop. Eliminate custom dominfo updating code in InsertPreheaderForLoop, using UpdateDomInfoForRevectoredPreds instead. llvm-svn: 30586	2006-09-23 07:40:52 +00:00
Chris Lattner	51c95cdd82	Fix Transforms/IndVarsSimplify/2006-09-20-LFTR-Crash.ll llvm-svn: 30555	2006-09-21 05:12:20 +00:00
Nick Lewycky	fde9c308b2	Don't rewrite ConstantExpr::get. llvm-svn: 30552	2006-09-21 01:05:35 +00:00
Nick Lewycky	d74c55f483	Once we're down to "setcc type constant1, constant2", at least come up with the right answer. llvm-svn: 30550	2006-09-20 23:02:24 +00:00
Nick Lewycky	cfff1c3f86	Use a total ordering to compare instructions. Fixes infinite loop in resolve(). llvm-svn: 30540	2006-09-20 17:04:01 +00:00
Andrew Lenharth	44cb67af5c	simplify llvm-svn: 30535	2006-09-20 15:37:57 +00:00
Chris Lattner	380c7e9a59	We went through all that trouble to compute whether it was safe to transform this comparison, but never checked it. Whoops, no wonder we miscompiled 177.mesa! llvm-svn: 30511	2006-09-20 04:44:59 +00:00
Evan Cheng	cd3f6ff0e5	Back out Chris' last set of changes. This breaks 177.mesa and povray somehow. llvm-svn: 30505	2006-09-20 01:39:40 +00:00
Evan Cheng	453280b94d	80 col. llvm-svn: 30504	2006-09-20 01:10:02 +00:00
Andrew Lenharth	4f339bebb0	If we have an add, do it in the pointer realm, not the int realm. This is critical in the linux kernel for pointer analysis correctness llvm-svn: 30496	2006-09-19 18:24:51 +00:00
Chris Lattner	12f52faf93	implement select.ll:test19-22 llvm-svn: 30482	2006-09-19 06:18:21 +00:00
Nick Lewycky	b9c5483a93	Walk down the dominator tree instead of the control flow graph. That means that we can't modify the CFG any more, at least not until it's possible to update the dominator tree (PR217). llvm-svn: 30469	2006-09-18 21:09:35 +00:00
Chris Lattner	de07792595	Fix an infinite loop building the CFE llvm-svn: 30465	2006-09-18 18:27:05 +00:00
Chris Lattner	67a35bbce7	Implement a trivial optzn: of vastart is never called in a function that takes ... args, remove the '...'. This is Transforms/DeadArgElim/dead_vaargs.ll llvm-svn: 30459	2006-09-18 07:02:31 +00:00
Chris Lattner	4922a0e53f	Implement InstCombine/cast.ll:test31. This speeds up 462.libquantum by 26%. llvm-svn: 30456	2006-09-18 05:27:43 +00:00
Chris Lattner	420c4bcc8d	Implement Transforms/InstCombine/shift-sra.ll:test0 llvm-svn: 30450	2006-09-18 04:31:40 +00:00
Chris Lattner	b3f24c91b0	Rewrite shift/and/compare sequences to promote better licm of the RHS. Use isLogicalShift/isArithmeticShift to simplify code. llvm-svn: 30448	2006-09-18 04:22:48 +00:00
Chris Lattner	850465d53f	Fix Transforms/InstCombine/2006-09-15-CastToBool.ll and PR913 llvm-svn: 30405	2006-09-16 03:14:10 +00:00
Chris Lattner	9482cc5b16	revert previous two patches. They cause miscompilation of MultiSource/Applications/Burg llvm-svn: 30397	2006-09-15 17:24:45 +00:00
Owen Anderson	edadd3faee	Revert my previous work on ArgumentPromotion. Further investigation has revealed these changes to be incorrect. They just weren't showing up in any of our current testcases. llvm-svn: 30385	2006-09-15 05:22:51 +00:00
Anton Korobeynikov	d61d39ec53	Adding dllimport, dllexport and external weak linkage types. DLL* linkages got full (I hope) codegeneration support in C & both x86 assembler backends. External weak linkage added for future use, we don't provide any codegeneration, etc. support for it. llvm-svn: 30374	2006-09-14 18:23:27 +00:00
Chris Lattner	237ccf2a51	Second half of the fix for Transforms/Inline/inline_cleanup.ll This folds unconditional branches that are often produced by code specialization. llvm-svn: 30307	2006-09-13 21:27:00 +00:00
Nick Lewycky	12efffc96b	Add some more consistency checks. llvm-svn: 30305	2006-09-13 19:32:53 +00:00
Nick Lewycky	51ce8d6b46	Fix unionSets so that it can merge correctly. llvm-svn: 30304	2006-09-13 19:24:01 +00:00
Chris Lattner	6ef6d06d21	Implement the first half of Transforms/Inline/inline_cleanup.ll llvm-svn: 30303	2006-09-13 19:23:57 +00:00
Nick Lewycky	3a4dc7b489	Erase dead instructions. llvm-svn: 30298	2006-09-13 18:55:37 +00:00
Devang Patel	fab4972a6e	Initialize DontInternalize. llvm-svn: 30281	2006-09-13 01:02:26 +00:00
Chris Lattner	1d7ec20a4d	An sinkable instruction may exist with uses, if those uses are in dead blocks. Handle this. This fixes PR908 and Transforms/LICM/2006-09-12-DeadUserOfSunkInstr.ll llvm-svn: 30275	2006-09-12 19:17:09 +00:00
Chris Lattner	d28627009a	Fix PR905 and InstCombine/2006-09-11-EmptyStructCrash.ll llvm-svn: 30266	2006-09-11 21:43:16 +00:00
Nick Lewycky	e94f42a740	Skip the linear search if the answer is already known. llvm-svn: 30251	2006-09-11 17:23:34 +00:00
Chris Lattner	d1f8e07808	Allow tail duplication in more cases, relaxing the previous restriction a bit. This fixes Regression/Transforms/TailDup/MergeTest.ll llvm-svn: 30237	2006-09-10 18:17:58 +00:00
Nick Lewycky	9a22d7b60f	Replace EquivalenceClasses with a custom-built data structure. Many common operations (like findProperties) should be faster, at the expense of unionSets being slower in cases that are rare in practise. Don't erase a dead Instruction. This fixes a memory corruption issue. llvm-svn: 30235	2006-09-10 02:27:07 +00:00
Chris Lattner	0468987592	Implement Transforms/InstCombine/hoist_instr.ll llvm-svn: 30234	2006-09-09 22:02:56 +00:00
Chris Lattner	27ff96d87a	Make inlining costs more accurate. llvm-svn: 30231	2006-09-09 20:40:44 +00:00
Chris Lattner	d79dc79831	Turn div X, (Cond ? Y : 0) -> div X, Y This implements select.ll::test18. llvm-svn: 30230	2006-09-09 20:26:32 +00:00
Chris Lattner	c465046e65	Throttle back tail duplication to avoid creating really ugly sequences of code. For Transforms/TailDup/if-tail-dup.ll, f.e., it produces: _foo: movl 8(%esp), %eax movl 4(%esp), %ecx testl $1, %ecx je LBB1_2 #cond_next LBB1_1: #cond_true movl $1, (%eax) LBB1_2: #cond_next testl $2, %ecx je LBB1_4 #cond_next10 LBB1_3: #cond_true6 movl $1, 4(%eax) LBB1_4: #cond_next10 testl $4, %ecx je LBB1_6 #cond_next18 LBB1_5: #cond_true14 movl $1, 8(%eax) LBB1_6: #cond_next18 testl $8, %ecx je LBB1_8 #return LBB1_7: #cond_true22 movl $1, 12(%eax) ret LBB1_8: #return ret instead of: _foo: movl 4(%esp), %eax testl $2, %eax sete %cl movl 8(%esp), %edx testl $1, %eax je LBB1_2 #cond_next LBB1_1: #cond_true movl $1, (%edx) testb %cl, %cl jne LBB1_4 #cond_next10 jmp LBB1_3 #cond_true6 LBB1_2: #cond_next testb %cl, %cl jne LBB1_4 #cond_next10 LBB1_3: #cond_true6 movl $1, 4(%edx) testl $4, %eax je LBB1_6 #cond_next18 jmp LBB1_5 #cond_true14 LBB1_4: #cond_next10 testl $4, %eax je LBB1_6 #cond_next18 LBB1_5: #cond_true14 movl $1, 8(%edx) testl $8, %eax je LBB1_8 #return jmp LBB1_7 #cond_true22 LBB1_6: #cond_next18 testl $8, %eax je LBB1_8 #return LBB1_7: #cond_true22 movl $1, 12(%edx) ret LBB1_8: #return ret llvm-svn: 30158	2006-09-07 21:30:15 +00:00
Chris Lattner	845b223da4	Fix Duraid's changes to work when TLI is null. This fixes the failing lowerinvoke regtests. llvm-svn: 30115	2006-09-05 17:48:07 +00:00
Duraid Madina	cf6749e4c0	add setJumpBufSize() and setJumpBufAlignment() to target-lowering. Call these from your backend to enjoy setjmp/longjmp goodness, see lib/Target/IA64/IA64ISelLowering.cpp for an example llvm-svn: 30095	2006-09-04 06:21:35 +00:00
Owen Anderson	19b80e76df	Make ArgumentPromotion handle recursive functions that pass pointers in their recursive calls. llvm-svn: 30057	2006-09-02 21:19:44 +00:00
Nick Lewycky	8e5599354a	Improve handling of SelectInst. Reorder operations to remove duplicated work. Fix to leave floating-point types out of the optimization. Add tests to predsimplify.ll for SwitchInst and SelectInst handling. llvm-svn: 30055	2006-09-02 19:40:38 +00:00
Nick Lewycky	f6f529d008	Don't confuse canonicalize and lookup. Fixes predsimplify.reg4.ll. Also corrects missing optimization opportunity removing cases from a switch. llvm-svn: 30009	2006-09-01 03:26:35 +00:00
Nick Lewycky	08674ab707	Properties where both Values weren't in the union (as being equal to another Value) weren't being found by findProperties. This fixes predsimplify.ll test6, a missed optimization opportunity. llvm-svn: 29991	2006-08-31 00:39:16 +00:00
Nick Lewycky	5f8f9af65c	Move to using the EquivalenceClass ADT. Removes SynSets. If a branch's condition has become a ConstantBool, simplify it immediately. Removing the edge saves work and exposes up more optimization opportunities in the pass. Add support for SelectInst. llvm-svn: 29970	2006-08-30 02:46:48 +00:00
Devang Patel	f489d0f85c	Do not rely on std::sort and std::erase to get list of unique exit blocks. The output is dependent on addresses of basic block. Add and use Loop::getUniqueExitBlocks. llvm-svn: 29966	2006-08-29 22:29:16 +00:00
Owen Anderson	a8a2e5c666	Clean up a bit. llvm-svn: 29950	2006-08-29 06:10:56 +00:00
Nick Lewycky	b2e8ae1700	Add PredicateSimplifier pass. Collapses equal variables into one form and simplifies expressions. This implements the optimization described in PR807. llvm-svn: 29947	2006-08-28 22:44:55 +00:00
Owen Anderson	62c84fe371	Make LoopUnroll fold excessive BasicBlocks. This results in a significant speedup of gccas on 252.eon llvm-svn: 29936	2006-08-28 02:09:46 +00:00
Chris Lattner	97c9f20c52	simplify AnalysisGroup registration, eliminating one typeid call. llvm-svn: 29932	2006-08-28 00:42:29 +00:00
Chris Lattner	c2d3d3112e	eliminate RegisterOpt. It does the same thing as RegisterPass. llvm-svn: 29925	2006-08-27 22:42:52 +00:00
Chris Lattner	3d27be1333	s\|llvm/Support/Visibility.h\|llvm/Support/Compiler.h\| llvm-svn: 29911	2006-08-27 12:54:02 +00:00
Owen Anderson	403b95af47	Fix a crash related to updating Phi nodes in the original header block. This was causing a crash in 175.vpr llvm-svn: 29887	2006-08-25 22:13:55 +00:00
Owen Anderson	8e4b029573	Add an assertion to check that we're really preserving LCSSA. llvm-svn: 29886	2006-08-25 22:12:36 +00:00
Owen Anderson	8cca95cf5d	Reapply the indvars patch, since nothing blew up last night. llvm-svn: 29874	2006-08-25 17:41:25 +00:00
Owen Anderson	94446a4267	Revert my previous patch. Since there are some major changes that went in today, I'm going to wait to put this in HEAD until tomorrow, so as not to clutter the nightly tester. llvm-svn: 29868	2006-08-25 03:45:57 +00:00
Owen Anderson	15a6423431	Specify that indvars actually preserve LCSSA. This has been done for a while, but I forgot to put in the analysis usage. llvm-svn: 29867	2006-08-25 03:32:13 +00:00
Owen Anderson	e001d811ba	Implement unrolling of multiblock loops. This significantly improves the utility of the LoopUnroll pass. Also, add a testcase for multiblock-loop unrolling. llvm-svn: 29859	2006-08-24 21:28:19 +00:00
Reid Spencer	5495fe8dd6	Fix a grammaro in a comment. llvm-svn: 29765	2006-08-18 09:01:07 +00:00
Chris Lattner	6441cf93c9	Handle single-entry PHI nodes correctly. This fixes PR877 and Transforms/CondProp/2006-08-14-SingleEntryPhiCrash.ll llvm-svn: 29673	2006-08-14 21:38:05 +00:00
Chris Lattner	f18b396cc2	Don't attempt to split subloops out of a loop with a huge number of backedges. Not only will this take huge amounts of compile time, the resultant loop nests won't be useful for optimization. This reduces loopsimplify time on Transforms/LoopSimplify/2006-08-11-LoopSimplifyLongTime.ll from ~32s to ~0.4s with a debug build of llvm on a 2.7Ghz G5. llvm-svn: 29647	2006-08-12 05:25:00 +00:00
Chris Lattner	85d9944f9a	Reimplement the loopsimplify code which deletes edges from unreachable blocks that target loop blocks. Before, the code was run once per loop, and depended on the number of predecessors each block in the loop had. Unfortunately, scanning preds can be really slow when huge numbers of phis exist or when phis with huge numbers of inputs exist. Now, the code is run once per function and scans successors instead of preds, which is far faster. In addition, the new code is simpler and is goto free, woo. This change speeds up a nasty testcase Duraid provided me from taking hours to taking ~72s with a debug build. The functionality this implements is already tested in the testsuite as Transforms/CodeExtractor/2004-03-13-LoopExtractorCrash.ll. llvm-svn: 29644	2006-08-12 04:51:20 +00:00
Reid Spencer	2b6d18a64f	Make this example pass use some things from lib/Support (EscapeString, SlowOperatingInfo, Statistics). Besides providing an example of how to use these facilities, it also serves to debug problems with runtime linking when dlopening a loadable module. These three support facilities exercise different combinations of Text/Weak Weak/Text and Text/Text linking between the executable and the module. llvm-svn: 29552	2006-08-07 23:17:24 +00:00
Reid Spencer	e6458c3fb2	For PR780: 1. Change the usage of LOADABLE_MODULE so that it implies all the things necessary to make a loadable module. This reduces the user's burdern to get a loadable module correctly built. 2. Document the usage of LOADABLE_MODULE in the MakefileGuide 3. Adjust the makefile for lib/Transforms/Hello to use the new specification for building loadable modules 4. Adjust the sample project to not attempt to build a shared library for its little library. This was just wasteful and not instructive at all. llvm-svn: 29551	2006-08-07 23:12:15 +00:00
Chris Lattner	c9009d917d	Fix PR867 (and maybe 868) and testcsae: Transforms/SimplifyCFG/2006-08-03-Crash.ll llvm-svn: 29515	2006-08-03 21:40:24 +00:00
Chris Lattner	3ff620178b	Changes: 1. Update an obsolete comment. 2. Make the sorting by base an explicit (though still N^2) step, so that the code is more clear on what it is doing. 3. Partition uses so that uses inside the loop are handled before uses outside the loop. Note that none of these changes currently changes the code inserted by LSR, but they are a stepping stone to getting there. This code is the result of some crazy pair programming with Nate. :) llvm-svn: 29493	2006-08-03 06:34:50 +00:00
Chris Lattner	38b6e8382a	Add special check to avoid isLoop call. Simple, but doesn't seem to speed up lcssa much in practice. llvm-svn: 29465	2006-08-02 00:16:47 +00:00
Chris Lattner	5a2bc786be	Replace the SSA update code in LCSSA with a bottom-up approach instead of a top down approach, inspired by discussions with Tanya. This approach is significantly faster, because it does not need dominator frontiers and it does not insert extraneous unused PHI nodes. For example, on 252.eon, in a release-asserts build, this speeds up LCSSA (which is the slowest pass in gccas) from 9.14s to 0.74s on my G5. This code is also slightly smaller and significantly simpler than the old code. Amusingly, in a normal Release build (which includes the "assert(L->isLCSSAForm());" assertion), asserting that the result of LCSSA is in LCSSA form is actually slower than the LCSSA transformation pass itself on 252.eon. I will see if Loop::isLCSSAForm can be sped up next. llvm-svn: 29463	2006-08-02 00:06:09 +00:00
Chris Lattner	85ea83e821	Add some advice llvm-svn: 29324	2006-07-27 04:24:14 +00:00
Chris Lattner	1b928478aa	Minor comment tweaks llvm-svn: 29226	2006-07-20 19:06:16 +00:00
Devang Patel	edd2f9952e	Make it fit into 80 cols. llvm-svn: 29223	2006-07-20 18:03:39 +00:00
Devang Patel	839d9260f0	Add new constructor to accept vector of exported names while creating InternalizePass. llvm-svn: 29222	2006-07-20 17:48:05 +00:00
Owen Anderson	8ef4c92ef8	Add an assertion. llvm-svn: 29199	2006-07-19 05:48:45 +00:00
Owen Anderson	aba8c199dd	Make LoopUnroll not die on LCSSA Phis. This makes lencod work again. llvm-svn: 29198	2006-07-19 05:45:14 +00:00
Owen Anderson	00b974cdbc	Fix a error that hadn't yet cause any problems, but I'm sure it would have somewhere down the road. llvm-svn: 29197	2006-07-19 03:51:48 +00:00
Chris Lattner	fea3974133	silence warnings in a release build llvm-svn: 29189	2006-07-18 21:48:57 +00:00
Evan Cheng	e9c68f52e1	Only reuse a previous IV if it would not require a type conversion. llvm-svn: 29186	2006-07-18 19:07:58 +00:00
Chris Lattner	19247f36ea	eliminate some ugly code, using ConstantExpr::getWithOperands instead. llvm-svn: 29149	2006-07-14 22:21:31 +00:00
Owen Anderson	bea70ee1de	Hopefully the final attempt at making IndVars preserve LCSSA. This should fix PR 831. llvm-svn: 29141	2006-07-14 18:49:15 +00:00
Chris Lattner	9b6c02ebe4	Revert this patch temporarily until PR831 is fixed. llvm-svn: 29134	2006-07-13 19:05:20 +00:00
Chris Lattner	b3c64f7ab3	Handle instructions in the map, but that map to a null pointer. This unbreaks smg2000. llvm-svn: 29127	2006-07-12 21:37:11 +00:00
Owen Anderson	dea9202e3b	IndVars now (correctly) preserves LCSSA form. llvm-svn: 29126	2006-07-12 21:29:14 +00:00
Chris Lattner	6148456ec2	In addition to deleting calls, the inliner can constant fold them as well. Handle this case, which doesn't require a new callgraph edge. This fixes a crash compiling MallocBench/gs. llvm-svn: 29121	2006-07-12 18:37:18 +00:00
Chris Lattner	5de3b8b262	Change the callgraph representation to store the callsite along with the target CG node. This allows the inliner to properly update the callgraph when using the pruning inliner. The pruning inliner may not copy over all call sites from a callee to a caller, so the edges corresponding to those call sites should not be copied over either. This fixes PR827 and Transforms/Inline/2006-07-12-InlinePruneCGUpdate.ll llvm-svn: 29120	2006-07-12 18:29:36 +00:00
Chris Lattner	091b6ea847	Silence a warning produced in assertions-disabled mode llvm-svn: 29108	2006-07-11 18:31:26 +00:00
Owen Anderson	15b1f7d2cd	Revert my indvars changes because they were breaking things. Unfortunately this didn't start showing up until after the recent instcombine fixes. llvm-svn: 29102	2006-07-11 07:25:33 +00:00
Owen Anderson	bbf8990ef7	Add a comment, and fix a typo that broke the build. llvm-svn: 29094	2006-07-10 22:15:25 +00:00
Owen Anderson	ae8aa646f1	Don't indent the entire function. llvm-svn: 29093	2006-07-10 22:03:18 +00:00
Chris Lattner	b7845d69db	Recognize 16-bit bswaps by relaxing overconstrained pattern. This implements Transforms/InstCombine/bswap.ll:test[34]. llvm-svn: 29087	2006-07-10 20:25:24 +00:00
Owen Anderson	a6968f83b2	Make instcombine not remove Phi nodes when LCSSA is live. llvm-svn: 29083	2006-07-10 19:03:49 +00:00
Owen Anderson	fe6e97d275	Fix typo in the comment. llvm-svn: 29078	2006-07-09 21:35:40 +00:00
Owen Anderson	aecaabb6e1	Add a fix for an issue where LCSSA would fail to insert undef's in some corner cases. Ideally, this issue will go away in the future as LCSSA gets smarter about which Phi nodes it inserts. llvm-svn: 29076	2006-07-09 08:14:06 +00:00
Chris Lattner	fd2e13b107	Fix PR820 and Transforms/GlobalOpt/2006-07-07-InlineAsmCrash.ll llvm-svn: 29071	2006-07-07 21:37:01 +00:00
Chris Lattner	996795b0dd	Use hidden visibility to make symbols in an anonymous namespace get dropped. This shrinks libllvmgcc.dylib another 67K llvm-svn: 28975	2006-06-28 23:17:24 +00:00
Chris Lattner	4a4c7fe7fa	Shrink libllvmgcc.dylib by another 23K llvm-svn: 28972	2006-06-28 22:08:15 +00:00
Owen Anderson	18e816f356	Switch to a very conservative heuristic for determining when loop-unswitching will be profitable. This is mainly to remove some cases where excessive unswitching would result in long compile times and/or huge generated code. Once someone comes up with a better heuristic that avoids these cases, this should be switched out. llvm-svn: 28962	2006-06-28 17:47:50 +00:00
Chris Lattner	3fda386965	Fix Transforms/InstCombine/2006-06-28-infloop.ll llvm-svn: 28961	2006-06-28 17:34:50 +00:00
Chris Lattner	0a2e11260e	Don't unswitch really large loops even if they are mostly filled with empty blocks. llvm-svn: 28959	2006-06-28 16:38:55 +00:00
Andrew Lenharth	ebfa24ee9a	Catch more function pointer casting problems Remove the Function pointer cast in these calls, converting it to a cast of argument. %tmp60 = tail call int cast (int (ulong)* %str to int (int))( int 10 ) %tmp60 = tail call int cast (int (ulong) %str to int (int)*)( uint %tmp51 ) llvm-svn: 28953	2006-06-28 01:01:52 +00:00
Owen Anderson	bb3ae5eb8f	Fix for 2006-06-27-DeadSwitchCase.ll Be more careful when updating Phi nodes after eliminating dead switch cases. Fix proposed by Chris. llvm-svn: 28947	2006-06-27 22:26:09 +00:00
Chris Lattner	c4998a0138	Fix Transforms/DeadArgElim/2006-06-27-struct-ret.ll. -deadargelim should not remove the struct return argument of a csret function, even if it is obviously dead. llvm-svn: 28943	2006-06-27 21:05:04 +00:00
Owen Anderson	b659bb4196	De-pessimize the handling of LCSSA Phi nodes in IndVarSimplify. Hopefully this will make Shootout-C/nestedloop faster. llvm-svn: 28924	2006-06-27 02:17:08 +00:00
Chris Lattner	49771a0462	random code cleanups, no functionality change llvm-svn: 28914	2006-06-26 19:10:05 +00:00
Owen Anderson	f52351e50f	Make LoopUnswitch able to unswitch loops with live-out values by taking advantage of LCSSA. This results several times the number of unswitchings occurring on tests such and timberwolfmc, unix-tbl, and ldecod. llvm-svn: 28912	2006-06-26 07:44:36 +00:00
Chris Lattner	053fb9319d	Fix IndVarsSimplify/2006-06-16-Indvar-LCSSA-Crash.ll, a case where a "LCSSA" phi node causes indvars to break dominance properties. This fixes causes indvars to avoid inserting aggressive code in this case, instead indvars should be fixed to be more aggressive in the face of lcssa phi's. llvm-svn: 28850	2006-06-17 01:02:31 +00:00
Evan Cheng	8a417a2fde	Add missing casts. This fixed some regressions. llvm-svn: 28834	2006-06-16 18:37:15 +00:00
Evan Cheng	1fc4025a9c	More libcall transformations: printf("%s\n", str) -> puts(str) printf("%c", c) -> putchar(c) Also fixed fprintf(file, "%c", c) -> fputc(c, file) llvm-svn: 28815	2006-06-16 08:36:35 +00:00
Evan Cheng	f2ea587aa2	Simplify fprintf(file, "%s", str) to fputs(str, file). llvm-svn: 28814	2006-06-16 04:52:30 +00:00
Chris Lattner	c482a9e31a	Implement Transforms/InstCombine/bswap.ll, turning common shift/and/or bswap idioms into bswap intrinsics. llvm-svn: 28803	2006-06-15 19:07:26 +00:00
Chris Lattner	0c4f5a655a	Fix Transforms/LoopUnswitch/2006-06-13-SingleEntryPHI.ll, a loop unswitch bug exposed by the recent lcssa work. llvm-svn: 28779	2006-06-14 04:46:17 +00:00
Chris Lattner	e3abb14503	Use the PotDoms map to memoize 'dominating value' lookup. With this patch, LCSSA is still the slowest pass when gccas'ing 252.eon, but now it only takes 39s instead of 289s. :) llvm-svn: 28776	2006-06-14 01:13:57 +00:00
Owen Anderson	e714a5c549	Fix another instance where PHI nodes need special treatment. llvm-svn: 28774	2006-06-13 20:50:09 +00:00
Owen Anderson	3f8ff0449a	Fix a bug that was causing major slowdowns in povray. This was due to LCSSA not handling PHI nodes correctly when determining if a value was live-out. This patch reduces the number of detected live-out variables in the testcase from 6565 to 485. llvm-svn: 28771	2006-06-13 19:37:18 +00:00
Owen Anderson	fd0a3d6e5c	Reapply my 6/9 changes. The bug Evan saw no longer occurs. llvm-svn: 28759	2006-06-12 21:49:21 +00:00
Chris Lattner	b5c9d7a0af	Fix an infinite loop on Transforms/SimplifyCFG/2006-06-12-InfLoop.ll llvm-svn: 28758	2006-06-12 20:18:01 +00:00
Owen Anderson	0ac336965e	Fix for 2006-06-26-MultipleExitsSingleBlock. If a single exit block has multiple predecessors within the loop, it will appear in the exit blocks list more than once. LCSSA needs to take that into account so that it doesn't double process that exit block. llvm-svn: 28750	2006-06-12 07:10:16 +00:00
Owen Anderson	b538f14d2a	Re-commit the safe parts of my 6/9 patch. Still working on fixing the unsafe parts. llvm-svn: 28748	2006-06-11 19:22:28 +00:00
Evan Cheng	1b6e310e6f	Back out Owen's 6/9 changes. They broke MultiSource/Benchmarks/Prolangs-C/bison (and perhaps others). llvm-svn: 28747	2006-06-11 09:32:57 +00:00
Owen Anderson	b1dc1d44f8	Add LCSSA as a requirement for LoopUnswitch, and assert that LoopUnswitch preserves LCSSA. llvm-svn: 28739	2006-06-09 18:40:32 +00:00
Owen Anderson	505adff3f0	Make Loop able to verify that it is in LCSSA-form, and have the LCSSA pass assert on this. llvm-svn: 28738	2006-06-09 18:33:30 +00:00
Evan Cheng	398f70292c	RewriteExpr, either the new PHI node of induction variable or the post-increment value, should be first cast to the appropriated type (to the type of the common expr). Otherwise, the rewrite of a use based on (common + iv) may end up with an incorrect type. llvm-svn: 28735	2006-06-09 00:12:42 +00:00
Owen Anderson	5d029264ec	Update some comments, and expose LCSSAID in preparation for having other passes require LCSSA. llvm-svn: 28734	2006-06-08 20:02:53 +00:00
Reid Spencer	d4b795902c	Fix a spello in a comment. llvm-svn: 28714	2006-06-07 21:24:10 +00:00
Chris Lattner	95cebb082f	Fix a bug in a recent patch. This fixes UnitTests/Vector/Altivec/casts.c on PPC/altivec llvm-svn: 28698	2006-06-06 22:26:02 +00:00
Owen Anderson	ac601b4c4b	Fix some formatting, and use inLoop() when appropriate. llvm-svn: 28694	2006-06-06 04:36:36 +00:00
Owen Anderson	9e81c1bb03	Stop a memory leak, and update some comments. llvm-svn: 28693	2006-06-06 04:28:30 +00:00
Owen Anderson	766f90b08e	Some more clean-up, and squash an IDF-Phi related bug. llvm-svn: 28680	2006-06-04 00:55:19 +00:00
Owen Anderson	eb33815f1b	Various clean-ups suggested by Chris. llvm-svn: 28678	2006-06-04 00:02:23 +00:00
Owen Anderson	d00eacc4f9	Fix a bug in Phi-noded insertion. Also, update some comments to reflect what's actually going on. llvm-svn: 28677	2006-06-03 23:22:50 +00:00
Chris Lattner	540886f0ae	Remove unneeded hook. Patch by Anton K. Thanks! llvm-svn: 28664	2006-06-02 19:11:46 +00:00
Chris Lattner	02e0b4ddb7	Force anything that #includes llvm/Transforms/Utils/UnifyFunctionExitNodes.h to link in the implementation. Thanks to Anton Korobeynikov for figuring out what was going on here. llvm-svn: 28660	2006-06-02 18:40:06 +00:00
Chris Lattner	cdf2b1fc30	Remove dead #include llvm-svn: 28642	2006-06-01 20:02:28 +00:00
Chris Lattner	cc340c02a4	Make the "pruning cloner" smarter. As it propagates constants through the code (while cloning) it often gets the branch/switch instructions. Since it knows that edges of the CFG are dead, it need not clone (or even look) at the obviously dead blocks. This should speed up the inliner substantially on code where there are lots of inlinable calls to functions with constant arguments. On C++ code in particular, this kicks in. llvm-svn: 28641	2006-06-01 19:19:23 +00:00
Chris Lattner	f905a7b994	Silence a -pedantic warning. llvm-svn: 28632	2006-06-01 17:16:21 +00:00
Owen Anderson	619e4ba57f	Remove a FIXME that was fixed with my last patch. llvm-svn: 28619	2006-06-01 06:07:40 +00:00
Owen Anderson	cd76fa04a1	More cleanups. Also, add a special case for updating PHI nodes, and reimplement getValueDominatingFunction to walk the DominanceTree rather than just searching blindly. llvm-svn: 28618	2006-06-01 06:05:47 +00:00
Chris Lattner	1df0e98ac2	Swap the order of operands created here. For +&\|^, the order doesn't matter, but for sub, it really does! Fix fixes a miscompilation of fibheap_cut in llvmgcc4. llvm-svn: 28600	2006-05-31 21:14:00 +00:00
Owen Anderson	dad8c57340	Extract a huge loop into a helper method. Fix a few iterator-invalidation bugs. llvm-svn: 28599	2006-05-31 20:55:06 +00:00
Owen Anderson	8a8f278f15	Add Use replacement. Assuming there is nothing horribly wrong with this, LCSSA is now theoretically feature-complete. It has not, however, been thoroughly test, and is still considered experimental. llvm-svn: 28529	2006-05-29 01:00:00 +00:00
Owen Anderson	152d063ccb	Major think-o. Iterate over all live out-of-loop values, and perform the other calculations on each individually, rather than trying to delay it and do them all at the end. llvm-svn: 28527	2006-05-28 19:33:28 +00:00
Owen Anderson	1310e42803	Make LCSSA insert proper Phi nodes throughout the rest of the CFG by computing the iterated Dominance Frontier of the loop-closure Phi's. This is the second phase of the LCSSA pass. The third phase (coming soon) will be to update all uses of loop variables to use the loop-closure Phi's instead. llvm-svn: 28524	2006-05-27 18:47:11 +00:00
Chris Lattner	67c424e010	Fix some regression from the inliner patch I committed last night. This fixes ldecod, lencod, and SPASS. llvm-svn: 28523	2006-05-27 17:28:13 +00:00
Chris Lattner	be853d77e9	Switch the inliner over to using CloneAndPruneFunctionInto. This effectively makes it so that it constant folds instructions on the fly. This is good for several reasons: 0. Many instructions are constant foldable after inlining, particularly if inlining a call with constant arguments. 1. Without this, the inliner has to allocate memory for all of the instructions that can be constant folded, then a subsequent pass has to delete them. This gets the job done without this extra work. 2. This makes the inliner pass a bit more aggressive: in particular, it partially solves a phase order issue where the inliner would inline lots of code that folds away to nothing, but think that the resultant function is big because of this code that will be gone. Now the code never exists. This is the first part of a 2-step process. The second part will be smart enough to see when this implicit constant folding propagates a constant into a branch or switch instruction, making CFG edges dead. This implements Transforms/Inline/inline_constprop.ll llvm-svn: 28521	2006-05-27 01:28:04 +00:00
Chris Lattner	3df13f4f22	Implement a new method, CloneAndPruneFunctionInto, as documented. llvm-svn: 28519	2006-05-27 01:22:24 +00:00
Chris Lattner	bc3c879fcf	Refactor some code to expose an interface to constant fold and instruction given it's opcode, typeand operands. llvm-svn: 28517	2006-05-27 01:18:04 +00:00
Owen Anderson	b4e16996f1	A few small clean-ups, and the addition of an LCSSA statistic. llvm-svn: 28512	2006-05-27 00:31:37 +00:00
Owen Anderson	6e047ab8fc	Fix a copy-and-paste-o that would break some compilers. llvm-svn: 28507	2006-05-26 21:19:17 +00:00
Owen Anderson	f3dd3e2bfd	Clean up and refactor LCSSA a bunch. It should also run faster now, though there's still a lot of work to be done on it. llvm-svn: 28506	2006-05-26 21:11:53 +00:00
Chris Lattner	dab43b2b0e	Implement Transforms/InstCombine/store.ll:test2. llvm-svn: 28503	2006-05-26 19:19:20 +00:00
Owen Anderson	8eca8910b6	Skeletal LCSSA pass. This is currently non-functional. Expect functionality and documentation updates soo. llvm-svn: 28495	2006-05-26 13:58:26 +00:00
Chris Lattner	0e47716e69	Transform things like (splat(splat)) -> splat llvm-svn: 28490	2006-05-26 00:29:06 +00:00
Chris Lattner	12249be286	Introduce a helper function that simplifies interpretation of shuffle masks. No functionality change. llvm-svn: 28489	2006-05-25 23:48:38 +00:00
Chris Lattner	99155be33f	Turn (cast (shuffle (cast)) -> shuffle (cast) if it reduces the # casts in the program. This exposes more opportunities for the instcombiner, and implements vec_shuffle.ll:test6 llvm-svn: 28487	2006-05-25 23:24:33 +00:00
Chris Lattner	83f6578b0c	extract element from a shuffle vector can be trivially turned into an extractelement from the SV's source. This implement vec_shuffle.ll:test[45] llvm-svn: 28485	2006-05-25 22:53:38 +00:00
Chris Lattner	0853700582	Revert a patch that is unsafe, due to out of range array accesses in inner array scopes possibly accessing valid memory in outer subscripts. llvm-svn: 28478	2006-05-25 21:25:12 +00:00
Chris Lattner	a643d528bd	Patch for a new instcombine xform, patch contributed by Nick Lewycky! This implements Transforms/InstCombine/2006-05-10-InvalidIndexUndef.ll llvm-svn: 28450	2006-05-24 17:34:30 +00:00
Chris Lattner	aa2372562e	Patches to make the LLVM sources more -pedantic clean. Patch provided by Anton Korobeynikov! This is a step towards closing PR786. llvm-svn: 28447	2006-05-24 17:04:05 +00:00
Chris Lattner	d0622b6894	Silence a bogus gcc warning llvm-svn: 28422	2006-05-20 23:14:03 +00:00
Reid Spencer	2452c94df4	Fix a doxygen problem and break lines at 80 columns llvm-svn: 28395	2006-05-19 19:09:46 +00:00
Chris Lattner	e4cb4768fa	Declare that lowerinvoke doesn't interact with other lowering passes. Patch written by Domagoj Babic! llvm-svn: 28367	2006-05-17 21:05:27 +00:00
Chris Lattner	2e266807c3	Add a CloneModule call that exposes the mapping of values from the old module to the new module. Patch provided by Nick Lewycky! llvm-svn: 28349	2006-05-17 18:05:35 +00:00
Chris Lattner	35515557c7	remove some dead code identified by coverity llvm-svn: 28289	2006-05-14 18:45:44 +00:00
Chris Lattner	3237da073e	remove dead variables llvm-svn: 28286	2006-05-14 18:33:57 +00:00
Evan Cheng	18d0438148	Backing out last check-in for now. It's causing an infinite loop gccas lencode. llvm-svn: 28284	2006-05-14 06:46:03 +00:00
Chris Lattner	3987a8532d	Add/Sub/Mul are safe to promote here as well. Incrementing a single-bit bitfield now gives this code: _plus: lwz r2, 0(r3) rlwimi r2, r2, 0, 1, 31 xoris r2, r2, 32768 stw r2, 0(r3) blr instead of this: _plus: lwz r2, 0(r3) srwi r4, r2, 31 slwi r4, r4, 31 addis r4, r4, -32768 rlwimi r2, r4, 0, 0, 0 stw r2, 0(r3) blr this can obviously still be improved. llvm-svn: 28275	2006-05-13 02:16:08 +00:00
Chris Lattner	1ebbe6a22e	Implement simple promotion for cast elimination in instcombine. This is currently very limited, but can be extended in the future. For example, we now compile: uint %test30(uint %c1) { %c2 = cast uint %c1 to ubyte %c3 = xor ubyte %c2, 1 %c4 = cast ubyte %c3 to uint ret uint %c4 } to: _xor: movzbl 4(%esp), %eax xorl $1, %eax ret instead of: _xor: movb $1, %al xorb 4(%esp), %al movzbl %al, %eax ret More impressively, we now compile: struct B { unsigned bit : 1; }; void xor(struct B *b) { b->bit = b->bit ^ 1; } To (X86/PPC): _xor: movl 4(%esp), %eax xorl $-2147483648, (%eax) ret _xor: lwz r2, 0(r3) xoris r2, r2, 32768 stw r2, 0(r3) blr instead of (X86/PPC): _xor: movl 4(%esp), %eax movl (%eax), %ecx movl %ecx, %edx shrl $31, %edx # TRUNCATE movb %dl, %dl xorb $1, %dl movzbl %dl, %edx andl $2147483647, %ecx shll $31, %edx orl %ecx, %edx movl %edx, (%eax) ret _xor: lwz r2, 0(r3) srwi r4, r2, 31 xori r4, r4, 1 rlwimi r2, r4, 31, 0, 0 stw r2, 0(r3) blr This implements InstCombine/cast.ll:test30. llvm-svn: 28273	2006-05-13 02:06:03 +00:00
Chris Lattner	cd60d38b30	Remove some dead variables. Fix a nasty bug in the memcmp optimizer where we used the wrong variable! llvm-svn: 28269	2006-05-12 23:35:26 +00:00
Chris Lattner	94acc47654	Remove dead stuff llvm-svn: 28268	2006-05-12 23:32:01 +00:00
Chris Lattner	1443bc52be	Refactor some code, making it simpler. When doing the initial pass of constant folding, if we get a constantexpr, simplify the constant expr like we would do if the constant is folded in the normal loop. This fixes the missed-optimization regression in Transforms/InstCombine/getelementptr.ll last night. llvm-svn: 28224	2006-05-11 17:11:52 +00:00
Chris Lattner	a36ee4ea34	Two changes: 1. Implement InstCombine/deadcode.ll by not adding instructions in unreachable blocks (due to constants in conditional branches/switches) to the worklist. This causes them to be deleted before instcombine starts up, leading to better optimization. 2. In the prepass over instructions, do trivial constprop/dce as we go. This has the effect of improving the effectiveness of #1. In addition, it significantly speeds up instcombine on test cases with large amounts of constant folding code (for example, that produced by code specialization or partial evaluation). In one example, it speeds up instcombine from 0.0589s to 0.0224s with a release build (a 2.6x speedup). llvm-svn: 28215	2006-05-10 19:00:36 +00:00
Chris Lattner	4fe87d67c4	Patch to make some xforms preserve each other. Patch contributed by Domagoj Babic! llvm-svn: 28181	2006-05-09 04:13:41 +00:00
Chris Lattner	1d441adfbf	Move some code around. Make the "fold (and (cast A), (cast B)) -> (cast (and A, B))" transformation only apply when both casts really will cause code to be generated. If one or both doesn't, then this xform doesn't remove a cast. This fixes Transforms/InstCombine/2006-05-06-Infloop.ll llvm-svn: 28141	2006-05-06 09:00:16 +00:00
Chris Lattner	e745c7de0e	Fix an infinite loop compiling oggenc last night. llvm-svn: 28128	2006-05-05 20:51:30 +00:00
Chris Lattner	3af1053488	Implement InstCombine/cast.ll:test29 llvm-svn: 28126	2006-05-05 06:39:07 +00:00
Chris Lattner	fb29692055	Fix Transforms/InstCombine/2006-05-04-DemandedBitCrash.ll llvm-svn: 28101	2006-05-04 17:33:35 +00:00
Chris Lattner	2d3a02725d	Add pass ID's for various passes, so they can be AddRequiredID. Patch by Domagoj Babic! llvm-svn: 28048	2006-05-02 04:24:36 +00:00
Chris Lattner	655d08fda8	Fix InstCombine/2006-04-28-ShiftShiftLongLong.ll llvm-svn: 28019	2006-04-28 22:21:41 +00:00
Chris Lattner	e63d808b6e	Fix Transforms/Reassociate/2006-04-27-ReassociateVector.ll llvm-svn: 28007	2006-04-28 04:14:49 +00:00
Chris Lattner	b6cb64b7e6	Add support for inserting undef into a vector. This implements Transforms/InstCombine/vec_insert_to_shuffle.ll llvm-svn: 27997	2006-04-27 21:14:21 +00:00
Chris Lattner	f98b4aa2e7	Fix some nondeterminstic behavior in the mem2reg pass that (in addition to nondeterminism being bad) could cause some trivial missed optimizations (dead phi nodes being left around for later passes to clean up). With this, llvm-gcc4 now bootstraps and correctly compares. I don't know why I never tried to do it before... :) llvm-svn: 27984	2006-04-27 01:14:43 +00:00
Chris Lattner	dae49df407	Fix Transforms/ScalarRepl/2006-04-20-PromoteCrash.ll llvm-svn: 27912	2006-04-20 20:48:50 +00:00
Andrew Lenharth	f89e630b2f	Make code match cvs commit message :) llvm-svn: 27881	2006-04-20 15:41:37 +00:00
Andrew Lenharth	61eae29ad6	If we can convert the return pointer type into an integer that IntPtrType can be converted to losslessly, we can continue the conversion to a direct call. llvm-svn: 27880	2006-04-20 14:56:47 +00:00
Chris Lattner	36dd7c98d1	Turn x86 unaligned load/store intrinsics into aligned load/store instructions if the pointer is known aligned. llvm-svn: 27781	2006-04-17 22:26:56 +00:00
Chris Lattner	9095186deb	Fix a bug in the 'shuffle(undef,x,mask) -> shuffle(x, undef,mask')' xform Make the insert/extract elt -> shuffle code more aggressive. This fixes CodeGen/PowerPC/vec_shuffle.ll llvm-svn: 27728	2006-04-16 00:51:47 +00:00
Chris Lattner	34cebe785d	Canonicalize shuffle(undef,x,mask) -> shuffle(x, undef,mask'). llvm-svn: 27727	2006-04-16 00:03:56 +00:00
Chris Lattner	39fac448d6	significant cleanups to code that uses insert/extractelt heavily. This builds maximal shuffles out of them where possible. llvm-svn: 27717	2006-04-15 01:39:45 +00:00
Chris Lattner	3323ce165d	Teach scalarrepl to promote unions of vectors and floats, producing insert/extractelement operations. This implements Transforms/ScalarRepl/vector_promote.ll llvm-svn: 27710	2006-04-14 21:42:41 +00:00
Andrew Lenharth	92cf71f6d7	linear -> constant time llvm-svn: 27652	2006-04-13 13:43:31 +00:00
Reid Spencer	13a1a7a4a6	Get rid of a signed/unsigned compare warning. llvm-svn: 27625	2006-04-12 19:28:15 +00:00
Chris Lattner	b19a5c661b	Turn casts into getelementptr's when possible. This enables SROA to be more aggressive in some cases where LLVMGCC 4 is inserting casts for no reason. This implements InstCombine/cast.ll:test27/28. llvm-svn: 27620	2006-04-12 18:09:35 +00:00
Chris Lattner	2d37f920ad	Implement vec_shuffle.ll:test3 llvm-svn: 27573	2006-04-10 23:06:36 +00:00
Chris Lattner	fbb77a408b	Implement InstCombine/vec_shuffle.ll:test[12] llvm-svn: 27571	2006-04-10 22:45:52 +00:00
Andrew Lenharth	a9cdcca3c3	Add a simple pass to make sure that all (non-library) calls to malloc and free are visible to analysis as intrinsics. That is, make sure someone doesn't pass free around by address in some struct (as happens in say 176.gcc). This doesn't get rid of any indirect calls, just ensure calls to free and malloc are always direct. llvm-svn: 27560	2006-04-10 19:26:09 +00:00
Chris Lattner	17bd60588c	Add supprot for shufflevector llvm-svn: 27513	2006-04-08 01:19:12 +00:00
Chris Lattner	8ec0205de4	Fix inlining of insert/extract element constantexprs llvm-svn: 27478	2006-04-07 04:41:03 +00:00
Chris Lattner	e79d249c29	Lower vperm(x,y, mask) -> shuffle(x,y,mask) if mask is constant. This allows us to compile oh-so-realistic stuff like this: vec_vperm(A, B, (vector unsigned char){14}); to: vspltb v0, v0, 14 instead of: vspltisb v0, 14 vperm v0, v2, v1, v0 llvm-svn: 27452	2006-04-06 19:19:17 +00:00
Chris Lattner	caba72b6ff	vector casts of casts are eliminable. Transform this: %tmp = cast <4 x uint> %tmp to <4 x int> ; <<4 x int>> [#uses=1] %tmp = cast <4 x int> %tmp to <4 x float> ; <<4 x float>> [#uses=1] into: %tmp = cast <4 x uint> %tmp to <4 x float> ; <<4 x float>> [#uses=1] llvm-svn: 27355	2006-04-02 05:43:13 +00:00
Chris Lattner	ebca476b27	Allow transforming this: %tmp = cast <4 x uint>* %testData to <4 x int>* ; <<4 x int>> [#uses=1] %tmp = load <4 x int> %tmp ; <<4 x int>> [#uses=1] to this: %tmp = load <4 x uint>* %testData ; <<4 x uint>> [#uses=1] %tmp = cast <4 x uint> %tmp to <4 x int> ; <<4 x int>> [#uses=1] llvm-svn: 27353	2006-04-02 05:37:12 +00:00
Chris Lattner	f42d0aeda1	Turn altivec lvx/stvx intrinsics into loads and stores. This allows the elimination of one load from this: int AreSecondAndThirdElementsBothNegative( vector float in ) { #define QNaN 0x7FC00000 const vector unsigned int testData = (vector unsigned int)( QNaN, 0, 0, QNaN ); vector float test = vec_ld( 0, (float) &testData ); return ! vec_any_ge( test, *in ); } Now generating: _AreSecondAndThirdElementsBothNegative: mfspr r2, 256 oris r4, r2, 49152 mtspr 256, r4 li r4, lo16(LCPI1_0) lis r5, ha16(LCPI1_0) addi r6, r1, -16 lvx v0, r5, r4 stvx v0, 0, r6 lvx v1, 0, r3 vcmpgefp. v0, v0, v1 mfcr r3, 2 rlwinm r3, r3, 27, 31, 31 xori r3, r3, 1 cntlzw r3, r3 srwi r3, r3, 5 mtspr 256, r2 blr llvm-svn: 27352	2006-04-02 05:30:25 +00:00
Chris Lattner	70ec96fa32	Adjust to change in Intrinsics.gen interface. llvm-svn: 27344	2006-04-02 03:35:01 +00:00
Chris Lattner	1b2436a624	add valuemapper support for inline asm llvm-svn: 27332	2006-04-01 23:17:11 +00:00
Chris Lattner	6cf4914fd4	Fix InstCombine/2006-04-01-InfLoop.ll llvm-svn: 27330	2006-04-01 22:05:01 +00:00
Chris Lattner	dcd0792622	Fold A^(B&A) -> (B&A)^A Fold (B&A)^A == ~B & A This implements InstCombine/xor.ll:test2[56] llvm-svn: 27328	2006-04-01 08:03:55 +00:00
Chris Lattner	8d1d8d364c	If we can look through vector operations to find the scalar version of an extract_element'd value, do so. llvm-svn: 27323	2006-03-31 23:01:56 +00:00
Chris Lattner	92346c315e	extractelement(undef,x) -> undef llvm-svn: 27300	2006-03-31 18:25:14 +00:00
Chris Lattner	612fa8e6f3	Fix Transforms/InstCombine/2006-03-30-ExtractElement.ll llvm-svn: 27261	2006-03-30 22:02:40 +00:00
Chris Lattner	42e0ba09aa	teach the inliner to work with packed constants llvm-svn: 27161	2006-03-27 05:50:18 +00:00
Chris Lattner	d70d9f5b24	Don't crash on packed logical ops llvm-svn: 27125	2006-03-25 21:58:26 +00:00
Chris Lattner	f365f5f0c1	Fix spello llvm-svn: 27052	2006-03-24 07:14:34 +00:00
Chris Lattner	5821a6a17a	add the actual cost to the debug info llvm-svn: 27051	2006-03-24 07:14:00 +00:00
Jim Laskey	8f64426f5c	Strip changes to llvm.dbg intrinsics. llvm-svn: 26993	2006-03-23 18:11:33 +00:00
Jim Laskey	83f99115db	Can't combine anymore - we don't have a chain through llvm.dbg intrinsics. llvm-svn: 26992	2006-03-23 18:10:42 +00:00
Chris Lattner	7d80b4f366	silence a bogus gcc warning llvm-svn: 26953	2006-03-22 17:27:24 +00:00
Chris Lattner	d783c76c18	Teach cee to propagate through switch statements. This implements Transforms/CorrelatedExprs/switch.ll Patch contributed by Eric Kidd! llvm-svn: 26872	2006-03-19 19:37:24 +00:00
Evan Cheng	c28282bd87	- Fixed a bogus if condition. - Added more debugging info. - Allow reuse of IV of negative stride. e.g. -4 stride == 2 * iv of -2 stride. llvm-svn: 26841	2006-03-18 08:03:12 +00:00
Evan Cheng	f09f0ebd48	Sort StrideOrder so we can process the smallest strides first. This allows for more IV reuses. llvm-svn: 26837	2006-03-18 00:44:49 +00:00
Evan Cheng	4520698820	Allow users of iv / stride to be rewritten with expression that is a multiply of a smaller stride even if they have a common loop invariant expression part. llvm-svn: 26828	2006-03-17 19:52:23 +00:00
Evan Cheng	3df447d354	For each loop, keep track of all the IV expressions inserted indexed by stride. For a set of uses of the IV of a stride which is a multiple of another stride, do not insert a new IV expression. Rather, reuse the previous IV and rewrite the uses as uses of IV expression multiplied by the factor. e.g. x = 0 ...; x ++ y = 0 ...; y += 4 then use of y can be rewritten as use of 4*x for x86. llvm-svn: 26803	2006-03-16 21:53:05 +00:00
Chris Lattner	6d6084fd04	Teach the strip pass to strip type names in addition to value names. This is fallout from the type/value split in the symtab long long ago :) llvm-svn: 26785	2006-03-15 19:22:41 +00:00
Chris Lattner	c5f866bb4a	Implement a FIXME, recusively reassociating AAB + AAC --> A(AB+AC) --> A(A*(B+C)) This implements Reassociate/mul-factor3.ll llvm-svn: 26757	2006-03-14 16:04:29 +00:00
Chris Lattner	2fc319d444	extract some code into a method, no functionality change llvm-svn: 26755	2006-03-14 07:11:11 +00:00
Chris Lattner	d6bde46d85	Promote shifts by a constant to multiplies so that we can reassociate (x<<1)+(y<<1) -> (X+Y)<<1. This implements Transforms/Reassociate/shift-factor.ll llvm-svn: 26753	2006-03-14 06:55:18 +00:00
Evan Cheng	c567c4efbb	Added target lowering hooks which LSR consults to make more intelligent transformation decisions. llvm-svn: 26738	2006-03-13 23:14:23 +00:00
Jim Laskey	acb6e34277	Handle the removal of the debug chain. llvm-svn: 26729	2006-03-13 13:07:37 +00:00
Chris Lattner	60f6833376	use autogenerated side-effect information llvm-svn: 26673	2006-03-09 22:38:10 +00:00
Chris Lattner	6b7847a5bc	fix a pasto llvm-svn: 26627	2006-03-09 06:09:41 +00:00
Chris Lattner	fc34f8bb48	Fix a miscompilation of 188.ammp with the new CFE. 188.ammp is accessing arrays out of range in a horrible way, but we shouldn't break it anyway. Details in the comments. llvm-svn: 26606	2006-03-08 01:05:29 +00:00
Jim Laskey	69effa2325	Switch to using a numeric id for anchors. llvm-svn: 26598	2006-03-07 20:53:47 +00:00
Chris Lattner	7b87fd53f9	Fix ConstantMerge/2006-03-07-DontMergeDiffSections.ll, a problem Jim hypotheticalized about, where we would incorrectly merge two globals in different sections. llvm-svn: 26597	2006-03-07 17:56:59 +00:00
Chris Lattner	53ef5a032c	Teach the alignment handling code to look through constant expr casts and GEPs llvm-svn: 26580	2006-03-07 01:28:57 +00:00
Chris Lattner	82f2ef20b6	Teach instcombine to increase the alignment of memset/memcpy/memmove when the pointer is known to come from either a global variable, alloca or malloc. This allows us to compile this: P = malloc(28); memset(P, 0, 28); into explicit stores on PPC instead of a memset call. llvm-svn: 26577	2006-03-06 20:18:44 +00:00
Chris Lattner	6bc98653c2	Make vector narrowing more effective, implementing Transforms/InstCombine/vec_narrow.ll. This add support for narrowing extract_element(insertelement) also. llvm-svn: 26538	2006-03-05 00:22:33 +00:00
Chris Lattner	4c065091d8	Add factoring of multiplications, e.g. turning AA+AB into A*(A+B). Testcase here: Transforms/Reassociate/mulfactor.ll llvm-svn: 26524	2006-03-04 09:31:13 +00:00
Chris Lattner	32c01df299	Canonicalize (X+C1)C2 -> XC2+C1*C2 This implements Transforms/InstCombine/add.ll:test31 llvm-svn: 26519	2006-03-04 06:04:02 +00:00
Chris Lattner	681ef2f083	Change this to work with renamed intrinsics. llvm-svn: 26484	2006-03-03 01:34:17 +00:00
Chris Lattner	ea7986aeca	Make this work with renamed intrinsics. llvm-svn: 26482	2006-03-03 01:30:23 +00:00
Chris Lattner	85dda9a2bd	Generalize the REM folding code to handle another case Nick Lewycky pointed out: realize the AND can provide factors and look through Casts. llvm-svn: 26469	2006-03-02 06:50:58 +00:00
Chris Lattner	c5b6c9a12a	Fix a regression in a patch from a couple of days ago. This fixes Transforms/InstCombine/2006-02-28-Crash.ll llvm-svn: 26427	2006-02-28 19:47:20 +00:00
Chris Lattner	b70f141893	Implement rem.ll:test[7-9] and PR712 llvm-svn: 26415	2006-02-28 05:49:21 +00:00
Chris Lattner	2a7c7b8bab	Simplify some code now that the RHS of a rem can't be 0 llvm-svn: 26413	2006-02-28 05:40:55 +00:00
Chris Lattner	0de4a8d7b7	Rearrange some code, fold "rem X, 0", implementing rem.ll:test6 llvm-svn: 26411	2006-02-28 05:30:45 +00:00
Chris Lattner	c7bfed0f7b	Merge two almost-identical pieces of code. Make this code more powerful by using ComputeMaskedBits instead of looking for an AND operand. This lets us fold this: int %test23(int %a) { %tmp.1 = and int %a, 1 %tmp.2 = seteq int %tmp.1, 0 %tmp.3 = cast bool %tmp.2 to int ;; xor tmp1, 1 ret int %tmp.3 } into: xor (and a, 1), 1 llvm-svn: 26396	2006-02-27 02:38:23 +00:00
Chris Lattner	f5c8a0b83f	Fold (A^B) == A -> B == 0 and (A-B) == A -> B == 0 llvm-svn: 26394	2006-02-27 01:44:11 +00:00
Chris Lattner	f78df7c14d	Fold (X\|C1)^C2 -> X^(C1\|C2) when possible. This implements InstCombine/or.ll:test23. llvm-svn: 26385	2006-02-26 19:57:54 +00:00
Chris Lattner	b580d26e7d	Fix a problem that Nate noticed that boils down to an over conservative check in the code that does "select C, (X+Y), (X-Y) --> (X+(select C, Y, (-Y)))". We now compile this loop: LBB1_1: ; no_exit add r6, r2, r3 subf r3, r2, r3 cmpwi cr0, r2, 0 addi r7, r5, 4 lwz r2, 0(r5) addi r4, r4, 1 blt cr0, LBB1_4 ; no_exit LBB1_3: ; no_exit mr r3, r6 LBB1_4: ; no_exit cmpwi cr0, r4, 16 mr r5, r7 bne cr0, LBB1_1 ; no_exit into this instead: LBB1_1: ; no_exit srawi r6, r2, 31 add r2, r2, r6 xor r6, r2, r6 addi r7, r5, 4 lwz r2, 0(r5) addi r4, r4, 1 add r3, r3, r6 cmpwi cr0, r4, 16 mr r5, r7 bne cr0, LBB1_1 ; no_exit llvm-svn: 26356	2006-02-24 18:05:58 +00:00
Chris Lattner	e5521db5bc	Fix Regression/Transforms/LoopUnswitch/2006-02-22-UnswitchCrash.ll, which caused SPASS to fail building last night. We can't trivially unswitch a loop if the exit block has phi nodes in it, because we don't know which predecessor to use. llvm-svn: 26320	2006-02-22 23:55:00 +00:00
Chris Lattner	8a5a324dac	Add some comments, simplify some code, and fix a bug that caused rewriting to rewrite with the wrong value. llvm-svn: 26311	2006-02-22 06:37:14 +00:00
Chris Lattner	c2e3a7a4ce	improved support for branch folding, still not enabled. llvm-svn: 26289	2006-02-18 07:57:38 +00:00
Jeff Cohen	0add83e969	Fix bugs identified by VC++. llvm-svn: 26287	2006-02-18 03:20:33 +00:00
Chris Lattner	19fa8ac938	Implement deletion of dead blocks, currently disabled. llvm-svn: 26285	2006-02-18 02:42:34 +00:00
Chris Lattner	cb853de534	a previous patch completely disabled trivial unswitching, this fixees it. Thanks to nate for pointing this out :) llvm-svn: 26280	2006-02-18 01:32:04 +00:00
Chris Lattner	29f771ba21	initial trivial support for folding branches that have now-constant destinations. llvm-svn: 26279	2006-02-18 01:27:45 +00:00
Chris Lattner	8e44ff50b0	When unswitching a loop, make sure to update loop info with exit blocks in the right loop. llvm-svn: 26277	2006-02-18 00:55:32 +00:00
Chris Lattner	d95665188b	Fix Transforms/SimplifyCFG/2006-02-17-InfiniteUnroll.ll llvm-svn: 26275	2006-02-18 00:33:17 +00:00
Chris Lattner	baddba41c7	Fix loops where the header has an exit, fixing a loop-unswitch crash on crafty llvm-svn: 26258	2006-02-17 06:39:56 +00:00
Chris Lattner	6fd136239b	start of some new simplification code, not thoroughly tested, use at your own risk :) llvm-svn: 26248	2006-02-17 00:31:07 +00:00
Nate Begeman	8a77efe4f7	Rework the SelectionDAG-based implementations of SimplifyDemandedBits and ComputeMaskedBits to match the new improved versions in instcombine. Tested against all of multisource/benchmarks on ppc. llvm-svn: 26238	2006-02-16 21:11:51 +00:00
Chris Lattner	fa335f6083	Change SplitBlock to increment a BasicBlock::iterator, not an Instruction*. Apparently they do different things :) This fixes a testcase that nate reduced from spass. Also included are a couple minor code changes that don't affect the generated code at all. llvm-svn: 26235	2006-02-16 19:36:22 +00:00
Jeff Cohen	55f63f1b53	Fix VC++ warning. llvm-svn: 26228	2006-02-16 04:07:37 +00:00
Chris Lattner	ff42e81028	fix a bug where we unswitched the wrong way llvm-svn: 26225	2006-02-16 01:24:41 +00:00
Chris Lattner	fdff0bb43e	Implement trivial unswitching for switch stmts. This allows us to trivial unswitch this loop on 2 before sweating to unswitch on 1/3. void test4(int N, int i, int C, intP, intQ) { int j; for (j = 0; j < N; ++j) { switch (C) { // general unswitching. default: P[i+j] = 0; break; case 1: Q[i+j] = 0; break; case 3: P[i+j] = Q[i+j]; break; case 2: break; // TRIVIAL UNSWITCH on C==2 } } } llvm-svn: 26223	2006-02-15 22:52:05 +00:00
Chris Lattner	e5cb76d744	make "trivial" unswitching significantly more general. It can now handle this for example: for (j = 0; j < N; ++j) { // trivial unswitch if (C) P[i+j] = 0; } turning it into the obvious code without bothering to duplicate an empty loop. llvm-svn: 26220	2006-02-15 22:03:36 +00:00
Andrew Lenharth	47da60130a	fix a bunch of alpha regressions. see bug 709 llvm-svn: 26218	2006-02-15 21:13:37 +00:00
Chris Lattner	65152d80ec	Checking the wrong value. This caused us to emit silly code like Y = seteq bool X, true instead of just using X :) llvm-svn: 26215	2006-02-15 19:05:52 +00:00
Chris Lattner	01db04efb0	more refactoring, no functionality change. llvm-svn: 26194	2006-02-15 01:44:42 +00:00
Chris Lattner	b0cbe7106e	pull some code out into a function llvm-svn: 26191	2006-02-15 00:07:43 +00:00
Chris Lattner	9c5693fb2a	Canonicalize inner loops before outer loops. Inner loop canonicalization can provide work for the outer loop to canonicalize. This fixes a case that breaks unswitching. llvm-svn: 26189	2006-02-14 23:06:02 +00:00
Chris Lattner	cffbbee8d1	When splitting exit edges to canonicalize loops, make sure to put the new block in the appropriate loop nest. Third time is the charm, right? llvm-svn: 26187	2006-02-14 22:34:08 +00:00
Chris Lattner	0b8ec1a132	Use statistics to keep track of what flavors of loops we are unswitching llvm-svn: 26157	2006-02-14 01:01:41 +00:00
Chris Lattner	8b10ab3002	Implement Instcombine/and.ll:test34 llvm-svn: 26155	2006-02-13 23:07:23 +00:00
Chris Lattner	7d8522884b	If any of the sign extended bits are demanded, the input sign bit is demanded for a sign extension. This fixes InstCombine/2006-02-13-DemandedMiscompile.ll and Ptrdist/bc. llvm-svn: 26152	2006-02-13 22:41:07 +00:00
Chris Lattner	68e7475777	Be careful not to request or look at bits shifted in from outside the size of the input. This fixes the mediabench/gsm/toast failure last night. llvm-svn: 26138	2006-02-13 06:09:08 +00:00
Chris Lattner	f5b4ef7f58	remove some more dead special case code llvm-svn: 26135	2006-02-12 08:07:37 +00:00
Chris Lattner	5b2edb1fca	Eliminate special case hacks that are superceded by general purpose hacks llvm-svn: 26134	2006-02-12 08:02:11 +00:00
Chris Lattner	ee0f280743	Three changes: 1. Teach GetConstantInType to handle boolean constants. 2. Teach instcombine to fold (compare X, CST) when X has known 0/1 bits. Testcase here: set.ll:test22 3. Improve the "(X >> c1) & C2 == 0" folding code to allow a noop cast between the shift and and. More aggressive bitfolding for other reasons was turning signed shr's into unsigned shr's, leaving the noop cast in the way. llvm-svn: 26131	2006-02-12 02:07:56 +00:00
Chris Lattner	02f53ad3a2	Revert my last patch. It too breaks stuff llvm-svn: 26128	2006-02-12 01:59:10 +00:00
Chris Lattner	35248e06bc	Fix for my previously reverted patch llvm-svn: 26126	2006-02-11 21:24:54 +00:00
Chris Lattner	0157e7f55b	Port the recent innovations in ComputeMaskedBits to SimplifyDemandedBits. This allows us to simplify on conditions where bits are not known, but they are not demanded either! This also fixes a couple of bugs in ComputeMaskedBits that were exposed during this work. In the future, swaths of instcombine should be removed, as this code subsumes a bunch of ad-hockery. llvm-svn: 26122	2006-02-11 09:31:47 +00:00
Chris Lattner	b24ce3a2a8	revert my previous change, it exposed other problems. llvm-svn: 26121	2006-02-11 08:47:47 +00:00
Chris Lattner	05bf90dddf	Make this check stricter. Disallow loop exit blocks from being shared by loops and their subloops. llvm-svn: 26118	2006-02-11 02:13:17 +00:00
Chris Lattner	a6ae101afa	remove dead expr llvm-svn: 26116	2006-02-11 01:43:37 +00:00
Chris Lattner	fbadd7e1ee	implement unswitching of loops with switch stmts and selects in them llvm-svn: 26114	2006-02-11 00:43:37 +00:00
Chris Lattner	f1b151684d	Update PHI nodes in successors of exit blocks. llvm-svn: 26113	2006-02-10 23:26:14 +00:00
Chris Lattner	fe4151efe7	Reform the unswitching code in terms of edge splitting, not block splitting. llvm-svn: 26112	2006-02-10 23:16:39 +00:00
Chris Lattner	ec6b40a093	Fix a case where UnswitchTrivialCondition broke critical edges with phi's in the successors llvm-svn: 26108	2006-02-10 19:08:15 +00:00
Chris Lattner	6e263155a6	add some notes, move some code around. Implement unswitching of loops with branches on partially invariant computations. llvm-svn: 26104	2006-02-10 02:30:37 +00:00
Chris Lattner	4935417a84	Move code around to be more logical, no functionality change. llvm-svn: 26103	2006-02-10 02:01:22 +00:00
Chris Lattner	3fc3148b85	When unswitching a trivial loop, do admit we are doing it! :) llvm-svn: 26102	2006-02-10 01:36:35 +00:00
Chris Lattner	ed7a67b0de	Implement unconditional unswitching of 'trivial' loops, those loops that contain branches in their entry block that control whether or not the loop is a noop or not. llvm-svn: 26101	2006-02-10 01:24:09 +00:00
Chris Lattner	4f0e66df6a	Simplify control flow a bit, note that unswitch preserves canonical loop form llvm-svn: 26098	2006-02-09 22:15:42 +00:00
Chris Lattner	8976219850	Make the threshold a parameter llvm-svn: 26093	2006-02-09 20:15:48 +00:00
Chris Lattner	2826e0511b	Simplify the loop-unswitch pass, by not even trying to unswitch loops with uses of loop values outside the loop. We need loop-closed SSA form to do this right, or to use SSA rewriting if we really care. llvm-svn: 26089	2006-02-09 19:14:52 +00:00
Chris Lattner	24cd2fa269	Fix 80-column violations llvm-svn: 26088	2006-02-09 07:41:14 +00:00
Chris Lattner	4534dd59a3	Enhance MVIZ in three ways: 1. Teach it new tricks: in particular how to propagate through signed shr and sexts. 2. Teach it to return a bitset of known-1 and known-0 bits, instead of just zero. 3. Teach instcombine (AND X, C) to fold when we know all C bits of X. This implements Regression/Transforms/InstCombine/bittest.ll, and allows future things to be simplified. llvm-svn: 26087	2006-02-09 07:38:58 +00:00
Chris Lattner	ab2dc4d70d	Simplify some code, reducing calls to MaskedValueIsZero. Implement a minor optimization where we reduce the number of bits in AND masks when possible. llvm-svn: 26056	2006-02-08 07:34:50 +00:00
Chris Lattner	5997cf9381	Use EraseInstFromFunction in a few cases to put the uses of the removed instruction onto the worklist (in case they are now dead). Add a really trivial local DSE implementation to help out bitfield code. We now fold this: struct S { unsigned char a : 1, b : 1, c : 1, d : 2, e : 3; S(); }; S::S() : a(0), b(0), c(1), d(0), e(6) {} to this: void %_ZN1SC1Ev(%struct.S* %this) { entry: %tmp.1 = getelementptr %struct.S* %this, int 0, uint 0 store ubyte 38, ubyte* %tmp.1 ret void } much earlier (in gccas instead of only in gccld after DSE runs). llvm-svn: 26050	2006-02-08 03:25:32 +00:00
Chris Lattner	06a0ed1ee0	Implement some more interesting select sccp cases. This implements: test/Regression/Transforms/SCCP/select.ll llvm-svn: 26049	2006-02-08 02:38:11 +00:00
Chris Lattner	ddba3289b5	Fix a problem in my patch yesterday, causing a miscompilation of 176.gcc llvm-svn: 26045	2006-02-08 01:20:23 +00:00
Chris Lattner	44314827d6	Fix Transforms/InstCombine/2006-02-07-SextZextCrash.ll llvm-svn: 26040	2006-02-07 19:07:40 +00:00
Chris Lattner	92a6865321	Generalize MaskedValueIsZero into a ComputeMaskedNonZeroBits function, which is just as efficient as MVIZ and is also more general. Fix a few minor bugs introduced in recent patches llvm-svn: 26036	2006-02-07 08:05:22 +00:00
Chris Lattner	c3ebf40031	Make MaskedValueIsZero take a uint64_t instead of a ConstantIntegral as a mask. This allows the code to be simpler and more efficient. Also, generalize some of the cases in MVIZ a bit, making it slightly more aggressive. llvm-svn: 26035	2006-02-07 07:27:52 +00:00
Chris Lattner	77defbae0a	Use Type::getIntegralTypeMask() to simplify some code llvm-svn: 26034	2006-02-07 07:00:41 +00:00
Chris Lattner	2590e511d8	Implement the beginnings of a facility for simplifying expressions based on 'demanded bits', inspired by Nate's work in the dag combiner. This isn't complete, but needs to unrelated instcombiner changes to continue. llvm-svn: 26033	2006-02-07 06:56:34 +00:00
Chris Lattner	2e90b732fa	Turn A % (C << N), where C is 2^k, into A & ((C << N)-1) [urem only]. Turn A / (C1 << N), where C1 is "1<<C2" into A >> (N+C2) [udiv only]. Tested with: rem.ll:test5, div.ll:test10 llvm-svn: 26003	2006-02-05 07:54:04 +00:00
Chris Lattner	d30c4991a1	Use SCEVExpander::InsertCastOfTo instead of our own code. This reduces #LLVM LOC, and auto-cse's cast instructions. llvm-svn: 25974	2006-02-04 09:52:43 +00:00
Chris Lattner	2959f0003e	Fix two significant bugs in LSR: 1. When rewriting code in outer loops, sometimes we would insert code into inner loops that is invariant in that loop. 2. Notice that 4(2+x) is 8+4x and use that to simplify expressions. This is a performance neutral change. llvm-svn: 25964	2006-02-04 07:36:50 +00:00
Jeff Cohen	15a8c15a1f	Improve compatibility with VC2005, patch by Morten Ofstad! llvm-svn: 25661	2006-01-26 20:41:32 +00:00
Chris Lattner	120f31b1fd	teach the cloner to handle inline asms llvm-svn: 25633	2006-01-26 01:55:22 +00:00
Chris Lattner	c0f633a598	Fix Regression/Transforms/ScalarRepl/2006-01-24-IllegalUnionPromoteCrash.ll llvm-svn: 25587	2006-01-24 19:36:27 +00:00
Chris Lattner	00fcdfef0d	rename method llvm-svn: 25572	2006-01-24 04:16:34 +00:00
Chris Lattner	37992b34c2	When cloning a module, clone the inline asm. llvm-svn: 25559	2006-01-23 23:06:28 +00:00
Chris Lattner	5774040c09	add a bunch more optimizations for unary double math functions llvm-svn: 25530	2006-01-23 06:24:46 +00:00
Chris Lattner	57a2863cbb	Refactor/genericize this, no functionality change llvm-svn: 25525	2006-01-23 05:57:36 +00:00
Chris Lattner	c597b8a55e	Make iostream #inclusion explicit llvm-svn: 25514	2006-01-22 23:32:06 +00:00
Chris Lattner	33081b4648	Make this more efficient in the following ways: 1. Do not statically construct a map when the program starts up, this is expensive and cannot be optimized. Instead, create a list. 2. Do not insert entries for all function in the module into a hashmap that lives the full life of the compiler. llvm-svn: 25512	2006-01-22 23:10:26 +00:00
Chris Lattner	469640e506	Add explicit #includes of <iostream> llvm-svn: 25509	2006-01-22 22:53:01 +00:00
Chris Lattner	0d4ebfc15b	Several non-functionality changing changes: 1. Use the varargs version of getOrInsertFunction to simplify code. 2. remove #include 3. Reduce the number of #ifdef's. 4. remove extraneous vertical whitespace. llvm-svn: 25508	2006-01-22 22:35:08 +00:00
Robert Bocchino	027c18da98	ConstantFoldLoadThroughGEPConstantExpr wasn't handling pointers to packed types correctly. llvm-svn: 25470	2006-01-19 23:53:23 +00:00
Reid Spencer	ade182125f	For PR696: Don't do floor->floorf conversion if floorf is not available. This checks the compiler's host, not its target, which is incorrect for cross-compilers Not sure that's important as we don't build many cross-compilers. llvm-svn: 25456	2006-01-19 08:36:56 +00:00
Chris Lattner	e154abf9b3	Implement casts.ll:test26: a cast from float -> double -> integer, doesn't need the float->double part. llvm-svn: 25452	2006-01-19 07:40:22 +00:00
Chris Lattner	7be2203c9f	If not internalizing, don't mark llvm.global[cd]tors const, as a fix for a hypothetical future boog. llvm-svn: 25430	2006-01-19 00:46:54 +00:00
Chris Lattner	d693b7943a	Don't internalize llvm.global[cd]tor unless there are uses of it. This unbreaks front-ends that don't use __main (like the new CFE). llvm-svn: 25429	2006-01-19 00:40:39 +00:00
Chris Lattner	b98282d2d6	Make sure that cloning a module clones its target triple and dependent library list as well. This should help bugpoint. llvm-svn: 25424	2006-01-18 21:32:45 +00:00
Robert Bocchino	e6336a9b69	Constant folding support for the insertelement operation. llvm-svn: 25407	2006-01-17 20:07:07 +00:00
Robert Bocchino	6dce25019d	Lowerpacked and SCCP support for the insertelement operation. llvm-svn: 25406	2006-01-17 20:06:55 +00:00
Chris Lattner	801f47512d	Clean up the FFS optimization code, and make it correctly create the appropriate unsigned llvm.cttz.* intrinsic, fixing the 2005-05-11-Popcount-ffs-fls regression last night. llvm-svn: 25398	2006-01-17 18:27:17 +00:00
Reid Spencer	b4f9a6f110	For PR411: This patch is an incremental step towards supporting a flat symbol table. It de-overloads the intrinsic functions by providing type-specific intrinsics and arranging for automatically upgrading from the old overloaded name to the new non-overloaded name. Specifically: llvm.isunordered -> llvm.isunordered.f32, llvm.isunordered.f64 llvm.sqrt -> llvm.sqrt.f32, llvm.sqrt.f64 llvm.ctpop -> llvm.ctpop.i8, llvm.ctpop.i16, llvm.ctpop.i32, llvm.ctpop.i64 llvm.ctlz -> llvm.ctlz.i8, llvm.ctlz.i16, llvm.ctlz.i32, llvm.ctlz.i64 llvm.cttz -> llvm.cttz.i8, llvm.cttz.i16, llvm.cttz.i32, llvm.cttz.i64 New code should not use the overloaded intrinsic names. Warnings will be emitted if they are used. llvm-svn: 25366	2006-01-16 21:12:35 +00:00
Chris Lattner	307b7ea15f	fix a crash due to missing parens llvm-svn: 25363	2006-01-16 19:47:21 +00:00
Chris Lattner	0de2c7d3d8	This pass has never worked correctly. Remove. llvm-svn: 25349	2006-01-16 01:06:00 +00:00
Chris Lattner	f6d6823f09	Let the inliner update the callgraph to reflect the changes it makes, instead of doing it ourselves. This fixes Transforms/Inline/2006-01-14-CallGraphUpdate.ll llvm-svn: 25321	2006-01-14 20:09:18 +00:00
Chris Lattner	0841fb1d4c	Teach the inliner to update the CallGraph itself, and have it add edges to llvm.stacksave/restore when it inserts calls to them. llvm-svn: 25320	2006-01-14 20:07:50 +00:00
Chris Lattner	ef530c24c1	FunctionPass's cannot do IPO things. llvm-svn: 25315	2006-01-14 19:30:35 +00:00
Nate Begeman	82049eba2c	Add bswap intrinsics as documented in the Language Reference llvm-svn: 25309	2006-01-14 01:25:24 +00:00
Robert Bocchino	a83529678e	Added instcombine support for extractelement. llvm-svn: 25299	2006-01-13 22:48:06 +00:00
Chris Lattner	5fba6e6696	it is ok to dce stacksave. llvm-svn: 25295	2006-01-13 21:31:54 +00:00
Chris Lattner	503221f5c5	Do a simple instcombine xforms to delete llvm.stackrestore cases. llvm-svn: 25294	2006-01-13 21:28:09 +00:00
Chris Lattner	c66b223b28	Simplify this a tiny bit by using the new IntrinsicInst functionality. llvm-svn: 25292	2006-01-13 20:11:04 +00:00
Chris Lattner	45406c0c53	Permit inlining functions that contain dynamic allocations now that InlineFunction handles this case safely. This implements Transforms/Inline/dynamic_alloca_test.ll. llvm-svn: 25288	2006-01-13 19:35:43 +00:00
Chris Lattner	2be0607a8d	If inlining a call to a function that contains dynamic allocas, wrap the resultant code with llvm.stacksave/llvm.stackrestore intrinsics. llvm-svn: 25286	2006-01-13 19:34:14 +00:00
Chris Lattner	e24f79a032	Use ClonedCodeInfo to avoid another walk over the inlined code, this this time in common C cases. llvm-svn: 25285	2006-01-13 19:18:11 +00:00
Chris Lattner	19e6a08d78	Use the ClonedCodeInfo object to avoid scans of the inlined code when it doesn't contain any calls. This is a fairly common case for C++ code, so it will probably speed up the inliner marginally in these cases. llvm-svn: 25284	2006-01-13 19:15:15 +00:00
Chris Lattner	908d79556d	Refactor a bunch of invoke handling stuff out into a new function "HandleInlinedInvoke". No functionality change. llvm-svn: 25283	2006-01-13 19:05:59 +00:00
Chris Lattner	edad1288fd	Allow the code cloning interfaces to capture some important info about the code being cloned if the client wants. llvm-svn: 25281	2006-01-13 18:39:17 +00:00
Chris Lattner	257492c0ab	Fix a bug I noticed by inspection: if the first instruction in the inlined function was not an alloca, we wouldn't check the entry block for any allocas, leading to increased stack space in some cases. In practice, allocas are almost always at the top of the block, so this was never noticed. llvm-svn: 25280	2006-01-13 18:16:48 +00:00
Chris Lattner	49c4d536bd	Fix 80 column violations llvm-svn: 25279	2006-01-13 18:06:56 +00:00
Chris Lattner	0770d8e326	Preserve and update ETForest. Patch by Daniel Berlin llvm-svn: 25203	2006-01-11 05:11:13 +00:00
Chris Lattner	cb36710ff9	Switch these to using ETForest instead of DominatorSet to compute itself. Patch written by Daniel Berlin! llvm-svn: 25202	2006-01-11 05:10:20 +00:00
Chris Lattner	48e4a2ebd8	Switch this to using ETForest instead of DominatorSet to compute itself. Patch written by Daniel Berlin! llvm-svn: 25201	2006-01-11 05:09:40 +00:00
Robert Bocchino	230044839d	Added support for the extractelement operation. llvm-svn: 25181	2006-01-10 19:05:34 +00:00
Robert Bocchino	bd518d153b	Added lower packed support for the extractelement operation. llvm-svn: 25180	2006-01-10 19:05:05 +00:00
Chris Lattner	cda4aa6eb4	Teach loopsimplify to update et-forest. Patch contributed by Daniel Berlin! llvm-svn: 25153	2006-01-09 08:03:08 +00:00
Chris Lattner	9cbfbc21bb	fix some 176.gcc miscompilation from my previous patch. llvm-svn: 25137	2006-01-07 01:32:28 +00:00
Chris Lattner	330628a6d8	silence some bogus gcc warnings on fenris llvm-svn: 25130	2006-01-06 17:59:59 +00:00
Chris Lattner	eb372a0276	Enhance the shift-shift folding code to allow a no-op cast to occur in between the shifts. This allows us to fold this (which is the 'integer add a constant' sequence from cozmic's scheme compmiler): int %x(uint %anf-temporary776) { %anf-temporary777 = shr uint %anf-temporary776, ubyte 1 %anf-temporary800 = cast uint %anf-temporary777 to int %anf-temporary804 = shl int %anf-temporary800, ubyte 1 %anf-temporary805 = add int %anf-temporary804, -2 %anf-temporary806 = or int %anf-temporary805, 1 ret int %anf-temporary806 } into this: int %x(uint %anf-temporary776) { %anf-temporary776 = cast uint %anf-temporary776 to int %anf-temporary776.mask1 = add int %anf-temporary776, -2 %anf-temporary805 = or int %anf-temporary776.mask1, 1 ret int %anf-temporary805 } note that instcombine already knew how to eliminate the AND that the two shifts fold into. This is tested by InstCombine/shift.ll:test26 -Chris llvm-svn: 25128	2006-01-06 07:52:12 +00:00
Chris Lattner	b330939d90	Simplify the code a bit more llvm-svn: 25126	2006-01-06 07:22:22 +00:00
Chris Lattner	145539343f	Extract a bunch of code out of visitShiftInst into FoldShiftByConstant. No functionality changes. llvm-svn: 25125	2006-01-06 07:12:35 +00:00
Chris Lattner	8cdc773748	Pull inline methods out of the pass class definition to make it easier to read the code. Do not internalize debugger anchors. llvm-svn: 25067	2006-01-03 19:13:17 +00:00
Duraid Madina	7a3ad6cae2	getting there... llvm-svn: 25021	2005-12-26 13:48:44 +00:00
Chris Lattner	8c9e14620f	Fix Transforms/ScalarRepl/2005-12-14-UnionPromoteCrash.ll, a crash on undefined behavior in 126.gcc on big-endian systems. llvm-svn: 24708	2005-12-14 17:23:59 +00:00
Reid Spencer	175613adf6	Improve ResolveFunctions to: a) use better local variable names (OldMT -> OldFT) where "M" is used to mean "Function" (perhaps it was previously "Method"?) b) print out the module identifier in a warning message so that it is possible to track down in which module the error occurred. llvm-svn: 24698	2005-12-13 19:56:51 +00:00
Chris Lattner	3b0a62d8a5	Implement a little hack for parity with GCC on crafty. This speeds up 186.crafty by about 16% (from 15.109s to 13.045s) on my system. This turns allocas with unions/casts into scalars. For example crafty has something like this: union doub { unsigned short i[4]; long long d; }; int f(long long a) { return ((union doub){.d=a}).i[1]; } Instead of generating loads and stores to an alloca, we now promote the whole thing to a scalar long value. This implements: Transforms/ScalarRepl/AggregatePromote.ll llvm-svn: 24667	2005-12-12 07:19:13 +00:00
Chris Lattner	077200737c	getRawValue zero extens for unsigned values, use getsextvalue so that we know that small negative values fit into the immediate field of addressing modes. llvm-svn: 24608	2005-12-05 18:23:57 +00:00
Chris Lattner	165998207e	Wrap a long line, never internalize llvm.used. llvm-svn: 24602	2005-12-05 05:07:38 +00:00
Chris Lattner	2820b8c855	Fix SimplifyCFG/2005-12-03-IncorrectPHIFold.ll llvm-svn: 24581	2005-12-03 18:25:58 +00:00
Chris Lattner	dc4ffef633	Fix a bug where we didn't realize that vaarg reads memory. This fixes Transforms/DeadStoreElimination/2005-11-30-vaarg.ll llvm-svn: 24545	2005-11-30 19:38:22 +00:00
Andrew Lenharth	d251192910	a few more comments on the interfaces and functions llvm-svn: 24500	2005-11-28 18:10:59 +00:00
Andrew Lenharth	517caef495	Added documented rsprofiler interface. Also remove new profiler passes, the old ones have been updated to implement the interface. llvm-svn: 24499	2005-11-28 18:00:38 +00:00
Jeff Cohen	7ff44ec372	Fix VC++ warning. llvm-svn: 24496	2005-11-28 06:45:57 +00:00
Andrew Lenharth	93e59f6032	Random sampling (aka Arnold and Ryder) profiling. This is still preliminary, but it works on spec on x86 and alpha. The idea is to allow profiling passes to remember what profiling they inserted, then a random sampling framework is inserted which consists of duplicated basic blocks (without profiling), such that at each backedge in the program and entry into every function, the framework chooses whether to use the instrumented code or the instrumentation free code. The goal of such a framework is to make it reasonably cheap to do random sampling of very expensive profiling products (such as load-value profiling). The code is organized into 3 parts (2 passes) 1) a linked set of profiling passes, which implement an analysis group (linked, like alias analysis are). These insert profiling into the program, and remember what they inserted, so that at a later time they can be queried about any instruction. 2) a pass that handles inserting the random sampling framework. This also has options to control how random samples are choosen. Currently implemented are Global counters, register allocated global counters, and read cycle counter (see? there was a reason for it). The profiling passes are almost identical to the existing ones (block, function, and null profiling is supported right now), and they are valid passes without the sampling framework (hence the existing passes can be unified with the new ones, not done yet). Some things are a bit ugly still, but that should be fixed up soon enough. Other todo? making the counter values not "magic 2^16 -1" values, but dynamically choosable. llvm-svn: 24493	2005-11-28 00:58:09 +00:00
Andrew Lenharth	5fc3794e71	since reg2mem requires it, might as well mention that it preserves it llvm-svn: 24491	2005-11-25 16:04:54 +00:00
Andrew Lenharth	061029dee2	Reg2Mem is something a pass may depend on, so allow that llvm-svn: 24488	2005-11-22 22:14:23 +00:00
Andrew Lenharth	71b09bbb07	turns out, demotion and invokes and critical edges don't mix llvm-svn: 24487	2005-11-22 21:45:19 +00:00
Chris Lattner	9c37f23645	Fix a crash building 176.gcc due to my recent patch, which only fixed half the problem. llvm-svn: 24414	2005-11-18 18:30:47 +00:00
Chris Lattner	3e9e8bd25c	Implement a refinement to the mem2reg algorithm for cases where an alloca has a single def. In this case, look for uses that are dominated by the def and attempt to rewrite them to directly use the stored value. This speeds up mem2reg on these values and reduces the number of phi nodes inserted. This should address PR665. llvm-svn: 24411	2005-11-18 07:31:42 +00:00
Chris Lattner	31dc3827d3	This needs proper dominance llvm-svn: 24410	2005-11-18 07:29:44 +00:00
Chris Lattner	bca0be812d	This was checking the wrong GEP expression. Fixing this fixes a gccas crash compiling mysql reported by Ted Kremenek. llvm-svn: 24402	2005-11-17 19:35:42 +00:00
Andrew Lenharth	d9c13b1336	the pain isn't gone unless the phinodes are spilled too llvm-svn: 24288	2005-11-10 19:39:09 +00:00
Andrew Lenharth	8e66c0c8a9	this works with backedges to the existing entry block alot better llvm-svn: 24270	2005-11-10 17:35:34 +00:00
Andrew Lenharth	4130a4f061	The pass everyone has been waiting for! Reg2Mem for fun you can opt -reg2mem -mem2reg llvm-svn: 24267	2005-11-10 01:58:38 +00:00
Nate Begeman	848622f87f	Add support alignment of allocation instructions. Add support for specifying alignment and size of setjmp jmpbufs. No targets currently do anything with this information, nor is it presrved in the bytecode representation. That's coming up next. llvm-svn: 24196	2005-11-05 09:21:28 +00:00
Chris Lattner	16b29e9562	Implement Transforms/TailCallElim/return-undef.ll, a trivial case that has been sitting in my inbox since May 18. :) llvm-svn: 24194	2005-11-05 08:21:11 +00:00
Chris Lattner	dd0c174082	Turn sdiv into udiv if both operands have a clear sign bit. This occurs a few times in crafty: OLD: %tmp.36 = div int %tmp.35, 8 ; <int> [#uses=1] NEW: %tmp.36 = div uint %tmp.35, 8 ; <uint> [#uses=0] OLD: %tmp.19 = div int %tmp.18, 8 ; <int> [#uses=1] NEW: %tmp.19 = div uint %tmp.18, 8 ; <uint> [#uses=0] OLD: %tmp.117 = div int %tmp.116, 8 ; <int> [#uses=1] NEW: %tmp.117 = div uint %tmp.116, 8 ; <uint> [#uses=0] OLD: %tmp.92 = div int %tmp.91, 8 ; <int> [#uses=1] NEW: %tmp.92 = div uint %tmp.91, 8 ; <uint> [#uses=0] Which all turn into shrs. llvm-svn: 24190	2005-11-05 07:40:31 +00:00
Chris Lattner	e9ff0eaf5b	Turn srem -> urem when neither input has their sign bit set. This triggers 8 times in vortex, allowing the srems to be turned into shrs: OLD: %tmp.104 = rem int %tmp.5.i37, 16 ; <int> [#uses=1] NEW: %tmp.104 = rem uint %tmp.5.i37, 16 ; <uint> [#uses=0] OLD: %tmp.98 = rem int %tmp.5.i24, 16 ; <int> [#uses=1] NEW: %tmp.98 = rem uint %tmp.5.i24, 16 ; <uint> [#uses=0] OLD: %tmp.91 = rem int %tmp.5.i19, 8 ; <int> [#uses=1] NEW: %tmp.91 = rem uint %tmp.5.i19, 8 ; <uint> [#uses=0] OLD: %tmp.88 = rem int %tmp.5.i14, 8 ; <int> [#uses=1] NEW: %tmp.88 = rem uint %tmp.5.i14, 8 ; <uint> [#uses=0] OLD: %tmp.85 = rem int %tmp.5.i9, 1024 ; <int> [#uses=2] NEW: %tmp.85 = rem uint %tmp.5.i9, 1024 ; <uint> [#uses=0] OLD: %tmp.82 = rem int %tmp.5.i, 512 ; <int> [#uses=2] NEW: %tmp.82 = rem uint %tmp.5.i1, 512 ; <uint> [#uses=0] OLD: %tmp.48.i = rem int %tmp.5.i.i161, 4 ; <int> [#uses=1] NEW: %tmp.48.i = rem uint %tmp.5.i.i161, 4 ; <uint> [#uses=0] OLD: %tmp.20.i2 = rem int %tmp.5.i.i, 4 ; <int> [#uses=1] NEW: %tmp.20.i2 = rem uint %tmp.5.i.i, 4 ; <uint> [#uses=0] it also occurs 9 times in gcc, but with odd constant divisors (1009 and 61) so the payoff isn't as great. llvm-svn: 24189	2005-11-05 07:28:37 +00:00
Andrew Lenharth	662295587d	make this 64 bit clean, fixed test30 of /Regression/Transforms/InstCombine/add.ll llvm-svn: 24158	2005-11-02 18:35:40 +00:00
Chris Lattner	09efd4e5b6	Limit the search depth of MaskedValueIsZero to 6 instructions, to avoid bad cases. This fixes Markus's second testcase in PR639, and should seal it for good. llvm-svn: 24123	2005-10-31 18:35:52 +00:00
Chris Lattner	27d351f159	This pass is now obsolete since all targets have moved to the SelectionDAG infrastructure and the simple isels have been removed. llvm-svn: 24090	2005-10-29 05:33:46 +00:00
Chris Lattner	752717d4ec	Remove dead #include llvm-svn: 24083	2005-10-29 04:41:30 +00:00
Chris Lattner	ceb9d5adaa	Now that instcombine does this xform, remove it from the -raise pass llvm-svn: 24082	2005-10-29 04:40:23 +00:00
Chris Lattner	8f663e8bbc	Pull some code out into a function, give it the ability to see through +. This allows us to turn code like malloc(4*x+4) -> malloc int, (x+1) llvm-svn: 24081	2005-10-29 04:36:15 +00:00
Chris Lattner	8270c33606	Remove a special case, allowing the general case to handle it. No functionality change. llvm-svn: 24076	2005-10-29 03:19:53 +00:00
Chris Lattner	b9d3ca5c3c	Fix a bit of backwards logic that broke exptree and smg2000 llvm-svn: 24056	2005-10-28 16:27:35 +00:00
Chris Lattner	c4f67e67d2	Do not sink any instruction with side effects, including vaarg. This fixes PR640 llvm-svn: 24046	2005-10-27 17:13:11 +00:00
Chris Lattner	479911f971	Fix #include order llvm-svn: 24044	2005-10-27 16:34:00 +00:00
John Criswell	fe5f33b120	Move some constant folding code shared by Analysis and Transform passes into the LLVMAnalysis library. This allows LLVMTranform and LLVMTransformUtils to be archives and linked with LLVMAnalysis.a, which provides any missing definitions. llvm-svn: 24036	2005-10-27 15:54:34 +00:00
Chris Lattner	c6372cca78	Fix typo llvm-svn: 24033	2005-10-27 06:26:26 +00:00
Chris Lattner	0fe7551bc0	Teach instcombine to promote stuff like (cast (malloc sbyte, 8X) to int) into: malloc int, (2*X) llvm-svn: 24032	2005-10-27 06:24:46 +00:00
Chris Lattner	b3ecf96900	Promote cases like cast (malloc sbyte, 100) to int* into (malloc [25 x int]) directly without having to convert to (malloc [100 x sbyte]) first. llvm-svn: 24031	2005-10-27 06:12:00 +00:00
Chris Lattner	bb17180a23	Minor change to this file to support obscure cases with constant array amounts llvm-svn: 24030	2005-10-27 05:53:56 +00:00
John Criswell	94b7bea733	1. Remove libraries no longer created from the list of libraries linked into the SparcV9 JIT. 2. Make LLVMTransformUtils a relinked object file and always link it before LLVMAnalysis.a. These two libraries have circular dependencies on each other which creates problem when building the SparcV9 JIT. This change fixes the dependency on all platforms problems with a minimum of fuss. llvm-svn: 24023	2005-10-26 20:35:13 +00:00
Chris Lattner	38a1b00a0f	fold nested and's early to avoid inefficiencies in MaskedValueIsZero. This fixes a very slow compile in PR639. llvm-svn: 24011	2005-10-26 17:18:16 +00:00
Jeff Cohen	2b8cbf319c	Update Visual Studio projects to reflect moved file. llvm-svn: 23998	2005-10-26 05:36:51 +00:00
Alkis Evlogimenos	cb67b650b5	Stop using deprecated types llvm-svn: 23973	2005-10-25 11:18:06 +00:00
Chris Lattner	46705b2f2d	Handle allocations that, even after removing dead uses, still have more than one use (but one is a cast). This handles the very common case of: X = alloc [n x byte] Y = cast X to somethingbetter seteq X, null In order to avoid infinite looping when there are multiple casts, we only allow this if the xform is strictly increasing the alignment of the allocation. llvm-svn: 23961	2005-10-24 06:35:18 +00:00
Chris Lattner	355ecc09f8	Fix a bug where we would 'promote' an allocation from one type to another where the second has less alignment required. If we had explicit alignment support in the IR, we could handle this case, but we can't until we do. llvm-svn: 23960	2005-10-24 06:26:18 +00:00
Chris Lattner	ac87beb03a	Before promoting a malloc type, remove dead uses. This makes instcombine more effective at promoting these allocations, catching them earlier in the compile process. llvm-svn: 23959	2005-10-24 06:22:12 +00:00
Chris Lattner	216be91817	Pull some code out into a function, no functionality change llvm-svn: 23958	2005-10-24 06:03:58 +00:00
Chris Lattner	b37336978f	Remove some beta code that no longer has an owner. llvm-svn: 23944	2005-10-24 02:32:41 +00:00
Chris Lattner	f9998d9704	Do not build the ProfilePaths directory anymore llvm-svn: 23943	2005-10-24 02:31:49 +00:00
Chris Lattner	bde3845548	DONT_BUILD_RELINKED is gone and implied by BUILD_ARCHIVE now llvm-svn: 23940	2005-10-24 02:26:13 +00:00
Chris Lattner	8c087e962c	Only build .a file versions of these libraries, instead of .a and .o versions. This should speed up build times. llvm-svn: 23933	2005-10-24 01:59:48 +00:00
Chris Lattner	bd77fac034	Make sure that anything using the ADCE pass pulls in the UnifyFunctionExitNodes code llvm-svn: 23931	2005-10-24 01:40:23 +00:00
Jeff Cohen	11e26b52b2	When a function takes a variable number of pointer arguments, with a zero pointer marking the end of the list, the zero must be cast to the pointer type. An un-cast zero is a 32-bit int, and at least on x86_64, gcc will not extend the zero to 64 bits, thus allowing the upper 32 bits to be random junk. The new END_WITH_NULL macro may be used to annotate a such a function so that GCC (version 4 or newer) will detect the use of un-casted zero at compile time. llvm-svn: 23888	2005-10-23 04:37:20 +00:00
Chris Lattner	5df0e36e98	My previous patch was too conservative. Reject FP and void types, but do allow pointer types. llvm-svn: 23859	2005-10-21 05:45:41 +00:00
Chris Lattner	0c0b38bb4c	Do NOT touch FP ops with LSR. This fixes a testcase Nate sent me from an inner loop like this: LBB_RateConvertMono8AltiVec_2: ; no_exit lis r2, ha16(.CPI_RateConvertMono8AltiVec_0) lfs f3, lo16(.CPI_RateConvertMono8AltiVec_0)(r2) fmr f3, f3 fadd f0, f2, f0 fadd f3, f0, f3 fcmpu cr0, f3, f1 bge cr0, LBB_RateConvertMono8AltiVec_2 ; no_exit to an inner loop like this: LBB_RateConvertMono8AltiVec_1: ; no_exit fsub f2, f2, f1 fcmpu cr0, f2, f1 fmr f0, f2 bge cr0, LBB_RateConvertMono8AltiVec_1 ; no_exit Doh! good catch! llvm-svn: 23838	2005-10-20 04:47:10 +00:00
Chris Lattner	45517baf9f	Add an option to this pass. If it is set, we are allowed to internalize all but main. If it's not set, we can still internalize, but only if an explicit symbol list is provided. llvm-svn: 23783	2005-10-18 06:29:22 +00:00
Chris Lattner	da1b152c43	Make this work for FP constantexprs llvm-svn: 23773	2005-10-17 20:18:38 +00:00
Chris Lattner	7fde91e365	Oops, X+0.0 isn't foldable, but X+-0.0 is. llvm-svn: 23772	2005-10-17 17:56:38 +00:00
Chris Lattner	32979336a7	relax this a bit, as we only support the default rounding mode llvm-svn: 23771	2005-10-17 17:49:32 +00:00
Chris Lattner	192cd18f53	Fix (hopefully the last) issue where LSR is nondeterminstic. When pulling out CSE's of base expressions it could build a result whose order was nondet. llvm-svn: 23698	2005-10-11 18:41:04 +00:00
Chris Lattner	5c9d63da31	Fix another problem where LSR was being nondeterminstic. Also remove elements from the end of a vector instead of the beginning llvm-svn: 23697	2005-10-11 18:30:57 +00:00
Chris Lattner	b7a3894e7c	Fix another lsr-is-nondeterministic case llvm-svn: 23695	2005-10-11 18:17:57 +00:00
Chris Lattner	03b9eb506c	Make MaskedValueIsZero a bit more aggressive llvm-svn: 23677	2005-10-09 22:08:50 +00:00
Chris Lattner	62010c450f	Fix funky xcode indentation llvm-svn: 23674	2005-10-09 06:36:35 +00:00
Chris Lattner	eb4be8b942	Hrm, you didn't see this. llvm-svn: 23673	2005-10-09 06:24:02 +00:00
Chris Lattner	4ea0a3eaac	Fix a source of non-determinism in the backend: the order of processing IV strides dependend on the pointer order of the strides in memory. Non-determinism is bad. llvm-svn: 23672	2005-10-09 06:20:55 +00:00
Jeff Cohen	572910c9a2	Remove useless variable. llvm-svn: 23656	2005-10-07 05:28:29 +00:00
Chris Lattner	20b0754c41	Fix DemoteRegToStack on an invoke. This fixes PR634. llvm-svn: 23618	2005-10-04 00:44:01 +00:00
Chris Lattner	4c3b2b536c	Clean up the code a bit. Use isInstructionTriviallyDead to be more aggressive and more correct than use_empty(). This fixes PR635 and SimplifyCFG/2005-10-02-InvokeSimplify.ll llvm-svn: 23616	2005-10-03 23:43:43 +00:00
Chris Lattner	f07a587c79	Make IVUseShouldUsePostIncValue more aggressive when the use is a PHI. In particular, it should realize that phi's use their values in the pred block not the phi block itself. This change turns our em3d loop from this: _test: cmpwi cr0, r4, 0 bgt cr0, LBB_test_2 ; entry.no_exit_crit_edge LBB_test_1: ; entry.loopexit_crit_edge li r2, 0 b LBB_test_6 ; loopexit LBB_test_2: ; entry.no_exit_crit_edge li r6, 0 LBB_test_3: ; no_exit or r2, r6, r6 lwz r6, 0(r3) cmpw cr0, r6, r5 beq cr0, LBB_test_6 ; loopexit LBB_test_4: ; endif addi r3, r3, 4 addi r6, r2, 1 cmpw cr0, r6, r4 blt cr0, LBB_test_3 ; no_exit LBB_test_5: ; endif.loopexit.loopexit_crit_edge addi r3, r2, 1 blr LBB_test_6: ; loopexit or r3, r2, r2 blr into: _test: cmpwi cr0, r4, 0 bgt cr0, LBB_test_2 ; entry.no_exit_crit_edge LBB_test_1: ; entry.loopexit_crit_edge li r2, 0 b LBB_test_5 ; loopexit LBB_test_2: ; entry.no_exit_crit_edge li r6, 0 LBB_test_3: ; no_exit lwz r2, 0(r3) cmpw cr0, r2, r5 or r2, r6, r6 beq cr0, LBB_test_5 ; loopexit LBB_test_4: ; endif addi r3, r3, 4 addi r6, r6, 1 cmpw cr0, r6, r4 or r2, r6, r6 blt cr0, LBB_test_3 ; no_exit LBB_test_5: ; loopexit or r3, r2, r2 blr Unfortunately, this is actually worse code, because the register coallescer is getting confused somehow. If it were doing its job right, it could turn the code into this: _test: cmpwi cr0, r4, 0 bgt cr0, LBB_test_2 ; entry.no_exit_crit_edge LBB_test_1: ; entry.loopexit_crit_edge li r6, 0 b LBB_test_5 ; loopexit LBB_test_2: ; entry.no_exit_crit_edge li r6, 0 LBB_test_3: ; no_exit lwz r2, 0(r3) cmpw cr0, r2, r5 beq cr0, LBB_test_5 ; loopexit LBB_test_4: ; endif addi r3, r3, 4 addi r6, r6, 1 cmpw cr0, r6, r4 blt cr0, LBB_test_3 ; no_exit LBB_test_5: ; loopexit or r3, r6, r6 blr ... which I'll work on next. :) llvm-svn: 23604	2005-10-03 02:50:05 +00:00
Chris Lattner	e4ed42a426	Refactor some code into a function llvm-svn: 23603	2005-10-03 01:04:44 +00:00
Chris Lattner	360928dbed	This break is bogus and I have no idea why it was there. Basically it prevents memoizing code when IV's are used by phinodes outside of loops. In a simple example, we were getting this code before (note that r6 and r7 are isomorphic IV's): li r6, 0 or r7, r6, r6 LBB_test_3: ; no_exit lwz r2, 0(r3) cmpw cr0, r2, r5 or r2, r7, r7 beq cr0, LBB_test_5 ; loopexit LBB_test_4: ; endif addi r2, r7, 1 addi r7, r7, 1 addi r3, r3, 4 addi r6, r6, 1 cmpw cr0, r6, r4 blt cr0, LBB_test_3 ; no_exit Now we get: li r6, 0 LBB_test_3: ; no_exit or r2, r6, r6 lwz r6, 0(r3) cmpw cr0, r6, r5 beq cr0, LBB_test_6 ; loopexit LBB_test_4: ; endif addi r3, r3, 4 addi r6, r2, 1 cmpw cr0, r6, r4 blt cr0, LBB_test_3 ; no_exit this was noticed in em3d. llvm-svn: 23602	2005-10-03 00:37:33 +00:00
Chris Lattner	8fcce170cf	when checking if we should move a split edge block outside of a loop, check the presplit pred, not the post-split pred. This was causing us to make the wrong decision in some cases, leaving the critical edge block in the loop. llvm-svn: 23601	2005-10-03 00:31:52 +00:00
Jeff Cohen	f8a5e5ae6e	Fix VC++ warnings. llvm-svn: 23579	2005-10-01 03:57:14 +00:00
Chris Lattner	a554c9470b	Insert stores after phi nodes in the normal dest. This fixes LowerInvoke/2005-08-03-InvokeWithPHI.ll llvm-svn: 23525	2005-09-29 17:44:20 +00:00
Chris Lattner	87ef943a4c	Fold isascii into a simple comparison. This speeds up 197.parser by 7.4%, bringing the LLC time down to the CBE time. llvm-svn: 23521	2005-09-29 06:17:27 +00:00
Chris Lattner	5f6035feb0	remove a bunch of unneeded stuff, or self evident comments llvm-svn: 23519	2005-09-29 06:16:11 +00:00
Chris Lattner	c244e7c178	Implement a couple of memcmp folds from the todo list llvm-svn: 23517	2005-09-29 04:54:20 +00:00
Chris Lattner	ea7214b23d	Constant fold llvm.sqrt llvm-svn: 23487	2005-09-28 01:34:32 +00:00
Chris Lattner	3b63bb375c	add a note about a way to improve this code further, that I won't be getting to right now. llvm-svn: 23485	2005-09-27 22:44:59 +00:00
Chris Lattner	eb953f0ef8	Fix a regression in my previous patch, fixing GlobalOpt/2005-09-27-Crash.ll and PR632. llvm-svn: 23484	2005-09-27 22:28:11 +00:00
Chris Lattner	e285f5ed8f	Avoid spilling stack slots... to stack slots. llvm-svn: 23478	2005-09-27 21:33:12 +00:00
Chris Lattner	87eb249300	Completely rewrite 'correct' eh support. This changes how setjmp insertion is performed so it is only at most once per function that contains an invoke instead of once per invoke in the function. This patch has the following perks: 1. It fixes PR631, which complains about slowness. 2. If fixes PR240, which complains about non-volatile vars being live across setjmp/longjmps. 3. It improves (but does not fix) the jmpbuf alignment issue on itanium by not forcing the jmpbufs to always be 8-bytes off the alignment of the structure. 4. It speeds up 253.perlbmk from 338s to 13.70s (a 25x improvement!), making us now about 4% faster than GCC. Further improvements are also possible. llvm-svn: 23477	2005-09-27 21:18:17 +00:00
Chris Lattner	92233d2175	Make the pass name simpler llvm-svn: 23476	2005-09-27 21:10:32 +00:00
Chris Lattner	16cd356fb2	allow demotion to volatile values, add support for invoke llvm-svn: 23473	2005-09-27 19:39:00 +00:00
Chris Lattner	3d27e7f27f	Add support for external calls that we know how to constant fold. This implements ctor-list-opt.ll:CTOR8 llvm-svn: 23465	2005-09-27 05:02:43 +00:00
Chris Lattner	29b2780c8a	Fix a bug where we would evaluate stores into linkonce objects which could be potentially replaced at link-time. llvm-svn: 23463	2005-09-27 04:50:03 +00:00
Chris Lattner	65a3a0918f	Implement support for static constructors with calls in them. This is useful because gccas runs globalopt before inlining. This implements ctor-list-opt.ll:CTOR7 llvm-svn: 23462	2005-09-27 04:45:34 +00:00
Chris Lattner	da1889b778	Refactor this code a bit, no functionality changes. llvm-svn: 23460	2005-09-27 04:27:01 +00:00
Chris Lattner	f2f89af69a	Remove some dead code. ctor evaluation subsumes empty ctor elim llvm-svn: 23453	2005-09-26 20:38:20 +00:00
Chris Lattner	6bf2cd5735	Add support for alloca, implementing ctor-list-opt.ll:CTOR6 llvm-svn: 23452	2005-09-26 17:07:09 +00:00
Chris Lattner	46d9ff081d	Add a debug printout, fix a crash on kc++ llvm-svn: 23450	2005-09-26 07:34:35 +00:00
Chris Lattner	46af55e0e4	Implement loads/stores through GEP's of globals. This implements ctor-list-opt.ll:CTOR5. llvm-svn: 23449	2005-09-26 06:52:44 +00:00
Chris Lattner	61ff32cd70	Replace TraverseGEPInitializer with ConstantFoldLoadThroughGEPConstantExpr llvm-svn: 23447	2005-09-26 05:34:07 +00:00
Chris Lattner	02ae21e1e0	Eliminate GetGEPGlobalInitializer in favor of the more powerful ConstantFoldLoadThroughGEPConstantExpr function in the utils lib. llvm-svn: 23446	2005-09-26 05:28:52 +00:00
Chris Lattner	0b011ec8e2	Factor the GetGEPGlobalInitializer out of this pass and into Transforms/Utils as ConstantFoldLoadThroughGEPConstantExpr. llvm-svn: 23445	2005-09-26 05:28:06 +00:00
Chris Lattner	c13c7b9376	Move the ConstantFoldLoadThroughGEPConstantExpr function out of the InstCombine pass. llvm-svn: 23444	2005-09-26 05:27:10 +00:00
Chris Lattner	b009663e27	add a comment llvm-svn: 23442	2005-09-26 05:16:34 +00:00
Chris Lattner	4b05c322d5	Add support for getelementptr, load, and correctly reject volatile stores. llvm-svn: 23441	2005-09-26 05:15:37 +00:00
Chris Lattner	3e9ea5ffec	Add support for br/brcond/switch and phi llvm-svn: 23439	2005-09-26 04:57:38 +00:00
Chris Lattner	99e23fa74c	Add a simple interpreter to this code, allowing us to statically evaluate global ctors that are simple enough. This implements ctor-list-opt.ll:CTOR2. llvm-svn: 23437	2005-09-26 04:44:35 +00:00
Chris Lattner	696beefabb	factor some code into a InstallGlobalCtors method, add comments. No functionality change. llvm-svn: 23435	2005-09-26 02:31:18 +00:00
Chris Lattner	838bdc1836	Make the global opt optimizer work on modules with a null terminator, by accepting the null even with a non-65535 init prio llvm-svn: 23434	2005-09-26 02:19:27 +00:00
Chris Lattner	41b6a5a693	Factor this code out into a few methods. Implement the start of global ctor optimization. It is currently smart enough to remove the global ctor for cases like this: struct foo { foo() {} } x; ... saving a bit of startup time for the program. llvm-svn: 23433	2005-09-26 01:43:45 +00:00
Chris Lattner	f487768062	Fix some logic I broke that caused a regression on SimplifyLibCalls/2005-05-20-sprintf-crash.ll llvm-svn: 23430	2005-09-25 07:06:48 +00:00
Chris Lattner	0b3557f54a	Move MaskedValueIsZero up. Match a bunch of idioms for sign extensions, implementing InstCombine/signext.ll llvm-svn: 23428	2005-09-24 23:43:33 +00:00
Chris Lattner	175463a165	Simplify this code a bit by relying on recursive simplification. Support sprintf("%s", P)'s that have uses. s/hasNUses(0)/use_empty()/ llvm-svn: 23425	2005-09-24 22:17:06 +00:00
Chris Lattner	499e33646e	remove some debugging code llvm-svn: 23411	2005-09-23 18:49:09 +00:00
Chris Lattner	c59a371d45	Fold two consequtive branches that share a common destination between them. This implements SimplifyCFG/branch-fold.ll, and is useful on ?:/min/max heavy code llvm-svn: 23410	2005-09-23 18:47:20 +00:00
Chris Lattner	3a978bf66d	simplify some logic further llvm-svn: 23408	2005-09-23 07:23:18 +00:00
Chris Lattner	cc14ebc17b	pull a bunch of logic out of SimplifyCFG into a helper fn llvm-svn: 23407	2005-09-23 06:39:30 +00:00
Chris Lattner	6c70106053	Start threading across blocks with code in them, so long as the code does not define a value that is used outside of it's block. This catches many more simplifications, e.g. 854 in 176.gcc, 137 in vpr, etc. This implements branch-phi-thread.ll:test3.ll llvm-svn: 23397	2005-09-20 01:48:40 +00:00
Chris Lattner	f0bd8d0107	Implement merging of blocks with the same condition if the block has multiple predecessors. This implements branch-phi-thread.ll::test1 llvm-svn: 23395	2005-09-20 00:43:16 +00:00
Chris Lattner	049cb4482f	Reject a case we don't handle yet llvm-svn: 23393	2005-09-19 23:57:04 +00:00
Chris Lattner	a160924d57	remove debugging code :-/ llvm-svn: 23392	2005-09-19 23:50:15 +00:00
Chris Lattner	748f903046	Implement SimplifyCFG/branch-phi-thread.ll, the most trivial case of threading control across branches with determined outcomes. More generality to follow. This triggers a couple thousand times in specint. llvm-svn: 23391	2005-09-19 23:49:37 +00:00
Chris Lattner	b4b2530a1a	Refactor this code a bit and make it more general. This now compiles: struct S { unsigned int i : 6, j : 11, k : 15; } b; void plus2 (unsigned int x) { b.j += x; } To: _plus2: lis r2, ha16(L_b$non_lazy_ptr) lwz r2, lo16(L_b$non_lazy_ptr)(r2) lwz r4, 0(r2) slwi r3, r3, 6 add r3, r4, r3 rlwimi r3, r4, 0, 26, 14 stw r3, 0(r2) blr instead of: _plus2: lis r2, ha16(L_b$non_lazy_ptr) lwz r2, lo16(L_b$non_lazy_ptr)(r2) lwz r4, 0(r2) rlwinm r5, r4, 26, 21, 31 add r3, r5, r3 rlwimi r4, r3, 6, 15, 25 stw r4, 0(r2) blr by eliminating an 'and'. I'm pretty sure this is as small as we can go :) llvm-svn: 23386	2005-09-18 07:22:02 +00:00
Chris Lattner	797dee7705	Compile struct S { unsigned int i : 6, j : 11, k : 15; } b; void plus2 (unsigned int x) { b.j += x; } to: plus2: mov %EAX, DWORD PTR [b] mov %ECX, %EAX and %ECX, 131008 mov %EDX, DWORD PTR [%ESP + 4] shl %EDX, 6 add %EDX, %ECX and %EDX, 131008 and %EAX, -131009 or %EDX, %EAX mov DWORD PTR [b], %EDX ret instead of: plus2: mov %EAX, DWORD PTR [b] mov %ECX, %EAX shr %ECX, 6 and %ECX, 2047 add %ECX, DWORD PTR [%ESP + 4] shl %ECX, 6 and %ECX, 131008 and %EAX, -131009 or %ECX, %EAX mov DWORD PTR [b], %ECX ret llvm-svn: 23385	2005-09-18 06:30:59 +00:00
Chris Lattner	01f56c68e9	Generalize this transform, using MaskedValueIsZero, allowing us to compile: struct S { unsigned int i : 6, j : 11, k : 15; } b; void plus3 (unsigned int x) { b.k += x; } To: plus3: mov %EAX, DWORD PTR [%ESP + 4] shl %EAX, 17 add DWORD PTR [b], %EAX ret instead of: plus3: mov %EAX, DWORD PTR [%ESP + 4] shl %EAX, 17 mov %ECX, DWORD PTR [b] add %EAX, %ECX and %EAX, -131072 and %ECX, 131071 or %ECX, %EAX mov DWORD PTR [b], %ECX ret llvm-svn: 23384	2005-09-18 06:02:59 +00:00
Chris Lattner	4ebc8ab4e0	fix typeo llvm-svn: 23383	2005-09-18 05:25:20 +00:00
Chris Lattner	e5b23a6d67	Remove unintentionally committed code llvm-svn: 23382	2005-09-18 05:12:51 +00:00
Chris Lattner	27cb9dbd35	implement shift.ll:test25. This compiles: struct S { unsigned int i : 6, j : 11, k : 15; } b; void plus3 (unsigned int x) { b.k += x; } to: _plus3: lis r2, ha16(L_b$non_lazy_ptr) lwz r2, lo16(L_b$non_lazy_ptr)(r2) lwz r3, 0(r2) rlwinm r4, r3, 0, 0, 14 add r4, r4, r3 rlwimi r4, r3, 0, 15, 31 stw r4, 0(r2) blr instead of: _plus3: lis r2, ha16(L_b$non_lazy_ptr) lwz r2, lo16(L_b$non_lazy_ptr)(r2) lwz r4, 0(r2) srwi r5, r4, 17 add r3, r5, r3 slwi r3, r3, 17 rlwimi r3, r4, 0, 15, 31 stw r3, 0(r2) blr llvm-svn: 23381	2005-09-18 05:12:10 +00:00
Chris Lattner	af517574ce	Implement add.ll:test29. Codegening: struct S { unsigned int i : 6, j : 11, k : 15; } b; void plus1 (unsigned int x) { b.i += x; } as: _plus1: lis r2, ha16(L_b$non_lazy_ptr) lwz r2, lo16(L_b$non_lazy_ptr)(r2) lwz r4, 0(r2) add r3, r4, r3 rlwimi r3, r4, 0, 0, 25 stw r3, 0(r2) blr instead of: _plus1: lis r2, ha16(L_b$non_lazy_ptr) lwz r2, lo16(L_b$non_lazy_ptr)(r2) lwz r4, 0(r2) rlwinm r5, r4, 0, 26, 31 add r3, r5, r3 rlwimi r3, r4, 0, 0, 25 stw r3, 0(r2) blr llvm-svn: 23379	2005-09-18 04:24:45 +00:00
Chris Lattner	027eaf01cf	remove debug output llvm-svn: 23377	2005-09-18 03:50:25 +00:00
Chris Lattner	1521298993	Implement or.ll:test21. This teaches instcombine to be able to turn this: struct { unsigned int bit0:1; unsigned int ubyte:31; } sdata; void foo() { sdata.ubyte++; } into this: foo: add DWORD PTR [sdata], 2 ret instead of this: foo: mov %EAX, DWORD PTR [sdata] mov %ECX, %EAX add %ECX, 2 and %ECX, -2 and %EAX, 1 or %EAX, %ECX mov DWORD PTR [sdata], %EAX ret llvm-svn: 23376	2005-09-18 03:42:07 +00:00
Chris Lattner	a393e4d4b3	Fix the regression last night compiling povray llvm-svn: 23348	2005-09-14 17:32:56 +00:00
Chris Lattner	2a8932960d	Add a simple xform to simplify array accesses with casts in the way. This is useful for 178.galgel where resolution of dope vectors (by the optimizer) causes the scales to become apparent. llvm-svn: 23328	2005-09-13 18:36:04 +00:00
Chris Lattner	fd018c8dfe	Fix an issue where LSR would miss rewriting a use of an IV expression by a PHI node that is not the original PHI. This fixes up a dot-product loop in galgel, speeding it up from 18.47s to 16.13s. llvm-svn: 23327	2005-09-13 02:09:55 +00:00
Chris Lattner	567b81f0d2	Add a helper function, allowing us to simplify some code a bit, changing indentation, no functionality change llvm-svn: 23325	2005-09-13 00:40:14 +00:00
Chris Lattner	219175c84d	Implement a simple xform to turn code like this: if () { store A -> P; } else { store B -> P; } into a PHI node with one store, in the most trival case. This implements load.ll:test10. llvm-svn: 23324	2005-09-12 23:23:25 +00:00
Chris Lattner	e0bfdf1485	Another load-peephole optimization: do gcse when two loads are next to each other. This implements InstCombine/load.ll:test9 llvm-svn: 23322	2005-09-12 22:21:03 +00:00
Chris Lattner	b990f7d8ed	Implement a trivial form of store->load forwarding where the store and the load are exactly consequtive. This is picked up by other passes, but this triggers thousands of times in fortran programs that use static locals (and is thus a compile-time speedup). llvm-svn: 23320	2005-09-12 22:00:15 +00:00
Chris Lattner	8048b85e8f	Fix a regression from last night, which caused this pass to create invalid code for IV uses outside of loops that are not dominated by the latch block. We should only convert these uses to use the post-inc value if they ARE dominated by the latch block. Also use a new LoopInfo method to simplify some code. This fixes Transforms/LoopStrengthReduce/2005-09-12-UsesOutOutsideOfLoop.ll llvm-svn: 23318	2005-09-12 17:11:27 +00:00
Chris Lattner	a67648396a	_test: li r2, 0 LBB_test_1: ; no_exit.2 li r5, 0 stw r5, 0(r3) addi r2, r2, 1 addi r3, r3, 4 cmpwi cr0, r2, 701 blt cr0, LBB_test_1 ; no_exit.2 LBB_test_2: ; loopexit.2.loopexit addi r2, r2, 1 stw r2, 0(r4) blr [zion ~/llvm]$ cat > ~/xx Uses of IV's outside of the loop should use hte post-incremented version of the IV, not the preincremented version. This helps many loops (e.g. in sixtrack) which used to generate code like this (this is the code from the dont-hoist-simple-loop-constants.ll testcase): _test: li r2, 0 ** IV starts at 0 LBB_test_1: ; no_exit.2 or r5, r2, r2 Copy for loop exit li r2, 0 stw r2, 0(r3) addi r3, r3, 4 addi r2, r5, 1 addi r6, r5, 2 IV+2 cmpwi cr0, r6, 701 blt cr0, LBB_test_1 ; no_exit.2 LBB_test_2: ; loopexit.2.loopexit addi r2, r5, 2 IV+2 stw r2, 0(r4) blr And now generated code like this: _test: li r2, 1 * IV starts at 1 LBB_test_1: ; no_exit.2 li r5, 0 stw r5, 0(r3) addi r2, r2, 1 addi r3, r3, 4 cmpwi cr0, r2, 701 * IV.postinc + 0 blt cr0, LBB_test_1 LBB_test_2: ; loopexit.2.loopexit stw r2, 0(r4) * IV.postinc + 0 blr llvm-svn: 23313	2005-09-12 06:04:47 +00:00
Chris Lattner	530fe6ab30	implement Transforms/LoopStrengthReduce/dont-hoist-simple-loop-constants.ll. We used to emit this code for it: _test: li r2, 1 ;; Value tying up a register for the whole loop li r5, 0 LBB_test_1: ; no_exit.2 or r6, r5, r5 li r5, 0 stw r5, 0(r3) addi r5, r6, 1 addi r3, r3, 4 add r7, r2, r5 ;; should be addi r7, r5, 1 cmpwi cr0, r7, 701 blt cr0, LBB_test_1 ; no_exit.2 LBB_test_2: ; loopexit.2.loopexit addi r2, r6, 2 stw r2, 0(r4) blr now we emit this: _test: li r2, 0 LBB_test_1: ; no_exit.2 or r5, r2, r2 li r2, 0 stw r2, 0(r3) addi r3, r3, 4 addi r2, r5, 1 addi r6, r5, 2 ;; whoa, fold those adds! cmpwi cr0, r6, 701 blt cr0, LBB_test_1 ; no_exit.2 LBB_test_2: ; loopexit.2.loopexit addi r2, r5, 2 stw r2, 0(r4) blr more improvement coming. llvm-svn: 23306	2005-09-10 01:18:45 +00:00
Chris Lattner	b5e381a8cf	Fix a problem that Dan Berlin noticed, where reassociation would not succeed in building maximal expressions before simplifying them. In particular, i cases like this: X-(A+B+X) the code would consider A+B+X to be a maximal expression (not understanding that the single use '-' would be turned into a + later), simplify it (a noop) then later get simplified again. Each of these simplify steps is where the cost of reassociation comes from, so this patch should speed up the already fast pass a bit. Thanks to Dan for noticing this! llvm-svn: 23214	2005-09-02 07:07:58 +00:00
Chris Lattner	9fe263aa75	Avoid creating garbage instructions, just move the old add instruction to where we need it when converting -(A+B+C) -> -A + -B + -C. llvm-svn: 23213	2005-09-02 06:38:04 +00:00
Chris Lattner	d1325da091	add some assertions and fix problems where reassociate could access the Ops vector out of range llvm-svn: 23211	2005-09-02 05:23:22 +00:00
Chris Lattner	8ca5b2a6d2	Fix Regression/Transforms/Reassociate/2005-08-24-Crash.ll llvm-svn: 23019	2005-08-24 17:55:32 +00:00
Chris Lattner	4201cd1bbc	Transform floor((double)FLT) -> (double)floorf(FLT), implementing Regression/Transforms/SimplifyLibCalls/floor.ll. This triggers 19 times in 177.mesa. llvm-svn: 23017	2005-08-24 17:22:17 +00:00
Chris Lattner	ea7dfd53d6	Fix Transforms/LoopStrengthReduce/2005-08-17-OutOfLoopVariant.ll, a crash on 177.mesa llvm-svn: 22843	2005-08-17 21:22:41 +00:00
Chris Lattner	2bf7cb5213	Use a new helper to split critical edges, making the code simpler. Do not claim to not change the CFG. We do change the cfg to split critical edges. This isn't causing us a problem now, but could likely do so in the future. llvm-svn: 22824	2005-08-17 06:35:16 +00:00
Chris Lattner	5cf983ee0f	Fix a bad case in gzip where we put lots of things in registers across the loop, because a IV-dependent value was used outside of the loop and didn't have immediate-folding capability llvm-svn: 22798	2005-08-16 00:38:11 +00:00
Chris Lattner	47d3ec3525	Ooops, don't forget to clear this. The real inner loop is now: .LBB_foo_3: ; no_exit.1 lfd f2, 0(r9) lfd f3, 8(r9) fmul f4, f1, f2 fmadd f4, f0, f3, f4 stfd f4, 8(r9) fmul f3, f1, f3 fmsub f2, f0, f2, f3 stfd f2, 0(r9) addi r9, r9, 16 addi r8, r8, 1 cmpw cr0, r8, r4 ble .LBB_foo_3 ; no_exit.1 llvm-svn: 22782	2005-08-13 07:42:01 +00:00
Chris Lattner	5949d49032	Recursively scan scev expressions for common subexpressions. This allows us to handle nested loops much better, for example, by being able to tell that these two expressions: {( 8 + ( 16 * ( 1 + %Tmp11 + %Tmp12)) + %c_),+,( 16 * %Tmp 12)}<loopentry.1> {(( 16 * ( 1 + %Tmp11 + %Tmp12)) + %c_),+,( 16 * %Tmp12)}<loopentry.1> Have the following common part that can be shared: {(( 16 * ( 1 + %Tmp11 + %Tmp12)) + %c_),+,( 16 * %Tmp12)}<loopentry.1> This allows us to codegen an important inner loop in 168.wupwise as: .LBB_foo_4: ; no_exit.1 lfd f2, 16(r9) fmul f3, f0, f2 fmul f2, f1, f2 fadd f4, f3, f2 stfd f4, 8(r9) fsub f2, f3, f2 stfd f2, 16(r9) addi r8, r8, 1 addi r9, r9, 16 cmpw cr0, r8, r4 ble .LBB_foo_4 ; no_exit.1 instead of: .LBB_foo_3: ; no_exit.1 lfdx f2, r6, r9 add r10, r6, r9 lfd f3, 8(r10) fmul f4, f1, f2 fmadd f4, f0, f3, f4 stfd f4, 8(r10) fmul f3, f1, f3 fmsub f2, f0, f2, f3 stfdx f2, r6, r9 addi r9, r9, 16 addi r8, r8, 1 cmpw cr0, r8, r4 ble .LBB_foo_3 ; no_exit.1 llvm-svn: 22781	2005-08-13 07:27:18 +00:00
Chris Lattner	89c1dfc733	Teach SplitCriticalEdge to update LoopInfo if it is alive. This fixes a problem in LoopStrengthReduction, where it would split critical edges then confused itself with outdated loop information. llvm-svn: 22776	2005-08-13 01:38:43 +00:00
Chris Lattner	79396539d3	remove dead code. The exit block list is computed on demand, thus does not need to be updated. This code is a relic from when it did. llvm-svn: 22775	2005-08-13 01:30:36 +00:00
Chris Lattner	8447b49526	When splitting critical edges, make sure not to leave the new block in the middle of the loop. This turns a critical loop in gzip into this: .LBB_test_1: ; loopentry or r27, r28, r28 add r28, r3, r27 lhz r28, 3(r28) add r26, r4, r27 lhz r26, 3(r26) cmpw cr0, r28, r26 bne .LBB_test_8 ; loopentry.loopexit_crit_edge .LBB_test_2: ; shortcirc_next.0 add r28, r3, r27 lhz r28, 5(r28) add r26, r4, r27 lhz r26, 5(r26) cmpw cr0, r28, r26 bne .LBB_test_7 ; shortcirc_next.0.loopexit_crit_edge .LBB_test_3: ; shortcirc_next.1 add r28, r3, r27 lhz r28, 7(r28) add r26, r4, r27 lhz r26, 7(r26) cmpw cr0, r28, r26 bne .LBB_test_6 ; shortcirc_next.1.loopexit_crit_edge .LBB_test_4: ; shortcirc_next.2 add r28, r3, r27 lhz r26, 9(r28) add r28, r4, r27 lhz r25, 9(r28) addi r28, r27, 8 cmpw cr7, r26, r25 mfcr r26, 1 rlwinm r26, r26, 31, 31, 31 add r25, r8, r27 cmpw cr7, r25, r7 mfcr r25, 1 rlwinm r25, r25, 29, 31, 31 and. r26, r26, r25 bne .LBB_test_1 ; loopentry instead of this: .LBB_test_1: ; loopentry or r27, r28, r28 add r28, r3, r27 lhz r28, 3(r28) add r26, r4, r27 lhz r26, 3(r26) cmpw cr0, r28, r26 beq .LBB_test_3 ; shortcirc_next.0 .LBB_test_2: ; loopentry.loopexit_crit_edge add r2, r30, r27 add r8, r29, r27 b .LBB_test_9 ; loopexit .LBB_test_3: ; shortcirc_next.0 add r28, r3, r27 lhz r28, 5(r28) add r26, r4, r27 lhz r26, 5(r26) cmpw cr0, r28, r26 beq .LBB_test_5 ; shortcirc_next.1 .LBB_test_4: ; shortcirc_next.0.loopexit_crit_edge add r2, r11, r27 add r8, r12, r27 b .LBB_test_9 ; loopexit .LBB_test_5: ; shortcirc_next.1 add r28, r3, r27 lhz r28, 7(r28) add r26, r4, r27 lhz r26, 7(r26) cmpw cr0, r28, r26 beq .LBB_test_7 ; shortcirc_next.2 .LBB_test_6: ; shortcirc_next.1.loopexit_crit_edge add r2, r9, r27 add r8, r10, r27 b .LBB_test_9 ; loopexit .LBB_test_7: ; shortcirc_next.2 add r28, r3, r27 lhz r26, 9(r28) add r28, r4, r27 lhz r25, 9(r28) addi r28, r27, 8 cmpw cr7, r26, r25 mfcr r26, 1 rlwinm r26, r26, 31, 31, 31 add r25, r8, r27 cmpw cr7, r25, r7 mfcr r25, 1 rlwinm r25, r25, 29, 31, 31 and. r26, r26, r25 bne .LBB_test_1 ; loopentry Next up, improve the code for the loop. llvm-svn: 22769	2005-08-12 22:22:17 +00:00
Chris Lattner	4fec86d348	Fix a FIXME: if we are inserting code for a PHI argument, split the critical edge so that the code is not always executed for both operands. This prevents LSR from inserting code into loops whose exit blocks contain PHI uses of IV expressions (which are outside of loops). On gzip, for example, we turn this ugly code: .LBB_test_1: ; loopentry add r27, r3, r28 lhz r27, 3(r27) add r26, r4, r28 lhz r26, 3(r26) add r25, r30, r28 ;; Only live if exiting the loop add r24, r29, r28 ;; Only live if exiting the loop cmpw cr0, r27, r26 bne .LBB_test_5 ; loopexit into this: .LBB_test_1: ; loopentry or r27, r28, r28 add r28, r3, r27 lhz r28, 3(r28) add r26, r4, r27 lhz r26, 3(r26) cmpw cr0, r28, r26 beq .LBB_test_3 ; shortcirc_next.0 .LBB_test_2: ; loopentry.loopexit_crit_edge add r2, r30, r27 add r8, r29, r27 b .LBB_test_9 ; loopexit .LBB_test_2: ; shortcirc_next.0 ... blt .LBB_test_1 into this: .LBB_test_1: ; loopentry or r27, r28, r28 add r28, r3, r27 lhz r28, 3(r28) add r26, r4, r27 lhz r26, 3(r26) cmpw cr0, r28, r26 beq .LBB_test_3 ; shortcirc_next.0 .LBB_test_2: ; loopentry.loopexit_crit_edge add r2, r30, r27 add r8, r29, r27 b .LBB_t_3: ; shortcirc_next.0 .LBB_test_3: ; shortcirc_next.0 ... blt .LBB_test_1 Next step: get the block out of the loop so that the loop is all fall-throughs again. llvm-svn: 22766	2005-08-12 22:06:11 +00:00
Chris Lattner	b7ebe65c56	Change break critical edges to not remove, then insert, PHI node entries. Instead, just update the BB in-place. This is both faster, and it prevents split-critical-edges from shuffling the PHI argument list unneccesarily. llvm-svn: 22765	2005-08-12 21:58:07 +00:00
Chris Lattner	62df798919	remove some trickiness that broke yacr2 and some other programs last night llvm-svn: 22751	2005-08-10 17:15:20 +00:00
Chris Lattner	f83ce5faee	Make loop-simplify produce better loops by turning PHI nodes like X = phi [X, Y] into just Y. This often occurs when it seperates loops that have collapsed loop headers. This implements LoopSimplify/phi-node-simplify.ll llvm-svn: 22746	2005-08-10 02:07:32 +00:00
Chris Lattner	677d85784a	Allow indvar simplify to canonicalize ANY affine IV, not just affine IVs with constant stride. This implements Transforms/IndVarsSimplify/variable-stride-ivs.ll llvm-svn: 22744	2005-08-10 01:12:06 +00:00
Chris Lattner	edff91a49a	Teach LSR to strength reduce IVs that have a loop-invariant but non-constant stride. For code like this: void foo(float a, float b, int n, int stride_a, int stride_b) { int i; for (i=0; i<n; i++) a[istride_a] = b[istride_b]; } we now emit: .LBB_foo2_2: ; no_exit lfs f0, 0(r4) stfs f0, 0(r3) addi r7, r7, 1 add r4, r2, r4 add r3, r6, r3 cmpw cr0, r7, r5 blt .LBB_foo2_2 ; no_exit instead of: .LBB_foo_2: ; no_exit mullw r8, r2, r7 ;; multiply! slwi r8, r8, 2 lfsx f0, r4, r8 mullw r8, r2, r6 ;; multiply! slwi r8, r8, 2 stfsx f0, r3, r8 addi r2, r2, 1 cmpw cr0, r2, r5 blt .LBB_foo_2 ; no_exit loops with variable strides occur pretty often. For example, in SPECFP2K there are 317 variable strides in 177.mesa, 3 in 179.art, 14 in 188.ammp, 56 in 168.wupwise, 36 in 172.mgrid. Now we can allow indvars to turn functions written like this: void foo2(float a, float b, int n, int stride_a, int stride_b) { int i, ai = 0, bi = 0; for (i=0; i<n; i++) { a[ai] = b[bi]; ai += stride_a; bi += stride_b; } } into code like the above for better analysis. With this patch, they generate identical code. llvm-svn: 22740	2005-08-10 00:45:21 +00:00
Chris Lattner	dde7dc525e	Fix Regression/Transforms/LoopStrengthReduce/phi_node_update_multiple_preds.ll by being more careful about updating PHI nodes llvm-svn: 22739	2005-08-10 00:35:32 +00:00
Chris Lattner	c6c4d99a21	Fix some 80 column violations. Once we compute the evolution for a GEP, tell SE about it. This allows users of the GEP to know it, if the users are not direct. This allows us to compile this testcase: void fbSolidFillmmx(int w, unsigned char d) { while (w >= 64) { (unsigned long long ) (d + 0) = 0; (unsigned long long ) (d + 8) = 0; (unsigned long long ) (d + 16) = 0; (unsigned long long ) (d + 24) = 0; (unsigned long long ) (d + 32) = 0; (unsigned long long ) (d + 40) = 0; (unsigned long long ) (d + 48) = 0; (unsigned long long *) (d + 56) = 0; w -= 64; d += 64; } } into: .LBB_fbSolidFillmmx_2: ; no_exit li r2, 0 stw r2, 0(r4) stw r2, 4(r4) stw r2, 8(r4) stw r2, 12(r4) stw r2, 16(r4) stw r2, 20(r4) stw r2, 24(r4) stw r2, 28(r4) stw r2, 32(r4) stw r2, 36(r4) stw r2, 40(r4) stw r2, 44(r4) stw r2, 48(r4) stw r2, 52(r4) stw r2, 56(r4) stw r2, 60(r4) addi r4, r4, 64 addi r3, r3, -64 cmpwi cr0, r3, 63 bgt .LBB_fbSolidFillmmx_2 ; no_exit instead of: .LBB_fbSolidFillmmx_2: ; no_exit li r11, 0 stw r11, 0(r4) stw r11, 4(r4) stwx r11, r10, r4 add r12, r10, r4 stw r11, 4(r12) stwx r11, r9, r4 add r12, r9, r4 stw r11, 4(r12) stwx r11, r8, r4 add r12, r8, r4 stw r11, 4(r12) stwx r11, r7, r4 add r12, r7, r4 stw r11, 4(r12) stwx r11, r6, r4 add r12, r6, r4 stw r11, 4(r12) stwx r11, r5, r4 add r12, r5, r4 stw r11, 4(r12) stwx r11, r2, r4 add r12, r2, r4 stw r11, 4(r12) addi r4, r4, 64 addi r3, r3, -64 cmpwi cr0, r3, 63 bgt .LBB_fbSolidFillmmx_2 ; no_exit llvm-svn: 22737	2005-08-09 23:39:36 +00:00
Chris Lattner	02742710f3	SCEVAddExpr::get() of an empty list is invalid. llvm-svn: 22724	2005-08-09 01:13:47 +00:00
Chris Lattner	a091ff1764	Implement: LoopStrengthReduce/share_ivs.ll Two changes: * Only insert one PHI node for each stride. Other values are live in values. This cannot introduce higher register pressure than the previous approach, and can take advantage of reg+reg addressing modes. * Factor common base values out of uses before moving values from the base to the immediate fields. This improves codegen by starting the stride-specific PHI node out at a common place for each IV use. As an example, we used to generate this for a loop in swim: .LBB_main_no_exit_2E_6_2E_i_no_exit_2E_7_2E_i_2: ; no_exit.7.i lfd f0, 0(r8) stfd f0, 0(r3) lfd f0, 0(r6) stfd f0, 0(r7) lfd f0, 0(r2) stfd f0, 0(r5) addi r9, r9, 1 addi r2, r2, 8 addi r5, r5, 8 addi r6, r6, 8 addi r7, r7, 8 addi r8, r8, 8 addi r3, r3, 8 cmpw cr0, r9, r4 bgt .LBB_main_no_exit_2E_6_2E_i_no_exit_2E_7_2E_i_1 now we emit: .LBB_main_no_exit_2E_6_2E_i_no_exit_2E_7_2E_i_2: ; no_exit.7.i lfdx f0, r8, r2 stfdx f0, r9, r2 lfdx f0, r5, r2 stfdx f0, r7, r2 lfdx f0, r3, r2 stfdx f0, r6, r2 addi r10, r10, 1 addi r2, r2, 8 cmpw cr0, r10, r4 bgt .LBB_main_no_exit_2E_6_2E_i_no_exit_2E_7_2E_i_1 As another more dramatic example, we used to emit this: .LBB_main_L_90_no_exit_2E_0_2E_i16_no_exit_2E_1_2E_i19_2: ; no_exit.1.i19 lfd f0, 8(r21) lfd f4, 8(r3) lfd f5, 8(r27) lfd f6, 8(r22) lfd f7, 8(r5) lfd f8, 8(r6) lfd f9, 8(r30) lfd f10, 8(r11) lfd f11, 8(r12) fsub f10, f10, f11 fadd f5, f4, f5 fmul f5, f5, f1 fadd f6, f6, f7 fadd f6, f6, f8 fadd f6, f6, f9 fmadd f0, f5, f6, f0 fnmsub f0, f10, f2, f0 stfd f0, 8(r4) lfd f0, 8(r25) lfd f5, 8(r26) lfd f6, 8(r23) lfd f9, 8(r28) lfd f10, 8(r10) lfd f12, 8(r9) lfd f13, 8(r29) fsub f11, f13, f11 fadd f4, f4, f5 fmul f4, f4, f1 fadd f5, f6, f9 fadd f5, f5, f10 fadd f5, f5, f12 fnmsub f0, f4, f5, f0 fnmsub f0, f11, f3, f0 stfd f0, 8(r24) lfd f0, 8(r8) fsub f4, f7, f8 fsub f5, f12, f10 fnmsub f0, f5, f2, f0 fnmsub f0, f4, f3, f0 stfd f0, 8(r2) addi r20, r20, 1 addi r2, r2, 8 addi r8, r8, 8 addi r10, r10, 8 addi r12, r12, 8 addi r6, r6, 8 addi r29, r29, 8 addi r28, r28, 8 addi r26, r26, 8 addi r25, r25, 8 addi r24, r24, 8 addi r5, r5, 8 addi r23, r23, 8 addi r22, r22, 8 addi r3, r3, 8 addi r9, r9, 8 addi r11, r11, 8 addi r30, r30, 8 addi r27, r27, 8 addi r21, r21, 8 addi r4, r4, 8 cmpw cr0, r20, r7 bgt .LBB_main_L_90_no_exit_2E_0_2E_i16_no_exit_2E_1_2E_i19_1 we now emit: .LBB_main_L_90_no_exit_2E_0_2E_i16_no_exit_2E_1_2E_i19_2: ; no_exit.1.i19 lfdx f0, r21, r20 lfdx f4, r3, r20 lfdx f5, r27, r20 lfdx f6, r22, r20 lfdx f7, r5, r20 lfdx f8, r6, r20 lfdx f9, r30, r20 lfdx f10, r11, r20 lfdx f11, r12, r20 fsub f10, f10, f11 fadd f5, f4, f5 fmul f5, f5, f1 fadd f6, f6, f7 fadd f6, f6, f8 fadd f6, f6, f9 fmadd f0, f5, f6, f0 fnmsub f0, f10, f2, f0 stfdx f0, r4, r20 lfdx f0, r25, r20 lfdx f5, r26, r20 lfdx f6, r23, r20 lfdx f9, r28, r20 lfdx f10, r10, r20 lfdx f12, r9, r20 lfdx f13, r29, r20 fsub f11, f13, f11 fadd f4, f4, f5 fmul f4, f4, f1 fadd f5, f6, f9 fadd f5, f5, f10 fadd f5, f5, f12 fnmsub f0, f4, f5, f0 fnmsub f0, f11, f3, f0 stfdx f0, r24, r20 lfdx f0, r8, r20 fsub f4, f7, f8 fsub f5, f12, f10 fnmsub f0, f5, f2, f0 fnmsub f0, f4, f3, f0 stfdx f0, r2, r20 addi r19, r19, 1 addi r20, r20, 8 cmpw cr0, r19, r7 bgt .LBB_main_L_90_no_exit_2E_0_2E_i16_no_exit_2E_1_2E_i19_1 llvm-svn: 22722	2005-08-09 00:18:09 +00:00
Chris Lattner	37c24cc98c	Suck the base value out of the UsersToProcess vector into the BasedUser class to simplify the code. Fuse two loops. llvm-svn: 22721	2005-08-08 22:56:21 +00:00
Chris Lattner	37ed895bf1	Split MoveLoopVariantsToImediateField out from MoveImmediateValues. The first is a correctness thing, and the later is an optzn thing. This also is needed to support a future change. llvm-svn: 22720	2005-08-08 22:32:34 +00:00
Chris Lattner	9f269e40c9	Use the new 'moveBefore' method to simplify some code. Really, which is easier to understand? :) llvm-svn: 22706	2005-08-08 19:11:57 +00:00
Chris Lattner	14203e85b2	Not all constants are legal immediates in load/store instructions. llvm-svn: 22704	2005-08-08 06:25:50 +00:00
Chris Lattner	c70bbc0c41	Implement LoopStrengthReduce/share_code_in_preheader.ll by having one rewriter for all code inserted into the preheader, which is never flushed. llvm-svn: 22702	2005-08-08 05:47:49 +00:00
Chris Lattner	9bfa6f8784	Implement a simple optimization for the termination condition of the loop. The termination condition actually wants to use the post-incremented value of the loop, not a new indvar with an unusual base. On PPC, for example, this allows us to compile LoopStrengthReduce/exit_compare_live_range.ll to: _foo: li r2, 0 .LBB_foo_1: ; no_exit li r5, 0 stw r5, 0(r3) addi r2, r2, 1 cmpw cr0, r2, r4 bne .LBB_foo_1 ; no_exit blr instead of: _foo: li r2, 1 ;; IV starts at 1, not 0 .LBB_foo_1: ; no_exit li r5, 0 stw r5, 0(r3) addi r5, r2, 1 cmpw cr0, r2, r4 or r2, r5, r5 ;; Reg-reg copy, extra live range bne .LBB_foo_1 ; no_exit blr This implements LoopStrengthReduce/exit_compare_live_range.ll llvm-svn: 22699	2005-08-08 05:28:22 +00:00
Chris Lattner	579b20b747	All stats are "Number of ..." llvm-svn: 22694	2005-08-07 20:02:04 +00:00
Chris Lattner	2c14cf7b74	Add some simple folds that occur in bitfield cases. Fix a minor bug in isHighOnes, where it would consider 0 to have high ones. llvm-svn: 22693	2005-08-07 07:03:10 +00:00
Chris Lattner	134ebd0801	Fix typoCVS: ---------------------------------------------------------------------- llvm-svn: 22692	2005-08-07 07:00:52 +00:00
Chris Lattner	f4dd8c445c	* Use the new PHINode::hasConstantValue method to simplify some code * Teach this code to move allocas out of the loop when tail call eliminating a call marked 'tail'. This implements TailCallElim/move_alloca_for_tail_call.ll * Do not perform this transformation if a call is marked 'tail' and if there are allocas that we cannot move out of the loop in #2. Doing so would increase the stack usage of the function. This implements fixes PR615 and TailCallElim/dont-tce-tail-marked-call.ll. llvm-svn: 22690	2005-08-07 04:27:41 +00:00
Chris Lattner	11e7a5eda7	Make sure to clean CastedPointers after casts are potentially deleted. This fixes LSR crashes on 301.apsi, 191.fma3d, and 189.lucas llvm-svn: 22673	2005-08-05 01:30:11 +00:00
Chris Lattner	9f9c260b8c	now that hasConstantValue defaults to only returning values that dominate the PHI node, this ugly code can vanish. llvm-svn: 22672	2005-08-05 01:04:30 +00:00
Chris Lattner	257efb2ad3	This code can handle non-dominating instructions llvm-svn: 22667	2005-08-05 00:57:45 +00:00
Nate Begeman	b392321cae	Fix a fixme in CondPropagate.cpp by moving a PhiNode optimization into BasicBlock's removePredecessor routine. This requires shuffling around the definition and implementation of hasContantValue from Utils.h,cpp into Instructions.h,cpp llvm-svn: 22664	2005-08-04 23:24:19 +00:00
Chris Lattner	45f8b6e7aa	Modify how immediates are removed from base expressions to deal with the fact that the symbolic evaluator is not always able to use subtraction to remove expressions. This makes the code faster, and fixes the last crash on 178.galgel. Finally, add a statistic to see how many phi nodes are inserted. On 178.galgel, we get the follow stats: 2562 loop-reduce - Number of PHIs inserted 3927 loop-reduce - Number of GEPs strength reduced llvm-svn: 22662	2005-08-04 22:34:05 +00:00
Chris Lattner	a6d7c355bc	* Refactor some code into a new BasedUser::RewriteInstructionToUseNewBase method. * Fix a crash on 178.galgel, where we would insert expressions before PHI nodes instead of into the PHI node predecessor blocks. llvm-svn: 22657	2005-08-04 20:03:32 +00:00
Chris Lattner	0f7c0fa2a7	Fix a case that caused this to crash on 178.galgel llvm-svn: 22653	2005-08-04 19:26:19 +00:00
Chris Lattner	acc42c4df1	Teach LSR about loop-variant expressions, such as loops like this: for (i = 0; i < N; ++i) A[i][foo()] = 0; here we still want to strength reduce the A[i] part, even though foo() is l-v. This also simplifies some of the 'CanReduce' logic. This implements Transforms/LoopStrengthReduce/ops_after_indvar.ll llvm-svn: 22652	2005-08-04 19:08:16 +00:00
Nate Begeman	456044b724	Remove some more dead code. llvm-svn: 22650	2005-08-04 18:13:56 +00:00
Chris Lattner	eaf24725b2	Refactor this code substantially with the following improvements: 1. We only analyze instructions once, guaranteed 2. AnalyzeGetElementPtrUsers has been ripped apart and replaced with something much simpler. The next step is to handle expressions that are not all indvar+loop-invariant values (e.g. handling indvar+loopvariant). llvm-svn: 22649	2005-08-04 17:40:30 +00:00
Chris Lattner	6f286b760f	refactor some code llvm-svn: 22643	2005-08-04 01:19:13 +00:00
Chris Lattner	6510749050	invert to if's to make the logic simpler llvm-svn: 22641	2005-08-04 00:40:47 +00:00
Chris Lattner	a0102fbc4f	When processing outer loops and we find uses of an IV in inner loops, make sure to handle the use, just don't recurse into it. This permits us to generate this code for a simple nested loop case: .LBB_foo_0: ; entry stwu r1, -48(r1) stw r29, 44(r1) stw r30, 40(r1) mflr r11 stw r11, 56(r1) lis r2, ha16(L_A$non_lazy_ptr) lwz r30, lo16(L_A$non_lazy_ptr)(r2) li r29, 1 .LBB_foo_1: ; no_exit.0 bl L_bar$stub li r2, 1 or r3, r30, r30 .LBB_foo_2: ; no_exit.1 lfd f0, 8(r3) stfd f0, 0(r3) addi r4, r2, 1 addi r3, r3, 8 cmpwi cr0, r2, 100 or r2, r4, r4 bne .LBB_foo_2 ; no_exit.1 .LBB_foo_3: ; loopexit.1 addi r30, r30, 800 addi r2, r29, 1 cmpwi cr0, r29, 100 or r29, r2, r2 bne .LBB_foo_1 ; no_exit.0 .LBB_foo_4: ; return lwz r11, 56(r1) mtlr r11 lwz r30, 40(r1) lwz r29, 44(r1) lwz r1, 0(r1) blr instead of this: _foo: .LBB_foo_0: ; entry stwu r1, -48(r1) stw r28, 44(r1) ;; uses an extra register. stw r29, 40(r1) stw r30, 36(r1) mflr r11 stw r11, 56(r1) li r30, 1 li r29, 0 or r28, r29, r29 .LBB_foo_1: ; no_exit.0 bl L_bar$stub mulli r2, r28, 800 ;; unstrength-reduced multiply lis r3, ha16(L_A$non_lazy_ptr) ;; loop invariant address computation lwz r3, lo16(L_A$non_lazy_ptr)(r3) add r2, r2, r3 mulli r4, r29, 800 ;; unstrength-reduced multiply addi r3, r3, 8 add r3, r4, r3 li r4, 1 .LBB_foo_2: ; no_exit.1 lfd f0, 0(r3) stfd f0, 0(r2) addi r5, r4, 1 addi r2, r2, 8 ;; multiple stride 8 IV's addi r3, r3, 8 cmpwi cr0, r4, 100 or r4, r5, r5 bne .LBB_foo_2 ; no_exit.1 .LBB_foo_3: ; loopexit.1 addi r28, r28, 1 ;;; Many IV's with stride 1 addi r29, r29, 1 addi r2, r30, 1 cmpwi cr0, r30, 100 or r30, r2, r2 bne .LBB_foo_1 ; no_exit.0 .LBB_foo_4: ; return lwz r11, 56(r1) mtlr r11 lwz r30, 36(r1) lwz r29, 40(r1) lwz r28, 44(r1) lwz r1, 0(r1) blr llvm-svn: 22640	2005-08-04 00:14:11 +00:00
Chris Lattner	fc62470466	Teach loop-reduce to see into nested loops, to pull out immediate values pushed down by SCEV. In a nested loop case, this allows us to emit this: lis r3, ha16(L_A$non_lazy_ptr) lwz r3, lo16(L_A$non_lazy_ptr)(r3) add r2, r2, r3 li r3, 1 .LBB_foo_2: ; no_exit.1 lfd f0, 8(r2) ;; Uses offset of 8 instead of 0 stfd f0, 0(r2) addi r4, r3, 1 addi r2, r2, 8 cmpwi cr0, r3, 100 or r3, r4, r4 bne .LBB_foo_2 ; no_exit.1 instead of this: lis r3, ha16(L_A$non_lazy_ptr) lwz r3, lo16(L_A$non_lazy_ptr)(r3) add r2, r2, r3 addi r3, r3, 8 li r4, 1 .LBB_foo_2: ; no_exit.1 lfd f0, 0(r3) stfd f0, 0(r2) addi r5, r4, 1 addi r2, r2, 8 addi r3, r3, 8 cmpwi cr0, r4, 100 or r4, r5, r5 bne .LBB_foo_2 ; no_exit.1 llvm-svn: 22639	2005-08-03 23:44:42 +00:00
Chris Lattner	bb78c97e24	improve debug output llvm-svn: 22638	2005-08-03 23:30:08 +00:00
Chris Lattner	db23c74e5e	Move from Stage 0 to Stage 1. Only emit one PHI node for IV uses with identical bases and strides (after moving foldable immediates to the load/store instruction). This implements LoopStrengthReduce/dont_insert_redundant_ops.ll, allowing us to generate this PPC code for test1: or r30, r3, r3 .LBB_test1_1: ; Loop li r2, 0 stw r2, 0(r30) stw r2, 4(r30) bl L_pred$stub addi r30, r30, 8 cmplwi cr0, r3, 0 bne .LBB_test1_1 ; Loop instead of this code: or r30, r3, r3 or r29, r3, r3 .LBB_test1_1: ; Loop li r2, 0 stw r2, 0(r29) stw r2, 4(r30) bl L_pred$stub addi r30, r30, 8 ;; Two iv's with step of 8 addi r29, r29, 8 cmplwi cr0, r3, 0 bne .LBB_test1_1 ; Loop llvm-svn: 22635	2005-08-03 22:51:21 +00:00
Chris Lattner	430d0022df	Rename IVUse to IVUsersOfOneStride, use a struct instead of a pair to unify some parallel vectors and get field names more descriptive than "first" and "second". This isn't lisp afterall :) llvm-svn: 22633	2005-08-03 22:21:05 +00:00
Chris Lattner	84e9baa925	Fix a nasty dangling pointer issue. The ScalarEvolution pass would keep a map from instruction* to SCEVHandles. When we delete instructions, we have to tell it about it. We would run into nasty cases where new instructions were reallocated at old instruction addresses and get the old map values. Bad bad bad :( llvm-svn: 22632	2005-08-03 21:36:09 +00:00
Chris Lattner	3de05cc930	The correct fix for PR612, which also fixes Transforms/LowerInvoke/2005-08-03-InvokeWithPHIUse.ll llvm-svn: 22628	2005-08-03 18:51:44 +00:00
Chris Lattner	f8a81a9886	When inserting code, make sure not to insert it before PHI nodes. This fixes PR612 and Transforms/LowerInvoke/2005-08-03-InvokeWithPHI.ll llvm-svn: 22626	2005-08-03 18:34:29 +00:00
Chris Lattner	d683bdd0f8	Fix Transforms/SimplifyCFG/2005-08-03-PHIFactorCrash.ll, a problem that occurred while bugpointing another testcase llvm-svn: 22621	2005-08-03 17:59:45 +00:00
Chris Lattner	2dbf1960ff	Finally, add the required constraint checks to fix Transforms/SimplifyCFG/2005-08-01-PHIUpdateFail.ll the right way llvm-svn: 22615	2005-08-03 00:59:12 +00:00
Chris Lattner	908036942c	Simplify some code, add the correct pred checks llvm-svn: 22613	2005-08-03 00:38:27 +00:00
Chris Lattner	982b75c061	Refactor code out of PropagatePredecessorsForPHIs, turning it into a pure function with no side-effects llvm-svn: 22612	2005-08-03 00:29:26 +00:00
Chris Lattner	1f047fd513	use splice instead of remove/insert to avoid some symtab operations llvm-svn: 22611	2005-08-03 00:23:42 +00:00
Chris Lattner	76dc204488	move two functions up in the file, use SafeToMergeTerminators to eliminate some duplicated code llvm-svn: 22610	2005-08-03 00:19:45 +00:00
Chris Lattner	733d6704ce	Rip some code out of the main SimplifyCFG function into a subfunction and call it from the only place it is live. No functionality changes. llvm-svn: 22609	2005-08-03 00:11:16 +00:00
Chris Lattner	ac594de8dc	Disable this patch: http://lists.cs.uiuc.edu/pipermail/llvm-commits/Week-of-Mon-20050801/027345.html This breaks real programs and only fixes an obscure regression testcase. A real fix is in development. llvm-svn: 22606	2005-08-02 23:31:38 +00:00
Chris Lattner	eee90f7eb4	Change a place to use an arbitrary value instead of null, when possible llvm-svn: 22605	2005-08-02 23:29:23 +00:00
Chris Lattner	22d00a8e90	Update to use the new MathExtras.h support for log2 computation. Patch contributed by Jim Laskey! llvm-svn: 22592	2005-08-02 19:16:58 +00:00
Chris Lattner	351b891cbc	Like the comment says, do not insert cast instructions before phi nodes llvm-svn: 22586	2005-08-02 03:31:14 +00:00
Chris Lattner	4fd3e16cbd	This code was very close, but not quite right. It did not take into consideration the case where a reference in an unreachable block could occur. This fixes Transforms/SimplifyCFG/2005-08-01-PHIUpdateFail.ll, something I ran into while bugpoint'ing another pass. llvm-svn: 22584	2005-08-02 03:24:05 +00:00
Chris Lattner	75a44e154e	add a comment, make a check more lenient llvm-svn: 22581	2005-08-02 02:52:02 +00:00
Chris Lattner	dcce49e006	Simplify for loop, clear a per-loop map after processing each loop llvm-svn: 22580	2005-08-02 02:44:31 +00:00
Chris Lattner	9ef1294210	Add a comment Make LSR ignore GEP's that have loop variant base values, as we currently cannot codegen them llvm-svn: 22576	2005-08-02 01:32:29 +00:00
Chris Lattner	564900e5e5	Fix an iterator invalidation problem llvm-svn: 22575	2005-08-02 00:41:11 +00:00
Chris Lattner	e17c5d0e59	ConstantInt::get only works for arguments < 128. SimplifyLibCalls probably has to be audited to make sure it does not make this mistake elsewhere. Also, if this code knows that the type will be unsigned, obviously one arm of this is dead. Reid, can you take a look into this further? llvm-svn: 22566	2005-08-01 16:52:50 +00:00
Jeff Cohen	546fd5944e	Keep tabs and trailing spaces out. llvm-svn: 22565	2005-07-30 18:33:25 +00:00
Jeff Cohen	c500991055	Fix VC++ build problems. llvm-svn: 22564	2005-07-30 18:22:27 +00:00
Nate Begeman	17a0e2afea	Ack, typo llvm-svn: 22560	2005-07-30 00:21:31 +00:00
Nate Begeman	e68bcd1946	Commit a new LoopStrengthReduce pass that can use scalar evolutions and target data to decide which loop induction variables to strength reduce and how to do so. This work is mostly by Chris Lattner, with tweaks by me to get it working on some of MultiSource. llvm-svn: 22558	2005-07-30 00:15:07 +00:00
Nate Begeman	2bca4d9b7b	Break SCEVExpander out of IndVarSimplify into its own .h/.cpp file so that other passes may use it. llvm-svn: 22557	2005-07-30 00:12:19 +00:00
Jeff Cohen	5f4ef3c5a8	Eliminate all remaining tabs and trailing spaces. llvm-svn: 22523	2005-07-27 06:12:32 +00:00
Chris Lattner	31d0ac2414	ConvertibleToGEP always returns 0, remove some old crufty code which is actually dead because of this! llvm-svn: 22515	2005-07-26 16:38:28 +00:00
Chris Lattner	18aa4d8196	Do not let MaskedValueIsZero consider undef to be zero, for reasons explained in the comment. This fixes UnitTests/2003-09-18-BitFieldTest on darwin llvm-svn: 22483	2005-07-20 18:49:28 +00:00
Chris Lattner	247aef884c	When transforming &A[i] < &A[j] -> i < j, make sure to perform the comparison as a signed compare. This patch may fix PR597, but is correct in any case. llvm-svn: 22465	2005-07-18 23:07:33 +00:00
Chris Lattner	4ed40f7c6f	Fix a problem that instcombine would hit when dealing with unreachable code. Because the instcombine has to scan the entire function when it starts up to begin with, we might as well do it in DFO so we can nuke unreachable code. This fixes: Transforms/InstCombine/2005-07-07-DeadPHILoop.ll llvm-svn: 22348	2005-07-07 20:40:38 +00:00
Chris Lattner	937c71f2b3	Fix PR590 and Transforms/Mem2Reg/2005-06-30-ReadBeforeWrite.ll. The optimization for locally used allocas was not safe for allocas that were read before they were written. This change disables that optimization in that case. llvm-svn: 22318	2005-06-30 07:29:44 +00:00
John Criswell	810b4f8d55	Doh! Forgot to LLVMify the style. llvm-svn: 22312	2005-06-29 15:57:50 +00:00
John Criswell	4642afdcc1	Basic fix for PR#591; don't convert an fprintf() to an fwrite() if there is a mismatch in their character type pointers (i.e. fprintf() prints an array of ubytes while fwrite() takes an array of sbytes). We can probably do better than this (such as casting the ubyte to an sbyte). llvm-svn: 22310	2005-06-29 15:03:18 +00:00
Chris Lattner	9610c6f287	add a debug type llvm-svn: 22277	2005-06-24 16:00:46 +00:00
Andrew Lenharth	cf52eb2b99	prevent va_arg from being hoisted from a loop llvm-svn: 22265	2005-06-20 13:36:33 +00:00
Andrew Lenharth	d4b103107e	prevent DCE of vaarg intrinsics. This should take care of most regressions llvm-svn: 22263	2005-06-19 14:41:20 +00:00
Andrew Lenharth	9144ec4764	core changes for varargs llvm-svn: 22254	2005-06-18 18:34:52 +00:00
Reid Spencer	a7828baa3c	Fix a problem with the strcmp optimization checking the wrong string and not casting to the correct type. llvm-svn: 22250	2005-06-18 17:46:28 +00:00
Reid Spencer	4fdd96c4e0	Clean up some uninitialized variables and missing return statements that GCC 4.0.0 compiler (sometimes incorrectly) warns about under release build. llvm-svn: 22249	2005-06-18 17:37:34 +00:00
Chris Lattner	2ceb6ee576	This is not true: (X != 13 \| X < 15) -> X < 15 It is actually always true. This fixes PR586 and Transforms/InstCombine/2005-06-16-SetCCOrSetCCMiscompile.ll llvm-svn: 22236	2005-06-17 03:59:17 +00:00
Chris Lattner	73bcba5f61	Don't crash when dealing with INTMIN. This fixes PR585 and Transforms/InstCombine/2005-06-16-RangeCrash.ll llvm-svn: 22234	2005-06-17 02:05:55 +00:00
Chris Lattner	5e735294bf	Don't crash on: X = phi (X, X). This fixes PR584 and Transforms/SimplifyCFG/2005-06-16-PHICrash.ll llvm-svn: 22232	2005-06-17 01:45:53 +00:00
Chris Lattner	c53cb9d3ff	avoid constructing out of range shift amounts. llvm-svn: 22230	2005-06-17 01:29:28 +00:00
Chris Lattner	89dc4f16f5	Fix PR583 and testcase Transforms/InstCombine/2005-06-15-DivSelectCrash.ll llvm-svn: 22227	2005-06-16 04:55:52 +00:00
Chris Lattner	252a845e30	Fix PR571, removing code that does just the WRONG thing :) llvm-svn: 22225	2005-06-16 03:00:08 +00:00
Chris Lattner	104002bee3	Fix a bug in my previous patch. Do not get the shift amount type (which is always ubyte, get the type being shifted). This unbreaks espresso llvm-svn: 22224	2005-06-16 01:52:07 +00:00
Chris Lattner	d48b127aea	Fix PR575, patch provided by John Mellor-Crummey. Thanks! llvm-svn: 22223	2005-06-15 22:49:30 +00:00
Chris Lattner	df81539278	Fix PR582. The rewriter can move casts around, which invalidated the BB iterator. This fixes Transforms/IndVarsSimplify/2005-06-15-InstMoveCrash.ll llvm-svn: 22221	2005-06-15 21:29:31 +00:00
Chris Lattner	50bdfcb045	Do not promote globals only used by main to locals if there are constantexprs or other uses hanging off of them. llvm-svn: 22219	2005-06-15 21:11:48 +00:00
Chris Lattner	19b57f55aa	Fix PR577 and testcase InstCombine/2005-06-15-ShiftSetCCCrash.ll. Do not perform undefined out of range shifts. llvm-svn: 22217	2005-06-15 20:53:31 +00:00
Reid Spencer	a299d6f701	Put the hack back in that removes features, causes regressions to fail, but allows test programs to succeed. Actual fix for this is forthcoming. llvm-svn: 22213	2005-06-15 18:25:30 +00:00
Reid Spencer	6d231e55fa	Unbreak several InstCombine regression checks introduced by a hack to fix the bzip2 test. A better hack is needed. llvm-svn: 22209	2005-06-13 06:41:26 +00:00
Chris Lattner	1609a541cd	Fix a 64-bit problem, passing (int)0 through ... instead of (void*)0 llvm-svn: 22206	2005-06-09 03:32:54 +00:00
Chris Lattner	fbc45f10d0	Fix a problem on 64-bit targets where we passed (int)0 through ... instead of (void*)0. llvm-svn: 22205	2005-06-09 02:59:00 +00:00
Andrew Lenharth	ffe65458e7	hack to fix bzip2 (bug 571) llvm-svn: 22192	2005-06-04 12:43:56 +00:00
Reid Spencer	9fbad13dd7	Make the registration hash_map static. No other module needs it. Also, document what its for a little better. llvm-svn: 22164	2005-05-21 01:27:04 +00:00
Reid Spencer	0b13cdabae	Adjust the file comment to read a little easier. llvm-svn: 22163	2005-05-21 00:57:44 +00:00
Reid Spencer	45bb4afc79	Make sure ... arguments are casted to sbyte* where needed. llvm-svn: 22162	2005-05-21 00:39:30 +00:00
Reid Spencer	895af9ef24	Add a "brief" comment for CastToCStr llvm-svn: 22161	2005-05-21 00:23:23 +00:00
Chris Lattner	f8053cee7c	Fix mismatched type problem that crashed on cases like this: sprintf(P, "%s", X); Where X is not an sbyte*. This fixes the bug JohnMC reported on llvm-bugs. llvm-svn: 22159	2005-05-20 22:22:25 +00:00
Chris Lattner	19f9f32a5c	Fix Transforms/SimplifyCFG/switch-simplify-crash.ll llvm-svn: 22158	2005-05-20 22:19:54 +00:00
Chris Lattner	05deb04cb0	teach the inliner about coldcc and noreturn functions llvm-svn: 22113	2005-05-18 04:30:33 +00:00
Reid Spencer	74305a6233	Don't look for __builtin_ffs, we'll never see it from llvm-gcc and there's not reason to include it for other front ends. llvm-svn: 22070	2005-05-15 21:27:34 +00:00
Reid Spencer	17f7784c5d	Provide this optimization as well: ffs(x) -> (x == 0 ? 0 : 1+llvm.cttz(x)) llvm-svn: 22068	2005-05-15 21:19:45 +00:00
Reid Spencer	3de98ee643	Duh .. you actually have to #include Config/config.h before you can test for one of the values that it defines! llvm-svn: 22058	2005-05-15 17:20:47 +00:00
Reid Spencer	b195fcd5ef	Changes for ffs lib call simplification: * Check for availability of ffsll call in configure script * Support ffs, ffsl, and ffsll conversion to constant value if the argument is constant. llvm-svn: 22027	2005-05-14 16:42:52 +00:00
Chris Lattner	403d1c204c	Preserve calling conv when hacking on calls llvm-svn: 22025	2005-05-14 12:28:32 +00:00
Chris Lattner	05c703ea85	preserve calling conventions when hacking on code llvm-svn: 22024	2005-05-14 12:25:32 +00:00
Chris Lattner	bcefcf8552	Make sure to preserve the calling convention when changing an invoke into a call. This fixes Prolangs-C++/deriv2, kimwitu++, and Misc-C++/bigfib on X86 with -enable-x86-fastcc. llvm-svn: 22023	2005-05-14 12:21:56 +00:00
Chris Lattner	61d9d81770	calling a function with the wrong CC is undefined, turn it into an unreachable instruction. This is useful for catching optimizers that don't preserve calling conventions llvm-svn: 21928	2005-05-13 07:09:09 +00:00
Chris Lattner	ca968393ab	When lowering invokes to calls, amke sure to preserve the calling conv. This fixes Ptrdist/anagram with x86 llcbeta llvm-svn: 21925	2005-05-13 06:27:02 +00:00
Chris Lattner	ae186e012c	Prefer int 0 instead of long 0 for GEP arguments. llvm-svn: 21924	2005-05-13 06:10:12 +00:00
Chris Lattner	31c667e234	Fix Reassociate/shifttest.ll llvm-svn: 21839	2005-05-10 03:39:25 +00:00
Chris Lattner	bfc796f622	If a function contains no allocas, all of the calls in it are trivially suitable for tail calls. llvm-svn: 21836	2005-05-09 23:51:13 +00:00
Chris Lattner	b62f5082c5	implement and.ll:test33 llvm-svn: 21809	2005-05-09 04:58:36 +00:00
Chris Lattner	d0525a29d1	Preserve calling conventions when doing IPO llvm-svn: 21798	2005-05-09 01:05:50 +00:00
Chris Lattner	21d1dde72a	wrap long lines, preserve calling conventions when cloning functions and turning calls into invokes llvm-svn: 21797	2005-05-09 01:04:34 +00:00
Chris Lattner	a4c8022caf	Convert non-address taken functions with C calling conventions to fastcc. llvm-svn: 21791	2005-05-08 22:18:06 +00:00
Chris Lattner	df3332660f	Implement Reassociate/mul-neg-add.ll llvm-svn: 21788	2005-05-08 21:41:35 +00:00
Chris Lattner	c4f8e2b0ed	Bail out earlier llvm-svn: 21786	2005-05-08 21:33:47 +00:00
Chris Lattner	877b114037	Teach reassociate that 0-X === X*-1 llvm-svn: 21785	2005-05-08 21:28:52 +00:00
Chris Lattner	9f284e0a3c	Fix PR557 and basictest[34].ll. This makes reassociate realize that loads should be treated as unmovable, and gives distinct ranks to distinct values defined in the same basic block, allowing reassociate to do its thing. llvm-svn: 21783	2005-05-08 20:57:04 +00:00
Chris Lattner	9187f3905e	Add debugging information llvm-svn: 21781	2005-05-08 20:09:57 +00:00
Chris Lattner	08582be283	eliminate gotos llvm-svn: 21780	2005-05-08 19:48:43 +00:00
Chris Lattner	5847e5e10c	Improve reassociation handling of inverses, implementing inverses.ll. llvm-svn: 21778	2005-05-08 18:59:37 +00:00
Chris Lattner	4922118dc4	clean up and modernize this pass. llvm-svn: 21776	2005-05-08 18:45:26 +00:00
Chris Lattner	b18dbbfff5	Strength reduce SAR into SHR if there is no way sign bits could be shifted in. This tends to get cases like this: X = cast ubyte to int Y = shr int X, ... Tested by: shift.ll:test24 llvm-svn: 21775	2005-05-08 17:34:56 +00:00
Chris Lattner	e1850b86b6	Refactor some code llvm-svn: 21772	2005-05-08 00:19:31 +00:00
Chris Lattner	6e2086d7e4	Handle some simple cases where we can see that values get annihilated. llvm-svn: 21771	2005-05-08 00:08:33 +00:00
Chris Lattner	4294cec0f1	Fix a miscompilation of crafty by clobbering the "A" variable. llvm-svn: 21770	2005-05-07 23:49:08 +00:00
Chris Lattner	1e5065052a	Rewrite the guts of the reassociate pass to be more efficient and logical. Instead of trying to do local reassociation tweaks at each level, only process an expression tree once (at its root). This does not improve the reassociation pass in any real way. llvm-svn: 21768	2005-05-07 21:59:39 +00:00
Reid Spencer	170ae7ff70	* Add two strlen optimizations: strlen(x) != 0 -> x != 0 strlen(x) == 0 -> x == 0 * Change nested statistics to use style of other LLVM statistics so that only the name of the optimization (simplify-libcalls) is used as the statistic name, and the description indicates which specific all is optimized. Cuts down on some redundancy and saves a few bytes of space. * Make note of stpcpy optimization that could be done. llvm-svn: 21766	2005-05-07 20:15:59 +00:00
Reid Spencer	4f01a822b4	Don't increment the counter unless the debug flag is set. llvm-svn: 21762	2005-05-07 04:59:45 +00:00
Chris Lattner	cea579932d	Convert shifts to muls to assist reassociation. This implements Reassociate/shifttest.ll llvm-svn: 21761	2005-05-07 04:24:13 +00:00
Chris Lattner	f43e974abd	Simplify the code and rearrange it. No major functionality changes here. llvm-svn: 21759	2005-05-07 04:08:02 +00:00
Chris Lattner	7effa0ed06	BAD typeo which caused many testsuite failures last night. Note to self, do not change code after testing it without retesting! llvm-svn: 21741	2005-05-06 17:13:16 +00:00
Chris Lattner	6aacb0f9da	Preserve tail marker llvm-svn: 21737	2005-05-06 06:48:21 +00:00
Chris Lattner	9f3dced2c7	Implement Transforms/Inline/inline-tail.ll llvm-svn: 21736	2005-05-06 06:47:52 +00:00
Chris Lattner	324d2eedb2	preserve the tail marker llvm-svn: 21734	2005-05-06 06:46:58 +00:00
Chris Lattner	53db546b97	Wrap long lines llvm-svn: 21720	2005-05-06 05:34:40 +00:00
Chris Lattner	a36d525741	DCE intrinsic instructions without side effects. llvm-svn: 21719	2005-05-06 05:27:34 +00:00
Chris Lattner	ef298a3b8a	Teach instcombine propagate zeroness through shl instructions, implementing and.ll:test31 llvm-svn: 21717	2005-05-06 04:53:20 +00:00
Chris Lattner	873804168e	Implement shift.ll:test23. If we are shifting right then immediately truncating the result, turn signed shift rights into unsigned shift rights if possible. This leads to later simplification and happens often in 176.gcc. For example, this testcase: struct xxx { unsigned int code : 8; }; enum codes { A, B, C, D, E, F }; int foo(struct xxx P) { if ((enum codes)P->code == A) bar(); } used to be compiled to: int %foo(%struct.xxx %P) { %tmp.1 = getelementptr %struct.xxx* %P, int 0, uint 0 ; <uint> [#uses=1] %tmp.2 = load uint %tmp.1 ; <uint> [#uses=1] %tmp.3 = cast uint %tmp.2 to int ; <int> [#uses=1] %tmp.4 = shl int %tmp.3, ubyte 24 ; <int> [#uses=1] %tmp.5 = shr int %tmp.4, ubyte 24 ; <int> [#uses=1] %tmp.6 = cast int %tmp.5 to sbyte ; <sbyte> [#uses=1] %tmp.8 = seteq sbyte %tmp.6, 0 ; <bool> [#uses=1] br bool %tmp.8, label %then, label %UnifiedReturnBlock Now it is compiled to: %tmp.1 = getelementptr %struct.xxx* %P, int 0, uint 0 ; <uint> [#uses=1] %tmp.2 = load uint %tmp.1 ; <uint> [#uses=1] %tmp.2 = cast uint %tmp.2 to sbyte ; <sbyte> [#uses=1] %tmp.8 = seteq sbyte %tmp.2, 0 ; <bool> [#uses=1] br bool %tmp.8, label %then, label %UnifiedReturnBlock which is the difference between this: foo: subl $4, %esp movl 8(%esp), %eax movl (%eax), %eax shll $24, %eax sarl $24, %eax testb %al, %al jne .LBBfoo_2 and this: foo: subl $4, %esp movl 8(%esp), %eax movl (%eax), %eax testb %al, %al jne .LBBfoo_2 This occurs 3243 times total in the External tests, 215x in povray, 6x in each f2c'd program, 1451x in 176.gcc, 7x in crafty, 20x in perl, 25x in gap, 3x in m88ksim, 25x in ijpeg. Maybe this will cause a little jump on gcc tommorow :) llvm-svn: 21715	2005-05-06 04:18:52 +00:00
Chris Lattner	7208616ec0	Implement xor.ll:test22 llvm-svn: 21713	2005-05-06 02:07:39 +00:00
Chris Lattner	4c2d3781aa	implement and.ll:test30 and set.ll:test21 llvm-svn: 21712	2005-05-06 01:53:19 +00:00
Chris Lattner	dd1e562ec3	implement or.ll:test20 llvm-svn: 21709	2005-05-06 00:58:50 +00:00
Chris Lattner	807aa20f67	Fix a bug compimling Ruby, fixing this testcase: LowerSetJmp/2005-05-05-OldUses.ll llvm-svn: 21696	2005-05-05 15:47:43 +00:00
Chris Lattner	809dfac421	Instcombine: cast (X != 0) to int, cast (X == 1) to int -> X iff X has only the low bit set. This implements set.ll:test20. This triggers 2x on povray, 9x on mesa, 11x on gcc, 2x on crafty, 1x on eon, 6x on perlbmk and 11x on m88ksim. It allows us to compile these two functions into the same code: struct s { unsigned int bit : 1; }; unsigned foo(struct s p) { if (p->bit) return 1; else return 0; } unsigned bar(struct s p) { return p->bit; } llvm-svn: 21690	2005-05-04 19:10:26 +00:00
Reid Spencer	282d057485	Implement the IsDigitOptimization for simplifying calls to the isdigit library function: isdigit(chr) -> 0 or 1 if chr is constant isdigit(chr) -> chr - '0' <= 9 otherwise Although there are many calls to isdigit in llvm-test, most of them are compiled away by macros leaving only this: 2 MultiSource/Applications/hexxagon llvm-svn: 21688	2005-05-04 18:58:28 +00:00
Reid Spencer	1e520fd661	* Correct the function prototypes for some of the functions to match the actual spec (int -> uint) * Add the ability to get/cache the strlen function prototype. * Make sure generated values are appropriately named for debugging purposes * Add the SPrintFOptimiation for 4 casts of sprintf optimization: sprintf(str,cstr) -> llvm.memcpy(str,cstr) (if cstr has no %) sprintf(str,"") -> store sbyte 0, str sprintf(str,"%s",src) -> llvm.memcpy(str,src) (if src is constant) sprintf(str,"%c",chr) -> store chr, str ; store sbyte 0, str+1 The sprintf optimization didn't fire as much as I had hoped: 2 MultiSource/Applications/SPASS 5 MultiSource/Benchmarks/McCat/18-imp 22 MultiSource/Benchmarks/Prolangs-C/TimberWolfMC 1 MultiSource/Benchmarks/Prolangs-C/assembler 6 MultiSource/Benchmarks/Prolangs-C/unix-smail 2 MultiSource/Benchmarks/mediabench/mpeg2/mpeg2dec llvm-svn: 21679	2005-05-04 03:20:21 +00:00
Reid Spencer	38cabd7265	Implement optimizations for the strchr and llvm.memset library calls. Neither of these activated as many times as was hoped: strchr: 9 MultiSource/Applications/siod 1 MultiSource/Applications/d 2 MultiSource/Prolangs-C/archie-client 1 External/SPEC/CINT2000/176.gcc/176.gcc llvm.memset: no hits llvm-svn: 21669	2005-05-03 07:23:44 +00:00
Reid Spencer	95d8efdfcf	Avoid garbage output in the statistics display by ensuring that the strings passed to Statistic's constructor are not destructable. The stats are printed during static destruction and the SimplifyLibCalls module was getting destructed before the statistics. llvm-svn: 21661	2005-05-03 02:54:54 +00:00
Reid Spencer	49fa070401	Add the StrNCmpOptimization which is similar to strcmp. Unfortunately, this optimization didn't trigger on any llvm-test tests. llvm-svn: 21660	2005-05-03 01:43:45 +00:00
Reid Spencer	2d5c7beebd	Implement the fprintf optimization which converts calls like this: fprintf(F,"hello") -> fwrite("hello",strlen("hello"),1,F) fprintf(F,"%s","hello") -> fwrite("hello",strlen("hello"),1,F) fprintf(F,"%c",'x') -> fputc('c',F) This optimization fires severals times in llvm-test: 313 MultiSource/Applications/Burg 302 MultiSource/Benchmarks/Prolangs-C/TimberWolfMC 189 MultiSource/Benchmarks/Prolangs-C/mybison 175 MultiSource/Benchmarks/Prolangs-C/football 130 MultiSource/Benchmarks/Prolangs-C/unix-tbl llvm-svn: 21657	2005-05-02 23:59:26 +00:00
John Criswell	f42ed7bdaf	Fixed a comment. llvm-svn: 21653	2005-05-02 14:47:42 +00:00
Chris Lattner	a816eee427	Implement getelementptr.ll:test11 llvm-svn: 21647	2005-05-01 04:42:15 +00:00
Chris Lattner	a9d84e3388	Check for volatile loads only once. Implement load.ll:test7 llvm-svn: 21645	2005-05-01 04:24:53 +00:00
Reid Spencer	16449a9eb0	Fix a comment that stated the wrong thing. llvm-svn: 21638	2005-04-30 06:45:47 +00:00
Reid Spencer	4c444fe007	* Don't depend on "guessing" what a FILE* is, just require that the actual type be obtained from a CallInst we're optimizing. * Make it possible for getConstantStringLength to return the ConstantArray that it extracts in case the content is needed by an Optimization. * Implement the strcmp optimization * Implement the toascii optimization This pass is now firing several to many times in the following MultiSource tests: Applications/Burg - 7 (strcat,strcpy) Applications/siod - 13 (strcat,strcpy,strlen) Applications/spiff - 120 (exit,fputs,strcat,strcpy,strlen) Applications/treecc - 66 (exit,fputs,strcat,strcpy) Applications/kimwitu++ - 34 (strcmp,strcpy,strlen) Applications/SPASS - 588 (exit,fputs,strcat,strcpy,strlen) llvm-svn: 21626	2005-04-30 03:17:54 +00:00
Reid Spencer	9361697f93	Implement the optimizations for "pow" and "fputs" library calls. llvm-svn: 21618	2005-04-29 09:39:47 +00:00
Reid Spencer	c968ea0495	Remove optimizations that don't require both operands to be constant. These are moved to simplify-libcalls pass. llvm-svn: 21614	2005-04-29 05:55:35 +00:00
Jeff Cohen	4bc952f703	Consistently use 'class' to silence VC++ llvm-svn: 21612	2005-04-29 03:05:44 +00:00
Reid Spencer	ed55a6b5e0	* Add constant folding for additional floating point library calls such as sinh, cosh, etc. * Make the name comparisons for the fp libcalls a little more efficient by switching on the first character of the name before doing comparisons. llvm-svn: 21611	2005-04-28 23:01:59 +00:00
Reid Spencer	16983ca865	Remove from the TODO list those optimizations that are already handled by constant folding implemented in lib/Transforms/Utils/Local.cpp. llvm-svn: 21604	2005-04-28 18:05:16 +00:00
Reid Spencer	649ac283e4	Document additional libcall transformations that need to be written. Help Wanted! There's a lot of them to write. llvm-svn: 21603	2005-04-28 04:40:06 +00:00
Reid Spencer	7ddcfb3375	Doxygenate. llvm-svn: 21602	2005-04-27 21:29:20 +00:00
Chris Lattner	36ffb1ff37	remove 'statement with no effect' warning llvm-svn: 21600	2005-04-27 20:12:17 +00:00
Reid Spencer	08b4940509	More Cleanup: * Name the instructions by appending to name of original * Factor common part out of a switch statement. llvm-svn: 21597	2005-04-27 17:46:54 +00:00
Reid Spencer	e249a82e73	This is a cleanup commit: * Correct stale documentation in a few places * Re-order the file to better associate things and reduce line count * Make the pass thread safe by caching the Function* objects needed by the optimizers in the pass object instead of globally. * Provide the SimplifyLibCalls pass object to the optimizer classes so they can access cached Function* objects and TargetData info * Make sure the pass resets its cache if the Module passed to runOnModule changes * Rename CallOptimizer LibCallOptimization. All the classes are named Optimization while the objects are Optimizer. * Don't cache Function* in the optimizer objects because they could be used by multiple PassManager's running in multiple threads * Add an optimization for strcpy which is similar to strcat * Add a "TODO" list at the end of the file for ideas on additional libcall optimizations that could be added (get ideas from other compilers). Sorry for the huge diff. Its mostly reorganization of code. That won't happen again as I believe the design and infrastructure for this pass is now done or close to it. llvm-svn: 21589	2005-04-27 07:54:40 +00:00
Chris Lattner	93f4e9dd26	detect functions that never return, and turn the instruction following a call to them into an 'unreachable' instruction. This triggers a bunch of times, particularly on gcc: gzip: 36 gcc: 601 eon: 12 bzip: 38 llvm-svn: 21587	2005-04-27 04:52:23 +00:00
Reid Spencer	dc11db68b6	Prefix the debug statistics so they group together. llvm-svn: 21583	2005-04-27 00:20:23 +00:00
Reid Spencer	e95a647b2a	In debug builds, make a statistic for each kind of call optimization. This helps track down what gets triggered in the pass so its easier to identify good test cases. llvm-svn: 21582	2005-04-27 00:05:45 +00:00
Chris Lattner	7f4f773e9f	This analysis doesn't take 'throwing' into consideration, it looks at 'unwinding' llvm-svn: 21581	2005-04-26 23:53:25 +00:00
Reid Spencer	f9d4be187f	Fix up the debug statement to actually use a newline .. radical concept. llvm-svn: 21580	2005-04-26 23:07:08 +00:00
Reid Spencer	18b998192f	Uh, this isn't argpromotion. llvm-svn: 21579	2005-04-26 23:05:17 +00:00
Reid Spencer	2bc7a4f82a	Add some debugging output so we can tell which calls are getting triggered llvm-svn: 21578	2005-04-26 23:02:16 +00:00
Reid Spencer	f8c03d9db6	No, seriously folks, memcpy really does return void. llvm-svn: 21575	2005-04-26 22:49:48 +00:00
Reid Spencer	aaca170867	memcpy returns void!!!!! llvm-svn: 21574	2005-04-26 22:46:23 +00:00
Reid Spencer	4855ebf622	Fix some bugs found by running on llvm-test: * MemCpyOptimization can only be optimized if the 3rd and 4th arguments are constants and we weren't checking for that. * The result of llvm.memcpy (and llvm.memmove) is void* not sbyte*, put in a cast. llvm-svn: 21570	2005-04-26 19:55:57 +00:00
Reid Spencer	bb92b4fdfb	Changes From Review Feedback: * Have the SimplifyLibCalls pass acquire the TargetData and pass it down to the optimization classes so they can use it to make better choices for the signatures of functions, etc. * Rearrange the code a little so the utility functions are closer to their usage and keep the core of the pass near the top of the files. * Adjust the StrLen pass to get/use the correct prototype depending on the TargetData::getIntPtrType() result. The result of strlen is size_t which could be either uint or ulong depending on the platform. * Clean up some coding nits (cast vs. dyn_cast, remove redundant items from a switch, etc.) * Implement the MemMoveOptimization as a twin of MemCpyOptimization (they only differ in name). llvm-svn: 21569	2005-04-26 19:13:17 +00:00
Chris Lattner	bd43b9db9d	Fix the compile failures from last night. llvm-svn: 21565	2005-04-26 14:40:41 +00:00
Reid Spencer	b4f7b83dce	* Merge get_GVInitializer and getCharArrayLength into a single function named getConstantStringLength. This is the common part of StrCpy and StrLen optimizations and probably several others, yet to be written. It performs all the validity checks for looking at constant arrays that are supposed to be null-terminated strings and then computes the actual length of the string. * Implement the MemCpyOptimization class. This just turns memcpy of 1, 2, 4 and 8 byte data blocks that are properly aligned on those boundaries into a load and a store. Much more could be done here but alignment restrictions and lack of knowledge of the target instruction set prevent use from doing significantly more. That will have to be delegated to the code generators as they lower llvm.memcpy calls. llvm-svn: 21562	2005-04-26 07:45:18 +00:00
Reid Spencer	76dab9a523	* Implement StrLenOptimization * Factor out commonalities between StrLenOptimization and StrCatOptimization * Make sure that signatures return sbyte* not void* llvm-svn: 21559	2005-04-26 05:24:00 +00:00
Reid Spencer	8ee5aacc38	Incorporate feedback from Chris: * Change signatures of OptimizeCall and ValidateCalledFunction so they are non-const, allowing the optimization object to be modified. This is in support of caching things used across multiple calls. * Provide two functions for constructing and caching function types * Modify the StrCatOptimization to cache Function objects for strlen and llvm.memcpy so it doesn't regenerate them on each call site. Make sure these are invalidated each time we start the pass. * Handle both a GEP Instruction and a GEP ConstantExpr * Add additional checks to make sure we really are dealing with an arary of sbyte and that all the element initializers are ConstantInt or ConstantExpr that reduce to ConstantInt. * Make sure the GlobalVariable is constant! * Don't use ConstantArray::getString as it can fail and it doesn't give us the right thing. We must check for null bytes in the middle of the array. * Use llvm.memcpy instead of memcpy so we can factor alignment into it. * Don't use void* types in signatures, replace with sbyte* instead. llvm-svn: 21555	2005-04-26 03:26:15 +00:00
Reid Spencer	fe91dfec91	Changes due to code review and new implementation: * Don't use std::string for the function names, const char* will suffice * Allow each CallOptimizer to validate the function signature before doing anything * Repeatedly loop over the functions until an iteration produces no more optimizations. This allows one optimization to insert a call that is optimized by another optimization. * Implement the ConstantArray portion of the StrCatOptimization * Provide a template for the MemCpyOptimization * Make ExitInMainOptimization split the block, not delete everything after the return instruction. (This covers revision 1.3 and 1.4, as the 1.3 comments were botched) llvm-svn: 21548	2005-04-25 21:20:38 +00:00
Reid Spencer	f2534c7291	Lots of changes based on review and new functionality: * Use a llvm-svn: 21546	2005-04-25 21:11:48 +00:00
Chris Lattner	a21bf8d1be	implement getelementptr.ll:test10 llvm-svn: 21541	2005-04-25 20:17:30 +00:00
Reid Spencer	9bbaa2ab7f	Post-Review Cleanup: * Fix comments at top of file * Change algorithm for running the call optimizations from nn to something closer to n. Use a hash_map to store and lookup the optimizations since there will eventually (or potentially) be a large number of them. This gets lookup based on the name of the function to O(1). Each CallOptimizer now has a std::string member named func_name that tracks the name of the function that it applies to. It is this string that is entered into the hash_map for fast comparison against the function names encountered in the module. * Cleanup some style issues pertaining to iterator invalidation * Don't pass the Function pointer to the OptimizeCall function because if the optimization needs it, it can get it from the CallInst passed in. * Add the skeleton for a new CallOptimizer, StrCatOptimizer which will eventually replace strcat's of constant strings with direct copies. llvm-svn: 21526	2005-04-25 03:59:26 +00:00
Reid Spencer	39a762d149	A new pass to provide specific optimizations for certain well-known library calls. The pass visits all external functions in the module and determines if such function calls can be optimized. The optimizations are specific to the library calls involved. This initial version only optimizes calls to exit(3) when they occur in main(): it changes them to ret instructions. llvm-svn: 21522	2005-04-25 02:53:12 +00:00
Chris Lattner	2f1457fd83	Eliminate cases where we could << by 64, which is undefined in C. llvm-svn: 21500	2005-04-24 17:46:05 +00:00
Chris Lattner	d6f636a340	Implement xor.ll:test21: select (not C), A, B -> select C, B, A llvm-svn: 21495	2005-04-24 07:30:14 +00:00
Chris Lattner	d1f46d3bf9	Use getPrimitiveSizeInBits() instead of getPrimitiveSize()*8 Completely rework the 'setcc (cast x to larger), y' code. This code has the advantage of implementing setcc.ll:test19 (being more general than the previous code) and being correct in all cases. This allows us to unxfail 2004-11-27-SetCCForCastLargerAndConstant.ll, and close PR454. llvm-svn: 21491	2005-04-24 06:59:08 +00:00
Jeff Cohen	82639853c0	Eliminate tabs and trailing spaces llvm-svn: 21480	2005-04-23 21:38:35 +00:00
Chris Lattner	77c32c34d7	Generalize the setcc -> PHI and Select folding optimizations to work with any constant RHS, not just a constant integer RHS. This implements select.ll:test17 llvm-svn: 21470	2005-04-23 15:31:55 +00:00
Misha Brukman	b1c9317bb4	Remove trailing whitespace llvm-svn: 21427	2005-04-21 23:48:37 +00:00
Chris Lattner	a3159af703	Fix a bug where we would not promote calls to invokes if they occured in the same block as the setjmp. Thanks to Greg Pettyjohn for noticing this! llvm-svn: 21403	2005-04-21 16:46:46 +00:00
Chris Lattner	7ceb081f3f	Improve doxygen documentation, patch contributed by Evan Jones! llvm-svn: 21393	2005-04-21 16:04:49 +00:00
Chris Lattner	374e659466	Instcombine this: %shortcirc_val = select bool %tmp.1, bool true, bool %tmp.4 ; <bool> [#uses=1] %tmp.6 = cast bool %shortcirc_val to int ; <int> [#uses=1] into this: %shortcirc_val = or bool %tmp.1, %tmp.4 ; <bool> [#uses=1] %tmp.6 = cast bool %shortcirc_val to int ; <int> [#uses=1] not this: %tmp.4.cast = cast bool %tmp.4 to int ; <int> [#uses=1] %tmp.6 = select bool %tmp.1, int 1, int %tmp.4.cast ; <int> [#uses=1] llvm-svn: 21389	2005-04-21 05:43:13 +00:00
Chris Lattner	b38b443b15	Teach simplifycfg that setcc is cheap and non-trapping, so that it can convert this: %tmp.1 = seteq int %i, 0 ; <bool> [#uses=1] br bool %tmp.1, label %shortcirc_done, label %shortcirc_next shortcirc_next: ; preds = %entry %tmp.4 = seteq int %j, 0 ; <bool> [#uses=1] br label %shortcirc_done shortcirc_done: ; preds = %shortcirc_next, %entry %shortcirc_val = phi bool [ %tmp.4, %shortcirc_next ], [ true, %entry ] ; <bool> [#uses=1] to this: %tmp.1 = seteq int %i, 0 ; <bool> [#uses=1] %tmp.4 = seteq int %j, 0 ; <bool> [#uses=1] %shortcirc_val = select bool %tmp.1, bool true, bool %tmp.4 ; <bool> [#uses=1] ... which is later simplified by instcombine into an or. llvm-svn: 21388	2005-04-21 05:31:13 +00:00
Chris Lattner	8cb10a1775	Wrap some long lines. Make IPSCCP strip off dead constant exprs that are using functions, making them appear as though their address is taken. This allows us to propagate some more pool descriptors, lowering the overhead of pool alloc. llvm-svn: 21363	2005-04-19 19:16:19 +00:00
Chris Lattner	5c219469a0	Eliminate a broken transformation, fixing PR548 llvm-svn: 21354	2005-04-19 06:04:18 +00:00
Chris Lattner	ee84413730	silence a bogus warning llvm-svn: 21320	2005-04-18 05:26:21 +00:00
Chris Lattner	16a50fd0a0	a new simple pass, which will be extended to be more useful in the future. This pass forward branches through conditions when it can show that the conditions is either always true or false for a predecessor. This currently only handles the most simple cases of this, but is successful at threading across 2489 branches and 65 switch instructions in 176.gcc, which isn't bad. llvm-svn: 21306	2005-04-15 19:28:32 +00:00
Chris Lattner	95f16a3ac4	Get rid of this for_each loop llvm-svn: 21253	2005-04-12 18:51:33 +00:00
Chris Lattner	4236261930	Fix bug: InstCombine/2005-05-07-UDivSelectCrash.ll llvm-svn: 21152	2005-04-08 04:03:26 +00:00
Chris Lattner	4706046e68	Implement the following xforms: (X-Y)-X --> -Y A + (B - A) --> B (B - A) + A --> B llvm-svn: 21138	2005-04-07 17:14:51 +00:00
Chris Lattner	c7f3c1a00e	Implement InstCombine/add.ll:test28, transforming C1-(X+C2) --> (C1-C2)-X. This occurs several dozen times in specint2k, particularly in crafty and gcc apparently. llvm-svn: 21136	2005-04-07 16:28:01 +00:00
Chris Lattner	a9be4490d8	Transform X-(X+Y) == -Y and X-(Y+X) == -Y llvm-svn: 21134	2005-04-07 16:15:25 +00:00
Chris Lattner	ecfa9b5810	disable this transformation in the one obscure case that really pessimizes pointer analysis. llvm-svn: 20916	2005-03-29 06:37:47 +00:00
Alkis Evlogimenos	9ead0d7b4c	Rename createPromoteMemoryToRegister() to createPromoteMemoryToRegisterPass() to be consistent with other pass creation functions. llvm-svn: 20885	2005-03-28 02:01:12 +00:00
Chris Lattner	514e843e89	Enhance loopsimplify to preserve alias analysis instead of clobbering it. This prevents crashes on some programs when using -ds-aa -licm. llvm-svn: 20831	2005-03-25 06:37:22 +00:00
Chris Lattner	faf7791fea	Fix a bug where LICM was not updating AA information properly when sinking a pointer value out of a loop causing it to be duplicated. llvm-svn: 20828	2005-03-25 00:22:36 +00:00
Chris Lattner	1c790bf656	enable -debug-only=licm llvm-svn: 20788	2005-03-23 21:00:12 +00:00
Chris Lattner	7b9020a059	Fix the missing symbols problem Bill was hitting. Patch contributed by Bill Wendling!! llvm-svn: 20649	2005-03-17 15:38:16 +00:00
Chris Lattner	6cb4559369	stop using method. llvm-svn: 20603	2005-03-15 05:19:49 +00:00
Chris Lattner	531f9e92d4	This mega patch converts us from using Function::a{iterator\|begin\|end} to using Function::arg_{iterator\|begin\|end}. Likewise Module::g* -> Module::global_*. This patch is contributed by Gabor Greif, thanks! llvm-svn: 20597	2005-03-15 04:54:21 +00:00
Chris Lattner	8c79559443	fix a bug where we thought arguments were constants :( llvm-svn: 20506	2005-03-06 22:52:29 +00:00
Chris Lattner	2ce303b406	Fix Regression/Transforms/LoopStrengthReduce/dont_insert_redundant_ops.ll, hopefully not breaking too many other things. llvm-svn: 20505	2005-03-06 22:36:12 +00:00
Chris Lattner	45403e5052	implement Transforms/LoopStrengthReduce/invariant_value_first_arg.ll llvm-svn: 20501	2005-03-06 22:06:22 +00:00
Chris Lattner	d3874fad44	minor simplifications of the code. llvm-svn: 20497	2005-03-06 21:58:22 +00:00
Chris Lattner	dd3ec92085	trivial simplification llvm-svn: 20494	2005-03-06 21:35:38 +00:00
Chris Lattner	238f6df546	Fix a bug where we could corrupt a parent loop's header info if we unrolled a nested loop. This fixes Transforms/LoopUnroll/2005-03-06-BadLoopInfoUpdate.ll and PR532 llvm-svn: 20493	2005-03-06 20:57:32 +00:00
Chris Lattner	1b032f59e7	Make this MUCH faster by avoiding a linear search in the symbol table code. llvm-svn: 20479	2005-03-06 05:42:36 +00:00
Jeff Cohen	4abcea3a69	Reformat comments to fix 80 columns. llvm-svn: 20467	2005-03-05 22:45:40 +00:00
Jeff Cohen	be37fa07fd	Reuse induction variables created for strength-reduced GEPs by other similar GEPs. llvm-svn: 20466	2005-03-05 22:40:34 +00:00
Chris Lattner	6d0a24c608	second argument to Value::setName is now gone. llvm-svn: 20463	2005-03-05 19:05:20 +00:00
Chris Lattner	cfe2822cdf	Do not compute 1ULL << 64, which is undefined. This fixes Ptrdist/ks on the sparc, and testcase Regression/Transforms/InstCombine/2005-03-04-ShiftOverflow.ll llvm-svn: 20445	2005-03-04 23:21:33 +00:00
Jeff Cohen	a2c59b7423	Add support for not strength reducing GEPs where the element size is a small power of two. This emphatically includes the zeroeth power of two. llvm-svn: 20429	2005-03-04 04:04:26 +00:00
Chris Lattner	ef1e989e4f	Add an optional argument to lower to a specific constant value instead of to a "sizeof" expression. llvm-svn: 20414	2005-03-03 01:03:43 +00:00
Jeff Cohen	8ea6f9e821	Fixed the following LSR bugs: * Loop invariant code does not dominate the loop header, but rather the end of the loop preheader. * The base for a reduced GEP isn't a constant unless all of its operands (preceding the induction variable) are constant. * Allow induction variable elimination for the simple case after all. Also made changes recommended by Chris for properly deleting instructions. llvm-svn: 20383	2005-03-01 03:46:11 +00:00
Jeff Cohen	dcaa48b5c4	Fix crash in LSR due to attempt to remove original induction variable. However, for reasons explained in the comments, I also deactivated this code as it needs more thought. llvm-svn: 20367	2005-02-28 00:08:56 +00:00
Jeff Cohen	fd63d3af0d	PHI nodes were incorrectly placed when more than one GEP is reduced in a loop. llvm-svn: 20360	2005-02-27 21:08:04 +00:00
Jeff Cohen	39751c3b7c	First pass at improved Loop Strength Reduction. Still not yet ready for prime time. llvm-svn: 20358	2005-02-27 19:37:07 +00:00
Chris Lattner	7561ca1d15	Teach globalopt how memset/cpy/move affect memory, to allow better optimization. llvm-svn: 20352	2005-02-27 18:58:52 +00:00
Chris Lattner	0ce80cd542	Fix spelling, patch contributed by Gabor Greif! llvm-svn: 20343	2005-02-27 06:18:25 +00:00
Chris Lattner	cc6d75fddf	remove extraneous cast llvm-svn: 20334	2005-02-26 18:33:28 +00:00
Chris Lattner	1cca959e5d	Implement Transforms/SimplifyCFG/switch_thread.ll This does a simple form of "jump threading", which eliminates CFG edges that are provably dead. This triggers 90 times in the external tests, and eliminating CFG edges is always always a good thing! :) llvm-svn: 20300	2005-02-24 06:17:52 +00:00
Chris Lattner	25169caa80	make this more efficient. Scan up to 16 nodes, not the whole list. llvm-svn: 20289	2005-02-23 16:53:04 +00:00
Chris Lattner	52e931b37d	Remove use of bind_obj llvm-svn: 20276	2005-02-22 23:22:58 +00:00
Chris Lattner	7b5d9e2217	Do not mark obviously unreachable blocks live when processing PHI nodes, and handle incomplete control dependences correctly. This fixes: Regression/Transforms/ADCE/dead-phi-edge.ll -> a missed optimization Regression/Transforms/ADCE/dead-phi-edge.ll -> a compiler crash distilled from QT4 llvm-svn: 20227	2005-02-17 19:28:49 +00:00
Chris Lattner	31f3382b3b	Fix the second bug attached to PR504. llvm-svn: 20181	2005-02-14 20:11:45 +00:00
Chris Lattner	e616fea3bc	Fix for testcase Transforms/IndVarsSimplify/2005-02-11-InvokeCrash.ll and PR504. llvm-svn: 20129	2005-02-12 03:26:49 +00:00
Alkis Evlogimenos	c4a44c6b3d	Localize globals if they are only used in main(). This replaces the global with an alloca, which eventually gets promoted into a register. This enables a lot of other optimizations later on. llvm-svn: 20109	2005-02-10 18:36:30 +00:00
Alkis Evlogimenos	346bb20409	Fix crash on MallocInsts of unsized types. llvm-svn: 19988	2005-02-02 04:43:37 +00:00
Chris Lattner	82b42c5d85	API change. llvm-svn: 19959	2005-02-01 01:23:49 +00:00
Chris Lattner	d6a4492f81	Adjust to changes in APIs llvm-svn: 19958	2005-02-01 01:23:31 +00:00
Chris Lattner	f98a7bffb3	Hacks to make this ugly ugly code work with the new use lists. llvm-svn: 19957	2005-02-01 01:22:56 +00:00
Chris Lattner	72684fecf8	Implement InstCombine/cast.ll:test25, a case that occurs many times in spec llvm-svn: 19953	2005-01-31 05:51:45 +00:00
Chris Lattner	31f486c775	Implement the trivial cases in InstCombine/store.ll llvm-svn: 19950	2005-01-31 05:36:43 +00:00
Chris Lattner	fe1b0b8b24	Implement Transforms/InstCombine/cast-load-gep.ll, which allows us to devirtualize 11 indirect calls in perlbmk. llvm-svn: 19947	2005-01-31 04:50:46 +00:00
Chris Lattner	d8e20188c6	Adjust to changes in instruction interfaces. llvm-svn: 19900	2005-01-29 00:39:08 +00:00
Chris Lattner	a3f06fa2dd	Switchinst takes a hint for the number of cases it will have. llvm-svn: 19899	2005-01-29 00:38:45 +00:00
Chris Lattner	a35dfcedd3	switchinst ctor now takes a hint for the number of cases that it will have. llvm-svn: 19898	2005-01-29 00:38:26 +00:00
Chris Lattner	84d3137da7	Adjust Valuehandle to hold its operand directly in it. llvm-svn: 19897	2005-01-29 00:37:36 +00:00
Chris Lattner	cd517ff0c7	* add some DEBUG statements * Properly compile this: struct a {}; int test() { struct a b[2]; if (&b[0] != &b[1]) abort (); return 0; } to 'return 0', not abort(). llvm-svn: 19875	2005-01-28 19:32:01 +00:00
Alkis Evlogimenos	fbd921987f	Add a dependency to the trace library so that it gets pulled in automatically. llvm-svn: 19828	2005-01-25 16:23:57 +00:00
Chris Lattner	9e2c7facb2	Get rid of a several dozen more and instructions in specint. llvm-svn: 19786	2005-01-23 20:26:55 +00:00
Chris Lattner	fc4429e7c1	Handle comparisons of gep instructions that have different typed indices as long as they are the same size. llvm-svn: 19734	2005-01-21 23:06:49 +00:00
Chris Lattner	411336fe04	Add two optimizations. The first folds (X+Y)-X -> Y The second folds operations into selects, e.g. (select C, (X+Y), (Y+Z)) -> (Y+(select C, X, Z) This occurs a few times across spec, e.g. select add/sub mesa: 83 0 povray: 5 2 gcc 4 2 parser 0 22 perlbmk 13 30 twolf 0 3 llvm-svn: 19706	2005-01-19 21:50:18 +00:00
Chris Lattner	a3cc1835ad	Fix 'raise' to work with packed types. Patch by Morten Ofstad. llvm-svn: 19693	2005-01-19 16:16:35 +00:00
Chris Lattner	715364364b	Delete PHI nodes that are not dead but are locked in a cycle of single useness. llvm-svn: 19629	2005-01-17 05:10:15 +00:00
Chris Lattner	03f06f11aa	Move code out of indentation one level to make it easier to read. Disable the xform for < > cases. It turns out that the following is being miscompiled: bool %test(sbyte %S) { %T = cast sbyte %S to uint %V = setgt uint %T, 255 ret bool %V } llvm-svn: 19628	2005-01-17 03:20:02 +00:00
Chris Lattner	51726c47fe	Fix some bugs in an xform added yesterday. This fixes Prolangs-C/allroots. llvm-svn: 19553	2005-01-14 17:35:12 +00:00
Chris Lattner	7aa41cfa88	Fix a compile crash on spiff llvm-svn: 19552	2005-01-14 17:17:59 +00:00
Chris Lattner	4fa89827e2	if two gep comparisons only differ by one index, compare that index directly. This allows us to better optimize begin() -> end() comparisons in common cases. llvm-svn: 19542	2005-01-14 00:20:05 +00:00
Chris Lattner	d35d210ea0	Do not overrun iterators. This fixes a 176.gcc crash llvm-svn: 19541	2005-01-13 23:26:48 +00:00
Chris Lattner	a04c904c4c	Turn select C, (X+Y), (X-Y) --> (X+(select C, Y, (-Y))). This occurs in the 'sim' program and probably elsewhere. In sim, it comes up for cases like this: #define round(x) ((x)>0.0 ? (x)+0.5 : (x)-0.5) double G; void T(double X) { G = round(X); } (it uses the round macro a lot). This changes the LLVM code from: %tmp.1 = setgt double %X, 0.000000e+00 ; <bool> [#uses=1] %tmp.4 = add double %X, 5.000000e-01 ; <double> [#uses=1] %tmp.6 = sub double %X, 5.000000e-01 ; <double> [#uses=1] %mem_tmp.0 = select bool %tmp.1, double %tmp.4, double %tmp.6 store double %mem_tmp.0, double* %G to: %tmp.1 = setgt double %X, 0.000000e+00 ; <bool> [#uses=1] %mem_tmp.0.p = select bool %tmp.1, double 5.000000e-01, double -5.000000e-01 %mem_tmp.0 = add double %mem_tmp.0.p, %X store double %mem_tmp.0, double* %G ret void llvm-svn: 19537	2005-01-13 22:52:24 +00:00
Chris Lattner	81e8417614	Implement an optimization for == and != comparisons like this: _Bool test2(int X, int Y) { return &arr[X][Y] == arr; } instead of generating this: bool %test2(int %X, int %Y) { %tmp.3.idx = mul int %X, 160 ; <int> [#uses=1] %tmp.3.idx1 = shl int %Y, ubyte 2 ; <int> [#uses=1] %tmp.3.offs2 = sub int 0, %tmp.3.idx ; <int> [#uses=1] %tmp.7 = seteq int %tmp.3.idx1, %tmp.3.offs2 ; <bool> [#uses=1] ret bool %tmp.7 } generate this: bool %test2(int %X, int %Y) { seteq int %X, 0 ; <bool>:0 [#uses=1] seteq int %Y, 0 ; <bool>:1 [#uses=1] %tmp.7 = and bool %0, %1 ; <bool> [#uses=1] ret bool %tmp.7 } This idiom occurs in C++ programs when iterating from begin() to end(), in a vector or array. For example, we now compile this: void test(int X, int Y) { for (int i = arr; i != arr+100; ++i) foo(i); } to this: no_exit: ; preds = %entry, %no_exit ... %exitcond = seteq uint %indvar.next, 100 ; <bool> [#uses=1] br bool %exitcond, label %return, label %no_exit instead of this: no_exit: ; preds = %entry, %no_exit ... %inc5 = getelementptr [100 x [40 x int]]* %arr, int 0, int 0, int %inc.rec ; <int> [#uses=1] %tmp.8 = seteq int %inc5, getelementptr ([100 x [40 x int]]* %arr, int 0, int 100, int 0) ; <bool> [#uses=1] %indvar.next = add uint %indvar, 1 ; <uint> [#uses=1] br bool %tmp.8, label %return, label %no_exit llvm-svn: 19536	2005-01-13 22:25:21 +00:00
Chris Lattner	4cb9fa373b	Fix some bugs in code I didn't mean to check in. llvm-svn: 19534	2005-01-13 20:40:58 +00:00
Chris Lattner	0798af33a5	Fix a crash compiling 129.compress llvm-svn: 19533	2005-01-13 20:14:25 +00:00
Reid Spencer	134f02d0c7	Add the LOADABLE_MODULE=1 directive to indicate that this shared library is intended to be a dlopenable module and not a "plain" shared library. llvm-svn: 19456	2005-01-11 04:33:32 +00:00
Jeff Cohen	3e62e7c68b	Apply feedback from Chris. llvm-svn: 19432	2005-01-10 04:23:32 +00:00
Chris Lattner	798e84f59e	Fix VS warnings llvm-svn: 19383	2005-01-08 19:48:40 +00:00
Chris Lattner	46fa04b531	Fix VS warnings. llvm-svn: 19382	2005-01-08 19:45:31 +00:00
Chris Lattner	fdfe3e49fe	Fix uint64_t -> unsigned VS warnings. llvm-svn: 19381	2005-01-08 19:42:22 +00:00
Chris Lattner	47f395cd85	Silence VS warnings. llvm-svn: 19380	2005-01-08 19:37:20 +00:00
Chris Lattner	ce274ce93d	Silence warnings llvm-svn: 19379	2005-01-08 19:34:41 +00:00
Jeff Cohen	677babc4d4	Add more missing createXxxPass functions. llvm-svn: 19370	2005-01-08 17:21:40 +00:00
Misha Brukman	417ca179a9	Convert tabs to spaces llvm-svn: 19320	2005-01-07 07:05:34 +00:00
Jeff Cohen	9a7ac16214	Add missing createXxxPass functions llvm-svn: 19319	2005-01-07 06:57:28 +00:00
Jeff Cohen	844410b48e	Add missing include llvm-svn: 19315	2005-01-07 05:42:13 +00:00
Jeff Cohen	eca0d0f2da	Put createLoopUnswitchPass() into proper namespace llvm-svn: 19306	2005-01-06 05:47:18 +00:00
Jeff Cohen	27595a4aec	Add missing include llvm-svn: 19305	2005-01-06 05:46:44 +00:00
Chris Lattner	86102b8ad5	This is a bulk commit that implements the following primary improvements: * We can now fold cast instructions into select instructions that have at least one constant operand. * We now optimize expressions more aggressively based on bits that are known to be zero. These optimizations occur a lot in code that uses bitfields even in simple ways. * We now turn more cast-cast sequences into AND instructions. Before we would only do this if it if all types were unsigned. Now only the middle type needs to be unsigned (guaranteeing a zero extend). * We transform sign extensions into zero extensions in several cases. This corresponds to these test/Regression/Transforms/InstCombine testcases: 2004-11-22-Missed-and-fold.ll and.ll: test28-29 cast.ll: test21-24 and-or-and.ll cast-cast-to-and.ll zeroext-and-reduce.ll llvm-svn: 19220	2005-01-01 16:22:27 +00:00
Chris Lattner	3215bb6049	Implement SimplifyCFG/DeadSetCC.ll SimplifyCFG is one of those passes that we use for final cleanup: it should not rely on other passes to clean up its garbage. This fixes the "why are trivially dead setcc's in the output of gccas" problem. llvm-svn: 19212	2005-01-01 16:02:12 +00:00
Chris Lattner	13516fe2e7	Fix PR491 and testcase Transforms/DeadStoreElimination/2004-12-28-PartialStore.ll llvm-svn: 19180	2004-12-29 04:36:02 +00:00
Chris Lattner	b17f3e13ec	Adjust to new interfaces llvm-svn: 18958	2004-12-15 07:22:25 +00:00
Chris Lattner	9ad0d55025	Constant exprs are not efficiently negatable in practice. This disables turning X - (constantexpr) into X + (-constantexpr) among other things. llvm-svn: 18935	2004-12-14 20:08:06 +00:00
Brian Gaeke	f9639d2a74	Fix link error in PPC optimized build of 'opt'. llvm-svn: 18913	2004-12-13 21:28:39 +00:00
Chris Lattner	8f430a3b59	Get rid of getSizeOf, using ConstantExpr::getSizeOf instead. do not insert a prototype for malloc of: void* malloc(uint): on 64-bit u targets this is not correct. Instead of prototype it as void *malloc(...), and pass the correct intptr_t through the "...". Finally, fix Regression/CodeGen/SparcV9/2004-12-13-MallocCrash.ll, by not forming constantexpr casts from pointer to uint. llvm-svn: 18908	2004-12-13 20:00:02 +00:00
Chris Lattner	a199e3c1e2	Change indentation of a whole bunch of code, no real changes here. llvm-svn: 18843	2004-12-12 23:49:37 +00:00
Chris Lattner	14d07db44d	More substantial simplifications and speedups. This makes ADCE about 20% faster in some cases. llvm-svn: 18842	2004-12-12 23:40:17 +00:00
Chris Lattner	9115eb3024	More minor microoptimizations llvm-svn: 18841	2004-12-12 22:44:30 +00:00
Chris Lattner	d4298781c1	Remove some more set operations llvm-svn: 18840	2004-12-12 22:22:18 +00:00
Chris Lattner	a538439bf0	Reduce number of set operations. llvm-svn: 18839	2004-12-12 22:16:13 +00:00
Chris Lattner	bf5b7cf638	Optimize div/rem + select combinations more. In particular, implement div.ll:test10 and rem.ll:test4. llvm-svn: 18838	2004-12-12 21:48:58 +00:00
Chris Lattner	745196a5fc	Properly implement copying of a global, fixing the 255.vortex & povray failures from last night. llvm-svn: 18832	2004-12-12 19:34:41 +00:00
Chris Lattner	88deefa303	Simplify code and do not invalidate iterators. This fixes a crash compiling TimberWolfMC that was exposed due to recent optimizer changes. llvm-svn: 18831	2004-12-12 18:23:20 +00:00
Chris Lattner	1cbd5be7a1	Though the previous xform applies to literally dozens (hundreds?) of variables in SPEC, the subsequent optimziations that we are after don't play with with FP values, so disable this xform for them. Really we just don't want stuff like: double G; (always 0 or 412312.312) = G; turning into: bool G_b; = G_b ? 412312.312 : 0; We'd rather just do the load. -Chris llvm-svn: 18819	2004-12-12 06:03:06 +00:00
Chris Lattner	40e4cec9ee	If a variable can only hold two values, and is not already a bool, shrink it down to actually BE a bool. This allows simple value range propagation stuff work harder, deleting comparisons in bzip2 in some hot loops. This implements GlobalOpt/integer-bool.ll, which is the essence of the loop condition distilled into a testcase. llvm-svn: 18817	2004-12-12 05:53:50 +00:00
Chris Lattner	cbc0161d1f	If one side of and/or is known to be 0/-1, it doesn't matter if the other side is overdefined. This allows us to fold conditions like: if (X < Y \|\| Y > Z) in some cases. llvm-svn: 18807	2004-12-11 23:15:19 +00:00
Chris Lattner	263b0a1669	Only cound if we actually made a change. llvm-svn: 18800	2004-12-11 17:00:14 +00:00
Chris Lattner	ffefea0772	The split bb is really the exit of the old function llvm-svn: 18799	2004-12-11 16:59:54 +00:00
Chris Lattner	2f687fd9d6	Two bug fixes: 1. Actually increment the Statistic for the GV elim optzn 2. When resolving undef branches, only resolve branches in executable blocks, avoiding marking a bunch of completely dead blocks live. This has a big impact on the quality of the generated code. With this patch, we positively rip up vortex, compiling Ut_MoveBytes to a single memcpy call. In vortex we get this: 12 ipsccp - Number of globals found to be constant 986 ipsccp - Number of arguments constant propagated 1378 ipsccp - Number of basic blocks unreachable 8919 ipsccp - Number of instructions removed llvm-svn: 18796	2004-12-11 06:05:53 +00:00
Chris Lattner	8525ebe465	Do not delete the entry block to a function. llvm-svn: 18795	2004-12-11 05:32:19 +00:00
Chris Lattner	91dbae6fee	Implement Transforms/SCCP/ipsccp-gvar.ll, by tracking values stored to non-address-taken global variables. llvm-svn: 18790	2004-12-11 05:15:59 +00:00
Chris Lattner	99e1295645	Fix a bug where we could delete dead invoke instructions with uses. In functions where we fully constant prop the return value, replace all ret instructions with 'ret undef'. llvm-svn: 18786	2004-12-11 02:53:57 +00:00
Chris Lattner	bae4b64553	Implement SCCP/ipsccp-conditional.ll, by totally deleting dead blocks. llvm-svn: 18781	2004-12-10 22:29:08 +00:00
Chris Lattner	7285f43836	Fix SCCP/2004-12-10-UndefBranchBug.ll llvm-svn: 18776	2004-12-10 20:41:50 +00:00
Chris Lattner	4fc998da2e	Fix Regression/Transforms/SimplifyCFG/2004-12-10-SimplifyCFGCrash.ll, and the failure on make_dparser last night. llvm-svn: 18766	2004-12-10 17:42:31 +00:00
Chris Lattner	b439464c61	This is the initial implementation of IPSCCP, as requested by Brian. This implements SCCP/ipsccp-basic.ll, rips apart Olden/mst (as described in PR415), and does other nice things. There is still more to come with this, but it's a start. llvm-svn: 18752	2004-12-10 08:02:06 +00:00
Chris Lattner	36d39cecb4	note to self: Do not check in debugging code! llvm-svn: 18693	2004-12-09 07:15:52 +00:00
Chris Lattner	f17a2fb849	Implement trivial sinking for load instructions. This causes us to sink 567 loads in spec llvm-svn: 18692	2004-12-09 07:14:34 +00:00
Chris Lattner	39c98bb31c	Do extremely simple sinking of instructions when they are only used in a successor block. This turns cases like this: x = a op b if (c) { use x } into: if (c) { x = a op b use x } This triggers 3965 times in spec, and is tested by Regression/Transforms/InstCombine/sink_instruction.ll This appears to expose a bug in the X86 backend for 177.mesa, which I'm looking in to. llvm-svn: 18677	2004-12-08 23:43:58 +00:00
Alkis Evlogimenos	a1291a0679	Fix this regression and remove the XFAIL from this test. llvm-svn: 18674	2004-12-08 23:10:30 +00:00
Chris Lattner	8f30caf549	Fix Transforms/InstCombine/2004-12-08-RemInfiniteLoop.ll llvm-svn: 18670	2004-12-08 22:20:34 +00:00
Chris Lattner	674ce86cd0	Add support for compilers without argument dependent name lookup, contributed by Bjørn Wennberg llvm-svn: 18627	2004-12-08 16:12:20 +00:00
Chris Lattner	407000c497	Remove unneeded class qualifier, contributed by Bjørn Wennberg llvm-svn: 18625	2004-12-08 16:05:02 +00:00
Reid Spencer	9273d480ad	For PR387:\ Add doInitialization method to avoid overloaded virtuals llvm-svn: 18602	2004-12-07 08:11:36 +00:00
Chris Lattner	9019e5cfa0	Implement stripping of debug symbols, making the --strip-debug options in gccas/gccld more than just a noop. llvm-svn: 18456	2004-12-03 16:22:08 +00:00
Chris Lattner	e8ebcb3300	Initial reimplementation of the -strip pass, with a stub for implementing -S llvm-svn: 18440	2004-12-02 21:25:03 +00:00
Chris Lattner	a4c9808603	This pass is moving to lib IPO llvm-svn: 18439	2004-12-02 21:24:40 +00:00
Chris Lattner	c0677c081d	Implement a FIXME by checking to make sure that a malloc is not being used in scary and unknown ways before we promote it. This fixes the miscompilation of 188.ammp that has been plauging us since a globalopt patch went in. Thanks a ton to Tanya for helping me diagnose the problem! llvm-svn: 18418	2004-12-02 07:11:07 +00:00
Chris Lattner	3b18139b3c	Fix a minor bug where we set a var to initialized on malloc, not on store. This doesn't fix anything that I'm aware of, just noticed it by inspection llvm-svn: 18417	2004-12-02 06:25:58 +00:00
Chris Lattner	951673a94c	This pass is completely broken. llvm-svn: 18387	2004-11-30 17:09:06 +00:00
Chris Lattner	019445715e	Squelch warning llvm-svn: 18381	2004-11-30 07:47:34 +00:00
Chris Lattner	868ae13dc0	Fix test/Regression/Transforms/LICM/2004-09-14-AliasAnalysisInvalidate.llx This only fails on darwin or on X86 under valgrind. llvm-svn: 18377	2004-11-30 07:01:15 +00:00
Chris Lattner	fd8cbc257e	Alkis noticed that this variable is dead. Thanks! llvm-svn: 18369	2004-11-30 04:01:44 +00:00
Chris Lattner	389cfac0d1	If we have something like this: if (x) { code ... } else { code ... } Turn it into: code if (x) { ... } else { ... } This reduces code size and in some common cases allows us to completely eliminate the conditional. This turns several if/then/else blocks in loops into straightline code in 179.art, turning the loops into single basic blocks (good for modsched even!). Maybe now brg will leave me alone ;-) llvm-svn: 18366	2004-11-30 00:29:14 +00:00
Chris Lattner	6e455608e2	Allow hoisting loads of globals and alloca's in conditionals. llvm-svn: 18363	2004-11-29 21:26:12 +00:00
Reid Spencer	279fa256a2	Fix for PR454: * Make sure we handle signed to unsigned conversion correctly * Move this visitSetCondInst case to its own method. llvm-svn: 18312	2004-11-28 21:31:15 +00:00
Chris Lattner	6ea2888832	Make DSE potentially more aggressive by being more specific about alloca sizes. llvm-svn: 18309	2004-11-28 20:44:37 +00:00
Chris Lattner	14f3cdc227	Implement Regression/Transforms/InstCombine/getelementptr_cast.ll, which occurs many times in crafty llvm-svn: 18273	2004-11-27 17:55:46 +00:00
Chris Lattner	b137409926	Provide size information when checking to see if we can LICM a load, this allows us to hoist more loads in some cases. llvm-svn: 18265	2004-11-26 21:20:09 +00:00
Chris Lattner	540e5f92b4	Do not count debugger intrinsics in size estimation. llvm-svn: 18110	2004-11-22 17:23:57 +00:00
Chris Lattner	79e87e39eb	Ignore debugger intrinsics when doing inlining size computations. llvm-svn: 18109	2004-11-22 17:21:44 +00:00
Chris Lattner	6d048a0d32	Do not consider debug intrinsics in the size computations for loop unrolling. Patch contributed by Michael McCracken! llvm-svn: 18108	2004-11-22 17:18:36 +00:00
Misha Brukman	72a57c3259	Allow constructor parameter to override aggregating args; fix spacing llvm-svn: 18028	2004-11-20 02:20:27 +00:00
Chris Lattner	446948e094	Fix the exposed prototype for the lower packed pass, thanks to Morten Ofstad. llvm-svn: 17996	2004-11-19 16:49:34 +00:00
Chris Lattner	d137be2d0d	CPR is dead. llvm-svn: 17992	2004-11-19 16:24:57 +00:00
Chris Lattner	953075442d	Delete stoppoints that occur for the same source line. llvm-svn: 17970	2004-11-18 21:41:39 +00:00
Chris Lattner	c08ac110df	Check in hook that I forgot llvm-svn: 17956	2004-11-18 17:24:20 +00:00
Chris Lattner	27af257ea0	Do not delete dead invoke instructions! llvm-svn: 17897	2004-11-16 16:32:28 +00:00
Reid Spencer	9339638e9c	Remove unused variable for compilation by VC++. Patch contributed by Morten Ofstad. llvm-svn: 17830	2004-11-15 17:29:41 +00:00
Chris Lattner	1890f94413	Minor cleanups. There is no reason for SCCP to derive from instvisitor anymore. llvm-svn: 17825	2004-11-15 07:15:04 +00:00
Chris Lattner	9a038a3a5e	Count more accurately llvm-svn: 17824	2004-11-15 07:02:42 +00:00
Chris Lattner	97013636cd	Quiet warnings on the persephone tester llvm-svn: 17821	2004-11-15 05:54:07 +00:00
Chris Lattner	d18c16b842	Two minor improvements: 1. Speedup getValueState by having it not consider Arguments. It's better to just add them before we start SCCP'ing. 2. SCCP can delete the contents of dead blocks. No really, it's ok! This reduces the size of the IR for subsequent passes, even though simplifycfg would do the same job. In practice, simplifycfg does not run until much later than sccp in gccas llvm-svn: 17820	2004-11-15 05:45:33 +00:00
Chris Lattner	4f0316229c	rename InstValue to LatticeValue, as it holds for more than instructions. llvm-svn: 17818	2004-11-15 05:03:30 +00:00
Chris Lattner	074be1f6e4	Substantially refactor the SCCP class into an SCCP pass and an SCCPSolver class. The only changes are minor: * Do not try to SCCP instructions that return void in the rewrite loop. This is silly and fool hardy, wasting a map lookup and adding an entry to the map which is never used. * If we decide something has an undefined value, rewrite it to undef, potentially leading to further simplications. llvm-svn: 17816	2004-11-15 04:44:20 +00:00
Chris Lattner	28eeb73f2f	If a global is just loaded and restored, realize that it is not changing value. This allows us to turn more globals into constants and eliminate them. This patch implements GlobalOpt/load-store-global.llx. Note that this patch speeds up 255.vortex from: Output/255.vortex.out-cbe.time:program 7.640000 Output/255.vortex.out-llc.time:program 9.810000 to: Output/255.vortex.out-cbe.time:program 7.250000 Output/255.vortex.out-llc.time:program 9.490000 Which isn't bad at all! llvm-svn: 17746	2004-11-14 20:50:30 +00:00
Chris Lattner	46dd5a6304	This optimization makes MANY phi nodes that all have the same incoming value. If this happens, detect it early instead of relying on instcombine to notice it later. This can be a big speedup, because PHI nodes can have many incoming values. llvm-svn: 17741	2004-11-14 19:29:34 +00:00
Chris Lattner	7515cabe2a	Implement instcombine/phi.ll:test6 - pulling operations through PHI nodes. This exposes subsequent optimization possiblities and reduces code size. This triggers 1423 times in spec. llvm-svn: 17740	2004-11-14 19:13:23 +00:00
Chris Lattner	15ff1e1885	Transform this: %X = alloca ... %Y = alloca ... X == Y into false. This allows us to simplify some stuff in eon (and probably many other C++ programs) where operator= was checking for self assignment. Folding this allows us to SROA several additional structs. llvm-svn: 17735	2004-11-14 07:33:16 +00:00
Chris Lattner	5a8b003a09	Remove note to self llvm-svn: 17734	2004-11-14 06:57:47 +00:00
Chris Lattner	af555adc15	If a function always returns a constant, replace all calls sites with that constant value. This makes the return value dead and allows for simplification in the caller. This implements IPConstantProp/return-constant.ll This triggers several dozen times throughout SPEC. llvm-svn: 17730	2004-11-14 06:10:11 +00:00
Chris Lattner	fe3f4e6ebd	Teach SROA how to promote an array index that is variable, if the dimension of the array is just two. This occurs 8 times in gcc, 6 times in crafty, and 12 times in 099.go. This implements ScalarRepl/sroa_two.ll llvm-svn: 17727	2004-11-14 05:00:19 +00:00
Chris Lattner	8881912d71	Rearrange some code, no functionality changes. llvm-svn: 17724	2004-11-14 04:24:28 +00:00
Chris Lattner	9fa7f0ae0a	Remove debugging code llvm-svn: 17719	2004-11-13 23:32:53 +00:00
Chris Lattner	244031d306	Argument promotion transforms functions to unconditionally load their argument pointers. This is only valid to do if the function already unconditionally loaded an argument or if the pointer passed in is known to be valid. Make sure to do the required checks. This fixed ArgumentPromotion/control-flow.ll and the Burg program. llvm-svn: 17718	2004-11-13 23:31:34 +00:00
Chris Lattner	8c3e7b92af	Simplify handling of shifts to be the same as we do for adds. Add support for (X * C1) + (X * C2) (where * can be mul or shl), allowing us to fold: Y+Y+Y+Y+Y+Y+Y+Y into %tmp.8 = shl long %Y, ubyte 3 ; <long> [#uses=1] instead of %tmp.4 = shl long %Y, ubyte 2 ; <long> [#uses=1] %tmp.12 = shl long %Y, ubyte 2 ; <long> [#uses=1] %tmp.8 = add long %tmp.4, %tmp.12 ; <long> [#uses=1] This implements add.ll:test25 Also add support for (XC1)-(XC2) -> X*(C1-C2), implementing sub.ll:test18 llvm-svn: 17704	2004-11-13 19:50:12 +00:00
Chris Lattner	4efe20a103	Fold: (X + (X << C2)) --> X * ((1 << C2) + 1) ((X << C2) + X) --> X * ((1 << C2) + 1) This means that we now canonicalize "Y+Y+Y" into: %tmp.2 = mul long %Y, 3 ; <long> [#uses=1] instead of: %tmp.10 = shl long %Y, ubyte 1 ; <long> [#uses=1] %tmp.6 = add long %Y, %tmp.10 ; <long> [#uses=1] llvm-svn: 17701	2004-11-13 19:31:40 +00:00
Chris Lattner	2858e17538	Lazily create the abort message, so only translation units that use unwind will actually get it. llvm-svn: 17700	2004-11-13 19:07:32 +00:00
Chris Lattner	9b0291b18d	Fix: CodeExtractor/2004-11-12-InvokeExtract.ll llvm-svn: 17699	2004-11-13 00:06:45 +00:00
Chris Lattner	5bcca6058a	Fix a bug where the code extractor would get a bit confused handling invoke instructions, setting DefBlock to a block it did not have dom info for. llvm-svn: 17697	2004-11-12 23:50:44 +00:00
Chris Lattner	5c1d84c769	Simplify handling of constant initializers llvm-svn: 17696	2004-11-12 22:42:57 +00:00
Chris Lattner	9621dfab3f	Actually, leave the check in. This prevents us from counting dead arguments as IPCP opportunities. llvm-svn: 17680	2004-11-11 07:47:54 +00:00
Chris Lattner	5fa696f8e4	Fix bug: IPConstantProp/deadarg.ll llvm-svn: 17679	2004-11-11 07:46:29 +00:00
Chris Lattner	c1d24cd859	Make IP Constant prop more aggressive about handling self recursive calls. This implements IPConstantProp/recursion.ll llvm-svn: 17666	2004-11-10 19:43:59 +00:00
Chris Lattner	0d3773d8b1	Do not let dead constant expressions hanging off of functions prevent IPCP. This allows to elimination of a bunch of global pool descriptor args from programs being pool allocated (and is also generally useful!) llvm-svn: 17657	2004-11-09 20:47:30 +00:00
Chris Lattner	436285e75d	Change this back so that I get stable numbers to reflect the change from the nightly testers llvm-svn: 17646	2004-11-09 08:05:23 +00:00
Chris Lattner	1f0a97c6cb	Fix bug: 2004-11-08-FreeUseCrash.ll llvm-svn: 17642	2004-11-09 05:10:56 +00:00
Chris Lattner	49fa1ecd04	VERY large functions that are only called from one place are not really exciting to inline. Only inline medium or small sized functions with a single call site. llvm-svn: 17588	2004-11-07 21:46:47 +00:00
Chris Lattner	595016d090	This is V9 specific, move it there. llvm-svn: 17545	2004-11-07 00:39:26 +00:00
Chris Lattner	3c670cb65a	Remove dead vars llvm-svn: 17482	2004-11-05 04:46:22 +00:00
Chris Lattner	33eb909939	Fix some warnings on VC++ llvm-svn: 17481	2004-11-05 04:45:43 +00:00
Chris Lattner	96f6616479	* Rearrange code slightly * Disable broken transforms for simplifying (setcc (cast X to larger), CI) where CC is not != or == llvm-svn: 17422	2004-11-02 03:50:32 +00:00
Chris Lattner	8af7424920	Speed up the tail duplication pass on the testcase below from 68.2s to 1.23s: #define CL0(a) case a: f(); goto c; #define CL1(a) CL0(a##0) CL0(a##1) CL0(a##2) CL0(a##3) CL0(a##4) CL0(a##5) \ CL0(a##6) CL0(a##7) CL0(a##8) CL0(a##9) #define CL2(a) CL1(a##0) CL1(a##1) CL1(a##2) CL1(a##3) CL1(a##4) CL1(a##5) \ CL1(a##6) CL1(a##7) CL1(a##8) CL1(a##9) #define CL3(a) CL2(a##0) CL2(a##1) CL2(a##2) CL2(a##3) CL2(a##4) CL2(a##5) \ CL2(a##6) CL2(a##7) CL2(a##8) CL2(a##9) #define CL4(a) CL3(a##0) CL3(a##1) CL3(a##2) CL3(a##3) CL3(a##4) CL3(a##5) \ CL3(a##6) CL3(a##7) CL3(a##8) CL3(a##9) void f(); void a() { int b; c: switch (b) { CL4(1) } } This comes from GCC PR 15524 llvm-svn: 17390	2004-11-01 07:05:07 +00:00
Chris Lattner	93d1e39f3e	Do not compute the predecessor list for a block unless we need it. This speeds up simplifycfg on this program, from 44.87s to 0.29s (with a profiled build): #define CL0(a) case a: goto c; #define CL1(a) CL0(a##0) CL0(a##1) CL0(a##2) CL0(a##3) CL0(a##4) CL0(a##5) \ CL0(a##6) CL0(a##7) CL0(a##8) CL0(a##9) #define CL2(a) CL1(a##0) CL1(a##1) CL1(a##2) CL1(a##3) CL1(a##4) CL1(a##5) \ CL1(a##6) CL1(a##7) CL1(a##8) CL1(a##9) #define CL3(a) CL2(a##0) CL2(a##1) CL2(a##2) CL2(a##3) CL2(a##4) CL2(a##5) \ CL2(a##6) CL2(a##7) CL2(a##8) CL2(a##9) #define CL4(a) CL3(a##0) CL3(a##1) CL3(a##2) CL3(a##3) CL3(a##4) CL3(a##5) \ CL3(a##6) CL3(a##7) CL3(a##8) CL3(a##9) void f(); void a() { int b; c: switch (b) { CL4(1) } } This testcase is contrived to expose N^2 behavior, but this patch should speedup simplifycfg on any programs that use large switch statements. This testcase comes from GCC PR17895. llvm-svn: 17389	2004-11-01 06:53:58 +00:00
Reid Spencer	57cbe39d1e	Change Library Names Not To Conflict With Others When Installed llvm-svn: 17286	2004-10-27 23:18:45 +00:00
Chris Lattner	7dfc2d29ac	Convert 'struct' to 'class' in various places to adhere to the coding standards and work better with VC++. Patch contributed by Morten Ofstad! llvm-svn: 17281	2004-10-27 16:14:51 +00:00
Chris Lattner	70c2039b39	Hrm, this code was severely botched. As it turns out, this patch: http://mail.cs.uiuc.edu/pipermail/llvm-commits/Week-of-Mon-20041018/019708.html exposed ANOTHER latent bug in this xform, which caused Prolangs-C/bison to fill the zion nightly tester disk up and make the tester barf. This is obviously not a good thing, so lets fix this bug shall we? :) llvm-svn: 17276	2004-10-27 05:57:15 +00:00
Chris Lattner	845afe9b20	Initialize with the correct constant type llvm-svn: 17270	2004-10-27 03:55:24 +00:00
Chris Lattner	d57638c4a7	Fix compatibility with MSVC, patch by Morten Ofstad llvm-svn: 17218	2004-10-25 18:45:16 +00:00
Reid Spencer	fad217c847	Eliminate compilation warning on uninitialized variable. llvm-svn: 17163	2004-10-22 16:10:39 +00:00
Chris Lattner	fe9abf92de	* empty log message * llvm-svn: 17161	2004-10-22 06:43:28 +00:00
Chris Lattner	5c3c21e10a	Fix a bug Nate noticed, where we miscompiled a simple testcase llvm-svn: 17157	2004-10-22 04:53:16 +00:00
Reid Spencer	c1c320c335	We won't use automake llvm-svn: 17155	2004-10-22 03:35:04 +00:00
Brian Gaeke	c9d8b4d45c	Explain what this pass does. llvm-svn: 17146	2004-10-20 19:38:58 +00:00
Chris Lattner	257b284038	Hrm, some people complain when the compiler cheerfully tells them what it's doing... I guess they're right. llvm-svn: 17142	2004-10-19 06:33:16 +00:00
Reid Spencer	6a11a75f31	Initial automake generated Makefile template llvm-svn: 17136	2004-10-18 23:55:41 +00:00
Nate Begeman	b18121e6a9	Initial implementation of the strength reduction for GEP instructions in loops. This optimization is not turned on by default yet, but may be run with the opt tool's -loop-reduce flag. There are many FIXMEs listed in the code that will make it far more applicable to a wide range of code, but you have to start somewhere :) This limited version currently triggers on the following tests in the MultiSource directory: pcompress2: 7 times cfrac: 5 times anagram: 2 times ks: 6 times yacr2: 2 times llvm-svn: 17134	2004-10-18 21:08:22 +00:00
Chris Lattner	88a8a329c3	Get this file compiling with VC++, patch contributed by Morten Ofstad. Thanks Morten! llvm-svn: 17125	2004-10-18 15:43:46 +00:00
Reid Spencer	ce0783318b	Correction to allow compilation with Visual C++. Patch contributed by Morten Ofstad. Thanks Morten! llvm-svn: 17123	2004-10-18 14:38:48 +00:00
Chris Lattner	5edb2f32d0	Simplify code by deleting instructions that preceed unreachable instructions. Simplify code by simplifying terminators that branch to blocks that start with an unreachable instruction. llvm-svn: 17116	2004-10-18 04:07:22 +00:00
Chris Lattner	a67dd32004	Turn store -> null/undef into the LLVM unreachable instruction! This simple change hacks off 10K of bytecode from perlbmk (.5%) even though the front-end is not generating them yet and we are not optimizing the resultant code. This isn't too bad. llvm-svn: 17111	2004-10-18 03:00:50 +00:00
Chris Lattner	8ba9ec9bbb	Turn things with obviously undefined semantics into 'store -> null' llvm-svn: 17110	2004-10-18 02:59:09 +00:00
Chris Lattner	3b92f17165	My friend the invoke instruction does not dominate all basic blocks if it occurs in the entry node of a function llvm-svn: 17109	2004-10-18 01:48:31 +00:00
Chris Lattner	34ae670706	Fix a bug that occurs when the constant value is the result of an invoke. In particular, invoke ret values are only live in the normal dest of the invoke not in the unwind dest. llvm-svn: 17108	2004-10-18 01:21:17 +00:00
Chris Lattner	6a792feb02	Getting ADCE to interact well with unreachable instructions seems like a nontrivial exercise that I'm not interested in tackling right now. Just punt and treat them like unwind's. This 'fixes' test/Regression/Transforms/ADCE/unreachable-function.ll llvm-svn: 17106	2004-10-17 23:45:06 +00:00
Chris Lattner	6e79e55aea	Fix Regression/Transforms/Inline/2004-10-17-InlineFunctionWithoutReturn.ll If a function had no return instruction in it, and the result of the inlined call instruction was used, we would crash. llvm-svn: 17104	2004-10-17 23:21:07 +00:00
Chris Lattner	107c15c33d	Remove printout, realize that instructions in the entry block dominate all other blocks. llvm-svn: 17099	2004-10-17 21:31:34 +00:00
Chris Lattner	215c7ebaa6	When inserting PHI nodes, don't insert any phi nodes that are obviously unneccesary. This allows us to delete several hundred phi nodes of the form PHI(x,x,x,undef) from 253.perlbmk and probably other programs as well. This implements Mem2Reg/UndefValuesMerge.ll llvm-svn: 17098	2004-10-17 21:25:56 +00:00
Chris Lattner	96db59e48a	Enhance hasConstantValue to ignore undef values in phi nodes. This allows it to think that PHI[4, undef] == 4. llvm-svn: 17096	2004-10-17 21:23:26 +00:00
Chris Lattner	e29d634a94	hasConstantValue will soon return instructions that don't dominate the PHI node, so prepare for this. llvm-svn: 17095	2004-10-17 21:22:38 +00:00
Chris Lattner	67f0545daf	Fix a type violation llvm-svn: 17069	2004-10-16 23:28:04 +00:00
Chris Lattner	684c5c6587	Kill the bogon that slipped into my buffer before I committed. llvm-svn: 17067	2004-10-16 19:46:33 +00:00
Chris Lattner	6580e09fef	Implement InstCombine/getelementptr.ll:test9, which is the source of many ugly and giant constnat exprs in some programs. llvm-svn: 17066	2004-10-16 19:44:59 +00:00
Chris Lattner	98e541457b	Add support for unreachable llvm-svn: 17056	2004-10-16 18:21:33 +00:00
Chris Lattner	81a7a23494	Optimize instructions involving undef values. For example X+undef == undef. llvm-svn: 17047	2004-10-16 18:11:37 +00:00
Chris Lattner	7e6d4a12b5	Add support for UndefValue llvm-svn: 17046	2004-10-16 18:10:31 +00:00
Chris Lattner	c0e2e82477	When promoting mem2reg, make uninitialized values become undef isntead of 0. llvm-svn: 17045	2004-10-16 18:10:06 +00:00
Chris Lattner	646354bae1	Handle undef values as undefined on the constant lattice ignore unreachable instructions llvm-svn: 17044	2004-10-16 18:09:41 +00:00
Chris Lattner	6ac3ef950d	Add note llvm-svn: 17043	2004-10-16 18:09:25 +00:00
Chris Lattner	8e71c6a33d	Add support for the undef value. Implement a new optimization based on globals that are initialized with undef. When promoting malloc to a global, start out initialized to undef llvm-svn: 17042	2004-10-16 18:09:00 +00:00
Chris Lattner	5d33e8e73a	Fix a bug John tracked down in libstdc++ where we were incorrectly deleting weak functions. Thanks for finding this John! llvm-svn: 16997	2004-10-14 19:53:50 +00:00
Chris Lattner	45c35b1d1f	When converting phi nodes into select instructions, we shouldn't promote PHI nodes unless we KNOW that we are able to promote all of them. This fixes: test/Regression/Transforms/SimplifyCFG/PhiNoEliminate.ll llvm-svn: 16973	2004-10-14 05:13:36 +00:00
Reid Spencer	ace94df71f	Update to reflect changes in Makefile rules. llvm-svn: 16950	2004-10-13 11:46:52 +00:00
Chris Lattner	00648e1f86	Transform memmove -> memcpy when the source is obviously constant memory. llvm-svn: 16932	2004-10-12 04:52:52 +00:00
Chris Lattner	7cabf6f87a	Fix a REALLY obscure bug in my previous checkin, which was splicing the END marker from one ilist into the middle of another basic block! llvm-svn: 16925	2004-10-12 01:02:29 +00:00
Chris Lattner	9776f7259b	Handle a common case more carefully. In particular, instead of transforming pointer recurrences into expressions from this: %P_addr.0.i.0 = phi sbyte* [ getelementptr ([8 x sbyte]* %.str_1, int 0, int 0), %entry ], [ %inc.0.i, %no_exit.i ] %inc.0.i = getelementptr sbyte* %P_addr.0.i.0, int 1 ; <sbyte> [#uses=2] into this: %inc.0.i = getelementptr sbyte getelementptr ([8 x sbyte]* %.str_1, int 0, int 0), int %inc.0.i.rec Actually create something nice, like this: %inc.0.i = getelementptr [8 x sbyte]* %.str_1, int 0, int %inc.0.i.rec llvm-svn: 16924	2004-10-11 23:06:50 +00:00
Chris Lattner	a92af96c56	Reenable the transform, turning X/-10 < 1 into X > -10 llvm-svn: 16918	2004-10-11 19:40:04 +00:00
Chris Lattner	004e250cd2	This patch implements two things (sorry). First, it allows SRA of globals that have embedded arrays, implementing GlobalOpt/globalsra-partial.llx. This comes up infrequently, but does allow, for example, deleting several stores to dead parts of globals in dhrystone. Second, this implements GlobalOpt/malloc-promote-.llx, which is the following nifty transformation: Basically if a global pointer is initialized with malloc, and we can tell that the program won't notice, we transform this: struct foo FooPtr; ... FooPtr = malloc(sizeof(struct foo)); ... FooPtr->A FooPtr->B Into: struct foo FooPtrBody; ... FooPtrBody.A FooPtrBody.B This comes up occasionally, for example, the 'disp' global in 183.equake (where the xform speeds the CBE version of the program up from 56.16s to 52.40s (7%) on apoc), and the 'desired_accept', 'fixLRBT', 'macroArray', & 'key_queue' globals in 300.twolf (speeding it up from 22.29s to 21.55s (3.4%)). The nice thing about this xform is that it exposes the resulting global to global variable optimization and makes alias analysis easier in addition to eliminating a few loads. llvm-svn: 16916	2004-10-11 05:54:41 +00:00
Chris Lattner	e42eb31f7d	Just because we cannot completely eliminate all uses of a global, we can still optimize away all of the indirect calls and loads, etc from it. This turns code like this: if (G != 0) G(); into if (G != 0) ActualCallee(); This triggers a couple of times in gcc and libstdc++. llvm-svn: 16901	2004-10-10 23:14:11 +00:00
Reid Spencer	97327f05fc	Initial version of automake Makefile.am file. llvm-svn: 16893	2004-10-10 22:20:40 +00:00
Chris Lattner	604ed7aae8	Fix 2004-10-10-CastStoreOnce.llx, by adjusting types back if we strip off a cast llvm-svn: 16878	2004-10-10 17:07:12 +00:00
Chris Lattner	a0e769cc81	Implement GlobalOpt/deadglobal-2.llx, deletion of globals that are only stored to, but are stored at variable indexes. This occurs at least in 176.gcc, but probably others, and we should handle it for completeness. llvm-svn: 16876	2004-10-10 16:47:33 +00:00
Chris Lattner	cb9f152d8c	Avoid calling use_size() which could (in theory) be expensive if the global has a large number of users. Instead, just keep track of whether we're making changes as we do so. This patch has no functionlity changes. llvm-svn: 16874	2004-10-10 16:43:46 +00:00
Chris Lattner	09a527290d	Eliminate global pointers that are only stored a single value and null if we know that all uses of the global will trap if the pointer contained is null. In this case, we forward substitute the stored value to any uses. This has the effect of devirtualizing trivial globals in trivial cases. For example, 164.gzip contains this: gzip.h:extern int (read_buf) OF((char buf, unsigned size)); bits.c: read_buf = file_read; deflate.c: lookahead = read_buf((char)window, deflate.c: n = read_buf((char)window+strstart+lookahead, more); Since read_buf has to point to file_read at every use, we just replace the calls through read_buf with a direct call to file_read. This occurs in several benchmarks, including 176.gcc and 164.gzip. Direct calls are good and stuff. llvm-svn: 16871	2004-10-09 21:48:45 +00:00
Chris Lattner	5c91c8f18b	Use DEBUG instead of DebugFlag directly, as DebugFlag does not respect -debug-only! llvm-svn: 16868	2004-10-09 19:30:36 +00:00
Chris Lattner	f369b38d55	Fix infinite loop due to iteration llvm-svn: 16864	2004-10-09 03:32:52 +00:00
Chris Lattner	4ad08352b4	Implement sub.ll:test17, -X/C -> X/-C llvm-svn: 16863	2004-10-09 02:50:40 +00:00
Chris Lattner	1b8d2957d3	If we found a dead global, we should at least delete it... llvm-svn: 16858	2004-10-08 22:05:31 +00:00
Chris Lattner	1c4bddc50d	* Pull out the meat of runOnModule into another function for clarity. * Do not lead dangling dead constants prevent optimization * Iterate global optimization while we're making progress. These changes allow us to be more aggressive, handling cases like GlobalOpt/iterate.llx without a problem (turning it into 'ret int 0'). llvm-svn: 16857	2004-10-08 20:59:28 +00:00
Chris Lattner	73ad73e2d8	We might as well delete the known-dead global sooner rather than later since we know it is dead. llvm-svn: 16855	2004-10-08 20:25:55 +00:00
Chris Lattner	0b41e861b6	Temporarily disable a buggy transformation until it can be fixed. This fixes 254.gap. llvm-svn: 16853	2004-10-08 19:15:44 +00:00
Chris Lattner	abab0719af	Implement SRA for global variables. This allows the other global variable optimizations to trigger much more often. This allows the elimination of several dozen more global variables in Programs/External. Note that we only do this for non-constant globals: constant globals will already be optimized out if the accesses to them permit it. This implements Transforms/GlobalOpt/globalsra.llx llvm-svn: 16842	2004-10-08 17:32:09 +00:00
Chris Lattner	bff91d9a2e	Instcombine (X & FF00) + xx00 -> (X+xx00) & FF00, implementing and.ll:test27 This comes up when doing adds to bitfield elements. llvm-svn: 16836	2004-10-08 05:07:56 +00:00
Chris Lattner	44bd392cbf	Little patch to turn (shl (add X, 123), 4) -> (add (shl X, 4), 123 << 4) This triggers in cases of bitfield additions, opening opportunities for future improvements. llvm-svn: 16834	2004-10-08 03:46:20 +00:00
Chris Lattner	617f1a34f1	Improve comments, no functionality changes llvm-svn: 16814	2004-10-07 21:30:30 +00:00
Chris Lattner	02b6c918b7	Fix a bug in the safety analysis routine llvm-svn: 16804	2004-10-07 06:01:25 +00:00
Chris Lattner	f64799683e	Comment cleanups llvm-svn: 16803	2004-10-07 06:00:24 +00:00
Chris Lattner	25db58032d	* Rename pass to globalopt, since we do more than just constify * Instead of handling dead functions specially, just nuke them. * Be more aggressive about cleaning up after constification, in particular, handle getelementptr instructions and constantexprs. * Be a little bit more structured about how we process globals. *** Delete globals that are only stored to, and never read. These are clearly not useful, so they should go. This implements deadglobal.llx This last one triggers quite a few times. In particular, 2208 in the external tests, 1865 of which are in 252.eon. This shrinks eon from 1995094 to 1732341 bytes of bytecode. llvm-svn: 16802	2004-10-07 04:16:33 +00:00
Chris Lattner	1f849a08a3	Implement GlobalConstifier/trivialstore.llx, and also do some simplifications of the resultant program to avoid making later passes do it all. This allows us to constify globals that just have the same constant that they are initialized stored into them. Suprisingly this comes up ALL of the freaking time, dozens of times in SPEC, 30 times in vortex alone. For example, on 256.bzip2, it allows us to constify these two globals: %smallMode = internal global ubyte 0 ; <ubyte> [#uses=8] %verbosity = internal global int 0 ; <int> [#uses=49] Which (with later optimizations) results in the bytecode file shrinking from 82286 to 69686 bytes! Lets hear it for IPO :) For the record, it's nuking lots of "if (verbosity > 2) { do lots of stuff }" code. llvm-svn: 16793	2004-10-06 20:57:02 +00:00
Chris Lattner	0aee4b7947	Instcombine: -(X sdiv C) -> (X sdiv -C), tested by sub.ll:test16 llvm-svn: 16769	2004-10-06 15:08:25 +00:00
Chris Lattner	2ce32df8b0	Reduce code growth implied by the tail duplication pass by not duplicating an instruction if it can be hoisted to a common dominator of the block. This implements: test/Regression/Transforms/TailDup/MergeTest.ll llvm-svn: 16758	2004-10-06 03:27:37 +00:00
Brian Gaeke	33e834ebb0	Add accessor function. llvm-svn: 16622	2004-09-30 20:14:29 +00:00
Brian Gaeke	5a89bde564	Correct type of accessor functions. llvm-svn: 16621	2004-09-30 20:14:18 +00:00
Brian Gaeke	e80d4cd66b	Namespacify. Add accessor function. llvm-svn: 16620	2004-09-30 20:14:07 +00:00
Chris Lattner	9af8efddd3	Disable the 'WARNING: Found global types that are not compatible' warning that always prints when linking programs to libstdc++ :( llvm-svn: 16603	2004-09-30 00:12:29 +00:00
Chris Lattner	abae776b18	Hrm, debugging printouts do not need to be in here llvm-svn: 16598	2004-09-29 21:21:14 +00:00
Chris Lattner	6862fbd2cf	* Pull range optimization code out into new InsertRangeTest function. * SubOne/AddOne functions always return ConstantInt, declare them as such * Pull code for handling setcc X, cst, where cst is at the end of the range, or cc is LE or GE up earlier in visitSetCondInst. This reduces #iterations in some cases. * Fold: (div X, C1) op C2 -> range check, implementing div.ll:test6 - test9. llvm-svn: 16588	2004-09-29 17:40:11 +00:00
Chris Lattner	879ce7894c	Do not insert trivially dead select instructions, which allows us to potentially fold more in one pass. llvm-svn: 16583	2004-09-29 05:43:32 +00:00
Chris Lattner	6a4adcda4c	Fold binary expressions and casts into PHI nodes that have all constant inputs. This takes something like this: %A = phi int [ 3, %cond_false.0 ], [ 2, %endif.0.i ], [ 2, %endif.1.i ] %B = div int %tmp.243, 4 and turns it into: %A = phi int [ 3/4, %cond_false.0 ], [ 2/4, %endif.0.i ], [ 2/4, %endif.1.i ] which is later simplified (in this case) into %A = 0. This triggers thousands of times in spec, for example, 269 times in 176.gcc. This is tested by InstCombine/add.ll:test23 and set.ll:test18. llvm-svn: 16582	2004-09-29 05:07:12 +00:00
Chris Lattner	c949128b2f	Hrm, really, all tests passed without this, but it is scary to think how... llvm-svn: 16568	2004-09-29 03:16:24 +00:00
Chris Lattner	be7a69ebd8	Remove debugging printout Instcombine (setcc (truncate X), C1). This occurs THOUSANDS of times in many benchmarks. Particularlly common seem to be things like (seteq (cast bool X to int), int 0) This turns it into (seteq bool %X, false), which then becomes (not %X). llvm-svn: 16567	2004-09-29 03:09:18 +00:00
Chris Lattner	dcf756ec22	Fold (X setcc C1) \| (X setcc C2) This implements or.ll:test1[89] llvm-svn: 16561	2004-09-28 22:33:08 +00:00
Chris Lattner	623826c888	Fold (and (setcc X, C1), (setcc X, C2)) This is important for several reasons: 1. Benchmarks have lots of code that looks like this (perlbmk in particular): %tmp.2.i = setne int %tmp.0.i, 128 ; <bool> [#uses=1] %tmp.6343 = seteq int %tmp.0.i, 1 ; <bool> [#uses=1] %tmp.63 = and bool %tmp.2.i, %tmp.6343 ; <bool> [#uses=1] we now fold away the setne, a clear improvement. 2. In the more important cases, such as (X >= 10) & (X < 20), we now produce smaller code: (X-10) < 10. 3. Perhaps the nicest effect of this patch is that it really helps out the code generators. In particular, for a 'range test' like the above, instead of generating this on X86 (the difference on PPC is even more pronounced): cmp %EAX, 50 setge %CL cmp %EAX, 100 setl %AL and %CL, %AL cmp %CL, 0 we now generate this: add %EAX, -50 cmp %EAX, 50 Furthermore, this causes setcc's to be folded into branches more often. These combinations trigger dozens of times in the spec benchmarks, particularly in 176.gcc, 186.crafty, 253.perlbmk, 254.gap, & 099.go. llvm-svn: 16559	2004-09-28 21:48:02 +00:00
Chris Lattner	272d5ca9e0	Implement X / C1 / C2 folding Implement (setcc (shl X, C1), C2) folding. The second one occurs several dozen times in spec. The first was added just in case. :) These are tested by shift.ll:test2[12], and div.ll:test5 llvm-svn: 16549	2004-09-28 18:22:15 +00:00
Chris Lattner	6afc02f816	shl is always zero extending, so always use a zero extending shift right. This latent bug was exposed by recent changes, and is tested as: llvm/test/Regression/Transforms/InstCombine/2004-09-28-BadShiftAndSetCC.llx llvm-svn: 16546	2004-09-28 17:54:07 +00:00
Alkis Evlogimenos	20f1b0bafb	Add includes and use std:: for standard library calls to make code compile on windows. This patch was contributed by Paolo Invernizzi. llvm-svn: 16539	2004-09-28 14:42:44 +00:00
Alkis Evlogimenos	3ce42ec7ee	Pull assignment out of for loop conditional in order for this to compile under windows. Patch contributed by Paolo Invernizzi! llvm-svn: 16534	2004-09-28 02:40:37 +00:00
Chris Lattner	bfff18a869	Fix two bugs: one where a condition was mistakenly swapped, and another where we folded (X & 254) -> X < 1 instead of X < 2. These problems were latent problems exposed by the latest patch. llvm-svn: 16528	2004-09-27 19:29:18 +00:00
Chris Lattner	1023b8726e	Fold: (setcc (shr X, ShAmt), CI), where 'cc' is eq or ne. This xform triggers often, for example: 6x in povray, 1x in gzip, 279x in gcc, 1x in crafty, 8x in eon, 11x in perlbmk, 362x in gap, 4x in vortex, 14 in m88ksim, 211x in 126.gcc, 1x in compress, 11x in ijpeg, and 4x in 147.vortex. llvm-svn: 16521	2004-09-27 16:18:50 +00:00
Chris Lattner	7e794273f5	Implement shift-and combinations, implementing InstCombine/and.ll:test19-21 These combinations trigger 4 times in povray, 7x in gcc, 4x in gap, and 2x in bzip2. llvm-svn: 16508	2004-09-24 15:21:34 +00:00
Chris Lattner	e1b4d2a470	Move LHSI->hasOneUse() into the arms of the conditional, reindenting code. No functionality changes here. llvm-svn: 16505	2004-09-23 21:52:49 +00:00
Chris Lattner	8fc5af4da9	Implement Transforms/InstCombine/and.ll:test18, a case that occurs 20 times in perlbmk llvm-svn: 16504	2004-09-23 21:46:38 +00:00
Chris Lattner	bdcf41a8a2	Implement select.ll:test16: fold load (select C, X, null) -> load X llvm-svn: 16499	2004-09-23 15:46:00 +00:00
Chris Lattner	b121ae1cec	Do not fold (X + C1 != C2) if there are other users of the add. Doing this transformation used to take a loop like this: int Array[1000]; void test(int X) { int i; for (i = 0; i < 1000; ++i) Array[i] += X; } Compiled to LLVM is: no_exit: ; preds = %entry, %no_exit %indvar = phi uint [ 0, %entry ], [ %indvar.next, %no_exit ] ; <uint> [#uses=2] %tmp.4 = getelementptr [1000 x int]* %Array, int 0, uint %indvar ; <int> [#uses=2] %tmp.7 = load int %tmp.4 ; <int> [#uses=1] %tmp.9 = add int %tmp.7, %X ; <int> [#uses=1] store int %tmp.9, int* %tmp.4 * %indvar.next = add uint %indvar, 1 ; <uint> [#uses=2] * %exitcond = seteq uint %indvar.next, 1000 ; <bool> [#uses=1] br bool %exitcond, label %return, label %no_exit and turn it into a loop like this: no_exit: ; preds = %entry, %no_exit %indvar = phi uint [ 0, %entry ], [ %indvar.next, %no_exit ] ; <uint> [#uses=3] %tmp.4 = getelementptr [1000 x int]* %Array, int 0, uint %indvar ; <int> [#uses=2] %tmp.7 = load int %tmp.4 ; <int> [#uses=1] %tmp.9 = add int %tmp.7, %X ; <int> [#uses=1] store int %tmp.9, int* %tmp.4 * %indvar.next = add uint %indvar, 1 ; <uint> [#uses=1] * %exitcond = seteq uint %indvar, 999 ; <bool> [#uses=1] br bool %exitcond, label %return, label %no_exit Note that indvar.next and indvar can no longer be coallesced. In machine code terms, this patch changes this code: .LBBtest_1: # no_exit mov %EDX, OFFSET Array mov %ESI, %EAX add %ESI, DWORD PTR [%EDX + 4%ECX] mov %EDX, OFFSET Array mov DWORD PTR [%EDX + 4%ECX], %ESI mov %EDX, %ECX inc %EDX cmp %ECX, 999 mov %ECX, %EDX jne .LBBtest_1 # no_exit into this: .LBBtest_1: # no_exit mov %EDX, OFFSET Array mov %ESI, %EAX add %ESI, DWORD PTR [%EDX + 4%ECX] mov %EDX, OFFSET Array mov DWORD PTR [%EDX + 4%ECX], %ESI inc %ECX cmp %ECX, 1000 jne .LBBtest_1 # no_exit We need better instruction selection to get this: .LBBtest_1: # no_exit add DWORD PTR [Array + 4*%ECX], EAX inc %ECX cmp %ECX, 1000 jne .LBBtest_1 # no_exit ... but at least there is less register juggling llvm-svn: 16473	2004-09-21 21:35:23 +00:00
Chris Lattner	42618551d5	Fix potential miscompilations: InstCombine/2004-09-20-BadLoadCombine*.llx llvm-svn: 16447	2004-09-20 10:15:10 +00:00
Alkis Evlogimenos	d59cebf87a	Fix loop condition so that we don't decrement off the beginning of the list. llvm-svn: 16440	2004-09-20 06:42:58 +00:00
Chris Lattner	4f2cf030e8	'Pass' should now not be derived from by clients. Instead, they should derive from ModulePass. Instead of implementing Pass::run, then should implement ModulePass::runOnModule. llvm-svn: 16436	2004-09-20 04:48:05 +00:00
Chris Lattner	cd671065be	Prototype more accurately llvm-svn: 16433	2004-09-20 04:43:57 +00:00
Chris Lattner	3e86084641	Prototype these functions more accurately llvm-svn: 16432	2004-09-20 04:43:15 +00:00
Chris Lattner	e6f13093e6	Make isSafeToLoadUnconditionally a bit smarter, implementing PR362 and Regression/Transforms/InstCombine/CPP_min_max.llx llvm-svn: 16409	2004-09-19 19:18:10 +00:00
Chris Lattner	855a4ff4dd	Remove a whole bunch of horrible hacky code that was used to promote allocas whose addresses where used by trivial phi nodes and select instructions. This is now performed by the instcombine pass, which is more powerful, is much simpler, and is faster. This allows the deletion of a bunch of code, two FIXME's and two gotos. llvm-svn: 16406	2004-09-19 18:51:51 +00:00
Chris Lattner	f62ea8ef4b	Make instruction combining a bit more aggressive in the face of volatile loads, and implement two new transforms: InstCombine/load.ll:test[56]. llvm-svn: 16404	2004-09-19 18:43:46 +00:00
Chris Lattner	9864df96ba	Add comment llvm-svn: 16400	2004-09-19 01:05:16 +00:00
Chris Lattner	6455c51ab6	Fix the inliner to always delete any edges from the external call node to a function being deleted. Due to optimizations done while inlining, there can be edges from the external call node to a function node that were not apparent any longer. This fixes the compiler crash while compiling 175.vpr llvm-svn: 16399	2004-09-18 21:37:03 +00:00
Chris Lattner	37b6c4f2d2	Convert this pass to be a CallGraphSCCPass instead of a Pass, which eliminates the worklist and makes it more efficient. This does not change functionality at all. llvm-svn: 16390	2004-09-18 00:34:13 +00:00
Chris Lattner	475dc2c93d	Make sure to remove the Select instruction as well llvm-svn: 16389	2004-09-18 00:32:40 +00:00
Chris Lattner	5065b240c8	Fix typo in comment llvm-svn: 16384	2004-09-17 03:58:39 +00:00
Chris Lattner	9face5eb1f	Add a newline llvm-svn: 16369	2004-09-15 17:53:52 +00:00
Reid Spencer	6614946443	Convert code to compile with vc7.1. Patch contributed by Paolo Invernizzi. Thanks Paolo! llvm-svn: 16368	2004-09-15 17:06:42 +00:00
Chris Lattner	f11216d24f	Fix a bug in the previous checkin that broke 255.vortex llvm-svn: 16355	2004-09-15 02:34:40 +00:00
Chris Lattner	a346578d92	Make sure to update alias analysis information as we transform the function. This fixes PR420 and Regression/Transforms/LICM/2004-09-14-AliasAnalysisInvalidate.llx llvm-svn: 16348	2004-09-15 01:04:07 +00:00
Chris Lattner	9b9932bd94	If given an AliasSetTracker object to update, update it. llvm-svn: 16347	2004-09-15 01:02:54 +00:00
Chris Lattner	f41b80a05f	Remove a long-dead pass. Actually, this pass was never used at all. llvm-svn: 16337	2004-09-14 16:33:01 +00:00
Alkis Evlogimenos	a5c04ee50f	Fixes to make LLVM compile with vc7.1. Patch contributed by Paolo Invernizzi! llvm-svn: 16152	2004-09-03 18:19:51 +00:00
Reid Spencer	7c16caa336	Changes For Bug 352 Move include/Config and include/Support into include/llvm/Config, include/llvm/ADT and include/llvm/Support. From here on out, all LLVM public header files must be under include/llvm/. llvm-svn: 16137	2004-09-01 22:55:40 +00:00
Reid Spencer	f39f66e3ef	Initial checkin of a pass to lower packed operations to scalars operations. This also registers the pass with opt with a -lower-packed command line option. Patch contributed by Brad Jones. llvm-svn: 15987	2004-08-21 21:39:24 +00:00
Chris Lattner	14c198d09a	If we are linking two global variables and they have the same size, do not spew warnings, even if the types don't match. llvm-svn: 15933	2004-08-20 00:30:39 +00:00
Chris Lattner	6139134715	Implement test/Regression/Transforms/GlobalConstifier/phi-select.llx This allows more globals to be marked constant, particularly global arrays. llvm-svn: 15735	2004-08-14 20:57:17 +00:00
Chris Lattner	56273827b1	If we are extracting a block that has multiple successors that are the same block (common in a switch), make sure to remove extra edges in successor blocks. This fixes CodeExtractor/2004-08-12-BlockExtractPHI.ll and should be pulled into LLVM 1.3 (though the regression test need not be, as that would require pulling in the LoopExtract.cpp changes). llvm-svn: 15717	2004-08-13 03:27:07 +00:00
Chris Lattner	f06b043204	When we code extract some stuff, leave the codeRepl block in the place where the extracted code was, instead of putting it at the end of the function llvm-svn: 15716	2004-08-13 03:17:39 +00:00
Chris Lattner	7386e6333d	"extract" the block extractor pass from bugpoint (haha) llvm-svn: 15714	2004-08-13 03:05:17 +00:00
Chris Lattner	889d346e6e	Add value mapper support for select constant exprs. This should fix a bug Nate ran into when bugpointing siod. This fix should go into LLVM 1.3 llvm-svn: 15712	2004-08-13 02:43:19 +00:00
Chris Lattner	cde351ee30	This patch makes the inliner refuse to inline functions that have alloca instructions in the body of the function (not the entry block). This fixes test/Programs/SingleSource/Regression/C/2004-08-12-InlinerAndAllocas.c and test/Programs/External/SPEC/CINT2000/176.gcc on zion. This should obviously be pulled into 1.3. llvm-svn: 15684	2004-08-12 05:45:09 +00:00
Chris Lattner	7f1c7ede5b	Fix code extraction of unwind blocks. This fixed bugs that bugpoint can run into. This should go into 1.3 llvm-svn: 15679	2004-08-12 03:17:02 +00:00
Chris Lattner	a7ba90e672	Hrm, this pass didn't compile. This bugfix should go into 1.3! llvm-svn: 15676	2004-08-12 02:44:23 +00:00
Chris Lattner	4456da6a4c	Fix InstCombine/2004-08-10-BoolSetCC.ll, a bug that is miscompiling 176.gcc. Note that this is apparently not the only bug miscompiling gcc though. :( llvm-svn: 15639	2004-08-11 00:50:51 +00:00
Chris Lattner	8e7260652b	Fix InstCombine/2004-08-09-RemInfLoop.llx This should go into the 1.3 branch llvm-svn: 15593	2004-08-09 21:05:48 +00:00
Chris Lattner	4956a32c9e	Fix another really nasty regression that Anshu pointed out. In cases where dangling constant users were removed from a function, causing it to be dead, we never removed the call graph edge from the external node to the function. In most cases, this didn't cause a problem (by luck). This should definitely go into 1.3 llvm-svn: 15570	2004-08-08 03:29:50 +00:00
Chris Lattner	92b9906199	Two fixes: 1. Fix a REALLY nasty cyclic replacement issue that Anshu discovered, causing nondeterminstic crashes and memory corruption. 2. For performance, don't go inserting constantexpr casts of GV pointers. This should definitely go into 1.3 llvm-svn: 15568	2004-08-08 01:30:07 +00:00
Chris Lattner	6a93462144	This DEBUG is buggy. comment it out because it's not worth fixing. This should go into 1.3 llvm-svn: 15567	2004-08-08 01:27:56 +00:00
Alkis Evlogimenos	832437255d	Stop using getValues(). llvm-svn: 15487	2004-08-04 08:44:43 +00:00
Chris Lattner	7aa2d4747a	Fix a regression in InstCombine/xor.ll llvm-svn: 15410	2004-08-01 19:42:59 +00:00
Chris Lattner	7471b96a05	Expose this as a functionpass llvm-svn: 15369	2004-07-31 10:01:58 +00:00
Misha Brukman	9c003d8f65	Fix De Morgan's name. llvm-svn: 15343	2004-07-30 12:50:08 +00:00
Chris Lattner	d4252a7c64	Start using the PatternMatcher a bit. llvm-svn: 15342	2004-07-30 07:50:03 +00:00
Misha Brukman	f4a410f907	Fix #includes of i*.h => Instructions.h as per PR403. llvm-svn: 15337	2004-07-29 17:30:57 +00:00
Misha Brukman	63b38bd2ed	Fix #includes of i*.h => Instructions.h as per PR403. llvm-svn: 15334	2004-07-29 17:30:56 +00:00
Misha Brukman	2b3387a6d9	Fix #includes of i*.h => Instructions.h as per PR403. llvm-svn: 15328	2004-07-29 17:05:13 +00:00
Alkis Evlogimenos	fd7a2d4477	Merge i*.h headers into Instructions.h as part of bug403. llvm-svn: 15325	2004-07-29 12:17:34 +00:00
Robert Bocchino	7b5b86cd0f	This change fixed a bug in the function visitMul. The prior version assumed that a constant on the RHS of a multiplication was either an IntConstant or an FPConstant. It checked for an IntConstant and then, if it did not find one, did a hard cast to an FPConstant. That code would crash if the RHS were a ConstantExpr that was neither an IntConstant nor an FPConstant. This version replaces the hard cast with a dyn_cast. It performs the same way for IntConstants and FPConstants but does nothing, instead of crashing, for constant expressions. The regression test for this change is 2004-07-27-ConstantExprMul.ll. llvm-svn: 15291	2004-07-27 21:02:21 +00:00
Brian Gaeke	38b79e8fbc	Make the create...() functions for some of these passes return a FunctionPass *. llvm-svn: 15276	2004-07-27 17:43:21 +00:00
Chris Lattner	50eb771d37	Fix hoisting of void typed values, e.g. calls llvm-svn: 15263	2004-07-27 07:38:32 +00:00
Chris Lattner	f29807169a	Implement DeadStoreElim/alloca.llx by observing that allocas are dead at the end of the function (either return or unwind) llvm-svn: 15232	2004-07-26 06:14:11 +00:00
Chris Lattner	e5ad26dbb3	Throttle back indvar substitution from creating multiplies in loops. This is bad bad bad. llvm-svn: 15227	2004-07-26 02:47:12 +00:00
Chris Lattner	7b25bcdf52	* Substantially simplify how free instructions are handled (potentially fixing a bug in DSE). * Delete dead operand uses iteratively instead of recursively, using a SetVector. * Defer deletion of dead operand uses until the end of processing, which means we don't have to bother with updating the AliasSetTracker. This speeds up DSE substantially. llvm-svn: 15204	2004-07-25 11:09:56 +00:00
Chris Lattner	4c1c1ac7e4	Free instructions kill values too. This implements DeadStoreElim/free.llx llvm-svn: 15199	2004-07-25 07:58:38 +00:00
Chris Lattner	bad6478b00	obvious fix llvm-svn: 15162	2004-07-24 07:51:27 +00:00
Chris Lattner	3844c300de	This is a trivial dead store elimination pass. It very very simple and can be improved in many ways. But: stop laughing, even with -basicaa it deletes 15% of the stores in 252.eon :) llvm-svn: 15101	2004-07-22 08:00:28 +00:00
Chris Lattner	51f7c9e56d	Update GC intrinsics to take a pointer to the object as well as a pointer to the field being updated. Patch contributed by Tobias Nurmiranta llvm-svn: 15097	2004-07-22 05:51:13 +00:00
Brian Gaeke	902dcf0729	These files don't need to include <iostream> since they include "Support/Debug.h". llvm-svn: 15089	2004-07-21 20:50:33 +00:00
Chris Lattner	d8f5e2ccac	* Further cleanup. * Test for whether bits are shifted out during the optzn. If so, the fold is illegal, though it can be handled explicitly for setne/seteq This fixes the miscompilation of 254.gap last night, which was a latent bug exposed by other optimizer improvements. llvm-svn: 15085	2004-07-21 20:14:10 +00:00
Chris Lattner	1638de4499	Make cast-cast code a bit more defensive "simplify" a bit of code for comparison/and folding llvm-svn: 15082	2004-07-21 19:50:44 +00:00
Chris Lattner	4fbad968f8	Remove special casing of pointers and treat them generically as integers of the appopriate size. This gives us the ability to eliminate int -> ptr -> int llvm-svn: 15063	2004-07-21 04:27:24 +00:00
Chris Lattner	45b50d14c9	Fix a serious code pessimization problem. If an inlined function has a single return, clone the 'ret' BB code into the block AFTER the inlined call, not the other way around. llvm-svn: 15030	2004-07-20 05:45:24 +00:00
Chris Lattner	11ffd59e37	Implement Transforms/InstCombine/IntPtrCast.ll llvm-svn: 15029	2004-07-20 05:21:00 +00:00
Chris Lattner	ec67df0ed1	Ignore instructions that are in trivially dead functions. This allows us to constify 14 globals instead of 4 in a trivial C++ testcase. llvm-svn: 15027	2004-07-20 03:58:07 +00:00
Chris Lattner	44d0b9502a	Implement InstCombine/GEPIdxCanon.ll llvm-svn: 15024	2004-07-20 01:48:15 +00:00
Chris Lattner	5823ac1c21	Implement SimplifyCFG/BrUnwind.ll llvm-svn: 15022	2004-07-20 01:17:38 +00:00
Chris Lattner	4e2dbc6b4a	Rewrite cast->cast elimination code completely based on the information we actually care about. Someday when the cast instruction is gone, we can do better here, but this will do for now. This implements instcombine/cast.ll:test17/18 as well. llvm-svn: 15018	2004-07-20 00:59:32 +00:00
Chris Lattner	e2774757fe	Fix a performance regression from the CPR patch, simplify code llvm-svn: 14974	2004-07-18 21:34:16 +00:00
Chris Lattner	d47504d9db	Strip out and simplify some code. This also fixes the regression last night compiling cfrac. It did not realize that code like this: int G; int *H = &G; takes the address of G. llvm-svn: 14973	2004-07-18 19:56:20 +00:00
Chris Lattner	f3edc49ae2	Minor cleanup, no functionality change llvm-svn: 14972	2004-07-18 18:59:44 +00:00
Reid Spencer	3b4e83ec83	Remove an if statement that would never be reached. llvm-svn: 14968	2004-07-18 08:41:47 +00:00
Reid Spencer	f0a5bcaae4	Delete a redundant if branch. llvm-svn: 14967	2004-07-18 08:34:52 +00:00
Reid Spencer	c44cb6bd9f	Expand the coercion of constants to include the newly constant Globals. llvm-svn: 14966	2004-07-18 08:34:19 +00:00
Reid Spencer	539429d9b5	Delete a no-op loop. llvm-svn: 14965	2004-07-18 08:32:43 +00:00
Reid Spencer	6c2b627e23	Expand the scope to include global values because they are now constants too. llvm-svn: 14964	2004-07-18 08:32:10 +00:00
Reid Spencer	199aeb7f59	Avoid an unnecessary isa<Constant>. llvm-svn: 14963	2004-07-18 08:31:18 +00:00
Chris Lattner	9238d78dc3	Remove useless statistic, fix some slightly broken logic llvm-svn: 14958	2004-07-18 07:22:58 +00:00
Chris Lattner	2da5eee33c	Fix a rather serious bug in previous checkin llvm-svn: 14957	2004-07-18 06:56:58 +00:00
Reid Spencer	cb3fb5d4f5	bug 122: - Replace ConstantPointerRef usage with GlobalValue usage llvm-svn: 14953	2004-07-18 00:44:37 +00:00
Reid Spencer	874368790f	bug 122: - Replace ConstantPointerRef usage with GlobalValue usage - Minimize redundant isa<GlobalValue> usage - Correct isa<Constant> for GlobalValue subclass llvm-svn: 14950	2004-07-18 00:38:32 +00:00
Reid Spencer	ef784f01dd	bug 122: - Minimize redundant isa<GlobalValue> usage llvm-svn: 14948	2004-07-18 00:32:14 +00:00
Reid Spencer	c5afc9512b	bug 122: - Replace ConstantPointerRef usage with GlobalValue usage - Correct isa<Constant> for GlobalValue subclass llvm-svn: 14947	2004-07-18 00:31:05 +00:00
Reid Spencer	9e855c6832	bug 122: - Minimize redundant isa<GlobalValue> usage - Correct isa<Constant> for GlobalValue subclass llvm-svn: 14946	2004-07-18 00:29:57 +00:00
Reid Spencer	5f6815980b	bug 122: - Replace ConstantPointerRef usage with GlobalValue usage - Rename methods to get ride of ConstantPointerRef usage llvm-svn: 14945	2004-07-18 00:25:04 +00:00
Reid Spencer	83cae64faf	bug 122: - Excise dead CPR procesing. llvm-svn: 14944	2004-07-18 00:23:51 +00:00
Reid Spencer	e4de22874e	bug 122: - Replace ConstantPointerRef usage with GlobalValue usage - Correct test ordering for GlobalValue subclass llvm-svn: 14943	2004-07-18 00:19:45 +00:00
Chris Lattner	d79334df33	This patch was contributed by Daniel Berlin! Speed up SCCP substantially by processing overdefined values quickly. This patch speeds up SCCP by about 30-40% on large testcases. llvm-svn: 14861	2004-07-15 23:36:43 +00:00
Chris Lattner	f2c018c0c1	Fix PR404 try #2 This version takes about 1s longer than the previous one (down to 2.35s), but on the positive side, it actually works :) llvm-svn: 14856	2004-07-15 08:20:22 +00:00
Chris Lattner	daa12135da	Revert previous patch until I get a bug fixed llvm-svn: 14853	2004-07-15 05:36:31 +00:00
Chris Lattner	70177e402d	Fix PR404: Loop simplify is really slow on 252.eon This eliminates an NNlogN algorithm from the loop simplify pass, replacing it with a much simpler and faster alternative. In a debug build, this reduces gccas time on eon from 85s to 42s. llvm-svn: 14851	2004-07-15 04:27:04 +00:00
Chris Lattner	32c518e526	Progress on PR341 llvm-svn: 14840	2004-07-15 02:06:12 +00:00
Chris Lattner	9a63520b1a	Fixes working towards PR341 llvm-svn: 14839	2004-07-15 01:50:47 +00:00
Chris Lattner	ba7aef39fd	Now that we codegen the portable "sizeof" efficiently, we can use it for malloc lowering. This means that lowerallocations doesn't need targetdata anymore. yaay. llvm-svn: 14835	2004-07-15 01:08:08 +00:00
Chris Lattner	35e24774eb	Factor some code to handle "load (constantexpr cast foo)" just like "load (cast foo)". This allows us to compile C++ code like this: class Bclass { public: virtual int operator()() { return 666; } }; class Dclass: public Bclass { public: virtual int operator()() { return 667; } } ; int main(int argc, char argv) { Dclass x; return x(); } Into this: int %main(int %argc, sbyte %argv) { entry: call void %__main( ) ret int 667 } Instead of this: int %main(int %argc, sbyte** %argv) { entry: %x = alloca "struct.std::bad_typeid" ; <"struct.std::bad_typeid"> [#uses=3] call void %__main( ) %tmp.1.i.i = getelementptr "struct.std::bad_typeid" %x, uint 0, uint 0, uint 0 ; <int (...)*> [#uses=1] store int (...) getelementptr ([3 x int (...)] %vtable for Bclass, int 0, long 2), int (...)*** %tmp.1.i.i %tmp.3.i = getelementptr "struct.std::bad_typeid"* %x, int 0, uint 0, uint 0 ; <int (...)*> [#uses=1] store int (...) getelementptr ([3 x int (...)] %vtable for Dclass, int 0, long 2), int (...)*** %tmp.3.i %tmp.5 = load int ("struct.std::bad_typeid")* cast (int (...)** getelementptr ([3 x int (...)] %vtable for Dclass, int 0, long 2) to int ("struct.std::bad_typeid")) ; <int ("struct.std::bad_typeid")> [#uses=1] %tmp.6 = call int %tmp.5( "struct.std::bad_typeid" %x ) ; <int> [#uses=1] ret int %tmp.6 ret int 0 } In order words, we now resolve the virtual function call. llvm-svn: 14783	2004-07-13 01:49:43 +00:00
Chris Lattner	9eb9ccd9f6	Check to make sure types are sized before calling getTypeSize on them. llvm-svn: 14649	2004-07-06 19:28:42 +00:00
Brian Gaeke	a501be556f	It doesn't matter what the 2nd operand is; if the GEP has 2 operands and the first is a zero, we should leave it alone. llvm-svn: 14648	2004-07-06 19:24:47 +00:00
Brian Gaeke	0e0fe8a2e9	Add helper function. Don't touch GEPs for which DecomposeArrayRef is not going to do anything special (e.g., < 2 indices, or 2 indices and the last one is a constant.) llvm-svn: 14647	2004-07-06 18:15:39 +00:00
Chris Lattner	23b47b6af9	Implement rem.ll:test3 llvm-svn: 14640	2004-07-06 07:38:18 +00:00
Chris Lattner	98c6bdf251	Fix a minor bug where we would go into infinite loops on some constants llvm-svn: 14638	2004-07-06 07:11:42 +00:00
Chris Lattner	7fd5f0745a	Implement InstCombine/sub.ll:test15: X % -Y === X % Y Also, remove X % -1 = 0, because it's not true for unsigneds, and the signed case is superceeded by this new handling. llvm-svn: 14637	2004-07-06 07:01:22 +00:00
Reid Spencer	eb04d9bcb4	Add #include <iostream> since Value.h does not #include it any more. llvm-svn: 14622	2004-07-04 12:19:56 +00:00
Chris Lattner	4c9c20af28	Implement add.ll:test22, a common case in MSIL files llvm-svn: 14587	2004-07-03 00:26:11 +00:00
Chris Lattner	49df6cefa5	Do not call getTypeSize on a type that has no size llvm-svn: 14584	2004-07-02 22:55:47 +00:00
Brian Gaeke	e1a136fb4b	Get rid of a dead variable, and fix a typo in a comment. llvm-svn: 14560	2004-07-02 05:30:01 +00:00
Brian Gaeke	163c87fc32	Make this pass use a more specific debug message than "Processing:". llvm-svn: 14541	2004-07-01 19:27:10 +00:00
Vikram S. Adve	1097ed8467	Restoring this file. llvm-svn: 14478	2004-06-29 14:20:27 +00:00
Chris Lattner	3b11d3b294	Remove unused file llvm-svn: 14460	2004-06-28 00:46:58 +00:00
Chris Lattner	924882f775	These passes are long dead/obsolete. They never worked in the first place and are a maintenence burden. Nuke nuke nuke llvm-svn: 14457	2004-06-28 00:44:18 +00:00
Chris Lattner	6e07936ed2	Implement InstCombine/add.ll:test21 llvm-svn: 14443	2004-06-27 22:51:36 +00:00
Chris Lattner	7f4222237d	New constant expression lowering pass to simplify your instruction selection needs. Contributed by Vladimir Prus! llvm-svn: 14399	2004-06-25 07:48:09 +00:00
Vikram S. Adve	463556f889	This file is unused, and duplicates functionality in TraceValues.cpp. llvm-svn: 14369	2004-06-24 20:16:22 +00:00
Chris Lattner	7a002d6010	Two fixes. First, stop using the ugly shouldSubstituteIndVar method. Second, disable substitution of quadratic addrec expressions to avoid putting multiplies in loops! llvm-svn: 14358	2004-06-24 06:49:18 +00:00
Misha Brukman	49bb82a4b8	Moved to lib/VMCore llvm-svn: 14348	2004-06-23 17:21:17 +00:00
Brian Gaeke	1ea8447089	Use new IsNAN() wrapper. llvm-svn: 14340	2004-06-23 00:25:35 +00:00
Misha Brukman	ddc90adca3	File depends on DSA, moved to lib/Analysis/DataStructure llvm-svn: 14325	2004-06-22 18:11:38 +00:00
Chris Lattner	f12c4a3d37	FINALLY Fix a really nasty nondeterministic bug that has been haunting us since May 1st. In this code, the pred iterator was being invalidated sometimes causing the wrong entries to be added to PHI nodes. The fix for this is to defererence and safe the *PI value before we hack on branch instructions, which changes use/def chains, which SOMETIMES invalidates the iterator. llvm-svn: 14278	2004-06-21 07:19:01 +00:00
Chris Lattner	46f60890a3	Comment out the isnan stuff until we get a proper autoconf test for it breaking the build on sparc is not acceptable. llvm-svn: 14277	2004-06-21 06:17:21 +00:00
Chris Lattner	1c676f76b6	Make order of argument addition deterministic. In particular, the layout of ConstantInt objects in memory used to determine which order arguments were added in in some cases. llvm-svn: 14276	2004-06-21 00:07:58 +00:00
Chris Lattner	c9e06336ab	Make use of BinaryOperator::create* methods to shrinkify code. llvm-svn: 14262	2004-06-20 05:04:01 +00:00
Chris Lattner	7d30a6c145	Fix the inliner to be deterministic, not letting its output depend on the relative location of Function objects in memory. llvm-svn: 14260	2004-06-20 04:11:48 +00:00
Chris Lattner	9734fd0980	Add some DEBUG output to the simplifycfg routines Fix another non-deterministic behavior, this one should actually speed up the code though as it was doing silly things. llvm-svn: 14258	2004-06-20 01:13:18 +00:00
Chris Lattner	42ad646104	Now that dominator tree children are built in determinstic order, this horrible code can go away llvm-svn: 14254	2004-06-19 20:23:35 +00:00
Chris Lattner	940b7ba5ad	This will hopefully fix a heisenbug that Vladimir Merzliakov is running into valiantly trying to compile stuff on freebsd. llvm-svn: 14251	2004-06-19 19:01:26 +00:00
Chris Lattner	4027500e1c	Fix a nasty bug, noticed by Reid llvm-svn: 14249	2004-06-19 18:15:50 +00:00
Chris Lattner	ec2d34cc19	Fix one source of nondeterminism in the -licm pass: the hoist pass was processing blocks in whatever order they happened to end up in the dominator tree data structure. Force an ordering. llvm-svn: 14248	2004-06-19 08:56:43 +00:00
Chris Lattner	4db0f8260a	Change to use the StableBasicBlockNumbering class llvm-svn: 14247	2004-06-19 08:42:40 +00:00
Chris Lattner	a52ab6f57f	Do not let the numbering of PHI nodes placed in the function depend on non-deterministic things like the ordering of blocks in the dominance frontier of a BB. Unfortunately, I don't know of a better way to solve this problem than to explicitly sort the BB's in function-order before processing them. This is guaranteed to slow the pass down a bit, but is absolutely necessary to get usable diffs between two different tools executing the mem2reg or scalarrepl pass. Before this, bazillions of spurious diff failures occurred all over the place due to the different order of processing PHIs: - %tmp.111 = getelementptr %struct.Connector_struct* %upcon.0.0, uint 0, uint 0 + %tmp.111 = getelementptr %struct.Connector_struct* %upcon.0.1, uint 0, uint 0 Now, the diffs match. llvm-svn: 14244	2004-06-19 07:40:14 +00:00
Chris Lattner	b2b151d297	Do not sort by the address of LLVM ConstantInt* objects. This produces nondeterministic results that depend on where these objects land in memory. Instead, sort by the value of the constant, which is stable. Before this patch, the -simplifycfg pass run from two different compilers could cause different code to be generated, though it was semantically the same: @@ -12258,8 +12258,8 @@ %s_addr.1 = phi sbyte* [ %s, %entry ], [ %inc.0, %no_exit ] ; <sbyte> [#uses=5] %tmp.1 = load sbyte %s_addr.1 ; <sbyte> [#uses=1] switch sbyte %tmp.1, label %no_exit [ - sbyte 0, label %loopexit sbyte 46, label %loopexit + sbyte 0, label %loopexit ] We need to stomp all of this stuff out. llvm-svn: 14243	2004-06-19 07:02:14 +00:00
Chris Lattner	b5f8eb8315	Do not loop over uses as we delete them. This causes iterators to be invalidated out from under us. This bug goes back to revision 1.1: scary. llvm-svn: 14242	2004-06-19 02:02:22 +00:00
Chris Lattner	023a483c76	Implement Transforms/InstCombine/and.ll:test17, a common case that occurs due to unordered comparison macros in math.h llvm-svn: 14221	2004-06-18 06:07:51 +00:00
Chris Lattner	1e1abdd6ed	Do not function resolve intrinsics. This prevents warnings and possible bad things from happening due to declare bool %llvm.isunordered(double, double) declare bool %llvm.isunordered(float, float) llvm-svn: 14219	2004-06-18 05:50:48 +00:00
Brian Gaeke	27b13253d9	I love the smell of a freshly broken PowerPC build in the morning. llvm-svn: 14206	2004-06-17 22:27:04 +00:00
Chris Lattner	f03f320b79	Fix compilation problem on freebsd. Problem noted by Vladimir Merzliakov in PR371 llvm-svn: 14203	2004-06-17 21:20:52 +00:00
Chris Lattner	6b7275996c	Rename Type::PrimitiveID to TypeId and ::getPrimitiveID() to ::getTypeID() llvm-svn: 14201	2004-06-17 18:19:28 +00:00
Chris Lattner	97bfcea262	Rename Type::PrimitiveID to TypeId and ::getPrimitiveID() to ::getTypeID() Delete two functions that are now methods on the Type class llvm-svn: 14200	2004-06-17 18:16:02 +00:00
Brian Gaeke	661963c63f	Fix typo in DEBUG printout. llvm-svn: 14196	2004-06-17 07:26:52 +00:00
Brian Gaeke	20e09e5c7b	Um, did someone make a typo or something? llvm-svn: 14192	2004-06-15 23:09:50 +00:00
Chris Lattner	5a542aadc8	Remove support for the isnan intrinsic llvm-svn: 14186	2004-06-15 21:37:54 +00:00
Brian Gaeke	21370771ba	Quick hack to get this file compiling again on Mac OS X. The right thing to do is write an autoconf macro that checks whether __isnan or isnan actually works using the C++ compiler after #include <cmath>, instead of doing it the easy way with AC_CHECK_FUNCS(). llvm-svn: 14171	2004-06-14 06:33:19 +00:00
Alkis Evlogimenos	e395468ae5	Add constant folding capabilities to the isunordered intrinsic. llvm-svn: 14168	2004-06-13 01:23:56 +00:00
Chris Lattner	ec941f7abb	Constant fold the isnan intrinsic llvm-svn: 14150	2004-06-11 06:16:23 +00:00
Chris Lattner	ee59d4bf04	Fix a bug in my checkin from last night that caused miscompilations of 186.crafty, fhourstones and 132.ijpeg. Bugpoint makes really nasty miscompilations embarassingly easy to find. It narrowed it down to the instcombiner and this testcase (from fhourstones): bool %l7153_l4706_htstat_loopentry_2E_4_no_exit_2E_4(int* %i, [32 x int]* %works, int* %tmp.98.out) { newFuncRoot: %tmp.96 = load int* %i ; <int> [#uses=1] %tmp.97 = getelementptr [32 x int]* %works, long 0, int %tmp.96 ; <int> [#uses=1] %tmp.98 = load int %tmp.97 ; <int> [#uses=2] %tmp.99 = load int* %i ; <int> [#uses=1] %tmp.100 = and int %tmp.99, 7 ; <int> [#uses=1] %tmp.101 = seteq int %tmp.100, 7 ; <bool> [#uses=2] %tmp.102 = cast bool %tmp.101 to int ; <int> [#uses=0] br bool %tmp.101, label %codeRepl4.exitStub, label %codeRepl3.exitStub codeRepl4.exitStub: ; preds = %newFuncRoot store int %tmp.98, int* %tmp.98.out ret bool true codeRepl3.exitStub: ; preds = %newFuncRoot store int %tmp.98, int* %tmp.98.out ret bool false } ... which only has one combination performed on it: $ llvm-as < t.ll \| opt -instcombine -debug \| llvm-dis IC: Old = %tmp.101 = seteq int %tmp.100, 7 ; <bool> [#uses=1] New = setne int %tmp.100, 0 ; <bool>:<badref> [#uses=0] IC: MOD = br bool %tmp.101, label %codeRepl3.exitStub, label %codeRepl4.exitStub IC: MOD = %tmp.97 = getelementptr [32 x int]* %works, uint 0, int %tmp.96 ; <int*> [#uses=1] It doesn't get much better than this. :) llvm-svn: 14109	2004-06-10 02:33:20 +00:00
Chris Lattner	c8e7e298c1	More minor cleanups llvm-svn: 14108	2004-06-10 02:12:35 +00:00
Chris Lattner	df20a4d589	Eliminate many occurrances of Instruction:: llvm-svn: 14107	2004-06-10 02:07:29 +00:00
Chris Lattner	35167c3087	Implement InstCombine/select.ll:test15* llvm-svn: 14095	2004-06-09 07:59:58 +00:00
Chris Lattner	396dbfe327	Be more careful about the order we put stuff onto the worklist. This allow us to collapse this: bool %le(int %A, int %B) { %c1 = setgt int %A, %B %tmp = select bool %c1, int 1, int 0 %c2 = setlt int %A, %B %result = select bool %c2, int -1, int %tmp %c3 = setle int %result, 0 ret bool %c3 } into: bool %le(int %A, int %B) { %c3 = setle int %A, %B ; <bool> [#uses=1] ret bool %c3 } which is handy, because the Java FE makes these sequences all over the place. This is tested as: test/Regression/Transforms/InstCombine/JavaCompare.ll llvm-svn: 14086	2004-06-09 05:08:07 +00:00
Chris Lattner	2dd017402b	Implement select.ll:test14* llvm-svn: 14083	2004-06-09 04:24:29 +00:00
Brian Gaeke	a9c5779a86	Expand head-of-file comment. llvm-svn: 13982	2004-06-03 05:03:02 +00:00
Brian Gaeke	c0b9b83450	Use new form of unconditional branch constructor. llvm-svn: 13930	2004-06-01 20:06:10 +00:00
Chris Lattner	523d3e6674	Fix one of the major things that is causing the C Backend to infinite loop llvm-svn: 13872	2004-05-28 05:02:13 +00:00
John Criswell	37d2ae92a7	Fix a bug in the -deadtypeelim pass. The SymbolTable re-write changed it to eliminate the wrong type. llvm-svn: 13855	2004-05-27 21:16:46 +00:00
Chris Lattner	ed79d8af53	Fix InstCombine/load.ll & PR347. This code hadn't been updated after the "structs with more than 256 elements" related changes to the GEP instruction. Also it was not handling the ConstantAggregateZero class. Now it does! llvm-svn: 13834	2004-05-27 17:30:27 +00:00
Chris Lattner	c6e21fbd5c	Implement constant folding of fmod, which is used a lot in povray llvm-svn: 13823	2004-05-27 07:25:00 +00:00
Chris Lattner	06158d140c	Restructure call constant folding code a bit to make it simpler Add support for acos/asin/atan. 188.ammp contains three calls to acos with constant arguments. Constant folding it allows elimination of those 3 calls and three FP divisions of the results. llvm-svn: 13821	2004-05-27 06:26:28 +00:00
Alkis Evlogimenos	0eefdcd73f	Do not pass a null pointer if this instruction is not prepended or appended anywhere. llvm-svn: 13798	2004-05-26 22:50:28 +00:00
Alkis Evlogimenos	9e84b503f0	Use one destination constructor for the unconditional branch. llvm-svn: 13792	2004-05-26 21:38:14 +00:00
Reid Spencer	e7e9671cad	Convert to SymbolTable's new iteration interface. llvm-svn: 13754	2004-05-25 08:53:40 +00:00
Reid Spencer	abb6f008ca	Convert to SymbolTable's new lookup and iteration interfaces. llvm-svn: 13751	2004-05-25 08:52:20 +00:00
Reid Spencer	297d7fe7e6	Remove unused header file. llvm-svn: 13750	2004-05-25 08:51:36 +00:00
Reid Spencer	1cc31f264f	Make this pass simply invoke SymbolTable::strip(). llvm-svn: 13749	2004-05-25 08:51:25 +00:00
Chris Lattner	e1e10e1883	Implement InstCombine:shift.ll:test16, which turns (X >> C1) & C2 != C3 into (X & (C2 << C1)) != (C3 << C1), where the shift may be either left or right and the compare may be any one. This triggers 1546 times in 176.gcc alone, as it is a common pattern that occurs for bitfield accesses. llvm-svn: 13740	2004-05-25 06:32:08 +00:00
Chris Lattner	03841659a4	Implement instcombine/cast.ll:test16: Canonicalize cast X to bool into a setne instruction llvm-svn: 13736	2004-05-25 04:29:21 +00:00
Chris Lattner	6f02714a10	Fix a bug in my previous checkin llvm-svn: 13717	2004-05-24 06:24:46 +00:00
Chris Lattner	99173879ad	Spelling people's names right is kinda important llvm-svn: 13702	2004-05-23 21:27:29 +00:00
Chris Lattner	6754b827c6	Fix cases where we missed inlining some more obvious candidates because the caller was in an SCC. llvm-svn: 13693	2004-05-23 21:22:17 +00:00
Chris Lattner	8d7ff5e3dd	Simplify the interface and remove an unneeded #include llvm-svn: 13692	2004-05-23 21:21:35 +00:00
Chris Lattner	254f8f8ad5	Fairly substantial changes to update the alias analysis we are querying as we make the transformation. This allows us to use interprocedural alias analyses successfully. llvm-svn: 13691	2004-05-23 21:21:17 +00:00
Chris Lattner	289ba2ac4d	Adjust to the changes in the AliasSetTracker interface llvm-svn: 13690	2004-05-23 21:20:19 +00:00
Chris Lattner	e67dbc2ae2	Add support for replacement of formal arguments with simpler expressions. llvm-svn: 13689	2004-05-23 21:19:55 +00:00
Chris Lattner	099c8cfe90	Implement the -lowergc pass which is used by code generators (like the CBE) that do not have builtin support for garbage collection. llvm-svn: 13688	2004-05-23 21:19:22 +00:00
Brian Gaeke	72185765bc	Add CloneTraceInto(), which is based on (and has mostly the same effects as) CloneFunctionInto(). llvm-svn: 13601	2004-05-19 09:08:14 +00:00
Brian Gaeke	6182acf92a	Move RemapInstruction() to ValueMapper, so that it can be shared with CloneTrace, and because it is primarily an operation on ValueMaps. It is now a global (non-static) function which can be pulled in using ValueMapper.h. llvm-svn: 13600	2004-05-19 09:08:12 +00:00
Brian Gaeke	27e4943516	Clean up this pass somewhat: Add better comments, including a better head-of-file comment. Prune #includes. Fix a FIXME that Chris put here by using doInitialization(). Use DEBUG() to print out debug msgs. Give names to basic blocks inserted by this pass. Expand tabs. Use InsertProfilingInitCall() from ProfilingUtils to insert the initialize call. llvm-svn: 13581	2004-05-14 21:21:52 +00:00
Chris Lattner	0026512bac	This was not meant to be committed llvm-svn: 13565	2004-05-13 20:56:34 +00:00
Chris Lattner	c12c945cc4	Fix a nasty bug that caused us to unroll EXTREMELY large loops due to overflow in the size calculation. This is not something you want to see: Loop Unroll: F[main] Loop %no_exit Loop Size = 2 Trip Count = 2147483648 - UNROLLING! The problem was that 2*2147483648 == 0. Now we get: Loop Unroll: F[main] Loop %no_exit Loop Size = 2 Trip Count = 2147483648 - TOO LARGE: 4294967296>100 Thanks to some anonymous person playing with the demo page that repeatedly caused zion to go into swapping land. That's one way to ensure you'll get a quick bugfix. :) Testcase here: Transforms/LoopUnroll/2004-05-13-DontUnrollTooMuch.ll llvm-svn: 13564	2004-05-13 20:43:31 +00:00
Chris Lattner	66219abac7	Do not pass in the same argument to the extracted function more than once, and give the extracted function a more useful name than just foo_code. llvm-svn: 13493	2004-05-12 16:26:18 +00:00
Chris Lattner	13d2ddfe9c	Implement support for code extracting basic blocks that have a return instruction in them. llvm-svn: 13490	2004-05-12 16:07:41 +00:00
Chris Lattner	795c9933e2	Implement splitting of PHI nodes, allowing block extraction of BB's that have PHI node entries from multiple outside-the-region blocks. This also fixes extraction of the entry block in a function. Yaay. This has successfully block extracted all (but one) block from the score_move function in obsequi (out of 33). Hrm, I wonder which block the bug is in. :) llvm-svn: 13489	2004-05-12 15:29:13 +00:00
Chris Lattner	3b2917bfcf	* Pull some code out into the definedInRegion/definedInCaller methods * Add a stub for the severSplitPHINodes which will allow us to bbextract bb's with PHI nodes in them soon. * Remove unused arguments from findInputsOutputs * Dramatically simplify the code in findInputsOutputs. In particular, nothing really cares whether or not a PHI node is using something. * Move moveCodeToFunction to after emitCallAndSwitchStatement as that's the order they get called. * Fix a bug where we would code extract a region that included a call to vastart. Like 'alloca', calls to vastart must stay in the function that they are defined in. * Add some comments. llvm-svn: 13482	2004-05-12 06:01:40 +00:00
Chris Lattner	ffc4926263	Generate substantially better code when there are a limited number of exits from the extracted region. If the return has 0 or 1 exit blocks, the new function returns void. If it has 2 exits, it returns bool, otherwise it returns a ushort as before. This allows us to use a conditional branch instruction when there are two exit blocks, as often happens during block extraction. llvm-svn: 13481	2004-05-12 04:14:24 +00:00
Chris Lattner	3d1ca67fdd	Two minor improvements: 1. Get rid of the silly abort block. When doing bb extraction, we get one abort block for every block extracted, which is kinda annoying. 2. If the switch ends up having a single destination, turn it into an unconditional branch. I would like to add support for conditional branches, but to do this we will want to have the function return a bool instead of a ushort. llvm-svn: 13478	2004-05-12 03:22:33 +00:00
Chris Lattner	8ec5f88c79	Fix stupid bug in my checkin yesterday llvm-svn: 13429	2004-05-08 22:41:42 +00:00
Chris Lattner	5f667a6f58	Implement folding of GEP's like: %tmp.0 = getelementptr [50 x sbyte]* %ar, uint 0, int 5 ; <sbyte> [#uses=2] %tmp.7 = getelementptr sbyte %tmp.0, int 8 ; <sbyte*> [#uses=1] together. This patch actually allows us to simplify and generalize the code. llvm-svn: 13415	2004-05-07 22:09:22 +00:00
Chris Lattner	d9e5813821	Fix PR336: The instcombine pass asserts when visiting load instruction llvm-svn: 13400	2004-05-07 15:35:56 +00:00
Chris Lattner	9490849028	Do not mark instructions in unreachable sections of the function as live. This fixes PR332 and ADCE/2004-05-04-UnreachableBlock.llx llvm-svn: 13349	2004-05-04 17:00:46 +00:00
Chris Lattner	dd1a86d858	Minor efficiency tweak, suggested by Patrick Meredith llvm-svn: 13341	2004-05-04 15:19:33 +00:00
Brian Gaeke	5237476f75	Fix typo llvm-svn: 13340	2004-05-03 23:52:07 +00:00
Brian Gaeke	e96196081e	In InsertProfilingInitCall(), make it legal to pass in a null array, in which case you'll get a null array and zero passed to the profiling function. llvm-svn: 13336	2004-05-03 22:06:33 +00:00
Brian Gaeke	088dd3e121	Add initial implementation of basic-block tracing instrumentation pass. llvm-svn: 13335	2004-05-03 22:06:32 +00:00
Chris Lattner	be6f06818c	Do not clone arbitrary condition instructions. llvm-svn: 13316	2004-05-02 05:19:36 +00:00
Chris Lattner	51a6dbcb65	Do not infinitely "unroll" single BB loops. llvm-svn: 13315	2004-05-02 05:02:03 +00:00
Chris Lattner	1e94ed606e	Dont' merge terminators that are needed to select PHI node values. llvm-svn: 13312	2004-05-02 01:00:44 +00:00
Chris Lattner	2e93c4275e	Implement SimplifyCFG/branch-cond-merge.ll Turning "if (A < B && B < C)" into "if (A < B & B < C)" llvm-svn: 13311	2004-05-01 23:35:43 +00:00
Chris Lattner	63d75af920	Make sure to reprocess instructions used by deleted instructions to avoid missing opportunities for combination. llvm-svn: 13309	2004-05-01 23:27:23 +00:00
Chris Lattner	b643a9e675	Make sure the instruction combiner doesn't lose track of instructions when replacing them, missing the opportunity to do simplifications llvm-svn: 13308	2004-05-01 23:19:52 +00:00
Chris Lattner	4cbd160b45	Fix my missing parens llvm-svn: 13307	2004-05-01 22:41:51 +00:00
Chris Lattner	88da6f7b52	Implement SimplifyCFG/branch-cond-prop.ll llvm-svn: 13306	2004-05-01 22:36:37 +00:00
Chris Lattner	652064e3b8	Fix a major pessimization in the instcombiner. If an allocation instruction is only used by a cast, and the casted type is the same size as the original allocation, it would eliminate the cast by folding it into the allocation. Unfortunately, it was placing the new allocation instruction right before the cast, which could pull (for example) alloca instructions into the body of a function. This turns statically allocatable allocas into expensive dynamically allocated allocas, which is bad bad bad. This fixes the problem by placing the new allocation instruction at the same place the old one was, duh. :) llvm-svn: 13289	2004-04-30 04:37:52 +00:00
Chris Lattner	2d3a7a6ff0	Changes to fix up the inst_iterator to pass to boost iterator checks. This patch was graciously contributed by Vladimir Prus. llvm-svn: 13185	2004-04-27 15:13:33 +00:00
Chris Lattner	e20c334e65	Instcombine X/-1 --> 0-X llvm-svn: 13172	2004-04-26 14:01:59 +00:00
Misha Brukman	3596f0a180	* Allow aggregating extracted function arguments (controlled by flag) * Commandline option (for now) controls that flag that is passed in llvm-svn: 13141	2004-04-23 23:54:17 +00:00
Chris Lattner	83cd87efcd	Move the scev expansion code into this pass, where it belongs. There is still room for cleanup, but at least the code modification is out of the analysis now. llvm-svn: 13135	2004-04-23 21:29:48 +00:00
Misha Brukman	98aa516a9c	Clarify the logic: the flag is renamed to `deleteFn' to signify it will delete the function instead of isolating it. This also means the condition is reversed. llvm-svn: 13112	2004-04-22 23:00:51 +00:00
Misha Brukman	e0682426f0	Add a flag to choose between isolating a function or deleting the function from the Module. The default behavior keeps functionality as before: the chosen function is the one that remains. llvm-svn: 13111	2004-04-22 22:52:22 +00:00
Chris Lattner	c27302c79f	Disable a previous patch that was causing indvars to loop infinitely :( llvm-svn: 13108	2004-04-22 15:12:36 +00:00
Chris Lattner	c1a682dda0	Fix an extremely serious thinko I made in revision 1.60 of this file. llvm-svn: 13106	2004-04-22 14:59:40 +00:00
Chris Lattner	af532f27e7	Implement a todo, rewriting all possible scev expressions inside of the loop. This eliminates the extra add from the previous case, but it's not clear that this will be a performance win overall. Tommorows test results will tell. :) llvm-svn: 13103	2004-04-21 23:36:08 +00:00
Chris Lattner	fb9a299f68	This code really wants to iterate over the OPERANDS of an instruction, not over its USES. If it's dead it doesn't have any uses! :) Thanks to the fabulous and mysterious Bill Wendling for pointing this out. :) llvm-svn: 13102	2004-04-21 22:29:37 +00:00
Chris Lattner	dc7cc35088	Implement a fixme. The helps loops that have induction variables of different types in them. Instead of creating an induction variable for all types, it creates a single induction variable and casts to the other sizes. This generates this code: no_exit: ; preds = %entry, %no_exit %indvar = phi uint [ %indvar.next, %no_exit ], [ 0, %entry ] ; <uint> [#uses=4] *** %j.0.0 = cast uint %indvar to short ; <short> [#uses=1] %indvar = cast uint %indvar to int ; <int> [#uses=1] %tmp.7 = getelementptr short* %P, uint %indvar ; <short> [#uses=1] store short %j.0.0, short %tmp.7 %inc.0 = add int %indvar, 1 ; <int> [#uses=2] %tmp.2 = setlt int %inc.0, %N ; <bool> [#uses=1] %indvar.next = add uint %indvar, 1 ; <uint> [#uses=1] br bool %tmp.2, label %no_exit, label %loopexit instead of: no_exit: ; preds = %entry, %no_exit %indvar = phi ushort [ %indvar.next, %no_exit ], [ 0, %entry ] ; <ushort> [#uses=2] *** %indvar = phi uint [ %indvar.next, %no_exit ], [ 0, %entry ] ; <uint> [#uses=3] %indvar = cast uint %indvar to int ; <int> [#uses=1] %indvar = cast ushort %indvar to short ; <short> [#uses=1] %tmp.7 = getelementptr short* %P, uint %indvar ; <short> [#uses=1] store short %indvar, short %tmp.7 %inc.0 = add int %indvar, 1 ; <int> [#uses=2] %tmp.2 = setlt int %inc.0, %N ; <bool> [#uses=1] %indvar.next = add uint %indvar, 1 *** %indvar.next = add ushort %indvar, 1 br bool %tmp.2, label %no_exit, label %loopexit This is an improvement in register pressure, but probably doesn't happen that often. The more important fix will be to get rid of the redundant add. llvm-svn: 13101	2004-04-21 22:22:01 +00:00
Chris Lattner	be8bb804c5	Fix an incredibly nasty iterator invalidation problem. I am too spoiled by ilists :) Eventually it would be nice if CallGraph maintained an ilist of CallGraphNode's instead of a vector of pointers to them, but today is not that day. llvm-svn: 13100	2004-04-21 20:44:33 +00:00
Alkis Evlogimenos	f68f40ea42	Include cerrno (gcc-3.4 fix) llvm-svn: 13091	2004-04-21 16:11:40 +00:00
Chris Lattner	a9691fe70d	Fix typeo llvm-svn: 13089	2004-04-21 14:23:18 +00:00
Chris Lattner	c87784f1fc	REALLY fix PR324: don't delete linkonce functions until after the SCC traversal is done, which avoids invalidating iterators in the SCC traversal routines llvm-svn: 13088	2004-04-20 22:06:53 +00:00
Chris Lattner	c1aa21f5a7	Fix PR325 llvm-svn: 13081	2004-04-20 20:26:03 +00:00
Chris Lattner	514934051a	Fix PR324 and testcase: Inline/2004-04-20-InlineLinkOnce.llx llvm-svn: 13080	2004-04-20 20:20:59 +00:00
Chris Lattner	f48f777d4c	Initial checkin of a simple loop unswitching pass. It still needs work, but it's a start, and seems to do it's basic job. llvm-svn: 13068	2004-04-19 18:07:02 +00:00
Chris Lattner	bc02177fdc	Add #include llvm-svn: 13057	2004-04-19 03:01:23 +00:00
Chris Lattner	fc44a25bcb	Move isLoopInvariant to the Loop class llvm-svn: 13051	2004-04-18 22:46:08 +00:00
Chris Lattner	827826320d	Correct rewriting of exit blocks after my last patch llvm-svn: 13048	2004-04-18 22:27:10 +00:00
Chris Lattner	35eaa55cfc	Loop exit sets are no longer explicitly held, they are dynamically computed on demand. llvm-svn: 13046	2004-04-18 22:15:13 +00:00
Chris Lattner	d72c3eb54e	Change the ExitBlocks list from being explicitly contained in the Loop structure to being dynamically computed on demand. This makes updating loop information MUCH easier. llvm-svn: 13045	2004-04-18 22:14:10 +00:00
Chris Lattner	d15250240c	Reduce the unrolling limit llvm-svn: 13040	2004-04-18 18:06:14 +00:00
Chris Lattner	30ae18155d	If the preheader of the loop was the entry block of the function, make sure that the exit block of the loop becomes the new entry block of the function. This was causing a verifier assertion on 252.eon. llvm-svn: 13039	2004-04-18 17:38:42 +00:00
Chris Lattner	230bcb6b35	Be much more careful about how we update instructions outside of the loop using instructions inside of the loop. This should fix the MishaTest failure from last night. llvm-svn: 13038	2004-04-18 17:32:39 +00:00
Chris Lattner	4d52e1e401	After unrolling our single basic block loop, fold it into the preheader and exit block. The primary motivation for doing this is that we can now unroll nested loops. This makes a pretty big difference in some cases. For example, in 183.equake, we are now beating the native compiler with the CBE, and we are a lot closer with LLC. I'm now going to play around a bit with the unroll factor and see what effect it really has. llvm-svn: 13034	2004-04-18 06:27:43 +00:00
Chris Lattner	f2cc841619	Fix a bug: this does not preserve the CFG! While we're at it, add support for updating loop information correctly. llvm-svn: 13033	2004-04-18 05:38:37 +00:00
Chris Lattner	946b255977	Initial checkin of a simple loop unroller. This pass is extremely basic and limited. Even in it's extremely simple state (it can only fully unroll single basic block loops that execute a constant number of times), it already helps improve performance a LOT on some benchmarks, particularly with the native code generators. llvm-svn: 13028	2004-04-18 05:20:17 +00:00
Chris Lattner	c14da9600b	Make the tail duplication threshold accessible from the command line instead of hardcoded llvm-svn: 13025	2004-04-18 00:52:43 +00:00
Chris Lattner	a814080025	If the loop executes a constant number of times, try a bit harder to replace exit values. llvm-svn: 13018	2004-04-17 18:44:09 +00:00
Chris Lattner	1e9ac1a45e	Fix a HUGE pessimization on X86. The indvars pass was taking this (familiar) function: int _strlen(const char str) { int len = 0; while (str++) len++; return len; } And transforming it to use a ulong induction variable, because the type of the pointer index was left as a constant long. This is obviously very bad. The fix is to shrink long constants in getelementptr instructions to intptr_t, making the indvars pass insert a uint induction variable, which is much more efficient. Here's the before code for this function: int %_strlen(sbyte* %str) { entry: %tmp.13 = load sbyte* %str ; <sbyte> [#uses=1] %tmp.24 = seteq sbyte %tmp.13, 0 ; <bool> [#uses=1] br bool %tmp.24, label %loopexit, label %no_exit no_exit: ; preds = %entry, %no_exit * %indvar = phi uint [ %indvar.next, %no_exit ], [ 0, %entry ] ; <uint> [#uses=2] * %indvar = phi ulong [ %indvar.next, %no_exit ], [ 0, %entry ] ; <ulong> [#uses=2] %indvar1 = cast ulong %indvar to uint ; <uint> [#uses=1] %inc.02.sum = add uint %indvar1, 1 ; <uint> [#uses=1] %inc.0.0 = getelementptr sbyte* %str, uint %inc.02.sum ; <sbyte> [#uses=1] %tmp.1 = load sbyte %inc.0.0 ; <sbyte> [#uses=1] %tmp.2 = seteq sbyte %tmp.1, 0 ; <bool> [#uses=1] %indvar.next = add ulong %indvar, 1 ; <ulong> [#uses=1] %indvar.next = add uint %indvar, 1 ; <uint> [#uses=1] br bool %tmp.2, label %loopexit.loopexit, label %no_exit loopexit.loopexit: ; preds = %no_exit %indvar = cast uint %indvar to int ; <int> [#uses=1] %inc.1 = add int %indvar, 1 ; <int> [#uses=1] ret int %inc.1 loopexit: ; preds = %entry ret int 0 } Here's the after code: int %_strlen(sbyte* %str) { entry: %inc.02 = getelementptr sbyte* %str, uint 1 ; <sbyte> [#uses=1] %tmp.13 = load sbyte %str ; <sbyte> [#uses=1] %tmp.24 = seteq sbyte %tmp.13, 0 ; <bool> [#uses=1] br bool %tmp.24, label %loopexit, label %no_exit no_exit: ; preds = %entry, %no_exit *** %indvar = phi uint [ %indvar.next, %no_exit ], [ 0, %entry ] ; <uint> [#uses=3] %indvar = cast uint %indvar to int ; <int> [#uses=1] %inc.0.0 = getelementptr sbyte* %inc.02, uint %indvar ; <sbyte> [#uses=1] %inc.1 = add int %indvar, 1 ; <int> [#uses=1] %tmp.1 = load sbyte %inc.0.0 ; <sbyte> [#uses=1] %tmp.2 = seteq sbyte %tmp.1, 0 ; <bool> [#uses=1] %indvar.next = add uint %indvar, 1 ; <uint> [#uses=1] br bool %tmp.2, label %loopexit, label %no_exit loopexit: ; preds = %entry, %no_exit %len.0.1 = phi int [ 0, %entry ], [ %inc.1, %no_exit ] ; <int> [#uses=1] ret int %len.0.1 } llvm-svn: 13016	2004-04-17 18:16:10 +00:00
Chris Lattner	885a6eb74d	Even if there are not any induction variables in the loop, if we can compute the trip count for the loop, insert one so that we can canonicalize the exit condition. llvm-svn: 13015	2004-04-17 18:08:33 +00:00
Chris Lattner	a43312d30b	Add support for evaluation of exp/log/log10/pow llvm-svn: 13011	2004-04-16 22:35:33 +00:00
Chris Lattner	284d3b0311	Fix some really nasty dominance bugs that were exposed by my patch to make the verifier more strict. This fixes building zlib llvm-svn: 13002	2004-04-16 18:08:07 +00:00
Brian Gaeke	174633b078	Include <cmath> for compatibility with gcc 3.0.x (the system compiler on Debian.) llvm-svn: 12986	2004-04-16 15:57:32 +00:00
Chris Lattner	9e9b2b7474	Fix some of the strange CBE-only failures that happened last night. llvm-svn: 12980	2004-04-16 06:03:17 +00:00
Chris Lattner	0328d75c83	Fix Inline/2004-04-15-InlineDeletesCall.ll Basically we were using SimplifyCFG as a huge sledgehammer for a simple optimization. Because simplifycfg does so many things, we can't use it for this purpose. llvm-svn: 12977	2004-04-16 05:17:59 +00:00
Chris Lattner	d7a559e353	Fix a bug in the previous checkin: if the exit block is not the same as the back-edge block, we must check the preincremented value. llvm-svn: 12968	2004-04-15 20:26:22 +00:00
Chris Lattner	0cec5cb92c	Change the canonical induction variable that we insert. Instead of producing code like this: Loop: X = phi 0, X2 ... X2 = X + 1 if (X != N-1) goto Loop We now generate code that looks like this: Loop: X = phi 0, X2 ... X2 = X + 1 if (X2 != N) goto Loop This has two big advantages: 1. The trip count of the loop is now explicit in the code, allowing the direct implementation of Loop::getTripCount() 2. This reduces register pressure in the loop, and allows X and X2 to be put into the same register. As a consequence of the second point, the code we generate for loops went from: .LBB2: # no_exit.1 ... mov %EDI, %ESI inc %EDI cmp %ESI, 2 mov %ESI, %EDI jne .LBB2 # PC rel: no_exit.1 To: .LBB2: # no_exit.1 ... inc %ESI cmp %ESI, 3 jne .LBB2 # PC rel: no_exit.1 ... which has two fewer moves, and uses one less register. llvm-svn: 12961	2004-04-15 15:21:43 +00:00
Chris Lattner	6679e46b59	ADd a trivial instcombine: load null -> null llvm-svn: 12940	2004-04-14 03:28:36 +00:00
Chris Lattner	ff9362a8da	Add SCCP support for constant folding calls, implementing: test/Regression/Transforms/SCCP/calltest.ll llvm-svn: 12921	2004-04-13 19:43:54 +00:00
Chris Lattner	ca52d0468e	Add a simple call constant propagation interface. llvm-svn: 12919	2004-04-13 19:28:52 +00:00
Chris Lattner	d0dc6d5295	Constant propagation should remove the dead instructions llvm-svn: 12917	2004-04-13 19:28:20 +00:00
Chris Lattner	89e959bb1f	Fix LoopSimplify/2004-04-13-LoopSimplifyUpdateDomFrontier.ll LoopSimplify was not updating dominator frontiers correctly in some cases. llvm-svn: 12890	2004-04-13 16:23:25 +00:00
Chris Lattner	a6e22814ab	Refactor code a bit to make it simpler and eliminate the goto llvm-svn: 12888	2004-04-13 15:21:18 +00:00
Chris Lattner	8417052938	This patch addresses PR35: Loop simplify should reconstruct nested loops. This is fairly straight-forward, but was a real nightmare to get just perfect. aarg. :) llvm-svn: 12884	2004-04-13 05:05:33 +00:00
Chris Lattner	be43544429	Actually update the call graph as the inliner changes it. This allows us to execute other CallGraphSCCPasses after the inliner without crashing. llvm-svn: 12861	2004-04-12 05:37:29 +00:00
Chris Lattner	494a685449	Add support for removing invoke instructions llvm-svn: 12858	2004-04-12 05:15:13 +00:00
Chris Lattner	08f201bee5	Stop printing Function* llvm-svn: 12857	2004-04-12 04:06:56 +00:00
Chris Lattner	d041dcd92f	Simplify code a bit, and be sure to mark the external node as potentially throwing llvm-svn: 12856	2004-04-12 04:06:38 +00:00
Chris Lattner	24cf0200c7	Fix a bug in my select transformation llvm-svn: 12826	2004-04-11 01:39:19 +00:00
Chris Lattner	f16fe7206c	Update the value numbering interface. llvm-svn: 12824	2004-04-10 22:33:34 +00:00
Chris Lattner	623fba1107	Implement InstCombine/select.ll:test13* llvm-svn: 12821	2004-04-10 22:21:27 +00:00
Chris Lattner	cf4a996cba	Implement InstCombine/add.ll:test20 Canonicalize add of sign bit constant into a xor llvm-svn: 12819	2004-04-10 22:01:55 +00:00
Chris Lattner	69c4900512	Rewrite the GCSE pass to be substantially simpler, a bit more efficient, and a bit more powerful llvm-svn: 12817	2004-04-10 21:11:11 +00:00
Chris Lattner	f9d9665138	Fix spurious warning in release mode llvm-svn: 12816	2004-04-10 19:15:56 +00:00
Chris Lattner	d95ef7eff0	Simplify code a bit, and fix a bug that was breaking perlbmk llvm-svn: 12814	2004-04-10 18:06:21 +00:00
Chris Lattner	7ebfe61dc1	Fix a bug in my checkin last night that was breaking programs using invoke. llvm-svn: 12813	2004-04-10 16:53:29 +00:00
Chris Lattner	5093213c40	Fix previous patch llvm-svn: 12811	2004-04-10 07:27:48 +00:00
Chris Lattner	6149ac8991	Correctly update counters llvm-svn: 12810	2004-04-10 07:02:02 +00:00
Chris Lattner	cfa1adcdb8	Simplify code a bit, and use alias analysis to allow us to delete unused call and invoke instructions that are known to not write to memory. llvm-svn: 12807	2004-04-10 06:53:09 +00:00
Chris Lattner	56e4d3d8ad	Implement select.ll:test12* This transforms code like this: %C = or %A, %B %D = select %cond, %C, %A into: %C = select %cond, %B, 0 %D = or %A, %C Since B is often a constant, the select can often be eliminated. In any case, this reduces the usage count of A, allowing subsequent optimizations to happen. This xform applies when the operator is any of: add, sub, mul, or, xor, and, shl, shr llvm-svn: 12800	2004-04-09 23:46:01 +00:00
Chris Lattner	0aa565647c	Fold code like: if (C) V1 \|= V2; into: Vx = V1 \| V2; V1 = select C, V1, Vx when the expression can be evaluated unconditionally and is cheap to execute. This limited form of if conversion is quite handy in lots of cases. For example, it turns this testcase into straight-line code: int in0 ; int in1 ; int in2 ; int in3 ; int in4 ; int in5 ; int in6 ; int in7 ; int in8 ; int in9 ; int in10; int in11; int in12; int in13; int in14; int in15; long output; void mux(void) { output = (in0 ? 0x00000001 : 0) \| (in1 ? 0x00000002 : 0) \| (in2 ? 0x00000004 : 0) \| (in3 ? 0x00000008 : 0) \| (in4 ? 0x00000010 : 0) \| (in5 ? 0x00000020 : 0) \| (in6 ? 0x00000040 : 0) \| (in7 ? 0x00000080 : 0) \| (in8 ? 0x00000100 : 0) \| (in9 ? 0x00000200 : 0) \| (in10 ? 0x00000400 : 0) \| (in11 ? 0x00000800 : 0) \| (in12 ? 0x00001000 : 0) \| (in13 ? 0x00002000 : 0) \| (in14 ? 0x00004000 : 0) \| (in15 ? 0x00008000 : 0) ; } llvm-svn: 12798	2004-04-09 22:50:22 +00:00
Chris Lattner	183b336a54	Fold binary operators with a constant operand into select instructions that have a constant operand. This implements add.ll:test19, shift.ll:test15*, and others that are not tested llvm-svn: 12794	2004-04-09 19:05:30 +00:00
Chris Lattner	cf7baf3519	Implement select.ll:test11 llvm-svn: 12793	2004-04-09 18:19:44 +00:00
Chris Lattner	e228ee5870	Implement InstCombine/cast-propagate.ll llvm-svn: 12784	2004-04-08 20:39:49 +00:00
Chris Lattner	3b3861d305	Implement ScalarRepl/select_promote.ll llvm-svn: 12779	2004-04-08 19:59:34 +00:00
Chris Lattner	4d25c86b52	Remove the "really gross hacks" that are there to deal with recursive functions. Now we collect all of the call sites we are interested in inlining, then inline them. This entirely avoids issues with trying to inline a call site we got by inlining another call site. This also eliminates iterator invalidation issues. llvm-svn: 12770	2004-04-08 06:34:31 +00:00
Chris Lattner	1c631e813d	Implement InstCombine/select.ll:test[7-10] llvm-svn: 12769	2004-04-08 04:43:23 +00:00
Chris Lattner	2b2412d0c8	Implement test/Regression/Transforms/InstCombine/getelementptr_index.ll llvm-svn: 12762	2004-04-07 18:38:20 +00:00
Chris Lattner	4d1fcf1dcd	Fix a bug in yesterdays checkins which broke siod. siod is a great testcase! :) llvm-svn: 12659	2004-04-05 16:02:41 +00:00
Chris Lattner	8953b90aaa	Fix InstCombine/2004-04-04-InstCombineReplaceAllUsesWith.ll llvm-svn: 12658	2004-04-05 02:10:19 +00:00
Chris Lattner	69193f93b6	Support getelementptr instructions which use uint's to index into structure types and can have arbitrary 32- and 64-bit integer types indexing into sequential types. llvm-svn: 12653	2004-04-05 01:30:19 +00:00
Chris Lattner	e61b67d7d5	Rewrite the indvars pass to use the ScalarEvolution analysis. This also implements some new features for the indvars pass, including linear function test replacement, exit value substitution, and it works with a much more general class of induction variables and loops. llvm-svn: 12620	2004-04-02 20:24:31 +00:00
Chris Lattner	eed034bcd3	Fix the obvious bug in my previous checkin llvm-svn: 12618	2004-04-02 18:15:10 +00:00
Chris Lattner	9f0db32625	Implement Transforms/SimplifyCFG/return-merge.ll This actually causes us to turn code like: return C ? A : B; into a select instruction. llvm-svn: 12617	2004-04-02 18:13:43 +00:00
Chris Lattner	c24019c825	Fix PR310 and TailDup/2004-04-01-DemoteRegToStack.llx llvm-svn: 12597	2004-04-01 20:28:45 +00:00
Chris Lattner	59fdf74968	Remove some assertions that are now bogus with the last patch I put in llvm-svn: 12595	2004-04-01 19:21:46 +00:00
Chris Lattner	146d0df5e4	Fix PR306: Loop simplify incorrectly updates dominator information Testcase: LoopSimplify/2004-04-01-IncorrectDomUpdate.ll llvm-svn: 12592	2004-04-01 19:06:07 +00:00
Chris Lattner	61fab1409d	Add warning llvm-svn: 12573	2004-03-31 22:00:30 +00:00
Chris Lattner	709f03e2dd	Fix linking of constant expr casts due to type resolution changes. With this and the other patches 253.perlbmk links again. llvm-svn: 12565	2004-03-31 02:58:28 +00:00
Brian Gaeke	ef327be6ed	Start cleaning up this pass so that I can debug it. llvm-svn: 12548	2004-03-30 19:53:46 +00:00
Chris Lattner	81bdcb90ce	Now that all the code generators support the select instruction, and the instcombine pass can eliminate many nasty cases of them, start generating them in the optimizers llvm-svn: 12545	2004-03-30 19:44:05 +00:00
Chris Lattner	533bc49775	Implement select.ll:test[3-6] llvm-svn: 12544	2004-03-30 19:37:13 +00:00
Chris Lattner	059f390257	Add a simple select instruction lowering pass llvm-svn: 12540	2004-03-30 18:41:10 +00:00
Chris Lattner	56b5051428	X % -1 == X % 1 == 0 llvm-svn: 12520	2004-03-26 16:11:24 +00:00
Chris Lattner	57c67b06e9	Two changes: #1 is to unconditionally strip constantpointerrefs out of instruction operands where they are absolutely pointless and inhibit optimization. GRRR! #2 is to implement InstCombine/getelementptr_const.ll llvm-svn: 12519	2004-03-25 22:59:29 +00:00
Chris Lattner	abb77c9959	Teach the optimizer to delete zero sized alloca's (but not mallocs!) llvm-svn: 12507	2004-03-19 06:08:10 +00:00
Chris Lattner	232155dc1b	Fix bug: CodeExtractor/2004-03-17-MissedLiveIns.ll With this fix we now successfully extract all 149 loops from 256.bzip2 without crashing or miscompiling the program! llvm-svn: 12493	2004-03-18 05:56:32 +00:00
Chris Lattner	e83693560a	Add statistics to the loop extractor. The loop extractor has successfully extracted all 63 loops for Olden/bh without crashing and without miscompiling the program!!! llvm-svn: 12491	2004-03-18 05:46:10 +00:00
Chris Lattner	5bce0c807d	Fix problem with PHI nodes having multiple predecessors from different exit nodes llvm-svn: 12490	2004-03-18 05:43:18 +00:00
Chris Lattner	acd75986ee	Fix CodeExtractor/2004-03-17-UpdatePHIsOutsideRegion.ll llvm-svn: 12489	2004-03-18 05:38:31 +00:00
Chris Lattner	320d59f4cd	Seriously simplify and correct the PHI node handling code. llvm-svn: 12487	2004-03-18 05:28:49 +00:00
Chris Lattner	d8017a340d	Fix CodeExtractor/2004-03-17-OutputMismatch.ll llvm-svn: 12486	2004-03-18 04:12:05 +00:00
Chris Lattner	37de257ef0	Fix several bugs in the extractor: 1. Names were not put on the new arguments created (ok, this just helps sanity :) 2. Fix outgoing pointer values 3. Do not insert stores for values that had not been computed 4. Fix some wierd problems with the outset calculation This fixes CodeExtractor/2004-03-14-DominanceProblem.ll, making the extractor work on at least one simple case! llvm-svn: 12484	2004-03-18 03:49:40 +00:00
Chris Lattner	e9235d2dde	The code extractor needs dominator info. Provide it llvm-svn: 12483	2004-03-18 03:48:06 +00:00
Chris Lattner	cee3404d0a	Prune #includes, moving the module interface to the front. Note that this exposed the fact that the header was not self-contained. There is a reason we do things :) llvm-svn: 12481	2004-03-18 03:15:29 +00:00
Chris Lattner	a078f47b39	Fix compilation of mesa, which I broke earlier today llvm-svn: 12465	2004-03-17 02:02:47 +00:00
Chris Lattner	684fa5ac64	Be more accurate llvm-svn: 12464	2004-03-17 01:59:27 +00:00
Chris Lattner	a3783a577e	Fix bug in previous checkin llvm-svn: 12458	2004-03-16 23:36:49 +00:00
Chris Lattner	95057f6ad1	Okay, so there is no reasonable way for tail duplication to update SSA form, as it is making effectively arbitrary modifications to the CFG and we don't have a domset/domfrontier implementations that can handle the dynamic updates. Instead of having a bunch of code that doesn't actually work in practice, just demote any potentially tricky values to the stack (causing the problem to go away entirely). Later invocations of mem2reg will rebuild SSA for us. This fixes all of the major performance regressions with tail duplication from LLVM 1.1. For example, this loop: --- int popcount(int x) { int result = 0; while (x != 0) { result = result + (x & 0x1); x = x >> 1; } return result; } --- Used to be compiled into: int %popcount(int %X) { entry: br label %loopentry loopentry: ; preds = %entry, %no_exit %x.0 = phi int [ %X, %entry ], [ %tmp.9, %no_exit ] ; <int> [#uses=3] %result.1.0 = phi int [ 0, %entry ], [ %tmp.6, %no_exit ] ; <int> [#uses=2] %tmp.1 = seteq int %x.0, 0 ; <bool> [#uses=1] br bool %tmp.1, label %loopexit, label %no_exit no_exit: ; preds = %loopentry %tmp.4 = and int %x.0, 1 ; <int> [#uses=1] %tmp.6 = add int %tmp.4, %result.1.0 ; <int> [#uses=1] %tmp.9 = shr int %x.0, ubyte 1 ; <int> [#uses=1] br label %loopentry loopexit: ; preds = %loopentry ret int %result.1.0 } And is now compiled into: int %popcount(int %X) { entry: br label %no_exit no_exit: ; preds = %entry, %no_exit %x.0.0 = phi int [ %X, %entry ], [ %tmp.9, %no_exit ] ; <int> [#uses=2] %result.1.0.0 = phi int [ 0, %entry ], [ %tmp.6, %no_exit ] ; <int> [#uses=1] %tmp.4 = and int %x.0.0, 1 ; <int> [#uses=1] %tmp.6 = add int %tmp.4, %result.1.0.0 ; <int> [#uses=2] %tmp.9 = shr int %x.0.0, ubyte 1 ; <int> [#uses=2] %tmp.1 = seteq int %tmp.9, 0 ; <bool> [#uses=1] br bool %tmp.1, label %loopexit, label %no_exit loopexit: ; preds = %no_exit ret int %tmp.6 } llvm-svn: 12457	2004-03-16 23:29:09 +00:00
Chris Lattner	bb1a2cc7ab	This code was both incredibly complex and incredibly broken. Fix it. llvm-svn: 12456	2004-03-16 23:23:11 +00:00
Chris Lattner	fa48edfb7d	Punt if we see gigantic PHI nodes. This improves a huge interpreter loop testcase from 32.5s in -raise to take .3s llvm-svn: 12443	2004-03-16 19:52:53 +00:00
Chris Lattner	7a7b114871	Do not try to optimize PHI nodes with incredibly high degree. This reduces SCCP time from 615s to 1.49s on a large testcase that has a gigantic switch statement that all of the blocks in the function go to (an intepreter). llvm-svn: 12442	2004-03-16 19:49:59 +00:00
Chris Lattner	a64923ad26	Do not copy gigantic switch instructions llvm-svn: 12441	2004-03-16 19:45:22 +00:00
Chris Lattner	db5b8f4d6b	Fix a regression from this patch: http://mail.cs.uiuc.edu/pipermail/llvm-commits/Week-of-Mon-20040308/013095.html Basically, this patch only updated the immediate dominatees of the header node to tell them that the preheader also dominated them. In practice, ALL dominatees of the header node are also dominated by the preheader. This fixes: LoopSimplify/2004-03-15-IncorrectDomUpdate. and PR293 llvm-svn: 12434	2004-03-16 06:00:15 +00:00
Chris Lattner	95ce36da0d	Restore old inlining heuristic. As the comment indicates, this is a nasty horrible hack. llvm-svn: 12423	2004-03-15 06:38:14 +00:00
Chris Lattner	cd83282df1	Add counters for the number of calls elimianted llvm-svn: 12420	2004-03-15 05:46:59 +00:00
Chris Lattner	20cda2645e	Implement LICM of calls in simple cases. This is sufficient to move around sin/cos/strlen calls and stuff. This implements: LICM/call_sink_pure_function.ll LICM/call_sink_const_function.ll llvm-svn: 12415	2004-03-15 04:11:30 +00:00
Chris Lattner	fb87cdecd8	Mostly cosmetic improvements. Do fix the bug where a global value was considered an input. llvm-svn: 12406	2004-03-15 01:26:44 +00:00
Chris Lattner	73ab1fa7c8	Assert that input blocks meet the invariants we expect Simplify the input/output finder. All elements of a basic block are instructions. Any used arguments are also inputs. An instruction can only be used by another instruction. llvm-svn: 12405	2004-03-15 01:18:23 +00:00
Chris Lattner	2f155d8734	Fix several bugs in the loop extractor. In particular, subloops were never extracted, and a function that contained a single top-level loop never had the loop extracted, regardless of how much non-loop code there was. llvm-svn: 12403	2004-03-15 00:02:02 +00:00
Chris Lattner	5b2072ecd3	No correctness fixes here, just minor qoi fixes: * Don't insert a branch to the switch instruction after the call, just make it a single block. * Insert the new alloca instructions in the entry block of the original function instead of having them execute dynamically * Don't make the default edge of the switch instruction go back to the switch. The loop extractor shouldn't create new loops! * Give meaningful names to the alloca slots and the reload instructions * Some minor code simplifications llvm-svn: 12402	2004-03-14 23:43:24 +00:00
Chris Lattner	b4d8bf365c	Simplify code a bit, and fix bug CodeExtractor/2004-03-14-NoSwitchSupport.ll This also implements a two minor improvements: * Don't insert live-out stores IN the region, insert them on the code path that exits the region * If the region is exited to the same block from multiple paths, share the switch statement entry, live-out store code, and the basic block. llvm-svn: 12401	2004-03-14 23:05:49 +00:00
Chris Lattner	9c431f6c44	Simplify the code a bit by making the collection of basic blocks to extract a member of the class. While we're at it, turn the collection into a set instead of a vector to improve efficiency and make queries simpler. llvm-svn: 12400	2004-03-14 22:34:55 +00:00
Chris Lattner	a1672c1bd8	Split into two passes. Now there is the general loop extractor, usable on the command line, and the single loop extractor, usable by bugpoint llvm-svn: 12390	2004-03-14 20:01:36 +00:00
Chris Lattner	0137de5ecb	Passes don't print stuff! llvm-svn: 12385	2004-03-14 04:17:53 +00:00
Chris Lattner	b68659552a	Do not create empty basic blocks when the lowerswitch pass expects blocks to be non-empty! This fixes LowerSwitch/2004-03-13-SwitchIsDefaultCrash.ll llvm-svn: 12384	2004-03-14 04:14:31 +00:00
Chris Lattner	4fca71eb44	Minor random cleanups llvm-svn: 12382	2004-03-14 04:01:47 +00:00
Chris Lattner	6c3e8c78cf	FunctionPass's should not define their own 'run' method. Require 'simplified' loops, not just raw natural loops. This fixes CodeExtractor/2004-03-13-LoopExtractorCrash.ll llvm-svn: 12381	2004-03-14 04:01:06 +00:00
Chris Lattner	d078812f96	If a block is dead, dominators will not be calculated for it. Because of this loop information won't see it, and we could have unreachable blocks pointing to the non-header node of blocks in a natural loop. This isn't tidy, so have the loopsimplify pass clean it up. llvm-svn: 12380	2004-03-14 03:59:22 +00:00
Chris Lattner	3684469326	Verify functions as they are produced if -debug is specified. Reduce curly braceage llvm-svn: 12378	2004-03-14 03:17:22 +00:00
Chris Lattner	78a996aec4	Move prototype to IPO.h instead of Scalar.h Make sure that the file interface header (IPO.h) is included first remove dead #incldue llvm-svn: 12375	2004-03-14 02:37:16 +00:00
Chris Lattner	692a47aeb9	Indent anon namespace properly, add copyright block llvm-svn: 12373	2004-03-14 02:34:07 +00:00
Chris Lattner	41ec709e00	Move to the IPO library. Utils shouldn't contain passes. llvm-svn: 12372	2004-03-14 02:32:27 +00:00
Chris Lattner	8eebc49884	DemoteRegToStack got moved from DemoteRegToStack.h to Local.h llvm-svn: 12368	2004-03-14 02:13:38 +00:00
Chris Lattner	7d2a539735	Add some debugging output Fix InstCombine/2004-03-13-InstCombineInfLoop.ll which caused an infinite loop compiling (I think) povray. llvm-svn: 12365	2004-03-13 23:54:27 +00:00
Chris Lattner	2dc85b27e4	This change makes two big adjustments. * Be a lot more accurate about what the effects will be when inlining a call to a function when an argument is an alloca. * Dramatically reduce the penalty for inlining a call in a large function. This heuristic made it almost impossible to inline a function into a large function, no matter how small the callee is. llvm-svn: 12363	2004-03-13 23:15:45 +00:00
Chris Lattner	797cb2f6c1	This little patch speeds up the loop used to update the dominator set analysis. On the testcase from GCC PR12440, which has a LOT of loops (1392 of which require preheaders to be inserted), this speeds up the loopsimplify pass from 1.931s to 0.1875s. The loop in question goes from 1.65s -> 0.0097s, which isn't bad. All of these times are a debug build. This adds a dependency on DominatorTree analysis that was not there before, but we always had dominatortree available anyway, because LICM requires both loop simplify and DT, so this doesn't add any extra analysis in practice. llvm-svn: 12362	2004-03-13 22:01:26 +00:00
Chris Lattner	022167f13b	Implement sub.ll:test14 llvm-svn: 12355	2004-03-13 00:11:49 +00:00
Chris Lattner	92295c5031	Implement InstCombine/sub.ll:test12 & test13 llvm-svn: 12353	2004-03-12 23:53:13 +00:00
Chris Lattner	cb015ee6c0	Add constant folding wrapper support for select instructions. llvm-svn: 12319	2004-03-12 05:53:03 +00:00
Chris Lattner	59db22dcd4	Add sccp support for select instructions llvm-svn: 12318	2004-03-12 05:52:44 +00:00
Chris Lattner	b909e8b0d4	Add trivial optimizations for select instructions llvm-svn: 12317	2004-03-12 05:52:32 +00:00
Chris Lattner	721264aecc	Initial support for edge profiling llvm-svn: 12225	2004-03-08 17:54:34 +00:00
Chris Lattner	dae48f93b0	Split utility functions out of BlockProfiling.cpp llvm-svn: 12224	2004-03-08 17:06:13 +00:00
Chris Lattner	d91e676700	finegrainify namespacification llvm-svn: 12221	2004-03-08 16:45:53 +00:00
Chris Lattner	fe6f2e3e80	Implement ArgumentPromotion/aggregate-promote.ll This allows pointers to aggregate objects, whose elements are only read, to be promoted and passed in by element instead of by reference. This can enable a LOT of subsequent optimizations in the caller function. It's worth pointing out that this stuff happens a LOT of C++ programs, because objects in templates are generally passed around by reference. When these templates are instantiated on small aggregate or scalar types, however, it is more efficient to pass them in by value than by reference. This transformation triggers most on C++ codes (e.g. 334 times on eon), but does happen on C codes as well. For example, on mesa it triggers 72 times, and on gcc it triggers 35 times. this is amazingly good considering that we are using 'basicaa' so far. llvm-svn: 12202	2004-03-08 01:04:36 +00:00
Chris Lattner	cc544e57f3	Implement: ArgumentPromotion/chained.ll llvm-svn: 12200	2004-03-07 22:52:53 +00:00
Chris Lattner	64b8d697ad	Fix another minor bug, exposed by perlbmk llvm-svn: 12198	2004-03-07 22:43:27 +00:00
Chris Lattner	538fee7aa2	Since 'load null' is undefined, we can make it do whatever we want. Returning a zero value is the most likely way to cause further simplification, so we do it. llvm-svn: 12197	2004-03-07 22:16:24 +00:00
Chris Lattner	6770842b67	Fix a minor bug and turn debug output into, well, debug output. llvm-svn: 12195	2004-03-07 21:54:50 +00:00
Chris Lattner	483ae01c9c	New LLVM pass: argument promotion. This version only handles simple scalar variables. llvm-svn: 12193	2004-03-07 21:29:54 +00:00
Chris Lattner	7abcc387de	Don't emit things like malloc(16*1). Allocation instructions are fixed arity now. llvm-svn: 12086	2004-03-03 01:40:53 +00:00
Misha Brukman	f44acae31e	Implement ExtractCodeRegion() llvm-svn: 12070	2004-03-02 00:20:57 +00:00
Misha Brukman	f272f9b3d5	Make a note that this is usually used via bugpoint. llvm-svn: 12068	2004-03-02 00:19:09 +00:00
Misha Brukman	5af2be7d09	* Add implementation of ExtractBasicBlock() * Add comments to ExtractLoop() llvm-svn: 12053	2004-03-01 18:28:34 +00:00
Chris Lattner	5cf39339d1	Disable tail duplication in a case that breaks on Olden/tsp llvm-svn: 12021	2004-03-01 01:12:13 +00:00
Misha Brukman	c91e1ff50d	* Remove function to find "main" in a Module, there's a method for that * Removing extraneous empty space and empty comment lines llvm-svn: 12014	2004-02-29 23:09:10 +00:00
Chris Lattner	2de229f31b	Fix bug: test/Regression/Transforms/LowerInvoke/2004-02-29-PHICrash.llx ... which tickled the lowerinvoke pass because it used the BCE routines. llvm-svn: 12012	2004-02-29 22:24:41 +00:00
Chris Lattner	bf2963ef91	Fix PR255: [tailduplication] Single basic block loops are very rare Note that this is a band-aid put over a band-aid. This just undisables tail duplication in on very specific case that it seems to work in. llvm-svn: 11989	2004-02-29 06:41:20 +00:00
Chris Lattner	d3e6ae263c	Implement switch->br and br->switch folding by ripping out the switch->switch and br->br code and generalizing it. This allows us to compile code like this: int test(Instruction I) { if (isa<CastInst>(I)) return foo(7); else if (isa<BranchInst>(I)) return foo(123); else if (isa<UnwindInst>(I)) return foo(1241); else if (isa<SetCondInst>(I)) return foo(1); else if (isa<VAArgInst>(I)) return foo(42); return foo(-1); } into: int %_Z4testPN4llvm11InstructionE("struct.llvm::Instruction" %I) { entry: %tmp.1.i.i.i.i.i.i.i = getelementptr "struct.llvm::Instruction"* %I, long 0, ubyte 4 ; <uint> [#uses=1] %tmp.2.i.i.i.i.i.i.i = load uint %tmp.1.i.i.i.i.i.i.i ; <uint> [#uses=2] %tmp.2.i.i.i.i.i.i = seteq uint %tmp.2.i.i.i.i.i.i.i, 27 ; <bool> [#uses=0] switch uint %tmp.2.i.i.i.i.i.i.i, label %endif.0 [ uint 27, label %then.0 uint 2, label %then.1 uint 5, label %then.2 uint 14, label %then.3 uint 15, label %then.3 uint 16, label %then.3 uint 17, label %then.3 uint 18, label %then.3 uint 19, label %then.3 uint 32, label %then.4 ] ... As well as handling the cases in 176.gcc and many other programs more effectively. llvm-svn: 11964	2004-02-28 21:28:10 +00:00
Chris Lattner	772eafa332	if there is already a prototype for malloc/free, use it, even if it's incorrect. Do not just inject a new prototype. llvm-svn: 11951	2004-02-28 18:51:45 +00:00
Chris Lattner	51ea127bf3	Rename AddUsesToWorkList -> AddUsersToWorkList, as that is what it does. Create a new AddUsesToWorkList method optimize memmove/set/cpy of zero bytes to a noop. llvm-svn: 11941	2004-02-28 05:22:00 +00:00
Chris Lattner	f3a366062c	Turn 'free null' into nothing llvm-svn: 11940	2004-02-28 04:57:37 +00:00
Misha Brukman	8a2c28fdda	Right, it's really Extractor, not Extraction. llvm-svn: 11939	2004-02-28 03:37:58 +00:00
Misha Brukman	03a11340ff	A pass that uses the generic CodeExtractor to rip out every loop in every function, as long as the loop isn't the only one in that function. This should help debugging passes easier with BugPoint. llvm-svn: 11936	2004-02-28 03:33:01 +00:00
Misha Brukman	caa1a5abeb	A generic code extractor: given a list of BasicBlocks, it will rip them out into a new function, taking care of inputs and outputs. llvm-svn: 11935	2004-02-28 03:26:20 +00:00
Chris Lattner	e82c217b2f	setcond instructions don't have aliasing implications. llvm-svn: 11919	2004-02-27 18:09:25 +00:00
Chris Lattner	4f7accab96	Implement test/Regression/Transforms/InstCombine/canonicalize_branch.ll This is a really minor thing, but might help out the 'switch statement induction' code in simplifycfg. llvm-svn: 11900	2004-02-27 06:27:46 +00:00
Chris Lattner	79636d7cd5	Since LLVM uses structure type equivalence, it isn't useful to keep around multiple type names for the same structural type. Make DTE eliminate all but one of the type names llvm-svn: 11879	2004-02-26 20:02:23 +00:00
Chris Lattner	21e941fbfd	turn things like: if (X == 0 \|\| X == 2) ...where the comparisons and branches are in different blocks... into a switch instruction. This comes up a lot in various programs, and works well with the switch/switch merging code I checked earlier. For example, this testcase: int switchtest(int C) { return C == 0 ? f(123) : C == 1 ? f(3123) : C == 4 ? f(312) : C == 5 ? f(1234): f(444); } is converted into this: switch int %C, label %cond_false.3 [ int 0, label %cond_true.0 int 1, label %cond_true.1 int 4, label %cond_true.2 int 5, label %cond_true.3 ] instead of a whole bunch of conditional branches. Admittedly the code is ugly, and incomplete. To be complete, we need to add br -> switch merging and switch -> br merging. For example, this testcase: struct foo { int Q, R, Z; }; #define A (X->Q+X->R * 123) int test(struct foo X) { return A == 123 ? X1() : A == 12321 ? X2(): (A == 111 \|\| A == 222) ? X3() : A == 875 ? X4() : X5(); } Gets compiled to this: switch int %tmp.7, label %cond_false.2 [ int 123, label %cond_true.0 int 12321, label %cond_true.1 int 111, label %cond_true.2 int 222, label %cond_true.2 ] ... cond_false.2: ; preds = %entry %tmp.52 = seteq int %tmp.7, 875 ; <bool> [#uses=1] br bool %tmp.52, label %cond_true.3, label %cond_false.3 where the branch could be folded into the switch. This kind of thing occurs ALL OF THE TIME, especially in programs like 176.gcc, which is a horrible mess of code. It contains stuff like shudder*: #define SWITCH_TAKES_ARG(CHAR) \ ( (CHAR) == 'D' \ \|\| (CHAR) == 'U' \ \|\| (CHAR) == 'o' \ \|\| (CHAR) == 'e' \ \|\| (CHAR) == 'u' \ \|\| (CHAR) == 'I' \ \|\| (CHAR) == 'm' \ \|\| (CHAR) == 'L' \ \|\| (CHAR) == 'A' \ \|\| (CHAR) == 'h' \ \|\| (CHAR) == 'z') and #define CONST_OK_FOR_LETTER_P(VALUE, C) \ ((C) == 'I' ? SMALL_INTVAL (VALUE) \ : (C) == 'J' ? SMALL_INTVAL (-(VALUE)) \ : (C) == 'K' ? (unsigned)(VALUE) < 32 \ : (C) == 'L' ? ((VALUE) & 0xffff) == 0 \ : (C) == 'M' ? integer_ok_for_set (VALUE) \ : (C) == 'N' ? (VALUE) < 0 \ : (C) == 'O' ? (VALUE) == 0 \ : (C) == 'P' ? (VALUE) >= 0 \ : 0) and #define LEGITIMIZE_ADDRESS(X,OLDX,MODE,WIN) \ { \ if (GET_CODE (X) == PLUS && CONSTANT_ADDRESS_P (XEXP (X, 1))) \ (X) = gen_rtx (PLUS, SImode, XEXP (X, 0), \ copy_to_mode_reg (SImode, XEXP (X, 1))); \ if (GET_CODE (X) == PLUS && CONSTANT_ADDRESS_P (XEXP (X, 0))) \ (X) = gen_rtx (PLUS, SImode, XEXP (X, 1), \ copy_to_mode_reg (SImode, XEXP (X, 0))); \ if (GET_CODE (X) == PLUS && GET_CODE (XEXP (X, 0)) == MULT) \ (X) = gen_rtx (PLUS, SImode, XEXP (X, 1), \ force_operand (XEXP (X, 0), 0)); \ if (GET_CODE (X) == PLUS && GET_CODE (XEXP (X, 1)) == MULT) \ (X) = gen_rtx (PLUS, SImode, XEXP (X, 0), \ force_operand (XEXP (X, 1), 0)); \ if (GET_CODE (X) == PLUS && GET_CODE (XEXP (X, 0)) == PLUS) \ (X) = gen_rtx (PLUS, Pmode, force_operand (XEXP (X, 0), NULL_RTX),\ XEXP (X, 1)); \ if (GET_CODE (X) == PLUS && GET_CODE (XEXP (X, 1)) == PLUS) \ (X) = gen_rtx (PLUS, Pmode, XEXP (X, 0), \ force_operand (XEXP (X, 1), NULL_RTX)); \ if (GET_CODE (X) == SYMBOL_REF \|\| GET_CODE (X) == CONST \ \|\| GET_CODE (X) == LABEL_REF) \ (X) = legitimize_address (flag_pic, X, 0, 0); \ if (memory_address_p (MODE, X)) \ goto WIN; } and others. These macros get used multiple times of course. These are such lovely candidates for macros, aren't they? :) This code also nicely handles LLVM constructs that look like this: if (isa<CastInst>(I)) ... else if (isa<BranchInst>(I)) ... else if (isa<SetCondInst>(I)) ... else if (isa<UnwindInst>(I)) ... else if (isa<VAArgInst>(I)) ... where the isa can obviously be a dyn_cast as well. Switch instructions are a good thing. llvm-svn: 11870	2004-02-26 07:13:46 +00:00
Chris Lattner	8d1da1abee	My faith in programmers has been found to be totally misplaced. One would assume that if they don't intend to write to a global variable, that they would mark it as constant. However, there are people that don't understand that the compiler can do nice things for them if they give it the information it needs. This pass looks for blatently obvious globals that are only ever read from. Though it uses a trivially simple "alias analysis" of sorts, it is still able to do amazing things to important benchmarks. 253.perlbmk, for example, contains several *GIANT* function pointer tables that are not marked constant and should be. Marking them constant allows the optimizer to turn a whole bunch of indirect calls into direct calls. Note that only a link-time optimizer can do this transformation, but perlbmk does have several strings and other minor globals that can be marked constant by this pass when run from GCCAS. 176.gcc has a ton of strings and large tables that are marked constant, both at compile time (38 of them) and at link time (48 more). Other benchmarks give similar results, though it seems like big ones have disproportionally more than small ones. This pass is extremely quick and does good things. I'm going to enable it in gccas & gccld. Not bad for 50 SLOC. llvm-svn: 11836	2004-02-25 21:34:36 +00:00
Chris Lattner	9c6833c5ca	Fix incorrect debug code llvm-svn: 11821	2004-02-25 15:15:04 +00:00
Chris Lattner	8ee0593f0d	Fix a faulty optimization on FP values llvm-svn: 11801	2004-02-24 18:10:14 +00:00
Chris Lattner	90ea78edba	If a block is made dead, make sure to promptly remove it. llvm-svn: 11799	2004-02-24 16:09:21 +00:00
Chris Lattner	a2ab489135	Implement SimplifyCFG/switch_switch_fold.ll This case occurs many times in various benchmarks, especially when combined with the previous patch. This allows it to get stuff like: if (X == 4 \|\| X == 3) if (X == 5 \|\| X == 8) and switch (X) { case 4: case 5: case 6: if (X == 4 \|\| X == 5) llvm-svn: 11797	2004-02-24 07:23:58 +00:00
Chris Lattner	3cd98f054a	Rearrange code a bit llvm-svn: 11793	2004-02-24 05:54:22 +00:00
Chris Lattner	6f4b45acf5	Implement: test/Regression/Transforms/SimplifyCFG/switch_create.ll This turns code like this: if (X == 4 \| X == 7) and if (X != 4 & X != 7) into switch instructions. llvm-svn: 11792	2004-02-24 05:38:11 +00:00
Chris Lattner	ae739aefd7	Generate much more efficient code in programs like pifft llvm-svn: 11775	2004-02-23 21:46:58 +00:00
Chris Lattner	c40b9d7d51	Fix a small typeo in my checkin last night that broke vortex and other programs :( llvm-svn: 11774	2004-02-23 21:46:42 +00:00
Chris Lattner	f5ce254692	Fix InstCombine/2004-02-23-ShiftShiftOverflow.ll Also, turn 'shr int %X, 1234' into 'shr int %X, 31' llvm-svn: 11768	2004-02-23 20:30:06 +00:00
Chris Lattner	2b55ea38bc	Implement cast.ll::test14/15 llvm-svn: 11742	2004-02-23 07:16:20 +00:00
Chris Lattner	e79e854c5c	Refactor some code. In the mul - setcc folding case, we really care about whether this is the sign bit or not, so check unsigned comparisons as well. llvm-svn: 11740	2004-02-23 06:38:22 +00:00
Chris Lattner	c8a10c4b6a	Implement mul.ll:test11 llvm-svn: 11737	2004-02-23 06:00:11 +00:00
Chris Lattner	59611149ee	Implement "strength reduction" of X <= C and X >= C llvm-svn: 11735	2004-02-23 05:47:48 +00:00
Chris Lattner	2635b52d4e	Implement InstCombine/mul.ll:test10, which is a case that occurs when dealing with "predication" llvm-svn: 11734	2004-02-23 05:39:21 +00:00
Chris Lattner	8d0bacbb9e	Implement Transforms/InstCombine/cast.ll:test13, a case which occurs in a hot 164.gzip loop. llvm-svn: 11702	2004-02-22 05:25:17 +00:00
Chris Lattner	693e393fee	Fix PR245: Linking weak and strong global variables is dependent on link order llvm-svn: 11565	2004-02-17 21:56:04 +00:00
Chris Lattner	e42732e75f	Implement test/Regression/Transforms/SimplifyCFG/UncondBranchToReturn.ll, see the testcase for the reasoning. llvm-svn: 11496	2004-02-16 06:35:48 +00:00
Chris Lattner	4db2d22bea	Fold PHI nodes of constants which are only used by a single cast. This implements phi.ll:test4 llvm-svn: 11494	2004-02-16 05:07:08 +00:00
Chris Lattner	b36d908f7b	Teach LLVM to unravel the "swap idiom". This implements: Regression/Transforms/InstCombine/xor.ll:test20 llvm-svn: 11492	2004-02-16 03:54:20 +00:00
Chris Lattner	c207635fd5	Implement Transforms/InstCombine/xor.ll:test19 llvm-svn: 11490	2004-02-16 01:20:27 +00:00
Chris Lattner	d85e061575	Instead of producing calls to setjmp/longjmp, produce uses of the llvm.setjmp/llvm.longjmp intrinsics. llvm-svn: 11482	2004-02-15 22:24:27 +00:00
Chris Lattner	76b2ff4ded	Adjustments to support the new ConstantAggregateZero class llvm-svn: 11474	2004-02-15 05:55:15 +00:00
Chris Lattner	37a716fa80	Remove dependence on return type of ConstantStruct::get llvm-svn: 11466	2004-02-15 04:07:32 +00:00
Chris Lattner	c75bf528c1	Remove dependence on the return type of ConstantArray::get llvm-svn: 11463	2004-02-15 04:05:58 +00:00
Chris Lattner	283ffdfac5	Fix compilation of 126.gcc: intrinsic functions cannot throw, so they are not allowed in invoke instructions. Thus, if we are inlining a call to an intrinsic function into an invoke site, we don't need to turn the call into an invoke! llvm-svn: 11384	2004-02-13 16:47:35 +00:00
Chris Lattner	7db49ce5b4	Intrinsic functions cannot throw llvm-svn: 11383	2004-02-13 16:46:46 +00:00
Chris Lattner	7cbb22abe6	Expose a pass ID that can be 'required' llvm-svn: 11376	2004-02-13 16:16:16 +00:00
Chris Lattner	d4b36cf9bc	Remove obsolete comment. Unreachable blocks will automatically be left at the end of the function. llvm-svn: 11313	2004-02-11 05:20:50 +00:00
Chris Lattner	5add05129e	Add an _embarassingly simple_ implementation of basic block layout. This is more of a testcase for profiling information than anything that should reasonably be used, but it's a starting point. When I have more time I will whip this into better shape. llvm-svn: 11311	2004-02-11 04:53:20 +00:00
Chris Lattner	18d1f19fba	Implement SimplifyCFG/PhiEliminate.ll Having a proper 'select' instruction would allow the elimination of a lot of the special case cruft in this patch, but we don't have one yet. llvm-svn: 11307	2004-02-11 03:36:04 +00:00
Chris Lattner	838b845781	The hasConstantReferences predicate always returns false. llvm-svn: 11301	2004-02-11 01:17:07 +00:00
Chris Lattner	3232bbb9d8	initialization calls now return argc. If the program uses the argc value passed into main, make sure they use the return value of the init call instead of the one passed in. llvm-svn: 11262	2004-02-10 17:41:01 +00:00
Chris Lattner	37d46f4815	Only add the global variable with the abort message if an unwind actually occurs in the program. llvm-svn: 11249	2004-02-09 22:48:47 +00:00
Chris Lattner	e3af6f73ce	Don't depend on auto data conversion llvm-svn: 11229	2004-02-09 05:16:30 +00:00
Chris Lattner	ac6db755c3	Adjust to the changed StructType interface. In particular, getElementTypes() is gone. llvm-svn: 11228	2004-02-09 04:37:31 +00:00
Chris Lattner	fa829be4d3	Start using the new and improve interface to FunctionType arguments llvm-svn: 11224	2004-02-09 04:14:01 +00:00
Chris Lattner	57ea2e3294	The ConstantExpr::getCast call can cause a CPR to be generated. If so, strip it off. llvm-svn: 11213	2004-02-09 00:20:55 +00:00
Misha Brukman	3480e935d0	Fix grammar-o. llvm-svn: 11210	2004-02-08 22:27:33 +00:00
Chris Lattner	3b7f6b2217	Improve compatibility with programs that already have a prototype for 'write', even if it is wierd in some way. llvm-svn: 11207	2004-02-08 22:14:44 +00:00
Chris Lattner	fae8ab3088	rename the "exceptional" destination of an invoke instruction to the 'unwind' dest llvm-svn: 11202	2004-02-08 21:44:31 +00:00
Chris Lattner	56997dd283	Fix PR225: [pruneeh] -pruneeh pass removes invoke instructions it shouldn't llvm-svn: 11200	2004-02-08 21:15:59 +00:00
Chris Lattner	071bc60450	splitBasicBlock "does the right thing" now, no reason to reposition it. llvm-svn: 11199	2004-02-08 20:49:07 +00:00
Chris Lattner	108cadc274	Implement proper invoke/unwind lowering. This fixed PR16 "[lowerinvoke] The -lowerinvoke pass does not insert calls to setjmp/longjmp" llvm-svn: 11195	2004-02-08 19:53:56 +00:00
Chris Lattner	476488e669	Add a call to 'write' right before the call to abort() in the unwind path. This causes the JIT, or LLC'd program to print out a nice message, explaining WHY the program aborted. llvm-svn: 11184	2004-02-08 07:30:29 +00:00
Chris Lattner	2dd1c8d8ce	Fix another dominator update bug. These bugs keep getting exposed because GCSE keeps finding more code motion opportunities now that the dominators are correct! llvm-svn: 11142	2004-02-05 23:20:59 +00:00
Chris Lattner	c0c953f0bc	Fix bug updating dominators llvm-svn: 11140	2004-02-05 22:33:26 +00:00
Chris Lattner	f978c421e5	Add debug output llvm-svn: 11139	2004-02-05 22:33:19 +00:00
Chris Lattner	14ab84a483	Fix PR223: Loopsimplify incorrectly updates dominator information The problem is that the dominator update code didn't "realize" that it's possible for the newly inserted basic block to dominate anything. Because it IS possible, stuff was getting updated wrong. llvm-svn: 11137	2004-02-05 21:12:24 +00:00
Chris Lattner	39ad6f2772	Minor speedup, don't query ValueMap each time through the loop llvm-svn: 11123	2004-02-04 21:44:26 +00:00
Chris Lattner	6f8865bf9f	Two changes: 1. Don't scan to the end of alloca instructions in the caller function to insert inlined allocas, just insert at the top. This saves a lot of time inlining into functions with a lot of allocas. 2. Use splice to move the alloca instructions over, instead of remove/insert. This allows us to transfer a block at a time, and eliminates a bunch of silly symbol table manipulations. This speeds up the inliner on the testcase in PR209 from 1.73s -> 1.04s (67%) llvm-svn: 11118	2004-02-04 21:33:42 +00:00
Chris Lattner	0fa8c7c321	Optimize the case where we are inlining a function that contains only one basic block, and that basic block ends with a return instruction. In this case, we can just splice the cloned "body" of the function directly into the source basic block, avoiding a lot of rearrangement and splitBasicBlock's linear scan over the split block. This speeds up the inliner on the testcase in PR209 from 2.3s to 1.7s, a 35% reduction. llvm-svn: 11116	2004-02-04 04:17:06 +00:00
Chris Lattner	8d414ad035	Adjust to the new BasicBlock ctor, which requires a function parameter llvm-svn: 11114	2004-02-04 03:58:28 +00:00
Chris Lattner	0ff9da5fed	Remove unneeded code now that splitBasicBlock does the "right thing" llvm-svn: 11111	2004-02-04 03:21:51 +00:00
Chris Lattner	18ef3fda57	More refactoring. Move alloca instructions and handle invoke instructions before we delete the original call site, allowing slight simplifications of code, but nothing exciting. llvm-svn: 11109	2004-02-04 02:51:48 +00:00
Chris Lattner	9fc977eac4	Move the cloning of the function body much earlier in the inlinefunction process. The only optimization we did so far is to avoid creating a PHI node, then immediately destroying it in the common case where the callee has one return statement. Instead, we just don't create the return value. This has no noticable performance impact, but paves the way for future improvements. llvm-svn: 11108	2004-02-04 01:41:09 +00:00
Chris Lattner	a6578ef318	Give CloneBasicBlock an optional function argument to specify which function to add the cloned block to. This allows the block to be added to the function immediately, and all of the instructions to be immediately added to the function symbol table, which speeds up the inliner from 3.7 -> 3.38s on the PR209. llvm-svn: 11107	2004-02-04 01:19:43 +00:00
Chris Lattner	ae51cae111	Bunch up all locally used allocas by the block they are allocated in, and process them all as a group. This speeds up SRoA/mem2reg from 28.46s to 0.62s on the testcase from PR209. llvm-svn: 11100	2004-02-03 22:34:12 +00:00
Chris Lattner	3784188620	Handle extremely trivial cases extremely efficiently. This speeds up SRoA/mem2reg from 41.2s to 27.5s on the testcase in PR209. llvm-svn: 11099	2004-02-03 22:00:33 +00:00
Chris Lattner	c2f0aa58df	Disable (x - (y - z)) => (x + (z - y)) optimization for floating point. llvm-svn: 11083	2004-02-02 20:09:56 +00:00
Chris Lattner	cacd30b957	Update comment llvm-svn: 11082	2004-02-02 20:09:22 +00:00
Brian Gaeke	6204e75c4a	Make deadarghaX0r warning louder. (I just love typing haX0r. haX0r haX0r haX0r.) llvm-svn: 11079	2004-02-02 19:32:27 +00:00
Chris Lattner	ed9b12c31a	Disable tail duplication in any "hard" cases, where it might break SSA form. llvm-svn: 11052	2004-02-01 06:32:28 +00:00
Chris Lattner	7c91a6176c	Fix the count of the number of instructions removed llvm-svn: 11049	2004-02-01 05:15:07 +00:00
Misha Brukman	bf43787f33	Hyphenate `target-dependent' llvm-svn: 11003	2004-01-28 20:43:01 +00:00
Chris Lattner	1f7942fe7d	Fix InstCombine/2004-01-13-InstCombineInvokePHI.ll, which also fixes lots of C++ programs in Shootout-C++, including lists1 and moments, etc llvm-svn: 10845	2004-01-14 06:06:08 +00:00
Chris Lattner	6b052f2154	Clean up #includes llvm-svn: 10799	2004-01-12 19:56:36 +00:00
Chris Lattner	fcf21a75b0	Fix bug in previous checkin llvm-svn: 10798	2004-01-12 19:47:05 +00:00
Chris Lattner	c1e7cc0fbe	Eliminate use of ConstantHandling and ConstantExpr::getShift interfaces llvm-svn: 10796	2004-01-12 19:35:11 +00:00
Chris Lattner	d7ccc9e5a5	Add header file I accidentally removed in teh shuffle llvm-svn: 10795	2004-01-12 19:15:20 +00:00
Chris Lattner	c9fb4a3b89	Remove use of the ConstantHandling interfaces llvm-svn: 10793	2004-01-12 19:12:50 +00:00
Chris Lattner	429963742e	Remove use of ConstantExpr::getShift llvm-svn: 10792	2004-01-12 19:10:58 +00:00
Chris Lattner	1b7d4d7b63	Don't use ConstantExpr::getShift anymore llvm-svn: 10791	2004-01-12 19:08:43 +00:00
Chris Lattner	2853a7ed22	Remove use of ConstantHandling llvm-svn: 10789	2004-01-12 18:35:03 +00:00
Chris Lattner	118a76cb2f	Remove unneeded #include llvm-svn: 10788	2004-01-12 18:33:54 +00:00
Chris Lattner	fc6c859a0c	Move llvm::ConstantFoldInstruction from VMCore to here, next to ConstantFoldTerminator llvm-svn: 10785	2004-01-12 18:25:22 +00:00
Chris Lattner	81d8822396	Remove uses of ConstantHandling itf llvm-svn: 10783	2004-01-12 18:12:44 +00:00
Chris Lattner	0fe5b32c01	Use constantexprs for casts. Eliminate use of the ConstantHandling interfaces llvm-svn: 10779	2004-01-12 17:43:40 +00:00
Chris Lattner	fe992d4332	Fix fairly severe bug in my last checking where we treated all unfoldable constants as being "true" when evaluating branches. This was introduced because we now create constantexprs for the constants instead of failing the fold. llvm-svn: 10778	2004-01-12 17:40:36 +00:00
Chris Lattner	49f74522ec	* Implement minor performance optimization for the getelementptr case * Implement SCCP of load instructions, implementing Transforms/SCCP/loadtest.ll This allows us to fold expressions like "foo"[2], even if the pointer is only a conditional constant. llvm-svn: 10767	2004-01-12 04:29:41 +00:00
Chris Lattner	7e8af38637	Do not hack on volatile loads. I'm not sure what the point of a volatile load from constant memory is, but lets not take chances. llvm-svn: 10765	2004-01-12 04:13:56 +00:00
Chris Lattner	05fe6847a8	Implement SCCP/phitest.ll llvm-svn: 10763	2004-01-12 03:57:30 +00:00
Chris Lattner	fafa2ff2d6	Implement Transforms/ScalarRepl/phinodepromote.ll, which is an important case that the C/C++ front-end generates. llvm-svn: 10761	2004-01-12 01:18:32 +00:00
Chris Lattner	3bcecb92f3	Update obsolete comments Fix iterator invalidation problems which was causing -mstrip to miss some entries, and read free'd memory. This shrinks the symbol table of 254.gap from 333 to 284 bytes! :) llvm-svn: 10751	2004-01-10 21:36:49 +00:00
Chris Lattner	df3c342a4c	Finegrainify namespacification llvm-svn: 10727	2004-01-09 06:12:26 +00:00
Chris Lattner	fdf788eebd	Remove dependence on structure index type. s/MT/FT llvm-svn: 10726	2004-01-09 06:02:51 +00:00
Chris Lattner	49525f8cf4	Finegrainify namespacification llvm-svn: 10725	2004-01-09 06:02:20 +00:00
Chris Lattner	ff66958154	Finegrainify namespacification add flags for PR82 llvm-svn: 10724	2004-01-09 05:53:38 +00:00
Chris Lattner	9cc1a0e40d	Inching towards fixing PR82 llvm-svn: 10722	2004-01-09 05:44:50 +00:00
Chris Lattner	59d2d7fc33	Improve encapsulation in the Loop and LoopInfo classes by eliminating the getSubLoops/getTopLevelLoops methods, replacing them with iterator-based accessors. llvm-svn: 10714	2004-01-08 00:09:44 +00:00
Chris Lattner	56db5e98c8	Merging constants can cause further room for improvement. Iterate until we converge llvm-svn: 10618	2003-12-28 07:19:08 +00:00
Chris Lattner	30513e0a3a	rename ClassifyExpression -> ClassifyExpr llvm-svn: 10592	2003-12-23 08:04:08 +00:00
Chris Lattner	7e755e443f	More minor non-functional changes. This now computes the exit condition, though it doesn't do anything with it. llvm-svn: 10590	2003-12-23 07:47:09 +00:00
Chris Lattner	93bfb6c741	Remove extraneous #include finegrainify namespacification llvm-svn: 10589	2003-12-23 07:43:38 +00:00
Chris Lattner	c2ee05427e	Fix memory corruption bug PR193 llvm-svn: 10586	2003-12-22 23:49:36 +00:00
Chris Lattner	a02d5aa6ce	Don't mind me, I'm just refactoring away. This patch makes room for LFTR, but contains no functionality changes. llvm-svn: 10583	2003-12-22 09:53:29 +00:00
Chris Lattner	6449dcefbc	Implement IndVarsSimplify/pointer-indvars.ll, transforming pointer arithmetic into "array subscripts" llvm-svn: 10580	2003-12-22 05:02:01 +00:00
Chris Lattner	d3678bc7c5	Fix PR194 llvm-svn: 10573	2003-12-22 03:58:44 +00:00
Chris Lattner	fc7bdac1b3	Fix ADCE/2003-12-19-MergeReturn.llx llvm-svn: 10539	2003-12-19 09:08:34 +00:00
Chris Lattner	918460190f	Remove the wierd "Operands" loop, by traversing basicblocks in reverse order llvm-svn: 10536	2003-12-19 08:18:16 +00:00
Chris Lattner	547192d688	Implement LICM/sink_multiple.ll, by sinking all possible instructions in the loop before hoisting any. llvm-svn: 10534	2003-12-19 07:22:45 +00:00
Chris Lattner	031a3f8cc7	Generalize a special case to fix PR187 llvm-svn: 10531	2003-12-19 06:27:08 +00:00
Chris Lattner	91daeb5431	Factor code out into the Utils library llvm-svn: 10530	2003-12-19 05:58:40 +00:00
Chris Lattner	04efa4b155	Add new function llvm-svn: 10529	2003-12-19 05:56:28 +00:00
John Criswell	b22e9b4b35	Reverted back to previous revision - this was previously merged according to the CVS log messages. llvm-svn: 10517	2003-12-18 17:19:19 +00:00
John Criswell	86a3a48697	Merged in RELEASE_11. llvm-svn: 10516	2003-12-18 16:43:17 +00:00
Chris Lattner	9e2b42a0c8	When we delete instructions from the loop, make sure to remove them from the AliasSetTracker as well. llvm-svn: 10507	2003-12-18 08:12:32 +00:00
Chris Lattner	6c08bb8b8e	Fix for PR185 & IndVarsSimplify/2003-12-15-Crash.llx llvm-svn: 10473	2003-12-15 17:34:02 +00:00
Chris Lattner	884e824534	Refactor code just a little bit, allowing us to implement TailCallElim/return_constant.ll llvm-svn: 10467	2003-12-14 23:57:39 +00:00
Chris Lattner	d1c371c32c	Do not promote volatile alias sets into registers llvm-svn: 10458	2003-12-14 04:52:31 +00:00
Chris Lattner	34399dda2d	Fix LICM/2003-12-11-SinkingToPHI.ll, and quite possibly all of the other known problems in the universe. llvm-svn: 10409	2003-12-11 22:23:32 +00:00
Chris Lattner	027253b0d5	verifyFunction depends on dominator info, which levelraise does not declare that it needs. This is pretty scary code! This fixes Regression.Transforms.LevelRaise.2002-07-16-SourceAndDestCrash Regression.Transforms.LevelRaise.2002-07-31-AssertionFailure llvm-svn: 10406	2003-12-11 21:47:37 +00:00
Chris Lattner	6281fd3ead	Fix bug: LICM/sink_multiple_exits.ll Thanks for pointing this out John :) llvm-svn: 10387	2003-12-10 22:35:56 +00:00
Chris Lattner	55c2113b7b	Don't allow dead instructions to stop sinking early. llvm-svn: 10386	2003-12-10 20:43:29 +00:00
Chris Lattner	713907e2b8	Fix bug: IndVarsSimplify/2003-12-10-RemoveInstrCrash.llx llvm-svn: 10385	2003-12-10 20:43:04 +00:00
Chris Lattner	7e5bd59da2	Finegrainify namespacification Fix bug: LowerInvoke/2003-12-10-Crash.llx llvm-svn: 10382	2003-12-10 20:22:42 +00:00
Chris Lattner	ccd9f3c1f8	Finegrainify namespacification Reorder #includes Implement: IndVarsSimplify/2003-12-10-IndVarDeadCode.ll llvm-svn: 10376	2003-12-10 18:06:47 +00:00
Chris Lattner	7710f2f49e	Finegrainify namespacification Fix bug: LoopSimplify/2003-12-10-ExitBlocksProblem.ll llvm-svn: 10373	2003-12-10 17:20:35 +00:00
Chris Lattner	6364314a6e	Simplify code llvm-svn: 10371	2003-12-10 16:58:24 +00:00
Chris Lattner	48b4b852b4	Avoid performing two identical lookups when one will suffice llvm-svn: 10370	2003-12-10 16:57:24 +00:00
Chris Lattner	edda1af35a	Make LICM itself a bit more efficient, and make the generated code more efficient too: don't insert a store in every exit block, because a particular block may be exited to more than once by a loop llvm-svn: 10369	2003-12-10 15:56:24 +00:00
Chris Lattner	aaaea51090	Implement instruction sinking out of loops. This still can do a little bit better job, but this is the majority of the work. This implements LICM/sink*.ll llvm-svn: 10358	2003-12-10 06:41:05 +00:00
Chris Lattner	6c237bcdf2	Do not insert one entry PHI nodes in split exit blocks! llvm-svn: 10348	2003-12-09 23:12:55 +00:00
Chris Lattner	65c1193d55	Refactor code a little bit, eliminating the gratuitous InstVisitor, which should make subsequent changes simpler. This also allows us to hoist vaarg and vanext instructions llvm-svn: 10342	2003-12-09 19:32:44 +00:00
Chris Lattner	c05176843e	Fine grainify namespacification Code cleanups Make LICM::SafeToHoist marginally more efficient llvm-svn: 10341	2003-12-09 17:18:00 +00:00
Chris Lattner	50663a1a78	Implement: TailCallElim/accum_recursion_constant_arg.ll Also make sure to clean up any PHI nodes that are inserted which are pointless. llvm-svn: 10333	2003-12-08 23:37:35 +00:00
Chris Lattner	198e620752	Implement: test/Regression/Transforms/TailCallElim/accum_recursion.ll We now insert accumulator variables as necessary to eliminate tail recursion more aggressively. This is still fairly limited, but allows us to transform fib/factorial, and other functions into nice happy loops. :) llvm-svn: 10332	2003-12-08 23:19:26 +00:00
Chris Lattner	a7b6f3ab9c	Cleanup and restructure the code to make it easier to read and maintain. The only functionality change is that we now implement: Regression/Transforms/TailCallElim/intervening-inst.ll Which is really kinda pointless, because it means that trivially dead code does not interfere with -tce, but trivially dead code probably wouldn't be around anytime when this pass is run anyway. The point of including this change it to support other more aggressive transformations when we have the analysis capabilities to do so. llvm-svn: 10312	2003-12-08 05:34:54 +00:00
Chris Lattner	771804b541	Implement RaiseAllocations/FreeCastConstantExpr.ll llvm-svn: 10305	2003-12-07 01:42:08 +00:00
Chris Lattner	8427bffb9a	* Finegrainify namespacification * Transform: free <ty>* (cast <ty2>* X to <ty>) into free <ty2> X llvm-svn: 10303	2003-12-07 01:24:23 +00:00
Chris Lattner	40d2aeb28f	Finegrainify namespacification Fix regressions ScalarRepl/basictest.ll & arraytest.ll llvm-svn: 10287	2003-12-02 17:43:55 +00:00
Chris Lattner	8384f97ee4	Fix test: Transforms/LevelRaise/2003-11-28-IllegalTypeConversion.ll Some gep generalization changes llvm-svn: 10252	2003-11-29 05:31:25 +00:00
Chris Lattner	52310702a1	Do not use index type to determine what it is indexing into! llvm-svn: 10226	2003-11-25 21:09:18 +00:00
Chris Lattner	28ebb3e0a6	Delete dead line llvm-svn: 10164	2003-11-22 02:26:17 +00:00
Chris Lattner	f40cdbe856	Fix bug: Transforms/PruneEH/2003-11-21-PHIUpdate.llx llvm-svn: 10163	2003-11-22 02:20:36 +00:00
Chris Lattner	4cc2cc5c58	Do not crash when deleing a region with a dead invoke instruction llvm-svn: 10161	2003-11-22 02:13:08 +00:00
Chris Lattner	1ad805977d	Finegrainify namespacification The module stripping pass should not strip symbols on external globals llvm-svn: 10157	2003-11-22 01:29:35 +00:00
Chris Lattner	61b3f20bf1	Considering that CI is not even IN SCOPE here, I wooda thought the compiler would have caught this. sigh llvm-svn: 10142	2003-11-21 21:57:29 +00:00
Chris Lattner	f52e03c79e	Finegrainify namespacification llvm-svn: 10138	2003-11-21 21:54:22 +00:00
Chris Lattner	456031eed7	Get rid of using decls, finegrainify namespacification llvm-svn: 10137	2003-11-21 21:52:10 +00:00
Chris Lattner	51c28a5c1b	* Finegrainify namespacification * Make the cost metric for passing constants in as arguments to functions MUCH more accurate, by actually estimating the amount of code that will be constant propagated away. llvm-svn: 10136	2003-11-21 21:46:09 +00:00
Chris Lattner	a82f131abb	Finegrainify namespacification Print out the costs for functions that AREN'T inlined as well llvm-svn: 10135	2003-11-21 21:45:31 +00:00
Chris Lattner	a29600046d	Minor cleanups and simplifications llvm-svn: 10127	2003-11-21 16:52:05 +00:00
Chris Lattner	8791e26de1	* Finegrainify namespacification * Implement FuncResolve/2003-11-20-BogusResolveWarning.ll ... which eliminates a large number of annoying warnings. I know misha will miss them though! llvm-svn: 10123	2003-11-20 21:21:31 +00:00
Chris Lattner	2af517281d	Start using the nicer terminator auto-insertion API llvm-svn: 10111	2003-11-20 18:25:24 +00:00
Chris Lattner	63a0ccff44	Spew symbolic types! llvm-svn: 10110	2003-11-20 18:23:14 +00:00
Chris Lattner	18e5d5228a	When spewing out warnings during function resolution, do not vomit out pages and pages of non-symbolic types. llvm-svn: 10109	2003-11-20 18:19:35 +00:00
Misha Brukman	4f7ce560d5	This file was somehow missing a top-level comment line. llvm-svn: 10055	2003-11-17 19:35:17 +00:00
Chris Lattner	841dd53555	Fix PR116 llvm-svn: 10032	2003-11-16 21:39:27 +00:00
Chris Lattner	d76fe4ea7d	Implement feature: InstCombine/2003-11-13-ConstExprCastCall.ll llvm-svn: 9981	2003-11-13 19:17:02 +00:00
Brian Gaeke	960707c335	Put all LLVM code into the llvm namespace, as per bug 109. llvm-svn: 9903	2003-11-11 22:41:34 +00:00
Chris Lattner	1e6d3053f2	Reorganize code for locality, improve comments llvm-svn: 9857	2003-11-10 04:42:42 +00:00
Chris Lattner	4474336166	Adjust to new critical edge interface llvm-svn: 9853	2003-11-10 04:10:50 +00:00
Chris Lattner	984e11792f	Do NOT inline self recursive calls into other functions. This is causing the pool allocator no end of trouble, and doesn't make a lot of sense anyway. This does not solve the problem with mutually recursive functions, but they are much less common. llvm-svn: 9828	2003-11-09 05:05:36 +00:00
Chris Lattner	d61abe82d3	Untypo llvm-svn: 9827	2003-11-09 05:04:25 +00:00
Misha Brukman	ad03afcb34	Declare FunctionPasses as such so that they can be used in FunctionPassManager. llvm-svn: 9768	2003-11-07 17:20:18 +00:00
Chris Lattner	38cd27e450	Various cleanups and efficiency improvements llvm-svn: 9753	2003-11-06 19:46:29 +00:00
Chris Lattner	b0a4b49b23	Fix bug: PR93 llvm-svn: 9752	2003-11-06 19:18:49 +00:00
Chris Lattner	4e1b467594	Fix the problem with running cleanups in bugpoint: We were deleting arguments of intrinsic functions, causing the verifier to fail. llvm-svn: 9745	2003-11-05 21:53:41 +00:00
Chris Lattner	9e60aced2e	Split behavior into two pieces llvm-svn: 9741	2003-11-05 21:43:02 +00:00
Chris Lattner	8055fb3afa	Yet more fixes for constant expr shifts llvm-svn: 9739	2003-11-05 20:43:58 +00:00
Chris Lattner	ba55bd37fe	Further fixes for PR93 llvm-svn: 9738	2003-11-05 20:37:01 +00:00
Chris Lattner	7c94d1171a	Fix flawed logic that was breaking several SPEC benchmarks, including gzip and crafty. llvm-svn: 9731	2003-11-05 17:31:36 +00:00
Chris Lattner	813ec04735	Be gcc 3.4 clean llvm-svn: 9725	2003-11-05 06:12:18 +00:00
Chris Lattner	8f2f598024	Fix bug with previous implementation: - // ~(c-X) == X-(c-1) == X+(-c+1) + // ~(c-X) == X-c-1 == X+(-c-1) Implement: C - ~X == X + (1+C) llvm-svn: 9715	2003-11-05 01:06:05 +00:00
Chris Lattner	e580666532	Minor cleanup, plus implement InstCombine/xor.ll:test17 llvm-svn: 9711	2003-11-04 23:50:51 +00:00
Chris Lattner	0f68fa6569	Implement InstCombine/xor.ll:test(15\|16) llvm-svn: 9708	2003-11-04 23:37:10 +00:00
John Criswell	81587e798a	Checking in Chris's suggestions: Added assert() to ensure symbol table is well formed. Added code to remember the value that was found; resolving types can change the symbol table and invalidate the value of the iterator. Added comments to the ResolveTypes() function (mainly for my own benefit). Please feel free to correct the comments if they are not accurate. llvm-svn: 9693	2003-11-04 15:22:26 +00:00
Chris Lattner	6444c37488	Implement InstCombine/cast-set.ll:test6[a]. This improves code generated for a hot function in em3d llvm-svn: 9673	2003-11-03 05:17:03 +00:00
Chris Lattner	1693079e92	Implement InstCombine/cast-set.ll: test1, test2, test7 llvm-svn: 9670	2003-11-03 04:25:02 +00:00
Chris Lattner	af7893203b	Fix bug with zero sized casts llvm-svn: 9667	2003-11-03 01:29:41 +00:00

... 29 30 31 32 33 ...

4096 Commits