llvm-project

Commit Graph

Author	SHA1	Message	Date
Chris Lattner	e4ed42a426	Refactor some code into a function llvm-svn: 23603	2005-10-03 01:04:44 +00:00
Chris Lattner	360928dbed	This break is bogus and I have no idea why it was there. Basically it prevents memoizing code when IV's are used by phinodes outside of loops. In a simple example, we were getting this code before (note that r6 and r7 are isomorphic IV's): li r6, 0 or r7, r6, r6 LBB_test_3: ; no_exit lwz r2, 0(r3) cmpw cr0, r2, r5 or r2, r7, r7 beq cr0, LBB_test_5 ; loopexit LBB_test_4: ; endif addi r2, r7, 1 addi r7, r7, 1 addi r3, r3, 4 addi r6, r6, 1 cmpw cr0, r6, r4 blt cr0, LBB_test_3 ; no_exit Now we get: li r6, 0 LBB_test_3: ; no_exit or r2, r6, r6 lwz r6, 0(r3) cmpw cr0, r6, r5 beq cr0, LBB_test_6 ; loopexit LBB_test_4: ; endif addi r3, r3, 4 addi r6, r2, 1 cmpw cr0, r6, r4 blt cr0, LBB_test_3 ; no_exit this was noticed in em3d. llvm-svn: 23602	2005-10-03 00:37:33 +00:00
Chris Lattner	8fcce170cf	when checking if we should move a split edge block outside of a loop, check the presplit pred, not the post-split pred. This was causing us to make the wrong decision in some cases, leaving the critical edge block in the loop. llvm-svn: 23601	2005-10-03 00:31:52 +00:00
Jeff Cohen	f8a5e5ae6e	Fix VC++ warnings. llvm-svn: 23579	2005-10-01 03:57:14 +00:00
Chris Lattner	a554c9470b	Insert stores after phi nodes in the normal dest. This fixes LowerInvoke/2005-08-03-InvokeWithPHI.ll llvm-svn: 23525	2005-09-29 17:44:20 +00:00
Chris Lattner	87ef943a4c	Fold isascii into a simple comparison. This speeds up 197.parser by 7.4%, bringing the LLC time down to the CBE time. llvm-svn: 23521	2005-09-29 06:17:27 +00:00
Chris Lattner	5f6035feb0	remove a bunch of unneeded stuff, or self evident comments llvm-svn: 23519	2005-09-29 06:16:11 +00:00
Chris Lattner	c244e7c178	Implement a couple of memcmp folds from the todo list llvm-svn: 23517	2005-09-29 04:54:20 +00:00
Chris Lattner	ea7214b23d	Constant fold llvm.sqrt llvm-svn: 23487	2005-09-28 01:34:32 +00:00
Chris Lattner	3b63bb375c	add a note about a way to improve this code further, that I won't be getting to right now. llvm-svn: 23485	2005-09-27 22:44:59 +00:00
Chris Lattner	eb953f0ef8	Fix a regression in my previous patch, fixing GlobalOpt/2005-09-27-Crash.ll and PR632. llvm-svn: 23484	2005-09-27 22:28:11 +00:00
Chris Lattner	e285f5ed8f	Avoid spilling stack slots... to stack slots. llvm-svn: 23478	2005-09-27 21:33:12 +00:00
Chris Lattner	87eb249300	Completely rewrite 'correct' eh support. This changes how setjmp insertion is performed so it is only at most once per function that contains an invoke instead of once per invoke in the function. This patch has the following perks: 1. It fixes PR631, which complains about slowness. 2. If fixes PR240, which complains about non-volatile vars being live across setjmp/longjmps. 3. It improves (but does not fix) the jmpbuf alignment issue on itanium by not forcing the jmpbufs to always be 8-bytes off the alignment of the structure. 4. It speeds up 253.perlbmk from 338s to 13.70s (a 25x improvement!), making us now about 4% faster than GCC. Further improvements are also possible. llvm-svn: 23477	2005-09-27 21:18:17 +00:00
Chris Lattner	92233d2175	Make the pass name simpler llvm-svn: 23476	2005-09-27 21:10:32 +00:00
Chris Lattner	16cd356fb2	allow demotion to volatile values, add support for invoke llvm-svn: 23473	2005-09-27 19:39:00 +00:00
Chris Lattner	3d27e7f27f	Add support for external calls that we know how to constant fold. This implements ctor-list-opt.ll:CTOR8 llvm-svn: 23465	2005-09-27 05:02:43 +00:00
Chris Lattner	29b2780c8a	Fix a bug where we would evaluate stores into linkonce objects which could be potentially replaced at link-time. llvm-svn: 23463	2005-09-27 04:50:03 +00:00
Chris Lattner	65a3a0918f	Implement support for static constructors with calls in them. This is useful because gccas runs globalopt before inlining. This implements ctor-list-opt.ll:CTOR7 llvm-svn: 23462	2005-09-27 04:45:34 +00:00
Chris Lattner	da1889b778	Refactor this code a bit, no functionality changes. llvm-svn: 23460	2005-09-27 04:27:01 +00:00
Chris Lattner	f2f89af69a	Remove some dead code. ctor evaluation subsumes empty ctor elim llvm-svn: 23453	2005-09-26 20:38:20 +00:00
Chris Lattner	6bf2cd5735	Add support for alloca, implementing ctor-list-opt.ll:CTOR6 llvm-svn: 23452	2005-09-26 17:07:09 +00:00
Chris Lattner	46d9ff081d	Add a debug printout, fix a crash on kc++ llvm-svn: 23450	2005-09-26 07:34:35 +00:00
Chris Lattner	46af55e0e4	Implement loads/stores through GEP's of globals. This implements ctor-list-opt.ll:CTOR5. llvm-svn: 23449	2005-09-26 06:52:44 +00:00
Chris Lattner	61ff32cd70	Replace TraverseGEPInitializer with ConstantFoldLoadThroughGEPConstantExpr llvm-svn: 23447	2005-09-26 05:34:07 +00:00
Chris Lattner	02ae21e1e0	Eliminate GetGEPGlobalInitializer in favor of the more powerful ConstantFoldLoadThroughGEPConstantExpr function in the utils lib. llvm-svn: 23446	2005-09-26 05:28:52 +00:00
Chris Lattner	0b011ec8e2	Factor the GetGEPGlobalInitializer out of this pass and into Transforms/Utils as ConstantFoldLoadThroughGEPConstantExpr. llvm-svn: 23445	2005-09-26 05:28:06 +00:00
Chris Lattner	c13c7b9376	Move the ConstantFoldLoadThroughGEPConstantExpr function out of the InstCombine pass. llvm-svn: 23444	2005-09-26 05:27:10 +00:00
Chris Lattner	b009663e27	add a comment llvm-svn: 23442	2005-09-26 05:16:34 +00:00
Chris Lattner	4b05c322d5	Add support for getelementptr, load, and correctly reject volatile stores. llvm-svn: 23441	2005-09-26 05:15:37 +00:00
Chris Lattner	3e9ea5ffec	Add support for br/brcond/switch and phi llvm-svn: 23439	2005-09-26 04:57:38 +00:00
Chris Lattner	99e23fa74c	Add a simple interpreter to this code, allowing us to statically evaluate global ctors that are simple enough. This implements ctor-list-opt.ll:CTOR2. llvm-svn: 23437	2005-09-26 04:44:35 +00:00
Chris Lattner	696beefabb	factor some code into a InstallGlobalCtors method, add comments. No functionality change. llvm-svn: 23435	2005-09-26 02:31:18 +00:00
Chris Lattner	838bdc1836	Make the global opt optimizer work on modules with a null terminator, by accepting the null even with a non-65535 init prio llvm-svn: 23434	2005-09-26 02:19:27 +00:00
Chris Lattner	41b6a5a693	Factor this code out into a few methods. Implement the start of global ctor optimization. It is currently smart enough to remove the global ctor for cases like this: struct foo { foo() {} } x; ... saving a bit of startup time for the program. llvm-svn: 23433	2005-09-26 01:43:45 +00:00
Chris Lattner	f487768062	Fix some logic I broke that caused a regression on SimplifyLibCalls/2005-05-20-sprintf-crash.ll llvm-svn: 23430	2005-09-25 07:06:48 +00:00
Chris Lattner	0b3557f54a	Move MaskedValueIsZero up. Match a bunch of idioms for sign extensions, implementing InstCombine/signext.ll llvm-svn: 23428	2005-09-24 23:43:33 +00:00
Chris Lattner	175463a165	Simplify this code a bit by relying on recursive simplification. Support sprintf("%s", P)'s that have uses. s/hasNUses(0)/use_empty()/ llvm-svn: 23425	2005-09-24 22:17:06 +00:00
Chris Lattner	499e33646e	remove some debugging code llvm-svn: 23411	2005-09-23 18:49:09 +00:00
Chris Lattner	c59a371d45	Fold two consequtive branches that share a common destination between them. This implements SimplifyCFG/branch-fold.ll, and is useful on ?:/min/max heavy code llvm-svn: 23410	2005-09-23 18:47:20 +00:00
Chris Lattner	3a978bf66d	simplify some logic further llvm-svn: 23408	2005-09-23 07:23:18 +00:00
Chris Lattner	cc14ebc17b	pull a bunch of logic out of SimplifyCFG into a helper fn llvm-svn: 23407	2005-09-23 06:39:30 +00:00
Chris Lattner	6c70106053	Start threading across blocks with code in them, so long as the code does not define a value that is used outside of it's block. This catches many more simplifications, e.g. 854 in 176.gcc, 137 in vpr, etc. This implements branch-phi-thread.ll:test3.ll llvm-svn: 23397	2005-09-20 01:48:40 +00:00
Chris Lattner	f0bd8d0107	Implement merging of blocks with the same condition if the block has multiple predecessors. This implements branch-phi-thread.ll::test1 llvm-svn: 23395	2005-09-20 00:43:16 +00:00
Chris Lattner	049cb4482f	Reject a case we don't handle yet llvm-svn: 23393	2005-09-19 23:57:04 +00:00
Chris Lattner	a160924d57	remove debugging code :-/ llvm-svn: 23392	2005-09-19 23:50:15 +00:00
Chris Lattner	748f903046	Implement SimplifyCFG/branch-phi-thread.ll, the most trivial case of threading control across branches with determined outcomes. More generality to follow. This triggers a couple thousand times in specint. llvm-svn: 23391	2005-09-19 23:49:37 +00:00
Chris Lattner	b4b2530a1a	Refactor this code a bit and make it more general. This now compiles: struct S { unsigned int i : 6, j : 11, k : 15; } b; void plus2 (unsigned int x) { b.j += x; } To: _plus2: lis r2, ha16(L_b$non_lazy_ptr) lwz r2, lo16(L_b$non_lazy_ptr)(r2) lwz r4, 0(r2) slwi r3, r3, 6 add r3, r4, r3 rlwimi r3, r4, 0, 26, 14 stw r3, 0(r2) blr instead of: _plus2: lis r2, ha16(L_b$non_lazy_ptr) lwz r2, lo16(L_b$non_lazy_ptr)(r2) lwz r4, 0(r2) rlwinm r5, r4, 26, 21, 31 add r3, r5, r3 rlwimi r4, r3, 6, 15, 25 stw r4, 0(r2) blr by eliminating an 'and'. I'm pretty sure this is as small as we can go :) llvm-svn: 23386	2005-09-18 07:22:02 +00:00
Chris Lattner	797dee7705	Compile struct S { unsigned int i : 6, j : 11, k : 15; } b; void plus2 (unsigned int x) { b.j += x; } to: plus2: mov %EAX, DWORD PTR [b] mov %ECX, %EAX and %ECX, 131008 mov %EDX, DWORD PTR [%ESP + 4] shl %EDX, 6 add %EDX, %ECX and %EDX, 131008 and %EAX, -131009 or %EDX, %EAX mov DWORD PTR [b], %EDX ret instead of: plus2: mov %EAX, DWORD PTR [b] mov %ECX, %EAX shr %ECX, 6 and %ECX, 2047 add %ECX, DWORD PTR [%ESP + 4] shl %ECX, 6 and %ECX, 131008 and %EAX, -131009 or %ECX, %EAX mov DWORD PTR [b], %ECX ret llvm-svn: 23385	2005-09-18 06:30:59 +00:00
Chris Lattner	01f56c68e9	Generalize this transform, using MaskedValueIsZero, allowing us to compile: struct S { unsigned int i : 6, j : 11, k : 15; } b; void plus3 (unsigned int x) { b.k += x; } To: plus3: mov %EAX, DWORD PTR [%ESP + 4] shl %EAX, 17 add DWORD PTR [b], %EAX ret instead of: plus3: mov %EAX, DWORD PTR [%ESP + 4] shl %EAX, 17 mov %ECX, DWORD PTR [b] add %EAX, %ECX and %EAX, -131072 and %ECX, 131071 or %ECX, %EAX mov DWORD PTR [b], %ECX ret llvm-svn: 23384	2005-09-18 06:02:59 +00:00
Chris Lattner	4ebc8ab4e0	fix typeo llvm-svn: 23383	2005-09-18 05:25:20 +00:00
Chris Lattner	e5b23a6d67	Remove unintentionally committed code llvm-svn: 23382	2005-09-18 05:12:51 +00:00
Chris Lattner	27cb9dbd35	implement shift.ll:test25. This compiles: struct S { unsigned int i : 6, j : 11, k : 15; } b; void plus3 (unsigned int x) { b.k += x; } to: _plus3: lis r2, ha16(L_b$non_lazy_ptr) lwz r2, lo16(L_b$non_lazy_ptr)(r2) lwz r3, 0(r2) rlwinm r4, r3, 0, 0, 14 add r4, r4, r3 rlwimi r4, r3, 0, 15, 31 stw r4, 0(r2) blr instead of: _plus3: lis r2, ha16(L_b$non_lazy_ptr) lwz r2, lo16(L_b$non_lazy_ptr)(r2) lwz r4, 0(r2) srwi r5, r4, 17 add r3, r5, r3 slwi r3, r3, 17 rlwimi r3, r4, 0, 15, 31 stw r3, 0(r2) blr llvm-svn: 23381	2005-09-18 05:12:10 +00:00
Chris Lattner	af517574ce	Implement add.ll:test29. Codegening: struct S { unsigned int i : 6, j : 11, k : 15; } b; void plus1 (unsigned int x) { b.i += x; } as: _plus1: lis r2, ha16(L_b$non_lazy_ptr) lwz r2, lo16(L_b$non_lazy_ptr)(r2) lwz r4, 0(r2) add r3, r4, r3 rlwimi r3, r4, 0, 0, 25 stw r3, 0(r2) blr instead of: _plus1: lis r2, ha16(L_b$non_lazy_ptr) lwz r2, lo16(L_b$non_lazy_ptr)(r2) lwz r4, 0(r2) rlwinm r5, r4, 0, 26, 31 add r3, r5, r3 rlwimi r3, r4, 0, 0, 25 stw r3, 0(r2) blr llvm-svn: 23379	2005-09-18 04:24:45 +00:00
Chris Lattner	027eaf01cf	remove debug output llvm-svn: 23377	2005-09-18 03:50:25 +00:00
Chris Lattner	1521298993	Implement or.ll:test21. This teaches instcombine to be able to turn this: struct { unsigned int bit0:1; unsigned int ubyte:31; } sdata; void foo() { sdata.ubyte++; } into this: foo: add DWORD PTR [sdata], 2 ret instead of this: foo: mov %EAX, DWORD PTR [sdata] mov %ECX, %EAX add %ECX, 2 and %ECX, -2 and %EAX, 1 or %EAX, %ECX mov DWORD PTR [sdata], %EAX ret llvm-svn: 23376	2005-09-18 03:42:07 +00:00
Chris Lattner	a393e4d4b3	Fix the regression last night compiling povray llvm-svn: 23348	2005-09-14 17:32:56 +00:00
Chris Lattner	2a8932960d	Add a simple xform to simplify array accesses with casts in the way. This is useful for 178.galgel where resolution of dope vectors (by the optimizer) causes the scales to become apparent. llvm-svn: 23328	2005-09-13 18:36:04 +00:00
Chris Lattner	fd018c8dfe	Fix an issue where LSR would miss rewriting a use of an IV expression by a PHI node that is not the original PHI. This fixes up a dot-product loop in galgel, speeding it up from 18.47s to 16.13s. llvm-svn: 23327	2005-09-13 02:09:55 +00:00
Chris Lattner	567b81f0d2	Add a helper function, allowing us to simplify some code a bit, changing indentation, no functionality change llvm-svn: 23325	2005-09-13 00:40:14 +00:00
Chris Lattner	219175c84d	Implement a simple xform to turn code like this: if () { store A -> P; } else { store B -> P; } into a PHI node with one store, in the most trival case. This implements load.ll:test10. llvm-svn: 23324	2005-09-12 23:23:25 +00:00
Chris Lattner	e0bfdf1485	Another load-peephole optimization: do gcse when two loads are next to each other. This implements InstCombine/load.ll:test9 llvm-svn: 23322	2005-09-12 22:21:03 +00:00
Chris Lattner	b990f7d8ed	Implement a trivial form of store->load forwarding where the store and the load are exactly consequtive. This is picked up by other passes, but this triggers thousands of times in fortran programs that use static locals (and is thus a compile-time speedup). llvm-svn: 23320	2005-09-12 22:00:15 +00:00
Chris Lattner	8048b85e8f	Fix a regression from last night, which caused this pass to create invalid code for IV uses outside of loops that are not dominated by the latch block. We should only convert these uses to use the post-inc value if they ARE dominated by the latch block. Also use a new LoopInfo method to simplify some code. This fixes Transforms/LoopStrengthReduce/2005-09-12-UsesOutOutsideOfLoop.ll llvm-svn: 23318	2005-09-12 17:11:27 +00:00
Chris Lattner	a67648396a	_test: li r2, 0 LBB_test_1: ; no_exit.2 li r5, 0 stw r5, 0(r3) addi r2, r2, 1 addi r3, r3, 4 cmpwi cr0, r2, 701 blt cr0, LBB_test_1 ; no_exit.2 LBB_test_2: ; loopexit.2.loopexit addi r2, r2, 1 stw r2, 0(r4) blr [zion ~/llvm]$ cat > ~/xx Uses of IV's outside of the loop should use hte post-incremented version of the IV, not the preincremented version. This helps many loops (e.g. in sixtrack) which used to generate code like this (this is the code from the dont-hoist-simple-loop-constants.ll testcase): _test: li r2, 0 ** IV starts at 0 LBB_test_1: ; no_exit.2 or r5, r2, r2 Copy for loop exit li r2, 0 stw r2, 0(r3) addi r3, r3, 4 addi r2, r5, 1 addi r6, r5, 2 IV+2 cmpwi cr0, r6, 701 blt cr0, LBB_test_1 ; no_exit.2 LBB_test_2: ; loopexit.2.loopexit addi r2, r5, 2 IV+2 stw r2, 0(r4) blr And now generated code like this: _test: li r2, 1 * IV starts at 1 LBB_test_1: ; no_exit.2 li r5, 0 stw r5, 0(r3) addi r2, r2, 1 addi r3, r3, 4 cmpwi cr0, r2, 701 * IV.postinc + 0 blt cr0, LBB_test_1 LBB_test_2: ; loopexit.2.loopexit stw r2, 0(r4) * IV.postinc + 0 blr llvm-svn: 23313	2005-09-12 06:04:47 +00:00
Chris Lattner	530fe6ab30	implement Transforms/LoopStrengthReduce/dont-hoist-simple-loop-constants.ll. We used to emit this code for it: _test: li r2, 1 ;; Value tying up a register for the whole loop li r5, 0 LBB_test_1: ; no_exit.2 or r6, r5, r5 li r5, 0 stw r5, 0(r3) addi r5, r6, 1 addi r3, r3, 4 add r7, r2, r5 ;; should be addi r7, r5, 1 cmpwi cr0, r7, 701 blt cr0, LBB_test_1 ; no_exit.2 LBB_test_2: ; loopexit.2.loopexit addi r2, r6, 2 stw r2, 0(r4) blr now we emit this: _test: li r2, 0 LBB_test_1: ; no_exit.2 or r5, r2, r2 li r2, 0 stw r2, 0(r3) addi r3, r3, 4 addi r2, r5, 1 addi r6, r5, 2 ;; whoa, fold those adds! cmpwi cr0, r6, 701 blt cr0, LBB_test_1 ; no_exit.2 LBB_test_2: ; loopexit.2.loopexit addi r2, r5, 2 stw r2, 0(r4) blr more improvement coming. llvm-svn: 23306	2005-09-10 01:18:45 +00:00
Chris Lattner	b5e381a8cf	Fix a problem that Dan Berlin noticed, where reassociation would not succeed in building maximal expressions before simplifying them. In particular, i cases like this: X-(A+B+X) the code would consider A+B+X to be a maximal expression (not understanding that the single use '-' would be turned into a + later), simplify it (a noop) then later get simplified again. Each of these simplify steps is where the cost of reassociation comes from, so this patch should speed up the already fast pass a bit. Thanks to Dan for noticing this! llvm-svn: 23214	2005-09-02 07:07:58 +00:00
Chris Lattner	9fe263aa75	Avoid creating garbage instructions, just move the old add instruction to where we need it when converting -(A+B+C) -> -A + -B + -C. llvm-svn: 23213	2005-09-02 06:38:04 +00:00
Chris Lattner	d1325da091	add some assertions and fix problems where reassociate could access the Ops vector out of range llvm-svn: 23211	2005-09-02 05:23:22 +00:00
Chris Lattner	8ca5b2a6d2	Fix Regression/Transforms/Reassociate/2005-08-24-Crash.ll llvm-svn: 23019	2005-08-24 17:55:32 +00:00
Chris Lattner	4201cd1bbc	Transform floor((double)FLT) -> (double)floorf(FLT), implementing Regression/Transforms/SimplifyLibCalls/floor.ll. This triggers 19 times in 177.mesa. llvm-svn: 23017	2005-08-24 17:22:17 +00:00
Chris Lattner	ea7dfd53d6	Fix Transforms/LoopStrengthReduce/2005-08-17-OutOfLoopVariant.ll, a crash on 177.mesa llvm-svn: 22843	2005-08-17 21:22:41 +00:00
Chris Lattner	2bf7cb5213	Use a new helper to split critical edges, making the code simpler. Do not claim to not change the CFG. We do change the cfg to split critical edges. This isn't causing us a problem now, but could likely do so in the future. llvm-svn: 22824	2005-08-17 06:35:16 +00:00
Chris Lattner	5cf983ee0f	Fix a bad case in gzip where we put lots of things in registers across the loop, because a IV-dependent value was used outside of the loop and didn't have immediate-folding capability llvm-svn: 22798	2005-08-16 00:38:11 +00:00
Chris Lattner	47d3ec3525	Ooops, don't forget to clear this. The real inner loop is now: .LBB_foo_3: ; no_exit.1 lfd f2, 0(r9) lfd f3, 8(r9) fmul f4, f1, f2 fmadd f4, f0, f3, f4 stfd f4, 8(r9) fmul f3, f1, f3 fmsub f2, f0, f2, f3 stfd f2, 0(r9) addi r9, r9, 16 addi r8, r8, 1 cmpw cr0, r8, r4 ble .LBB_foo_3 ; no_exit.1 llvm-svn: 22782	2005-08-13 07:42:01 +00:00
Chris Lattner	5949d49032	Recursively scan scev expressions for common subexpressions. This allows us to handle nested loops much better, for example, by being able to tell that these two expressions: {( 8 + ( 16 * ( 1 + %Tmp11 + %Tmp12)) + %c_),+,( 16 * %Tmp 12)}<loopentry.1> {(( 16 * ( 1 + %Tmp11 + %Tmp12)) + %c_),+,( 16 * %Tmp12)}<loopentry.1> Have the following common part that can be shared: {(( 16 * ( 1 + %Tmp11 + %Tmp12)) + %c_),+,( 16 * %Tmp12)}<loopentry.1> This allows us to codegen an important inner loop in 168.wupwise as: .LBB_foo_4: ; no_exit.1 lfd f2, 16(r9) fmul f3, f0, f2 fmul f2, f1, f2 fadd f4, f3, f2 stfd f4, 8(r9) fsub f2, f3, f2 stfd f2, 16(r9) addi r8, r8, 1 addi r9, r9, 16 cmpw cr0, r8, r4 ble .LBB_foo_4 ; no_exit.1 instead of: .LBB_foo_3: ; no_exit.1 lfdx f2, r6, r9 add r10, r6, r9 lfd f3, 8(r10) fmul f4, f1, f2 fmadd f4, f0, f3, f4 stfd f4, 8(r10) fmul f3, f1, f3 fmsub f2, f0, f2, f3 stfdx f2, r6, r9 addi r9, r9, 16 addi r8, r8, 1 cmpw cr0, r8, r4 ble .LBB_foo_3 ; no_exit.1 llvm-svn: 22781	2005-08-13 07:27:18 +00:00
Chris Lattner	89c1dfc733	Teach SplitCriticalEdge to update LoopInfo if it is alive. This fixes a problem in LoopStrengthReduction, where it would split critical edges then confused itself with outdated loop information. llvm-svn: 22776	2005-08-13 01:38:43 +00:00
Chris Lattner	79396539d3	remove dead code. The exit block list is computed on demand, thus does not need to be updated. This code is a relic from when it did. llvm-svn: 22775	2005-08-13 01:30:36 +00:00
Chris Lattner	8447b49526	When splitting critical edges, make sure not to leave the new block in the middle of the loop. This turns a critical loop in gzip into this: .LBB_test_1: ; loopentry or r27, r28, r28 add r28, r3, r27 lhz r28, 3(r28) add r26, r4, r27 lhz r26, 3(r26) cmpw cr0, r28, r26 bne .LBB_test_8 ; loopentry.loopexit_crit_edge .LBB_test_2: ; shortcirc_next.0 add r28, r3, r27 lhz r28, 5(r28) add r26, r4, r27 lhz r26, 5(r26) cmpw cr0, r28, r26 bne .LBB_test_7 ; shortcirc_next.0.loopexit_crit_edge .LBB_test_3: ; shortcirc_next.1 add r28, r3, r27 lhz r28, 7(r28) add r26, r4, r27 lhz r26, 7(r26) cmpw cr0, r28, r26 bne .LBB_test_6 ; shortcirc_next.1.loopexit_crit_edge .LBB_test_4: ; shortcirc_next.2 add r28, r3, r27 lhz r26, 9(r28) add r28, r4, r27 lhz r25, 9(r28) addi r28, r27, 8 cmpw cr7, r26, r25 mfcr r26, 1 rlwinm r26, r26, 31, 31, 31 add r25, r8, r27 cmpw cr7, r25, r7 mfcr r25, 1 rlwinm r25, r25, 29, 31, 31 and. r26, r26, r25 bne .LBB_test_1 ; loopentry instead of this: .LBB_test_1: ; loopentry or r27, r28, r28 add r28, r3, r27 lhz r28, 3(r28) add r26, r4, r27 lhz r26, 3(r26) cmpw cr0, r28, r26 beq .LBB_test_3 ; shortcirc_next.0 .LBB_test_2: ; loopentry.loopexit_crit_edge add r2, r30, r27 add r8, r29, r27 b .LBB_test_9 ; loopexit .LBB_test_3: ; shortcirc_next.0 add r28, r3, r27 lhz r28, 5(r28) add r26, r4, r27 lhz r26, 5(r26) cmpw cr0, r28, r26 beq .LBB_test_5 ; shortcirc_next.1 .LBB_test_4: ; shortcirc_next.0.loopexit_crit_edge add r2, r11, r27 add r8, r12, r27 b .LBB_test_9 ; loopexit .LBB_test_5: ; shortcirc_next.1 add r28, r3, r27 lhz r28, 7(r28) add r26, r4, r27 lhz r26, 7(r26) cmpw cr0, r28, r26 beq .LBB_test_7 ; shortcirc_next.2 .LBB_test_6: ; shortcirc_next.1.loopexit_crit_edge add r2, r9, r27 add r8, r10, r27 b .LBB_test_9 ; loopexit .LBB_test_7: ; shortcirc_next.2 add r28, r3, r27 lhz r26, 9(r28) add r28, r4, r27 lhz r25, 9(r28) addi r28, r27, 8 cmpw cr7, r26, r25 mfcr r26, 1 rlwinm r26, r26, 31, 31, 31 add r25, r8, r27 cmpw cr7, r25, r7 mfcr r25, 1 rlwinm r25, r25, 29, 31, 31 and. r26, r26, r25 bne .LBB_test_1 ; loopentry Next up, improve the code for the loop. llvm-svn: 22769	2005-08-12 22:22:17 +00:00
Chris Lattner	4fec86d348	Fix a FIXME: if we are inserting code for a PHI argument, split the critical edge so that the code is not always executed for both operands. This prevents LSR from inserting code into loops whose exit blocks contain PHI uses of IV expressions (which are outside of loops). On gzip, for example, we turn this ugly code: .LBB_test_1: ; loopentry add r27, r3, r28 lhz r27, 3(r27) add r26, r4, r28 lhz r26, 3(r26) add r25, r30, r28 ;; Only live if exiting the loop add r24, r29, r28 ;; Only live if exiting the loop cmpw cr0, r27, r26 bne .LBB_test_5 ; loopexit into this: .LBB_test_1: ; loopentry or r27, r28, r28 add r28, r3, r27 lhz r28, 3(r28) add r26, r4, r27 lhz r26, 3(r26) cmpw cr0, r28, r26 beq .LBB_test_3 ; shortcirc_next.0 .LBB_test_2: ; loopentry.loopexit_crit_edge add r2, r30, r27 add r8, r29, r27 b .LBB_test_9 ; loopexit .LBB_test_2: ; shortcirc_next.0 ... blt .LBB_test_1 into this: .LBB_test_1: ; loopentry or r27, r28, r28 add r28, r3, r27 lhz r28, 3(r28) add r26, r4, r27 lhz r26, 3(r26) cmpw cr0, r28, r26 beq .LBB_test_3 ; shortcirc_next.0 .LBB_test_2: ; loopentry.loopexit_crit_edge add r2, r30, r27 add r8, r29, r27 b .LBB_t_3: ; shortcirc_next.0 .LBB_test_3: ; shortcirc_next.0 ... blt .LBB_test_1 Next step: get the block out of the loop so that the loop is all fall-throughs again. llvm-svn: 22766	2005-08-12 22:06:11 +00:00
Chris Lattner	b7ebe65c56	Change break critical edges to not remove, then insert, PHI node entries. Instead, just update the BB in-place. This is both faster, and it prevents split-critical-edges from shuffling the PHI argument list unneccesarily. llvm-svn: 22765	2005-08-12 21:58:07 +00:00
Chris Lattner	62df798919	remove some trickiness that broke yacr2 and some other programs last night llvm-svn: 22751	2005-08-10 17:15:20 +00:00
Chris Lattner	f83ce5faee	Make loop-simplify produce better loops by turning PHI nodes like X = phi [X, Y] into just Y. This often occurs when it seperates loops that have collapsed loop headers. This implements LoopSimplify/phi-node-simplify.ll llvm-svn: 22746	2005-08-10 02:07:32 +00:00
Chris Lattner	677d85784a	Allow indvar simplify to canonicalize ANY affine IV, not just affine IVs with constant stride. This implements Transforms/IndVarsSimplify/variable-stride-ivs.ll llvm-svn: 22744	2005-08-10 01:12:06 +00:00
Chris Lattner	edff91a49a	Teach LSR to strength reduce IVs that have a loop-invariant but non-constant stride. For code like this: void foo(float a, float b, int n, int stride_a, int stride_b) { int i; for (i=0; i<n; i++) a[istride_a] = b[istride_b]; } we now emit: .LBB_foo2_2: ; no_exit lfs f0, 0(r4) stfs f0, 0(r3) addi r7, r7, 1 add r4, r2, r4 add r3, r6, r3 cmpw cr0, r7, r5 blt .LBB_foo2_2 ; no_exit instead of: .LBB_foo_2: ; no_exit mullw r8, r2, r7 ;; multiply! slwi r8, r8, 2 lfsx f0, r4, r8 mullw r8, r2, r6 ;; multiply! slwi r8, r8, 2 stfsx f0, r3, r8 addi r2, r2, 1 cmpw cr0, r2, r5 blt .LBB_foo_2 ; no_exit loops with variable strides occur pretty often. For example, in SPECFP2K there are 317 variable strides in 177.mesa, 3 in 179.art, 14 in 188.ammp, 56 in 168.wupwise, 36 in 172.mgrid. Now we can allow indvars to turn functions written like this: void foo2(float a, float b, int n, int stride_a, int stride_b) { int i, ai = 0, bi = 0; for (i=0; i<n; i++) { a[ai] = b[bi]; ai += stride_a; bi += stride_b; } } into code like the above for better analysis. With this patch, they generate identical code. llvm-svn: 22740	2005-08-10 00:45:21 +00:00
Chris Lattner	dde7dc525e	Fix Regression/Transforms/LoopStrengthReduce/phi_node_update_multiple_preds.ll by being more careful about updating PHI nodes llvm-svn: 22739	2005-08-10 00:35:32 +00:00
Chris Lattner	c6c4d99a21	Fix some 80 column violations. Once we compute the evolution for a GEP, tell SE about it. This allows users of the GEP to know it, if the users are not direct. This allows us to compile this testcase: void fbSolidFillmmx(int w, unsigned char d) { while (w >= 64) { (unsigned long long ) (d + 0) = 0; (unsigned long long ) (d + 8) = 0; (unsigned long long ) (d + 16) = 0; (unsigned long long ) (d + 24) = 0; (unsigned long long ) (d + 32) = 0; (unsigned long long ) (d + 40) = 0; (unsigned long long ) (d + 48) = 0; (unsigned long long *) (d + 56) = 0; w -= 64; d += 64; } } into: .LBB_fbSolidFillmmx_2: ; no_exit li r2, 0 stw r2, 0(r4) stw r2, 4(r4) stw r2, 8(r4) stw r2, 12(r4) stw r2, 16(r4) stw r2, 20(r4) stw r2, 24(r4) stw r2, 28(r4) stw r2, 32(r4) stw r2, 36(r4) stw r2, 40(r4) stw r2, 44(r4) stw r2, 48(r4) stw r2, 52(r4) stw r2, 56(r4) stw r2, 60(r4) addi r4, r4, 64 addi r3, r3, -64 cmpwi cr0, r3, 63 bgt .LBB_fbSolidFillmmx_2 ; no_exit instead of: .LBB_fbSolidFillmmx_2: ; no_exit li r11, 0 stw r11, 0(r4) stw r11, 4(r4) stwx r11, r10, r4 add r12, r10, r4 stw r11, 4(r12) stwx r11, r9, r4 add r12, r9, r4 stw r11, 4(r12) stwx r11, r8, r4 add r12, r8, r4 stw r11, 4(r12) stwx r11, r7, r4 add r12, r7, r4 stw r11, 4(r12) stwx r11, r6, r4 add r12, r6, r4 stw r11, 4(r12) stwx r11, r5, r4 add r12, r5, r4 stw r11, 4(r12) stwx r11, r2, r4 add r12, r2, r4 stw r11, 4(r12) addi r4, r4, 64 addi r3, r3, -64 cmpwi cr0, r3, 63 bgt .LBB_fbSolidFillmmx_2 ; no_exit llvm-svn: 22737	2005-08-09 23:39:36 +00:00
Chris Lattner	02742710f3	SCEVAddExpr::get() of an empty list is invalid. llvm-svn: 22724	2005-08-09 01:13:47 +00:00
Chris Lattner	a091ff1764	Implement: LoopStrengthReduce/share_ivs.ll Two changes: * Only insert one PHI node for each stride. Other values are live in values. This cannot introduce higher register pressure than the previous approach, and can take advantage of reg+reg addressing modes. * Factor common base values out of uses before moving values from the base to the immediate fields. This improves codegen by starting the stride-specific PHI node out at a common place for each IV use. As an example, we used to generate this for a loop in swim: .LBB_main_no_exit_2E_6_2E_i_no_exit_2E_7_2E_i_2: ; no_exit.7.i lfd f0, 0(r8) stfd f0, 0(r3) lfd f0, 0(r6) stfd f0, 0(r7) lfd f0, 0(r2) stfd f0, 0(r5) addi r9, r9, 1 addi r2, r2, 8 addi r5, r5, 8 addi r6, r6, 8 addi r7, r7, 8 addi r8, r8, 8 addi r3, r3, 8 cmpw cr0, r9, r4 bgt .LBB_main_no_exit_2E_6_2E_i_no_exit_2E_7_2E_i_1 now we emit: .LBB_main_no_exit_2E_6_2E_i_no_exit_2E_7_2E_i_2: ; no_exit.7.i lfdx f0, r8, r2 stfdx f0, r9, r2 lfdx f0, r5, r2 stfdx f0, r7, r2 lfdx f0, r3, r2 stfdx f0, r6, r2 addi r10, r10, 1 addi r2, r2, 8 cmpw cr0, r10, r4 bgt .LBB_main_no_exit_2E_6_2E_i_no_exit_2E_7_2E_i_1 As another more dramatic example, we used to emit this: .LBB_main_L_90_no_exit_2E_0_2E_i16_no_exit_2E_1_2E_i19_2: ; no_exit.1.i19 lfd f0, 8(r21) lfd f4, 8(r3) lfd f5, 8(r27) lfd f6, 8(r22) lfd f7, 8(r5) lfd f8, 8(r6) lfd f9, 8(r30) lfd f10, 8(r11) lfd f11, 8(r12) fsub f10, f10, f11 fadd f5, f4, f5 fmul f5, f5, f1 fadd f6, f6, f7 fadd f6, f6, f8 fadd f6, f6, f9 fmadd f0, f5, f6, f0 fnmsub f0, f10, f2, f0 stfd f0, 8(r4) lfd f0, 8(r25) lfd f5, 8(r26) lfd f6, 8(r23) lfd f9, 8(r28) lfd f10, 8(r10) lfd f12, 8(r9) lfd f13, 8(r29) fsub f11, f13, f11 fadd f4, f4, f5 fmul f4, f4, f1 fadd f5, f6, f9 fadd f5, f5, f10 fadd f5, f5, f12 fnmsub f0, f4, f5, f0 fnmsub f0, f11, f3, f0 stfd f0, 8(r24) lfd f0, 8(r8) fsub f4, f7, f8 fsub f5, f12, f10 fnmsub f0, f5, f2, f0 fnmsub f0, f4, f3, f0 stfd f0, 8(r2) addi r20, r20, 1 addi r2, r2, 8 addi r8, r8, 8 addi r10, r10, 8 addi r12, r12, 8 addi r6, r6, 8 addi r29, r29, 8 addi r28, r28, 8 addi r26, r26, 8 addi r25, r25, 8 addi r24, r24, 8 addi r5, r5, 8 addi r23, r23, 8 addi r22, r22, 8 addi r3, r3, 8 addi r9, r9, 8 addi r11, r11, 8 addi r30, r30, 8 addi r27, r27, 8 addi r21, r21, 8 addi r4, r4, 8 cmpw cr0, r20, r7 bgt .LBB_main_L_90_no_exit_2E_0_2E_i16_no_exit_2E_1_2E_i19_1 we now emit: .LBB_main_L_90_no_exit_2E_0_2E_i16_no_exit_2E_1_2E_i19_2: ; no_exit.1.i19 lfdx f0, r21, r20 lfdx f4, r3, r20 lfdx f5, r27, r20 lfdx f6, r22, r20 lfdx f7, r5, r20 lfdx f8, r6, r20 lfdx f9, r30, r20 lfdx f10, r11, r20 lfdx f11, r12, r20 fsub f10, f10, f11 fadd f5, f4, f5 fmul f5, f5, f1 fadd f6, f6, f7 fadd f6, f6, f8 fadd f6, f6, f9 fmadd f0, f5, f6, f0 fnmsub f0, f10, f2, f0 stfdx f0, r4, r20 lfdx f0, r25, r20 lfdx f5, r26, r20 lfdx f6, r23, r20 lfdx f9, r28, r20 lfdx f10, r10, r20 lfdx f12, r9, r20 lfdx f13, r29, r20 fsub f11, f13, f11 fadd f4, f4, f5 fmul f4, f4, f1 fadd f5, f6, f9 fadd f5, f5, f10 fadd f5, f5, f12 fnmsub f0, f4, f5, f0 fnmsub f0, f11, f3, f0 stfdx f0, r24, r20 lfdx f0, r8, r20 fsub f4, f7, f8 fsub f5, f12, f10 fnmsub f0, f5, f2, f0 fnmsub f0, f4, f3, f0 stfdx f0, r2, r20 addi r19, r19, 1 addi r20, r20, 8 cmpw cr0, r19, r7 bgt .LBB_main_L_90_no_exit_2E_0_2E_i16_no_exit_2E_1_2E_i19_1 llvm-svn: 22722	2005-08-09 00:18:09 +00:00
Chris Lattner	37c24cc98c	Suck the base value out of the UsersToProcess vector into the BasedUser class to simplify the code. Fuse two loops. llvm-svn: 22721	2005-08-08 22:56:21 +00:00
Chris Lattner	37ed895bf1	Split MoveLoopVariantsToImediateField out from MoveImmediateValues. The first is a correctness thing, and the later is an optzn thing. This also is needed to support a future change. llvm-svn: 22720	2005-08-08 22:32:34 +00:00
Chris Lattner	9f269e40c9	Use the new 'moveBefore' method to simplify some code. Really, which is easier to understand? :) llvm-svn: 22706	2005-08-08 19:11:57 +00:00
Chris Lattner	14203e85b2	Not all constants are legal immediates in load/store instructions. llvm-svn: 22704	2005-08-08 06:25:50 +00:00
Chris Lattner	c70bbc0c41	Implement LoopStrengthReduce/share_code_in_preheader.ll by having one rewriter for all code inserted into the preheader, which is never flushed. llvm-svn: 22702	2005-08-08 05:47:49 +00:00
Chris Lattner	9bfa6f8784	Implement a simple optimization for the termination condition of the loop. The termination condition actually wants to use the post-incremented value of the loop, not a new indvar with an unusual base. On PPC, for example, this allows us to compile LoopStrengthReduce/exit_compare_live_range.ll to: _foo: li r2, 0 .LBB_foo_1: ; no_exit li r5, 0 stw r5, 0(r3) addi r2, r2, 1 cmpw cr0, r2, r4 bne .LBB_foo_1 ; no_exit blr instead of: _foo: li r2, 1 ;; IV starts at 1, not 0 .LBB_foo_1: ; no_exit li r5, 0 stw r5, 0(r3) addi r5, r2, 1 cmpw cr0, r2, r4 or r2, r5, r5 ;; Reg-reg copy, extra live range bne .LBB_foo_1 ; no_exit blr This implements LoopStrengthReduce/exit_compare_live_range.ll llvm-svn: 22699	2005-08-08 05:28:22 +00:00
Chris Lattner	579b20b747	All stats are "Number of ..." llvm-svn: 22694	2005-08-07 20:02:04 +00:00
Chris Lattner	2c14cf7b74	Add some simple folds that occur in bitfield cases. Fix a minor bug in isHighOnes, where it would consider 0 to have high ones. llvm-svn: 22693	2005-08-07 07:03:10 +00:00
Chris Lattner	134ebd0801	Fix typoCVS: ---------------------------------------------------------------------- llvm-svn: 22692	2005-08-07 07:00:52 +00:00
Chris Lattner	f4dd8c445c	* Use the new PHINode::hasConstantValue method to simplify some code * Teach this code to move allocas out of the loop when tail call eliminating a call marked 'tail'. This implements TailCallElim/move_alloca_for_tail_call.ll * Do not perform this transformation if a call is marked 'tail' and if there are allocas that we cannot move out of the loop in #2. Doing so would increase the stack usage of the function. This implements fixes PR615 and TailCallElim/dont-tce-tail-marked-call.ll. llvm-svn: 22690	2005-08-07 04:27:41 +00:00
Chris Lattner	11e7a5eda7	Make sure to clean CastedPointers after casts are potentially deleted. This fixes LSR crashes on 301.apsi, 191.fma3d, and 189.lucas llvm-svn: 22673	2005-08-05 01:30:11 +00:00
Chris Lattner	9f9c260b8c	now that hasConstantValue defaults to only returning values that dominate the PHI node, this ugly code can vanish. llvm-svn: 22672	2005-08-05 01:04:30 +00:00
Chris Lattner	257efb2ad3	This code can handle non-dominating instructions llvm-svn: 22667	2005-08-05 00:57:45 +00:00
Nate Begeman	b392321cae	Fix a fixme in CondPropagate.cpp by moving a PhiNode optimization into BasicBlock's removePredecessor routine. This requires shuffling around the definition and implementation of hasContantValue from Utils.h,cpp into Instructions.h,cpp llvm-svn: 22664	2005-08-04 23:24:19 +00:00
Chris Lattner	45f8b6e7aa	Modify how immediates are removed from base expressions to deal with the fact that the symbolic evaluator is not always able to use subtraction to remove expressions. This makes the code faster, and fixes the last crash on 178.galgel. Finally, add a statistic to see how many phi nodes are inserted. On 178.galgel, we get the follow stats: 2562 loop-reduce - Number of PHIs inserted 3927 loop-reduce - Number of GEPs strength reduced llvm-svn: 22662	2005-08-04 22:34:05 +00:00
Chris Lattner	a6d7c355bc	* Refactor some code into a new BasedUser::RewriteInstructionToUseNewBase method. * Fix a crash on 178.galgel, where we would insert expressions before PHI nodes instead of into the PHI node predecessor blocks. llvm-svn: 22657	2005-08-04 20:03:32 +00:00
Chris Lattner	0f7c0fa2a7	Fix a case that caused this to crash on 178.galgel llvm-svn: 22653	2005-08-04 19:26:19 +00:00
Chris Lattner	acc42c4df1	Teach LSR about loop-variant expressions, such as loops like this: for (i = 0; i < N; ++i) A[i][foo()] = 0; here we still want to strength reduce the A[i] part, even though foo() is l-v. This also simplifies some of the 'CanReduce' logic. This implements Transforms/LoopStrengthReduce/ops_after_indvar.ll llvm-svn: 22652	2005-08-04 19:08:16 +00:00
Nate Begeman	456044b724	Remove some more dead code. llvm-svn: 22650	2005-08-04 18:13:56 +00:00
Chris Lattner	eaf24725b2	Refactor this code substantially with the following improvements: 1. We only analyze instructions once, guaranteed 2. AnalyzeGetElementPtrUsers has been ripped apart and replaced with something much simpler. The next step is to handle expressions that are not all indvar+loop-invariant values (e.g. handling indvar+loopvariant). llvm-svn: 22649	2005-08-04 17:40:30 +00:00
Chris Lattner	6f286b760f	refactor some code llvm-svn: 22643	2005-08-04 01:19:13 +00:00
Chris Lattner	6510749050	invert to if's to make the logic simpler llvm-svn: 22641	2005-08-04 00:40:47 +00:00
Chris Lattner	a0102fbc4f	When processing outer loops and we find uses of an IV in inner loops, make sure to handle the use, just don't recurse into it. This permits us to generate this code for a simple nested loop case: .LBB_foo_0: ; entry stwu r1, -48(r1) stw r29, 44(r1) stw r30, 40(r1) mflr r11 stw r11, 56(r1) lis r2, ha16(L_A$non_lazy_ptr) lwz r30, lo16(L_A$non_lazy_ptr)(r2) li r29, 1 .LBB_foo_1: ; no_exit.0 bl L_bar$stub li r2, 1 or r3, r30, r30 .LBB_foo_2: ; no_exit.1 lfd f0, 8(r3) stfd f0, 0(r3) addi r4, r2, 1 addi r3, r3, 8 cmpwi cr0, r2, 100 or r2, r4, r4 bne .LBB_foo_2 ; no_exit.1 .LBB_foo_3: ; loopexit.1 addi r30, r30, 800 addi r2, r29, 1 cmpwi cr0, r29, 100 or r29, r2, r2 bne .LBB_foo_1 ; no_exit.0 .LBB_foo_4: ; return lwz r11, 56(r1) mtlr r11 lwz r30, 40(r1) lwz r29, 44(r1) lwz r1, 0(r1) blr instead of this: _foo: .LBB_foo_0: ; entry stwu r1, -48(r1) stw r28, 44(r1) ;; uses an extra register. stw r29, 40(r1) stw r30, 36(r1) mflr r11 stw r11, 56(r1) li r30, 1 li r29, 0 or r28, r29, r29 .LBB_foo_1: ; no_exit.0 bl L_bar$stub mulli r2, r28, 800 ;; unstrength-reduced multiply lis r3, ha16(L_A$non_lazy_ptr) ;; loop invariant address computation lwz r3, lo16(L_A$non_lazy_ptr)(r3) add r2, r2, r3 mulli r4, r29, 800 ;; unstrength-reduced multiply addi r3, r3, 8 add r3, r4, r3 li r4, 1 .LBB_foo_2: ; no_exit.1 lfd f0, 0(r3) stfd f0, 0(r2) addi r5, r4, 1 addi r2, r2, 8 ;; multiple stride 8 IV's addi r3, r3, 8 cmpwi cr0, r4, 100 or r4, r5, r5 bne .LBB_foo_2 ; no_exit.1 .LBB_foo_3: ; loopexit.1 addi r28, r28, 1 ;;; Many IV's with stride 1 addi r29, r29, 1 addi r2, r30, 1 cmpwi cr0, r30, 100 or r30, r2, r2 bne .LBB_foo_1 ; no_exit.0 .LBB_foo_4: ; return lwz r11, 56(r1) mtlr r11 lwz r30, 36(r1) lwz r29, 40(r1) lwz r28, 44(r1) lwz r1, 0(r1) blr llvm-svn: 22640	2005-08-04 00:14:11 +00:00
Chris Lattner	fc62470466	Teach loop-reduce to see into nested loops, to pull out immediate values pushed down by SCEV. In a nested loop case, this allows us to emit this: lis r3, ha16(L_A$non_lazy_ptr) lwz r3, lo16(L_A$non_lazy_ptr)(r3) add r2, r2, r3 li r3, 1 .LBB_foo_2: ; no_exit.1 lfd f0, 8(r2) ;; Uses offset of 8 instead of 0 stfd f0, 0(r2) addi r4, r3, 1 addi r2, r2, 8 cmpwi cr0, r3, 100 or r3, r4, r4 bne .LBB_foo_2 ; no_exit.1 instead of this: lis r3, ha16(L_A$non_lazy_ptr) lwz r3, lo16(L_A$non_lazy_ptr)(r3) add r2, r2, r3 addi r3, r3, 8 li r4, 1 .LBB_foo_2: ; no_exit.1 lfd f0, 0(r3) stfd f0, 0(r2) addi r5, r4, 1 addi r2, r2, 8 addi r3, r3, 8 cmpwi cr0, r4, 100 or r4, r5, r5 bne .LBB_foo_2 ; no_exit.1 llvm-svn: 22639	2005-08-03 23:44:42 +00:00
Chris Lattner	bb78c97e24	improve debug output llvm-svn: 22638	2005-08-03 23:30:08 +00:00
Chris Lattner	db23c74e5e	Move from Stage 0 to Stage 1. Only emit one PHI node for IV uses with identical bases and strides (after moving foldable immediates to the load/store instruction). This implements LoopStrengthReduce/dont_insert_redundant_ops.ll, allowing us to generate this PPC code for test1: or r30, r3, r3 .LBB_test1_1: ; Loop li r2, 0 stw r2, 0(r30) stw r2, 4(r30) bl L_pred$stub addi r30, r30, 8 cmplwi cr0, r3, 0 bne .LBB_test1_1 ; Loop instead of this code: or r30, r3, r3 or r29, r3, r3 .LBB_test1_1: ; Loop li r2, 0 stw r2, 0(r29) stw r2, 4(r30) bl L_pred$stub addi r30, r30, 8 ;; Two iv's with step of 8 addi r29, r29, 8 cmplwi cr0, r3, 0 bne .LBB_test1_1 ; Loop llvm-svn: 22635	2005-08-03 22:51:21 +00:00
Chris Lattner	430d0022df	Rename IVUse to IVUsersOfOneStride, use a struct instead of a pair to unify some parallel vectors and get field names more descriptive than "first" and "second". This isn't lisp afterall :) llvm-svn: 22633	2005-08-03 22:21:05 +00:00
Chris Lattner	84e9baa925	Fix a nasty dangling pointer issue. The ScalarEvolution pass would keep a map from instruction* to SCEVHandles. When we delete instructions, we have to tell it about it. We would run into nasty cases where new instructions were reallocated at old instruction addresses and get the old map values. Bad bad bad :( llvm-svn: 22632	2005-08-03 21:36:09 +00:00
Chris Lattner	3de05cc930	The correct fix for PR612, which also fixes Transforms/LowerInvoke/2005-08-03-InvokeWithPHIUse.ll llvm-svn: 22628	2005-08-03 18:51:44 +00:00
Chris Lattner	f8a81a9886	When inserting code, make sure not to insert it before PHI nodes. This fixes PR612 and Transforms/LowerInvoke/2005-08-03-InvokeWithPHI.ll llvm-svn: 22626	2005-08-03 18:34:29 +00:00
Chris Lattner	d683bdd0f8	Fix Transforms/SimplifyCFG/2005-08-03-PHIFactorCrash.ll, a problem that occurred while bugpointing another testcase llvm-svn: 22621	2005-08-03 17:59:45 +00:00
Chris Lattner	2dbf1960ff	Finally, add the required constraint checks to fix Transforms/SimplifyCFG/2005-08-01-PHIUpdateFail.ll the right way llvm-svn: 22615	2005-08-03 00:59:12 +00:00
Chris Lattner	908036942c	Simplify some code, add the correct pred checks llvm-svn: 22613	2005-08-03 00:38:27 +00:00
Chris Lattner	982b75c061	Refactor code out of PropagatePredecessorsForPHIs, turning it into a pure function with no side-effects llvm-svn: 22612	2005-08-03 00:29:26 +00:00
Chris Lattner	1f047fd513	use splice instead of remove/insert to avoid some symtab operations llvm-svn: 22611	2005-08-03 00:23:42 +00:00
Chris Lattner	76dc204488	move two functions up in the file, use SafeToMergeTerminators to eliminate some duplicated code llvm-svn: 22610	2005-08-03 00:19:45 +00:00
Chris Lattner	733d6704ce	Rip some code out of the main SimplifyCFG function into a subfunction and call it from the only place it is live. No functionality changes. llvm-svn: 22609	2005-08-03 00:11:16 +00:00
Chris Lattner	ac594de8dc	Disable this patch: http://lists.cs.uiuc.edu/pipermail/llvm-commits/Week-of-Mon-20050801/027345.html This breaks real programs and only fixes an obscure regression testcase. A real fix is in development. llvm-svn: 22606	2005-08-02 23:31:38 +00:00
Chris Lattner	eee90f7eb4	Change a place to use an arbitrary value instead of null, when possible llvm-svn: 22605	2005-08-02 23:29:23 +00:00
Chris Lattner	22d00a8e90	Update to use the new MathExtras.h support for log2 computation. Patch contributed by Jim Laskey! llvm-svn: 22592	2005-08-02 19:16:58 +00:00
Chris Lattner	351b891cbc	Like the comment says, do not insert cast instructions before phi nodes llvm-svn: 22586	2005-08-02 03:31:14 +00:00
Chris Lattner	4fd3e16cbd	This code was very close, but not quite right. It did not take into consideration the case where a reference in an unreachable block could occur. This fixes Transforms/SimplifyCFG/2005-08-01-PHIUpdateFail.ll, something I ran into while bugpoint'ing another pass. llvm-svn: 22584	2005-08-02 03:24:05 +00:00
Chris Lattner	75a44e154e	add a comment, make a check more lenient llvm-svn: 22581	2005-08-02 02:52:02 +00:00
Chris Lattner	dcce49e006	Simplify for loop, clear a per-loop map after processing each loop llvm-svn: 22580	2005-08-02 02:44:31 +00:00
Chris Lattner	9ef1294210	Add a comment Make LSR ignore GEP's that have loop variant base values, as we currently cannot codegen them llvm-svn: 22576	2005-08-02 01:32:29 +00:00
Chris Lattner	564900e5e5	Fix an iterator invalidation problem llvm-svn: 22575	2005-08-02 00:41:11 +00:00
Chris Lattner	e17c5d0e59	ConstantInt::get only works for arguments < 128. SimplifyLibCalls probably has to be audited to make sure it does not make this mistake elsewhere. Also, if this code knows that the type will be unsigned, obviously one arm of this is dead. Reid, can you take a look into this further? llvm-svn: 22566	2005-08-01 16:52:50 +00:00
Jeff Cohen	546fd5944e	Keep tabs and trailing spaces out. llvm-svn: 22565	2005-07-30 18:33:25 +00:00
Jeff Cohen	c500991055	Fix VC++ build problems. llvm-svn: 22564	2005-07-30 18:22:27 +00:00
Nate Begeman	17a0e2afea	Ack, typo llvm-svn: 22560	2005-07-30 00:21:31 +00:00
Nate Begeman	e68bcd1946	Commit a new LoopStrengthReduce pass that can use scalar evolutions and target data to decide which loop induction variables to strength reduce and how to do so. This work is mostly by Chris Lattner, with tweaks by me to get it working on some of MultiSource. llvm-svn: 22558	2005-07-30 00:15:07 +00:00
Nate Begeman	2bca4d9b7b	Break SCEVExpander out of IndVarSimplify into its own .h/.cpp file so that other passes may use it. llvm-svn: 22557	2005-07-30 00:12:19 +00:00
Jeff Cohen	5f4ef3c5a8	Eliminate all remaining tabs and trailing spaces. llvm-svn: 22523	2005-07-27 06:12:32 +00:00
Chris Lattner	31d0ac2414	ConvertibleToGEP always returns 0, remove some old crufty code which is actually dead because of this! llvm-svn: 22515	2005-07-26 16:38:28 +00:00
Chris Lattner	18aa4d8196	Do not let MaskedValueIsZero consider undef to be zero, for reasons explained in the comment. This fixes UnitTests/2003-09-18-BitFieldTest on darwin llvm-svn: 22483	2005-07-20 18:49:28 +00:00
Chris Lattner	247aef884c	When transforming &A[i] < &A[j] -> i < j, make sure to perform the comparison as a signed compare. This patch may fix PR597, but is correct in any case. llvm-svn: 22465	2005-07-18 23:07:33 +00:00
Chris Lattner	4ed40f7c6f	Fix a problem that instcombine would hit when dealing with unreachable code. Because the instcombine has to scan the entire function when it starts up to begin with, we might as well do it in DFO so we can nuke unreachable code. This fixes: Transforms/InstCombine/2005-07-07-DeadPHILoop.ll llvm-svn: 22348	2005-07-07 20:40:38 +00:00
Chris Lattner	937c71f2b3	Fix PR590 and Transforms/Mem2Reg/2005-06-30-ReadBeforeWrite.ll. The optimization for locally used allocas was not safe for allocas that were read before they were written. This change disables that optimization in that case. llvm-svn: 22318	2005-06-30 07:29:44 +00:00
John Criswell	810b4f8d55	Doh! Forgot to LLVMify the style. llvm-svn: 22312	2005-06-29 15:57:50 +00:00
John Criswell	4642afdcc1	Basic fix for PR#591; don't convert an fprintf() to an fwrite() if there is a mismatch in their character type pointers (i.e. fprintf() prints an array of ubytes while fwrite() takes an array of sbytes). We can probably do better than this (such as casting the ubyte to an sbyte). llvm-svn: 22310	2005-06-29 15:03:18 +00:00
Chris Lattner	9610c6f287	add a debug type llvm-svn: 22277	2005-06-24 16:00:46 +00:00
Andrew Lenharth	cf52eb2b99	prevent va_arg from being hoisted from a loop llvm-svn: 22265	2005-06-20 13:36:33 +00:00
Andrew Lenharth	d4b103107e	prevent DCE of vaarg intrinsics. This should take care of most regressions llvm-svn: 22263	2005-06-19 14:41:20 +00:00
Andrew Lenharth	9144ec4764	core changes for varargs llvm-svn: 22254	2005-06-18 18:34:52 +00:00
Reid Spencer	a7828baa3c	Fix a problem with the strcmp optimization checking the wrong string and not casting to the correct type. llvm-svn: 22250	2005-06-18 17:46:28 +00:00
Reid Spencer	4fdd96c4e0	Clean up some uninitialized variables and missing return statements that GCC 4.0.0 compiler (sometimes incorrectly) warns about under release build. llvm-svn: 22249	2005-06-18 17:37:34 +00:00
Chris Lattner	2ceb6ee576	This is not true: (X != 13 \| X < 15) -> X < 15 It is actually always true. This fixes PR586 and Transforms/InstCombine/2005-06-16-SetCCOrSetCCMiscompile.ll llvm-svn: 22236	2005-06-17 03:59:17 +00:00
Chris Lattner	73bcba5f61	Don't crash when dealing with INTMIN. This fixes PR585 and Transforms/InstCombine/2005-06-16-RangeCrash.ll llvm-svn: 22234	2005-06-17 02:05:55 +00:00
Chris Lattner	5e735294bf	Don't crash on: X = phi (X, X). This fixes PR584 and Transforms/SimplifyCFG/2005-06-16-PHICrash.ll llvm-svn: 22232	2005-06-17 01:45:53 +00:00
Chris Lattner	c53cb9d3ff	avoid constructing out of range shift amounts. llvm-svn: 22230	2005-06-17 01:29:28 +00:00
Chris Lattner	89dc4f16f5	Fix PR583 and testcase Transforms/InstCombine/2005-06-15-DivSelectCrash.ll llvm-svn: 22227	2005-06-16 04:55:52 +00:00
Chris Lattner	252a845e30	Fix PR571, removing code that does just the WRONG thing :) llvm-svn: 22225	2005-06-16 03:00:08 +00:00
Chris Lattner	104002bee3	Fix a bug in my previous patch. Do not get the shift amount type (which is always ubyte, get the type being shifted). This unbreaks espresso llvm-svn: 22224	2005-06-16 01:52:07 +00:00
Chris Lattner	d48b127aea	Fix PR575, patch provided by John Mellor-Crummey. Thanks! llvm-svn: 22223	2005-06-15 22:49:30 +00:00
Chris Lattner	df81539278	Fix PR582. The rewriter can move casts around, which invalidated the BB iterator. This fixes Transforms/IndVarsSimplify/2005-06-15-InstMoveCrash.ll llvm-svn: 22221	2005-06-15 21:29:31 +00:00
Chris Lattner	50bdfcb045	Do not promote globals only used by main to locals if there are constantexprs or other uses hanging off of them. llvm-svn: 22219	2005-06-15 21:11:48 +00:00
Chris Lattner	19b57f55aa	Fix PR577 and testcase InstCombine/2005-06-15-ShiftSetCCCrash.ll. Do not perform undefined out of range shifts. llvm-svn: 22217	2005-06-15 20:53:31 +00:00
Reid Spencer	a299d6f701	Put the hack back in that removes features, causes regressions to fail, but allows test programs to succeed. Actual fix for this is forthcoming. llvm-svn: 22213	2005-06-15 18:25:30 +00:00
Reid Spencer	6d231e55fa	Unbreak several InstCombine regression checks introduced by a hack to fix the bzip2 test. A better hack is needed. llvm-svn: 22209	2005-06-13 06:41:26 +00:00
Chris Lattner	1609a541cd	Fix a 64-bit problem, passing (int)0 through ... instead of (void*)0 llvm-svn: 22206	2005-06-09 03:32:54 +00:00
Chris Lattner	fbc45f10d0	Fix a problem on 64-bit targets where we passed (int)0 through ... instead of (void*)0. llvm-svn: 22205	2005-06-09 02:59:00 +00:00
Andrew Lenharth	ffe65458e7	hack to fix bzip2 (bug 571) llvm-svn: 22192	2005-06-04 12:43:56 +00:00
Reid Spencer	9fbad13dd7	Make the registration hash_map static. No other module needs it. Also, document what its for a little better. llvm-svn: 22164	2005-05-21 01:27:04 +00:00
Reid Spencer	0b13cdabae	Adjust the file comment to read a little easier. llvm-svn: 22163	2005-05-21 00:57:44 +00:00
Reid Spencer	45bb4afc79	Make sure ... arguments are casted to sbyte* where needed. llvm-svn: 22162	2005-05-21 00:39:30 +00:00
Reid Spencer	895af9ef24	Add a "brief" comment for CastToCStr llvm-svn: 22161	2005-05-21 00:23:23 +00:00
Chris Lattner	f8053cee7c	Fix mismatched type problem that crashed on cases like this: sprintf(P, "%s", X); Where X is not an sbyte*. This fixes the bug JohnMC reported on llvm-bugs. llvm-svn: 22159	2005-05-20 22:22:25 +00:00
Chris Lattner	19f9f32a5c	Fix Transforms/SimplifyCFG/switch-simplify-crash.ll llvm-svn: 22158	2005-05-20 22:19:54 +00:00
Chris Lattner	05deb04cb0	teach the inliner about coldcc and noreturn functions llvm-svn: 22113	2005-05-18 04:30:33 +00:00
Reid Spencer	74305a6233	Don't look for __builtin_ffs, we'll never see it from llvm-gcc and there's not reason to include it for other front ends. llvm-svn: 22070	2005-05-15 21:27:34 +00:00
Reid Spencer	17f7784c5d	Provide this optimization as well: ffs(x) -> (x == 0 ? 0 : 1+llvm.cttz(x)) llvm-svn: 22068	2005-05-15 21:19:45 +00:00
Reid Spencer	3de98ee643	Duh .. you actually have to #include Config/config.h before you can test for one of the values that it defines! llvm-svn: 22058	2005-05-15 17:20:47 +00:00
Reid Spencer	b195fcd5ef	Changes for ffs lib call simplification: * Check for availability of ffsll call in configure script * Support ffs, ffsl, and ffsll conversion to constant value if the argument is constant. llvm-svn: 22027	2005-05-14 16:42:52 +00:00
Chris Lattner	403d1c204c	Preserve calling conv when hacking on calls llvm-svn: 22025	2005-05-14 12:28:32 +00:00
Chris Lattner	05c703ea85	preserve calling conventions when hacking on code llvm-svn: 22024	2005-05-14 12:25:32 +00:00
Chris Lattner	bcefcf8552	Make sure to preserve the calling convention when changing an invoke into a call. This fixes Prolangs-C++/deriv2, kimwitu++, and Misc-C++/bigfib on X86 with -enable-x86-fastcc. llvm-svn: 22023	2005-05-14 12:21:56 +00:00
Chris Lattner	61d9d81770	calling a function with the wrong CC is undefined, turn it into an unreachable instruction. This is useful for catching optimizers that don't preserve calling conventions llvm-svn: 21928	2005-05-13 07:09:09 +00:00
Chris Lattner	ca968393ab	When lowering invokes to calls, amke sure to preserve the calling conv. This fixes Ptrdist/anagram with x86 llcbeta llvm-svn: 21925	2005-05-13 06:27:02 +00:00
Chris Lattner	ae186e012c	Prefer int 0 instead of long 0 for GEP arguments. llvm-svn: 21924	2005-05-13 06:10:12 +00:00
Chris Lattner	31c667e234	Fix Reassociate/shifttest.ll llvm-svn: 21839	2005-05-10 03:39:25 +00:00
Chris Lattner	bfc796f622	If a function contains no allocas, all of the calls in it are trivially suitable for tail calls. llvm-svn: 21836	2005-05-09 23:51:13 +00:00
Chris Lattner	b62f5082c5	implement and.ll:test33 llvm-svn: 21809	2005-05-09 04:58:36 +00:00
Chris Lattner	d0525a29d1	Preserve calling conventions when doing IPO llvm-svn: 21798	2005-05-09 01:05:50 +00:00
Chris Lattner	21d1dde72a	wrap long lines, preserve calling conventions when cloning functions and turning calls into invokes llvm-svn: 21797	2005-05-09 01:04:34 +00:00
Chris Lattner	a4c8022caf	Convert non-address taken functions with C calling conventions to fastcc. llvm-svn: 21791	2005-05-08 22:18:06 +00:00
Chris Lattner	df3332660f	Implement Reassociate/mul-neg-add.ll llvm-svn: 21788	2005-05-08 21:41:35 +00:00
Chris Lattner	c4f8e2b0ed	Bail out earlier llvm-svn: 21786	2005-05-08 21:33:47 +00:00
Chris Lattner	877b114037	Teach reassociate that 0-X === X*-1 llvm-svn: 21785	2005-05-08 21:28:52 +00:00
Chris Lattner	9f284e0a3c	Fix PR557 and basictest[34].ll. This makes reassociate realize that loads should be treated as unmovable, and gives distinct ranks to distinct values defined in the same basic block, allowing reassociate to do its thing. llvm-svn: 21783	2005-05-08 20:57:04 +00:00
Chris Lattner	9187f3905e	Add debugging information llvm-svn: 21781	2005-05-08 20:09:57 +00:00
Chris Lattner	08582be283	eliminate gotos llvm-svn: 21780	2005-05-08 19:48:43 +00:00
Chris Lattner	5847e5e10c	Improve reassociation handling of inverses, implementing inverses.ll. llvm-svn: 21778	2005-05-08 18:59:37 +00:00
Chris Lattner	4922118dc4	clean up and modernize this pass. llvm-svn: 21776	2005-05-08 18:45:26 +00:00
Chris Lattner	b18dbbfff5	Strength reduce SAR into SHR if there is no way sign bits could be shifted in. This tends to get cases like this: X = cast ubyte to int Y = shr int X, ... Tested by: shift.ll:test24 llvm-svn: 21775	2005-05-08 17:34:56 +00:00
Chris Lattner	e1850b86b6	Refactor some code llvm-svn: 21772	2005-05-08 00:19:31 +00:00
Chris Lattner	6e2086d7e4	Handle some simple cases where we can see that values get annihilated. llvm-svn: 21771	2005-05-08 00:08:33 +00:00
Chris Lattner	4294cec0f1	Fix a miscompilation of crafty by clobbering the "A" variable. llvm-svn: 21770	2005-05-07 23:49:08 +00:00
Chris Lattner	1e5065052a	Rewrite the guts of the reassociate pass to be more efficient and logical. Instead of trying to do local reassociation tweaks at each level, only process an expression tree once (at its root). This does not improve the reassociation pass in any real way. llvm-svn: 21768	2005-05-07 21:59:39 +00:00
Reid Spencer	170ae7ff70	* Add two strlen optimizations: strlen(x) != 0 -> x != 0 strlen(x) == 0 -> x == 0 * Change nested statistics to use style of other LLVM statistics so that only the name of the optimization (simplify-libcalls) is used as the statistic name, and the description indicates which specific all is optimized. Cuts down on some redundancy and saves a few bytes of space. * Make note of stpcpy optimization that could be done. llvm-svn: 21766	2005-05-07 20:15:59 +00:00
Reid Spencer	4f01a822b4	Don't increment the counter unless the debug flag is set. llvm-svn: 21762	2005-05-07 04:59:45 +00:00
Chris Lattner	cea579932d	Convert shifts to muls to assist reassociation. This implements Reassociate/shifttest.ll llvm-svn: 21761	2005-05-07 04:24:13 +00:00
Chris Lattner	f43e974abd	Simplify the code and rearrange it. No major functionality changes here. llvm-svn: 21759	2005-05-07 04:08:02 +00:00
Chris Lattner	7effa0ed06	BAD typeo which caused many testsuite failures last night. Note to self, do not change code after testing it without retesting! llvm-svn: 21741	2005-05-06 17:13:16 +00:00
Chris Lattner	6aacb0f9da	Preserve tail marker llvm-svn: 21737	2005-05-06 06:48:21 +00:00
Chris Lattner	9f3dced2c7	Implement Transforms/Inline/inline-tail.ll llvm-svn: 21736	2005-05-06 06:47:52 +00:00
Chris Lattner	324d2eedb2	preserve the tail marker llvm-svn: 21734	2005-05-06 06:46:58 +00:00
Chris Lattner	53db546b97	Wrap long lines llvm-svn: 21720	2005-05-06 05:34:40 +00:00
Chris Lattner	a36d525741	DCE intrinsic instructions without side effects. llvm-svn: 21719	2005-05-06 05:27:34 +00:00
Chris Lattner	ef298a3b8a	Teach instcombine propagate zeroness through shl instructions, implementing and.ll:test31 llvm-svn: 21717	2005-05-06 04:53:20 +00:00
Chris Lattner	873804168e	Implement shift.ll:test23. If we are shifting right then immediately truncating the result, turn signed shift rights into unsigned shift rights if possible. This leads to later simplification and happens often in 176.gcc. For example, this testcase: struct xxx { unsigned int code : 8; }; enum codes { A, B, C, D, E, F }; int foo(struct xxx P) { if ((enum codes)P->code == A) bar(); } used to be compiled to: int %foo(%struct.xxx %P) { %tmp.1 = getelementptr %struct.xxx* %P, int 0, uint 0 ; <uint> [#uses=1] %tmp.2 = load uint %tmp.1 ; <uint> [#uses=1] %tmp.3 = cast uint %tmp.2 to int ; <int> [#uses=1] %tmp.4 = shl int %tmp.3, ubyte 24 ; <int> [#uses=1] %tmp.5 = shr int %tmp.4, ubyte 24 ; <int> [#uses=1] %tmp.6 = cast int %tmp.5 to sbyte ; <sbyte> [#uses=1] %tmp.8 = seteq sbyte %tmp.6, 0 ; <bool> [#uses=1] br bool %tmp.8, label %then, label %UnifiedReturnBlock Now it is compiled to: %tmp.1 = getelementptr %struct.xxx* %P, int 0, uint 0 ; <uint> [#uses=1] %tmp.2 = load uint %tmp.1 ; <uint> [#uses=1] %tmp.2 = cast uint %tmp.2 to sbyte ; <sbyte> [#uses=1] %tmp.8 = seteq sbyte %tmp.2, 0 ; <bool> [#uses=1] br bool %tmp.8, label %then, label %UnifiedReturnBlock which is the difference between this: foo: subl $4, %esp movl 8(%esp), %eax movl (%eax), %eax shll $24, %eax sarl $24, %eax testb %al, %al jne .LBBfoo_2 and this: foo: subl $4, %esp movl 8(%esp), %eax movl (%eax), %eax testb %al, %al jne .LBBfoo_2 This occurs 3243 times total in the External tests, 215x in povray, 6x in each f2c'd program, 1451x in 176.gcc, 7x in crafty, 20x in perl, 25x in gap, 3x in m88ksim, 25x in ijpeg. Maybe this will cause a little jump on gcc tommorow :) llvm-svn: 21715	2005-05-06 04:18:52 +00:00
Chris Lattner	7208616ec0	Implement xor.ll:test22 llvm-svn: 21713	2005-05-06 02:07:39 +00:00
Chris Lattner	4c2d3781aa	implement and.ll:test30 and set.ll:test21 llvm-svn: 21712	2005-05-06 01:53:19 +00:00
Chris Lattner	dd1e562ec3	implement or.ll:test20 llvm-svn: 21709	2005-05-06 00:58:50 +00:00
Chris Lattner	807aa20f67	Fix a bug compimling Ruby, fixing this testcase: LowerSetJmp/2005-05-05-OldUses.ll llvm-svn: 21696	2005-05-05 15:47:43 +00:00
Chris Lattner	809dfac421	Instcombine: cast (X != 0) to int, cast (X == 1) to int -> X iff X has only the low bit set. This implements set.ll:test20. This triggers 2x on povray, 9x on mesa, 11x on gcc, 2x on crafty, 1x on eon, 6x on perlbmk and 11x on m88ksim. It allows us to compile these two functions into the same code: struct s { unsigned int bit : 1; }; unsigned foo(struct s p) { if (p->bit) return 1; else return 0; } unsigned bar(struct s p) { return p->bit; } llvm-svn: 21690	2005-05-04 19:10:26 +00:00
Reid Spencer	282d057485	Implement the IsDigitOptimization for simplifying calls to the isdigit library function: isdigit(chr) -> 0 or 1 if chr is constant isdigit(chr) -> chr - '0' <= 9 otherwise Although there are many calls to isdigit in llvm-test, most of them are compiled away by macros leaving only this: 2 MultiSource/Applications/hexxagon llvm-svn: 21688	2005-05-04 18:58:28 +00:00
Reid Spencer	1e520fd661	* Correct the function prototypes for some of the functions to match the actual spec (int -> uint) * Add the ability to get/cache the strlen function prototype. * Make sure generated values are appropriately named for debugging purposes * Add the SPrintFOptimiation for 4 casts of sprintf optimization: sprintf(str,cstr) -> llvm.memcpy(str,cstr) (if cstr has no %) sprintf(str,"") -> store sbyte 0, str sprintf(str,"%s",src) -> llvm.memcpy(str,src) (if src is constant) sprintf(str,"%c",chr) -> store chr, str ; store sbyte 0, str+1 The sprintf optimization didn't fire as much as I had hoped: 2 MultiSource/Applications/SPASS 5 MultiSource/Benchmarks/McCat/18-imp 22 MultiSource/Benchmarks/Prolangs-C/TimberWolfMC 1 MultiSource/Benchmarks/Prolangs-C/assembler 6 MultiSource/Benchmarks/Prolangs-C/unix-smail 2 MultiSource/Benchmarks/mediabench/mpeg2/mpeg2dec llvm-svn: 21679	2005-05-04 03:20:21 +00:00
Reid Spencer	38cabd7265	Implement optimizations for the strchr and llvm.memset library calls. Neither of these activated as many times as was hoped: strchr: 9 MultiSource/Applications/siod 1 MultiSource/Applications/d 2 MultiSource/Prolangs-C/archie-client 1 External/SPEC/CINT2000/176.gcc/176.gcc llvm.memset: no hits llvm-svn: 21669	2005-05-03 07:23:44 +00:00
Reid Spencer	95d8efdfcf	Avoid garbage output in the statistics display by ensuring that the strings passed to Statistic's constructor are not destructable. The stats are printed during static destruction and the SimplifyLibCalls module was getting destructed before the statistics. llvm-svn: 21661	2005-05-03 02:54:54 +00:00
Reid Spencer	49fa070401	Add the StrNCmpOptimization which is similar to strcmp. Unfortunately, this optimization didn't trigger on any llvm-test tests. llvm-svn: 21660	2005-05-03 01:43:45 +00:00
Reid Spencer	2d5c7beebd	Implement the fprintf optimization which converts calls like this: fprintf(F,"hello") -> fwrite("hello",strlen("hello"),1,F) fprintf(F,"%s","hello") -> fwrite("hello",strlen("hello"),1,F) fprintf(F,"%c",'x') -> fputc('c',F) This optimization fires severals times in llvm-test: 313 MultiSource/Applications/Burg 302 MultiSource/Benchmarks/Prolangs-C/TimberWolfMC 189 MultiSource/Benchmarks/Prolangs-C/mybison 175 MultiSource/Benchmarks/Prolangs-C/football 130 MultiSource/Benchmarks/Prolangs-C/unix-tbl llvm-svn: 21657	2005-05-02 23:59:26 +00:00
John Criswell	f42ed7bdaf	Fixed a comment. llvm-svn: 21653	2005-05-02 14:47:42 +00:00
Chris Lattner	a816eee427	Implement getelementptr.ll:test11 llvm-svn: 21647	2005-05-01 04:42:15 +00:00
Chris Lattner	a9d84e3388	Check for volatile loads only once. Implement load.ll:test7 llvm-svn: 21645	2005-05-01 04:24:53 +00:00
Reid Spencer	16449a9eb0	Fix a comment that stated the wrong thing. llvm-svn: 21638	2005-04-30 06:45:47 +00:00
Reid Spencer	4c444fe007	* Don't depend on "guessing" what a FILE* is, just require that the actual type be obtained from a CallInst we're optimizing. * Make it possible for getConstantStringLength to return the ConstantArray that it extracts in case the content is needed by an Optimization. * Implement the strcmp optimization * Implement the toascii optimization This pass is now firing several to many times in the following MultiSource tests: Applications/Burg - 7 (strcat,strcpy) Applications/siod - 13 (strcat,strcpy,strlen) Applications/spiff - 120 (exit,fputs,strcat,strcpy,strlen) Applications/treecc - 66 (exit,fputs,strcat,strcpy) Applications/kimwitu++ - 34 (strcmp,strcpy,strlen) Applications/SPASS - 588 (exit,fputs,strcat,strcpy,strlen) llvm-svn: 21626	2005-04-30 03:17:54 +00:00
Reid Spencer	9361697f93	Implement the optimizations for "pow" and "fputs" library calls. llvm-svn: 21618	2005-04-29 09:39:47 +00:00
Reid Spencer	c968ea0495	Remove optimizations that don't require both operands to be constant. These are moved to simplify-libcalls pass. llvm-svn: 21614	2005-04-29 05:55:35 +00:00
Jeff Cohen	4bc952f703	Consistently use 'class' to silence VC++ llvm-svn: 21612	2005-04-29 03:05:44 +00:00
Reid Spencer	ed55a6b5e0	* Add constant folding for additional floating point library calls such as sinh, cosh, etc. * Make the name comparisons for the fp libcalls a little more efficient by switching on the first character of the name before doing comparisons. llvm-svn: 21611	2005-04-28 23:01:59 +00:00
Reid Spencer	16983ca865	Remove from the TODO list those optimizations that are already handled by constant folding implemented in lib/Transforms/Utils/Local.cpp. llvm-svn: 21604	2005-04-28 18:05:16 +00:00
Reid Spencer	649ac283e4	Document additional libcall transformations that need to be written. Help Wanted! There's a lot of them to write. llvm-svn: 21603	2005-04-28 04:40:06 +00:00
Reid Spencer	7ddcfb3375	Doxygenate. llvm-svn: 21602	2005-04-27 21:29:20 +00:00
Chris Lattner	36ffb1ff37	remove 'statement with no effect' warning llvm-svn: 21600	2005-04-27 20:12:17 +00:00
Reid Spencer	08b4940509	More Cleanup: * Name the instructions by appending to name of original * Factor common part out of a switch statement. llvm-svn: 21597	2005-04-27 17:46:54 +00:00
Reid Spencer	e249a82e73	This is a cleanup commit: * Correct stale documentation in a few places * Re-order the file to better associate things and reduce line count * Make the pass thread safe by caching the Function* objects needed by the optimizers in the pass object instead of globally. * Provide the SimplifyLibCalls pass object to the optimizer classes so they can access cached Function* objects and TargetData info * Make sure the pass resets its cache if the Module passed to runOnModule changes * Rename CallOptimizer LibCallOptimization. All the classes are named Optimization while the objects are Optimizer. * Don't cache Function* in the optimizer objects because they could be used by multiple PassManager's running in multiple threads * Add an optimization for strcpy which is similar to strcat * Add a "TODO" list at the end of the file for ideas on additional libcall optimizations that could be added (get ideas from other compilers). Sorry for the huge diff. Its mostly reorganization of code. That won't happen again as I believe the design and infrastructure for this pass is now done or close to it. llvm-svn: 21589	2005-04-27 07:54:40 +00:00
Chris Lattner	93f4e9dd26	detect functions that never return, and turn the instruction following a call to them into an 'unreachable' instruction. This triggers a bunch of times, particularly on gcc: gzip: 36 gcc: 601 eon: 12 bzip: 38 llvm-svn: 21587	2005-04-27 04:52:23 +00:00
Reid Spencer	dc11db68b6	Prefix the debug statistics so they group together. llvm-svn: 21583	2005-04-27 00:20:23 +00:00
Reid Spencer	e95a647b2a	In debug builds, make a statistic for each kind of call optimization. This helps track down what gets triggered in the pass so its easier to identify good test cases. llvm-svn: 21582	2005-04-27 00:05:45 +00:00
Chris Lattner	7f4f773e9f	This analysis doesn't take 'throwing' into consideration, it looks at 'unwinding' llvm-svn: 21581	2005-04-26 23:53:25 +00:00
Reid Spencer	f9d4be187f	Fix up the debug statement to actually use a newline .. radical concept. llvm-svn: 21580	2005-04-26 23:07:08 +00:00
Reid Spencer	18b998192f	Uh, this isn't argpromotion. llvm-svn: 21579	2005-04-26 23:05:17 +00:00
Reid Spencer	2bc7a4f82a	Add some debugging output so we can tell which calls are getting triggered llvm-svn: 21578	2005-04-26 23:02:16 +00:00
Reid Spencer	f8c03d9db6	No, seriously folks, memcpy really does return void. llvm-svn: 21575	2005-04-26 22:49:48 +00:00
Reid Spencer	aaca170867	memcpy returns void!!!!! llvm-svn: 21574	2005-04-26 22:46:23 +00:00
Reid Spencer	4855ebf622	Fix some bugs found by running on llvm-test: * MemCpyOptimization can only be optimized if the 3rd and 4th arguments are constants and we weren't checking for that. * The result of llvm.memcpy (and llvm.memmove) is void* not sbyte*, put in a cast. llvm-svn: 21570	2005-04-26 19:55:57 +00:00
Reid Spencer	bb92b4fdfb	Changes From Review Feedback: * Have the SimplifyLibCalls pass acquire the TargetData and pass it down to the optimization classes so they can use it to make better choices for the signatures of functions, etc. * Rearrange the code a little so the utility functions are closer to their usage and keep the core of the pass near the top of the files. * Adjust the StrLen pass to get/use the correct prototype depending on the TargetData::getIntPtrType() result. The result of strlen is size_t which could be either uint or ulong depending on the platform. * Clean up some coding nits (cast vs. dyn_cast, remove redundant items from a switch, etc.) * Implement the MemMoveOptimization as a twin of MemCpyOptimization (they only differ in name). llvm-svn: 21569	2005-04-26 19:13:17 +00:00
Chris Lattner	bd43b9db9d	Fix the compile failures from last night. llvm-svn: 21565	2005-04-26 14:40:41 +00:00
Reid Spencer	b4f7b83dce	* Merge get_GVInitializer and getCharArrayLength into a single function named getConstantStringLength. This is the common part of StrCpy and StrLen optimizations and probably several others, yet to be written. It performs all the validity checks for looking at constant arrays that are supposed to be null-terminated strings and then computes the actual length of the string. * Implement the MemCpyOptimization class. This just turns memcpy of 1, 2, 4 and 8 byte data blocks that are properly aligned on those boundaries into a load and a store. Much more could be done here but alignment restrictions and lack of knowledge of the target instruction set prevent use from doing significantly more. That will have to be delegated to the code generators as they lower llvm.memcpy calls. llvm-svn: 21562	2005-04-26 07:45:18 +00:00
Reid Spencer	76dab9a523	* Implement StrLenOptimization * Factor out commonalities between StrLenOptimization and StrCatOptimization * Make sure that signatures return sbyte* not void* llvm-svn: 21559	2005-04-26 05:24:00 +00:00
Reid Spencer	8ee5aacc38	Incorporate feedback from Chris: * Change signatures of OptimizeCall and ValidateCalledFunction so they are non-const, allowing the optimization object to be modified. This is in support of caching things used across multiple calls. * Provide two functions for constructing and caching function types * Modify the StrCatOptimization to cache Function objects for strlen and llvm.memcpy so it doesn't regenerate them on each call site. Make sure these are invalidated each time we start the pass. * Handle both a GEP Instruction and a GEP ConstantExpr * Add additional checks to make sure we really are dealing with an arary of sbyte and that all the element initializers are ConstantInt or ConstantExpr that reduce to ConstantInt. * Make sure the GlobalVariable is constant! * Don't use ConstantArray::getString as it can fail and it doesn't give us the right thing. We must check for null bytes in the middle of the array. * Use llvm.memcpy instead of memcpy so we can factor alignment into it. * Don't use void* types in signatures, replace with sbyte* instead. llvm-svn: 21555	2005-04-26 03:26:15 +00:00
Reid Spencer	fe91dfec91	Changes due to code review and new implementation: * Don't use std::string for the function names, const char* will suffice * Allow each CallOptimizer to validate the function signature before doing anything * Repeatedly loop over the functions until an iteration produces no more optimizations. This allows one optimization to insert a call that is optimized by another optimization. * Implement the ConstantArray portion of the StrCatOptimization * Provide a template for the MemCpyOptimization * Make ExitInMainOptimization split the block, not delete everything after the return instruction. (This covers revision 1.3 and 1.4, as the 1.3 comments were botched) llvm-svn: 21548	2005-04-25 21:20:38 +00:00
Reid Spencer	f2534c7291	Lots of changes based on review and new functionality: * Use a llvm-svn: 21546	2005-04-25 21:11:48 +00:00
Chris Lattner	a21bf8d1be	implement getelementptr.ll:test10 llvm-svn: 21541	2005-04-25 20:17:30 +00:00
Reid Spencer	9bbaa2ab7f	Post-Review Cleanup: * Fix comments at top of file * Change algorithm for running the call optimizations from nn to something closer to n. Use a hash_map to store and lookup the optimizations since there will eventually (or potentially) be a large number of them. This gets lookup based on the name of the function to O(1). Each CallOptimizer now has a std::string member named func_name that tracks the name of the function that it applies to. It is this string that is entered into the hash_map for fast comparison against the function names encountered in the module. * Cleanup some style issues pertaining to iterator invalidation * Don't pass the Function pointer to the OptimizeCall function because if the optimization needs it, it can get it from the CallInst passed in. * Add the skeleton for a new CallOptimizer, StrCatOptimizer which will eventually replace strcat's of constant strings with direct copies. llvm-svn: 21526	2005-04-25 03:59:26 +00:00
Reid Spencer	39a762d149	A new pass to provide specific optimizations for certain well-known library calls. The pass visits all external functions in the module and determines if such function calls can be optimized. The optimizations are specific to the library calls involved. This initial version only optimizes calls to exit(3) when they occur in main(): it changes them to ret instructions. llvm-svn: 21522	2005-04-25 02:53:12 +00:00
Chris Lattner	2f1457fd83	Eliminate cases where we could << by 64, which is undefined in C. llvm-svn: 21500	2005-04-24 17:46:05 +00:00
Chris Lattner	d6f636a340	Implement xor.ll:test21: select (not C), A, B -> select C, B, A llvm-svn: 21495	2005-04-24 07:30:14 +00:00
Chris Lattner	d1f46d3bf9	Use getPrimitiveSizeInBits() instead of getPrimitiveSize()*8 Completely rework the 'setcc (cast x to larger), y' code. This code has the advantage of implementing setcc.ll:test19 (being more general than the previous code) and being correct in all cases. This allows us to unxfail 2004-11-27-SetCCForCastLargerAndConstant.ll, and close PR454. llvm-svn: 21491	2005-04-24 06:59:08 +00:00
Jeff Cohen	82639853c0	Eliminate tabs and trailing spaces llvm-svn: 21480	2005-04-23 21:38:35 +00:00
Chris Lattner	77c32c34d7	Generalize the setcc -> PHI and Select folding optimizations to work with any constant RHS, not just a constant integer RHS. This implements select.ll:test17 llvm-svn: 21470	2005-04-23 15:31:55 +00:00
Misha Brukman	b1c9317bb4	Remove trailing whitespace llvm-svn: 21427	2005-04-21 23:48:37 +00:00
Chris Lattner	a3159af703	Fix a bug where we would not promote calls to invokes if they occured in the same block as the setjmp. Thanks to Greg Pettyjohn for noticing this! llvm-svn: 21403	2005-04-21 16:46:46 +00:00
Chris Lattner	7ceb081f3f	Improve doxygen documentation, patch contributed by Evan Jones! llvm-svn: 21393	2005-04-21 16:04:49 +00:00
Chris Lattner	374e659466	Instcombine this: %shortcirc_val = select bool %tmp.1, bool true, bool %tmp.4 ; <bool> [#uses=1] %tmp.6 = cast bool %shortcirc_val to int ; <int> [#uses=1] into this: %shortcirc_val = or bool %tmp.1, %tmp.4 ; <bool> [#uses=1] %tmp.6 = cast bool %shortcirc_val to int ; <int> [#uses=1] not this: %tmp.4.cast = cast bool %tmp.4 to int ; <int> [#uses=1] %tmp.6 = select bool %tmp.1, int 1, int %tmp.4.cast ; <int> [#uses=1] llvm-svn: 21389	2005-04-21 05:43:13 +00:00
Chris Lattner	b38b443b15	Teach simplifycfg that setcc is cheap and non-trapping, so that it can convert this: %tmp.1 = seteq int %i, 0 ; <bool> [#uses=1] br bool %tmp.1, label %shortcirc_done, label %shortcirc_next shortcirc_next: ; preds = %entry %tmp.4 = seteq int %j, 0 ; <bool> [#uses=1] br label %shortcirc_done shortcirc_done: ; preds = %shortcirc_next, %entry %shortcirc_val = phi bool [ %tmp.4, %shortcirc_next ], [ true, %entry ] ; <bool> [#uses=1] to this: %tmp.1 = seteq int %i, 0 ; <bool> [#uses=1] %tmp.4 = seteq int %j, 0 ; <bool> [#uses=1] %shortcirc_val = select bool %tmp.1, bool true, bool %tmp.4 ; <bool> [#uses=1] ... which is later simplified by instcombine into an or. llvm-svn: 21388	2005-04-21 05:31:13 +00:00
Chris Lattner	8cb10a1775	Wrap some long lines. Make IPSCCP strip off dead constant exprs that are using functions, making them appear as though their address is taken. This allows us to propagate some more pool descriptors, lowering the overhead of pool alloc. llvm-svn: 21363	2005-04-19 19:16:19 +00:00
Chris Lattner	5c219469a0	Eliminate a broken transformation, fixing PR548 llvm-svn: 21354	2005-04-19 06:04:18 +00:00
Chris Lattner	ee84413730	silence a bogus warning llvm-svn: 21320	2005-04-18 05:26:21 +00:00
Chris Lattner	16a50fd0a0	a new simple pass, which will be extended to be more useful in the future. This pass forward branches through conditions when it can show that the conditions is either always true or false for a predecessor. This currently only handles the most simple cases of this, but is successful at threading across 2489 branches and 65 switch instructions in 176.gcc, which isn't bad. llvm-svn: 21306	2005-04-15 19:28:32 +00:00
Chris Lattner	95f16a3ac4	Get rid of this for_each loop llvm-svn: 21253	2005-04-12 18:51:33 +00:00
Chris Lattner	4236261930	Fix bug: InstCombine/2005-05-07-UDivSelectCrash.ll llvm-svn: 21152	2005-04-08 04:03:26 +00:00
Chris Lattner	4706046e68	Implement the following xforms: (X-Y)-X --> -Y A + (B - A) --> B (B - A) + A --> B llvm-svn: 21138	2005-04-07 17:14:51 +00:00
Chris Lattner	c7f3c1a00e	Implement InstCombine/add.ll:test28, transforming C1-(X+C2) --> (C1-C2)-X. This occurs several dozen times in specint2k, particularly in crafty and gcc apparently. llvm-svn: 21136	2005-04-07 16:28:01 +00:00
Chris Lattner	a9be4490d8	Transform X-(X+Y) == -Y and X-(Y+X) == -Y llvm-svn: 21134	2005-04-07 16:15:25 +00:00
Chris Lattner	ecfa9b5810	disable this transformation in the one obscure case that really pessimizes pointer analysis. llvm-svn: 20916	2005-03-29 06:37:47 +00:00
Alkis Evlogimenos	9ead0d7b4c	Rename createPromoteMemoryToRegister() to createPromoteMemoryToRegisterPass() to be consistent with other pass creation functions. llvm-svn: 20885	2005-03-28 02:01:12 +00:00
Chris Lattner	514e843e89	Enhance loopsimplify to preserve alias analysis instead of clobbering it. This prevents crashes on some programs when using -ds-aa -licm. llvm-svn: 20831	2005-03-25 06:37:22 +00:00
Chris Lattner	faf7791fea	Fix a bug where LICM was not updating AA information properly when sinking a pointer value out of a loop causing it to be duplicated. llvm-svn: 20828	2005-03-25 00:22:36 +00:00
Chris Lattner	1c790bf656	enable -debug-only=licm llvm-svn: 20788	2005-03-23 21:00:12 +00:00
Chris Lattner	7b9020a059	Fix the missing symbols problem Bill was hitting. Patch contributed by Bill Wendling!! llvm-svn: 20649	2005-03-17 15:38:16 +00:00
Chris Lattner	6cb4559369	stop using method. llvm-svn: 20603	2005-03-15 05:19:49 +00:00
Chris Lattner	531f9e92d4	This mega patch converts us from using Function::a{iterator\|begin\|end} to using Function::arg_{iterator\|begin\|end}. Likewise Module::g* -> Module::global_*. This patch is contributed by Gabor Greif, thanks! llvm-svn: 20597	2005-03-15 04:54:21 +00:00
Chris Lattner	8c79559443	fix a bug where we thought arguments were constants :( llvm-svn: 20506	2005-03-06 22:52:29 +00:00
Chris Lattner	2ce303b406	Fix Regression/Transforms/LoopStrengthReduce/dont_insert_redundant_ops.ll, hopefully not breaking too many other things. llvm-svn: 20505	2005-03-06 22:36:12 +00:00
Chris Lattner	45403e5052	implement Transforms/LoopStrengthReduce/invariant_value_first_arg.ll llvm-svn: 20501	2005-03-06 22:06:22 +00:00
Chris Lattner	d3874fad44	minor simplifications of the code. llvm-svn: 20497	2005-03-06 21:58:22 +00:00
Chris Lattner	dd3ec92085	trivial simplification llvm-svn: 20494	2005-03-06 21:35:38 +00:00
Chris Lattner	238f6df546	Fix a bug where we could corrupt a parent loop's header info if we unrolled a nested loop. This fixes Transforms/LoopUnroll/2005-03-06-BadLoopInfoUpdate.ll and PR532 llvm-svn: 20493	2005-03-06 20:57:32 +00:00
Chris Lattner	1b032f59e7	Make this MUCH faster by avoiding a linear search in the symbol table code. llvm-svn: 20479	2005-03-06 05:42:36 +00:00
Jeff Cohen	4abcea3a69	Reformat comments to fix 80 columns. llvm-svn: 20467	2005-03-05 22:45:40 +00:00
Jeff Cohen	be37fa07fd	Reuse induction variables created for strength-reduced GEPs by other similar GEPs. llvm-svn: 20466	2005-03-05 22:40:34 +00:00
Chris Lattner	6d0a24c608	second argument to Value::setName is now gone. llvm-svn: 20463	2005-03-05 19:05:20 +00:00
Chris Lattner	cfe2822cdf	Do not compute 1ULL << 64, which is undefined. This fixes Ptrdist/ks on the sparc, and testcase Regression/Transforms/InstCombine/2005-03-04-ShiftOverflow.ll llvm-svn: 20445	2005-03-04 23:21:33 +00:00
Jeff Cohen	a2c59b7423	Add support for not strength reducing GEPs where the element size is a small power of two. This emphatically includes the zeroeth power of two. llvm-svn: 20429	2005-03-04 04:04:26 +00:00
Chris Lattner	ef1e989e4f	Add an optional argument to lower to a specific constant value instead of to a "sizeof" expression. llvm-svn: 20414	2005-03-03 01:03:43 +00:00
Jeff Cohen	8ea6f9e821	Fixed the following LSR bugs: * Loop invariant code does not dominate the loop header, but rather the end of the loop preheader. * The base for a reduced GEP isn't a constant unless all of its operands (preceding the induction variable) are constant. * Allow induction variable elimination for the simple case after all. Also made changes recommended by Chris for properly deleting instructions. llvm-svn: 20383	2005-03-01 03:46:11 +00:00
Jeff Cohen	dcaa48b5c4	Fix crash in LSR due to attempt to remove original induction variable. However, for reasons explained in the comments, I also deactivated this code as it needs more thought. llvm-svn: 20367	2005-02-28 00:08:56 +00:00
Jeff Cohen	fd63d3af0d	PHI nodes were incorrectly placed when more than one GEP is reduced in a loop. llvm-svn: 20360	2005-02-27 21:08:04 +00:00
Jeff Cohen	39751c3b7c	First pass at improved Loop Strength Reduction. Still not yet ready for prime time. llvm-svn: 20358	2005-02-27 19:37:07 +00:00
Chris Lattner	7561ca1d15	Teach globalopt how memset/cpy/move affect memory, to allow better optimization. llvm-svn: 20352	2005-02-27 18:58:52 +00:00
Chris Lattner	0ce80cd542	Fix spelling, patch contributed by Gabor Greif! llvm-svn: 20343	2005-02-27 06:18:25 +00:00
Chris Lattner	cc6d75fddf	remove extraneous cast llvm-svn: 20334	2005-02-26 18:33:28 +00:00
Chris Lattner	1cca959e5d	Implement Transforms/SimplifyCFG/switch_thread.ll This does a simple form of "jump threading", which eliminates CFG edges that are provably dead. This triggers 90 times in the external tests, and eliminating CFG edges is always always a good thing! :) llvm-svn: 20300	2005-02-24 06:17:52 +00:00
Chris Lattner	25169caa80	make this more efficient. Scan up to 16 nodes, not the whole list. llvm-svn: 20289	2005-02-23 16:53:04 +00:00
Chris Lattner	52e931b37d	Remove use of bind_obj llvm-svn: 20276	2005-02-22 23:22:58 +00:00
Chris Lattner	7b5d9e2217	Do not mark obviously unreachable blocks live when processing PHI nodes, and handle incomplete control dependences correctly. This fixes: Regression/Transforms/ADCE/dead-phi-edge.ll -> a missed optimization Regression/Transforms/ADCE/dead-phi-edge.ll -> a compiler crash distilled from QT4 llvm-svn: 20227	2005-02-17 19:28:49 +00:00
Chris Lattner	31f3382b3b	Fix the second bug attached to PR504. llvm-svn: 20181	2005-02-14 20:11:45 +00:00
Chris Lattner	e616fea3bc	Fix for testcase Transforms/IndVarsSimplify/2005-02-11-InvokeCrash.ll and PR504. llvm-svn: 20129	2005-02-12 03:26:49 +00:00
Alkis Evlogimenos	c4a44c6b3d	Localize globals if they are only used in main(). This replaces the global with an alloca, which eventually gets promoted into a register. This enables a lot of other optimizations later on. llvm-svn: 20109	2005-02-10 18:36:30 +00:00
Alkis Evlogimenos	346bb20409	Fix crash on MallocInsts of unsized types. llvm-svn: 19988	2005-02-02 04:43:37 +00:00
Chris Lattner	82b42c5d85	API change. llvm-svn: 19959	2005-02-01 01:23:49 +00:00
Chris Lattner	d6a4492f81	Adjust to changes in APIs llvm-svn: 19958	2005-02-01 01:23:31 +00:00
Chris Lattner	f98a7bffb3	Hacks to make this ugly ugly code work with the new use lists. llvm-svn: 19957	2005-02-01 01:22:56 +00:00
Chris Lattner	72684fecf8	Implement InstCombine/cast.ll:test25, a case that occurs many times in spec llvm-svn: 19953	2005-01-31 05:51:45 +00:00
Chris Lattner	31f486c775	Implement the trivial cases in InstCombine/store.ll llvm-svn: 19950	2005-01-31 05:36:43 +00:00
Chris Lattner	fe1b0b8b24	Implement Transforms/InstCombine/cast-load-gep.ll, which allows us to devirtualize 11 indirect calls in perlbmk. llvm-svn: 19947	2005-01-31 04:50:46 +00:00
Chris Lattner	d8e20188c6	Adjust to changes in instruction interfaces. llvm-svn: 19900	2005-01-29 00:39:08 +00:00
Chris Lattner	a3f06fa2dd	Switchinst takes a hint for the number of cases it will have. llvm-svn: 19899	2005-01-29 00:38:45 +00:00
Chris Lattner	a35dfcedd3	switchinst ctor now takes a hint for the number of cases that it will have. llvm-svn: 19898	2005-01-29 00:38:26 +00:00
Chris Lattner	84d3137da7	Adjust Valuehandle to hold its operand directly in it. llvm-svn: 19897	2005-01-29 00:37:36 +00:00
Chris Lattner	cd517ff0c7	* add some DEBUG statements * Properly compile this: struct a {}; int test() { struct a b[2]; if (&b[0] != &b[1]) abort (); return 0; } to 'return 0', not abort(). llvm-svn: 19875	2005-01-28 19:32:01 +00:00
Alkis Evlogimenos	fbd921987f	Add a dependency to the trace library so that it gets pulled in automatically. llvm-svn: 19828	2005-01-25 16:23:57 +00:00
Chris Lattner	9e2c7facb2	Get rid of a several dozen more and instructions in specint. llvm-svn: 19786	2005-01-23 20:26:55 +00:00
Chris Lattner	fc4429e7c1	Handle comparisons of gep instructions that have different typed indices as long as they are the same size. llvm-svn: 19734	2005-01-21 23:06:49 +00:00
Chris Lattner	411336fe04	Add two optimizations. The first folds (X+Y)-X -> Y The second folds operations into selects, e.g. (select C, (X+Y), (Y+Z)) -> (Y+(select C, X, Z) This occurs a few times across spec, e.g. select add/sub mesa: 83 0 povray: 5 2 gcc 4 2 parser 0 22 perlbmk 13 30 twolf 0 3 llvm-svn: 19706	2005-01-19 21:50:18 +00:00
Chris Lattner	a3cc1835ad	Fix 'raise' to work with packed types. Patch by Morten Ofstad. llvm-svn: 19693	2005-01-19 16:16:35 +00:00
Chris Lattner	715364364b	Delete PHI nodes that are not dead but are locked in a cycle of single useness. llvm-svn: 19629	2005-01-17 05:10:15 +00:00
Chris Lattner	03f06f11aa	Move code out of indentation one level to make it easier to read. Disable the xform for < > cases. It turns out that the following is being miscompiled: bool %test(sbyte %S) { %T = cast sbyte %S to uint %V = setgt uint %T, 255 ret bool %V } llvm-svn: 19628	2005-01-17 03:20:02 +00:00
Chris Lattner	51726c47fe	Fix some bugs in an xform added yesterday. This fixes Prolangs-C/allroots. llvm-svn: 19553	2005-01-14 17:35:12 +00:00
Chris Lattner	7aa41cfa88	Fix a compile crash on spiff llvm-svn: 19552	2005-01-14 17:17:59 +00:00
Chris Lattner	4fa89827e2	if two gep comparisons only differ by one index, compare that index directly. This allows us to better optimize begin() -> end() comparisons in common cases. llvm-svn: 19542	2005-01-14 00:20:05 +00:00
Chris Lattner	d35d210ea0	Do not overrun iterators. This fixes a 176.gcc crash llvm-svn: 19541	2005-01-13 23:26:48 +00:00
Chris Lattner	a04c904c4c	Turn select C, (X+Y), (X-Y) --> (X+(select C, Y, (-Y))). This occurs in the 'sim' program and probably elsewhere. In sim, it comes up for cases like this: #define round(x) ((x)>0.0 ? (x)+0.5 : (x)-0.5) double G; void T(double X) { G = round(X); } (it uses the round macro a lot). This changes the LLVM code from: %tmp.1 = setgt double %X, 0.000000e+00 ; <bool> [#uses=1] %tmp.4 = add double %X, 5.000000e-01 ; <double> [#uses=1] %tmp.6 = sub double %X, 5.000000e-01 ; <double> [#uses=1] %mem_tmp.0 = select bool %tmp.1, double %tmp.4, double %tmp.6 store double %mem_tmp.0, double* %G to: %tmp.1 = setgt double %X, 0.000000e+00 ; <bool> [#uses=1] %mem_tmp.0.p = select bool %tmp.1, double 5.000000e-01, double -5.000000e-01 %mem_tmp.0 = add double %mem_tmp.0.p, %X store double %mem_tmp.0, double* %G ret void llvm-svn: 19537	2005-01-13 22:52:24 +00:00
Chris Lattner	81e8417614	Implement an optimization for == and != comparisons like this: _Bool test2(int X, int Y) { return &arr[X][Y] == arr; } instead of generating this: bool %test2(int %X, int %Y) { %tmp.3.idx = mul int %X, 160 ; <int> [#uses=1] %tmp.3.idx1 = shl int %Y, ubyte 2 ; <int> [#uses=1] %tmp.3.offs2 = sub int 0, %tmp.3.idx ; <int> [#uses=1] %tmp.7 = seteq int %tmp.3.idx1, %tmp.3.offs2 ; <bool> [#uses=1] ret bool %tmp.7 } generate this: bool %test2(int %X, int %Y) { seteq int %X, 0 ; <bool>:0 [#uses=1] seteq int %Y, 0 ; <bool>:1 [#uses=1] %tmp.7 = and bool %0, %1 ; <bool> [#uses=1] ret bool %tmp.7 } This idiom occurs in C++ programs when iterating from begin() to end(), in a vector or array. For example, we now compile this: void test(int X, int Y) { for (int i = arr; i != arr+100; ++i) foo(i); } to this: no_exit: ; preds = %entry, %no_exit ... %exitcond = seteq uint %indvar.next, 100 ; <bool> [#uses=1] br bool %exitcond, label %return, label %no_exit instead of this: no_exit: ; preds = %entry, %no_exit ... %inc5 = getelementptr [100 x [40 x int]]* %arr, int 0, int 0, int %inc.rec ; <int> [#uses=1] %tmp.8 = seteq int %inc5, getelementptr ([100 x [40 x int]]* %arr, int 0, int 100, int 0) ; <bool> [#uses=1] %indvar.next = add uint %indvar, 1 ; <uint> [#uses=1] br bool %tmp.8, label %return, label %no_exit llvm-svn: 19536	2005-01-13 22:25:21 +00:00
Chris Lattner	4cb9fa373b	Fix some bugs in code I didn't mean to check in. llvm-svn: 19534	2005-01-13 20:40:58 +00:00
Chris Lattner	0798af33a5	Fix a crash compiling 129.compress llvm-svn: 19533	2005-01-13 20:14:25 +00:00
Reid Spencer	134f02d0c7	Add the LOADABLE_MODULE=1 directive to indicate that this shared library is intended to be a dlopenable module and not a "plain" shared library. llvm-svn: 19456	2005-01-11 04:33:32 +00:00
Jeff Cohen	3e62e7c68b	Apply feedback from Chris. llvm-svn: 19432	2005-01-10 04:23:32 +00:00
Chris Lattner	798e84f59e	Fix VS warnings llvm-svn: 19383	2005-01-08 19:48:40 +00:00
Chris Lattner	46fa04b531	Fix VS warnings. llvm-svn: 19382	2005-01-08 19:45:31 +00:00
Chris Lattner	fdfe3e49fe	Fix uint64_t -> unsigned VS warnings. llvm-svn: 19381	2005-01-08 19:42:22 +00:00

... 5 6 7 8 9 ...

2420 Commits