llvm-project

Commit Graph

Author	SHA1	Message	Date
Chris Lattner	3a978bf66d	simplify some logic further llvm-svn: 23408	2005-09-23 07:23:18 +00:00
Chris Lattner	cc14ebc17b	pull a bunch of logic out of SimplifyCFG into a helper fn llvm-svn: 23407	2005-09-23 06:39:30 +00:00
Chris Lattner	6c70106053	Start threading across blocks with code in them, so long as the code does not define a value that is used outside of it's block. This catches many more simplifications, e.g. 854 in 176.gcc, 137 in vpr, etc. This implements branch-phi-thread.ll:test3.ll llvm-svn: 23397	2005-09-20 01:48:40 +00:00
Chris Lattner	f0bd8d0107	Implement merging of blocks with the same condition if the block has multiple predecessors. This implements branch-phi-thread.ll::test1 llvm-svn: 23395	2005-09-20 00:43:16 +00:00
Chris Lattner	049cb4482f	Reject a case we don't handle yet llvm-svn: 23393	2005-09-19 23:57:04 +00:00
Chris Lattner	a160924d57	remove debugging code :-/ llvm-svn: 23392	2005-09-19 23:50:15 +00:00
Chris Lattner	748f903046	Implement SimplifyCFG/branch-phi-thread.ll, the most trivial case of threading control across branches with determined outcomes. More generality to follow. This triggers a couple thousand times in specint. llvm-svn: 23391	2005-09-19 23:49:37 +00:00
Chris Lattner	b4b2530a1a	Refactor this code a bit and make it more general. This now compiles: struct S { unsigned int i : 6, j : 11, k : 15; } b; void plus2 (unsigned int x) { b.j += x; } To: _plus2: lis r2, ha16(L_b$non_lazy_ptr) lwz r2, lo16(L_b$non_lazy_ptr)(r2) lwz r4, 0(r2) slwi r3, r3, 6 add r3, r4, r3 rlwimi r3, r4, 0, 26, 14 stw r3, 0(r2) blr instead of: _plus2: lis r2, ha16(L_b$non_lazy_ptr) lwz r2, lo16(L_b$non_lazy_ptr)(r2) lwz r4, 0(r2) rlwinm r5, r4, 26, 21, 31 add r3, r5, r3 rlwimi r4, r3, 6, 15, 25 stw r4, 0(r2) blr by eliminating an 'and'. I'm pretty sure this is as small as we can go :) llvm-svn: 23386	2005-09-18 07:22:02 +00:00
Chris Lattner	797dee7705	Compile struct S { unsigned int i : 6, j : 11, k : 15; } b; void plus2 (unsigned int x) { b.j += x; } to: plus2: mov %EAX, DWORD PTR [b] mov %ECX, %EAX and %ECX, 131008 mov %EDX, DWORD PTR [%ESP + 4] shl %EDX, 6 add %EDX, %ECX and %EDX, 131008 and %EAX, -131009 or %EDX, %EAX mov DWORD PTR [b], %EDX ret instead of: plus2: mov %EAX, DWORD PTR [b] mov %ECX, %EAX shr %ECX, 6 and %ECX, 2047 add %ECX, DWORD PTR [%ESP + 4] shl %ECX, 6 and %ECX, 131008 and %EAX, -131009 or %ECX, %EAX mov DWORD PTR [b], %ECX ret llvm-svn: 23385	2005-09-18 06:30:59 +00:00
Chris Lattner	01f56c68e9	Generalize this transform, using MaskedValueIsZero, allowing us to compile: struct S { unsigned int i : 6, j : 11, k : 15; } b; void plus3 (unsigned int x) { b.k += x; } To: plus3: mov %EAX, DWORD PTR [%ESP + 4] shl %EAX, 17 add DWORD PTR [b], %EAX ret instead of: plus3: mov %EAX, DWORD PTR [%ESP + 4] shl %EAX, 17 mov %ECX, DWORD PTR [b] add %EAX, %ECX and %EAX, -131072 and %ECX, 131071 or %ECX, %EAX mov DWORD PTR [b], %ECX ret llvm-svn: 23384	2005-09-18 06:02:59 +00:00
Chris Lattner	4ebc8ab4e0	fix typeo llvm-svn: 23383	2005-09-18 05:25:20 +00:00
Chris Lattner	e5b23a6d67	Remove unintentionally committed code llvm-svn: 23382	2005-09-18 05:12:51 +00:00
Chris Lattner	27cb9dbd35	implement shift.ll:test25. This compiles: struct S { unsigned int i : 6, j : 11, k : 15; } b; void plus3 (unsigned int x) { b.k += x; } to: _plus3: lis r2, ha16(L_b$non_lazy_ptr) lwz r2, lo16(L_b$non_lazy_ptr)(r2) lwz r3, 0(r2) rlwinm r4, r3, 0, 0, 14 add r4, r4, r3 rlwimi r4, r3, 0, 15, 31 stw r4, 0(r2) blr instead of: _plus3: lis r2, ha16(L_b$non_lazy_ptr) lwz r2, lo16(L_b$non_lazy_ptr)(r2) lwz r4, 0(r2) srwi r5, r4, 17 add r3, r5, r3 slwi r3, r3, 17 rlwimi r3, r4, 0, 15, 31 stw r3, 0(r2) blr llvm-svn: 23381	2005-09-18 05:12:10 +00:00
Chris Lattner	af517574ce	Implement add.ll:test29. Codegening: struct S { unsigned int i : 6, j : 11, k : 15; } b; void plus1 (unsigned int x) { b.i += x; } as: _plus1: lis r2, ha16(L_b$non_lazy_ptr) lwz r2, lo16(L_b$non_lazy_ptr)(r2) lwz r4, 0(r2) add r3, r4, r3 rlwimi r3, r4, 0, 0, 25 stw r3, 0(r2) blr instead of: _plus1: lis r2, ha16(L_b$non_lazy_ptr) lwz r2, lo16(L_b$non_lazy_ptr)(r2) lwz r4, 0(r2) rlwinm r5, r4, 0, 26, 31 add r3, r5, r3 rlwimi r3, r4, 0, 0, 25 stw r3, 0(r2) blr llvm-svn: 23379	2005-09-18 04:24:45 +00:00
Chris Lattner	027eaf01cf	remove debug output llvm-svn: 23377	2005-09-18 03:50:25 +00:00
Chris Lattner	1521298993	Implement or.ll:test21. This teaches instcombine to be able to turn this: struct { unsigned int bit0:1; unsigned int ubyte:31; } sdata; void foo() { sdata.ubyte++; } into this: foo: add DWORD PTR [sdata], 2 ret instead of this: foo: mov %EAX, DWORD PTR [sdata] mov %ECX, %EAX add %ECX, 2 and %ECX, -2 and %EAX, 1 or %EAX, %ECX mov DWORD PTR [sdata], %EAX ret llvm-svn: 23376	2005-09-18 03:42:07 +00:00
Chris Lattner	a393e4d4b3	Fix the regression last night compiling povray llvm-svn: 23348	2005-09-14 17:32:56 +00:00
Chris Lattner	2a8932960d	Add a simple xform to simplify array accesses with casts in the way. This is useful for 178.galgel where resolution of dope vectors (by the optimizer) causes the scales to become apparent. llvm-svn: 23328	2005-09-13 18:36:04 +00:00
Chris Lattner	fd018c8dfe	Fix an issue where LSR would miss rewriting a use of an IV expression by a PHI node that is not the original PHI. This fixes up a dot-product loop in galgel, speeding it up from 18.47s to 16.13s. llvm-svn: 23327	2005-09-13 02:09:55 +00:00
Chris Lattner	567b81f0d2	Add a helper function, allowing us to simplify some code a bit, changing indentation, no functionality change llvm-svn: 23325	2005-09-13 00:40:14 +00:00
Chris Lattner	219175c84d	Implement a simple xform to turn code like this: if () { store A -> P; } else { store B -> P; } into a PHI node with one store, in the most trival case. This implements load.ll:test10. llvm-svn: 23324	2005-09-12 23:23:25 +00:00
Chris Lattner	e0bfdf1485	Another load-peephole optimization: do gcse when two loads are next to each other. This implements InstCombine/load.ll:test9 llvm-svn: 23322	2005-09-12 22:21:03 +00:00
Chris Lattner	b990f7d8ed	Implement a trivial form of store->load forwarding where the store and the load are exactly consequtive. This is picked up by other passes, but this triggers thousands of times in fortran programs that use static locals (and is thus a compile-time speedup). llvm-svn: 23320	2005-09-12 22:00:15 +00:00
Chris Lattner	8048b85e8f	Fix a regression from last night, which caused this pass to create invalid code for IV uses outside of loops that are not dominated by the latch block. We should only convert these uses to use the post-inc value if they ARE dominated by the latch block. Also use a new LoopInfo method to simplify some code. This fixes Transforms/LoopStrengthReduce/2005-09-12-UsesOutOutsideOfLoop.ll llvm-svn: 23318	2005-09-12 17:11:27 +00:00
Chris Lattner	a67648396a	_test: li r2, 0 LBB_test_1: ; no_exit.2 li r5, 0 stw r5, 0(r3) addi r2, r2, 1 addi r3, r3, 4 cmpwi cr0, r2, 701 blt cr0, LBB_test_1 ; no_exit.2 LBB_test_2: ; loopexit.2.loopexit addi r2, r2, 1 stw r2, 0(r4) blr [zion ~/llvm]$ cat > ~/xx Uses of IV's outside of the loop should use hte post-incremented version of the IV, not the preincremented version. This helps many loops (e.g. in sixtrack) which used to generate code like this (this is the code from the dont-hoist-simple-loop-constants.ll testcase): _test: li r2, 0 ** IV starts at 0 LBB_test_1: ; no_exit.2 or r5, r2, r2 Copy for loop exit li r2, 0 stw r2, 0(r3) addi r3, r3, 4 addi r2, r5, 1 addi r6, r5, 2 IV+2 cmpwi cr0, r6, 701 blt cr0, LBB_test_1 ; no_exit.2 LBB_test_2: ; loopexit.2.loopexit addi r2, r5, 2 IV+2 stw r2, 0(r4) blr And now generated code like this: _test: li r2, 1 * IV starts at 1 LBB_test_1: ; no_exit.2 li r5, 0 stw r5, 0(r3) addi r2, r2, 1 addi r3, r3, 4 cmpwi cr0, r2, 701 * IV.postinc + 0 blt cr0, LBB_test_1 LBB_test_2: ; loopexit.2.loopexit stw r2, 0(r4) * IV.postinc + 0 blr llvm-svn: 23313	2005-09-12 06:04:47 +00:00
Chris Lattner	530fe6ab30	implement Transforms/LoopStrengthReduce/dont-hoist-simple-loop-constants.ll. We used to emit this code for it: _test: li r2, 1 ;; Value tying up a register for the whole loop li r5, 0 LBB_test_1: ; no_exit.2 or r6, r5, r5 li r5, 0 stw r5, 0(r3) addi r5, r6, 1 addi r3, r3, 4 add r7, r2, r5 ;; should be addi r7, r5, 1 cmpwi cr0, r7, 701 blt cr0, LBB_test_1 ; no_exit.2 LBB_test_2: ; loopexit.2.loopexit addi r2, r6, 2 stw r2, 0(r4) blr now we emit this: _test: li r2, 0 LBB_test_1: ; no_exit.2 or r5, r2, r2 li r2, 0 stw r2, 0(r3) addi r3, r3, 4 addi r2, r5, 1 addi r6, r5, 2 ;; whoa, fold those adds! cmpwi cr0, r6, 701 blt cr0, LBB_test_1 ; no_exit.2 LBB_test_2: ; loopexit.2.loopexit addi r2, r5, 2 stw r2, 0(r4) blr more improvement coming. llvm-svn: 23306	2005-09-10 01:18:45 +00:00
Chris Lattner	b5e381a8cf	Fix a problem that Dan Berlin noticed, where reassociation would not succeed in building maximal expressions before simplifying them. In particular, i cases like this: X-(A+B+X) the code would consider A+B+X to be a maximal expression (not understanding that the single use '-' would be turned into a + later), simplify it (a noop) then later get simplified again. Each of these simplify steps is where the cost of reassociation comes from, so this patch should speed up the already fast pass a bit. Thanks to Dan for noticing this! llvm-svn: 23214	2005-09-02 07:07:58 +00:00
Chris Lattner	9fe263aa75	Avoid creating garbage instructions, just move the old add instruction to where we need it when converting -(A+B+C) -> -A + -B + -C. llvm-svn: 23213	2005-09-02 06:38:04 +00:00
Chris Lattner	d1325da091	add some assertions and fix problems where reassociate could access the Ops vector out of range llvm-svn: 23211	2005-09-02 05:23:22 +00:00
Chris Lattner	8ca5b2a6d2	Fix Regression/Transforms/Reassociate/2005-08-24-Crash.ll llvm-svn: 23019	2005-08-24 17:55:32 +00:00
Chris Lattner	4201cd1bbc	Transform floor((double)FLT) -> (double)floorf(FLT), implementing Regression/Transforms/SimplifyLibCalls/floor.ll. This triggers 19 times in 177.mesa. llvm-svn: 23017	2005-08-24 17:22:17 +00:00
Chris Lattner	ea7dfd53d6	Fix Transforms/LoopStrengthReduce/2005-08-17-OutOfLoopVariant.ll, a crash on 177.mesa llvm-svn: 22843	2005-08-17 21:22:41 +00:00
Chris Lattner	2bf7cb5213	Use a new helper to split critical edges, making the code simpler. Do not claim to not change the CFG. We do change the cfg to split critical edges. This isn't causing us a problem now, but could likely do so in the future. llvm-svn: 22824	2005-08-17 06:35:16 +00:00
Chris Lattner	5cf983ee0f	Fix a bad case in gzip where we put lots of things in registers across the loop, because a IV-dependent value was used outside of the loop and didn't have immediate-folding capability llvm-svn: 22798	2005-08-16 00:38:11 +00:00
Chris Lattner	47d3ec3525	Ooops, don't forget to clear this. The real inner loop is now: .LBB_foo_3: ; no_exit.1 lfd f2, 0(r9) lfd f3, 8(r9) fmul f4, f1, f2 fmadd f4, f0, f3, f4 stfd f4, 8(r9) fmul f3, f1, f3 fmsub f2, f0, f2, f3 stfd f2, 0(r9) addi r9, r9, 16 addi r8, r8, 1 cmpw cr0, r8, r4 ble .LBB_foo_3 ; no_exit.1 llvm-svn: 22782	2005-08-13 07:42:01 +00:00
Chris Lattner	5949d49032	Recursively scan scev expressions for common subexpressions. This allows us to handle nested loops much better, for example, by being able to tell that these two expressions: {( 8 + ( 16 * ( 1 + %Tmp11 + %Tmp12)) + %c_),+,( 16 * %Tmp 12)}<loopentry.1> {(( 16 * ( 1 + %Tmp11 + %Tmp12)) + %c_),+,( 16 * %Tmp12)}<loopentry.1> Have the following common part that can be shared: {(( 16 * ( 1 + %Tmp11 + %Tmp12)) + %c_),+,( 16 * %Tmp12)}<loopentry.1> This allows us to codegen an important inner loop in 168.wupwise as: .LBB_foo_4: ; no_exit.1 lfd f2, 16(r9) fmul f3, f0, f2 fmul f2, f1, f2 fadd f4, f3, f2 stfd f4, 8(r9) fsub f2, f3, f2 stfd f2, 16(r9) addi r8, r8, 1 addi r9, r9, 16 cmpw cr0, r8, r4 ble .LBB_foo_4 ; no_exit.1 instead of: .LBB_foo_3: ; no_exit.1 lfdx f2, r6, r9 add r10, r6, r9 lfd f3, 8(r10) fmul f4, f1, f2 fmadd f4, f0, f3, f4 stfd f4, 8(r10) fmul f3, f1, f3 fmsub f2, f0, f2, f3 stfdx f2, r6, r9 addi r9, r9, 16 addi r8, r8, 1 cmpw cr0, r8, r4 ble .LBB_foo_3 ; no_exit.1 llvm-svn: 22781	2005-08-13 07:27:18 +00:00
Chris Lattner	89c1dfc733	Teach SplitCriticalEdge to update LoopInfo if it is alive. This fixes a problem in LoopStrengthReduction, where it would split critical edges then confused itself with outdated loop information. llvm-svn: 22776	2005-08-13 01:38:43 +00:00
Chris Lattner	79396539d3	remove dead code. The exit block list is computed on demand, thus does not need to be updated. This code is a relic from when it did. llvm-svn: 22775	2005-08-13 01:30:36 +00:00
Chris Lattner	8447b49526	When splitting critical edges, make sure not to leave the new block in the middle of the loop. This turns a critical loop in gzip into this: .LBB_test_1: ; loopentry or r27, r28, r28 add r28, r3, r27 lhz r28, 3(r28) add r26, r4, r27 lhz r26, 3(r26) cmpw cr0, r28, r26 bne .LBB_test_8 ; loopentry.loopexit_crit_edge .LBB_test_2: ; shortcirc_next.0 add r28, r3, r27 lhz r28, 5(r28) add r26, r4, r27 lhz r26, 5(r26) cmpw cr0, r28, r26 bne .LBB_test_7 ; shortcirc_next.0.loopexit_crit_edge .LBB_test_3: ; shortcirc_next.1 add r28, r3, r27 lhz r28, 7(r28) add r26, r4, r27 lhz r26, 7(r26) cmpw cr0, r28, r26 bne .LBB_test_6 ; shortcirc_next.1.loopexit_crit_edge .LBB_test_4: ; shortcirc_next.2 add r28, r3, r27 lhz r26, 9(r28) add r28, r4, r27 lhz r25, 9(r28) addi r28, r27, 8 cmpw cr7, r26, r25 mfcr r26, 1 rlwinm r26, r26, 31, 31, 31 add r25, r8, r27 cmpw cr7, r25, r7 mfcr r25, 1 rlwinm r25, r25, 29, 31, 31 and. r26, r26, r25 bne .LBB_test_1 ; loopentry instead of this: .LBB_test_1: ; loopentry or r27, r28, r28 add r28, r3, r27 lhz r28, 3(r28) add r26, r4, r27 lhz r26, 3(r26) cmpw cr0, r28, r26 beq .LBB_test_3 ; shortcirc_next.0 .LBB_test_2: ; loopentry.loopexit_crit_edge add r2, r30, r27 add r8, r29, r27 b .LBB_test_9 ; loopexit .LBB_test_3: ; shortcirc_next.0 add r28, r3, r27 lhz r28, 5(r28) add r26, r4, r27 lhz r26, 5(r26) cmpw cr0, r28, r26 beq .LBB_test_5 ; shortcirc_next.1 .LBB_test_4: ; shortcirc_next.0.loopexit_crit_edge add r2, r11, r27 add r8, r12, r27 b .LBB_test_9 ; loopexit .LBB_test_5: ; shortcirc_next.1 add r28, r3, r27 lhz r28, 7(r28) add r26, r4, r27 lhz r26, 7(r26) cmpw cr0, r28, r26 beq .LBB_test_7 ; shortcirc_next.2 .LBB_test_6: ; shortcirc_next.1.loopexit_crit_edge add r2, r9, r27 add r8, r10, r27 b .LBB_test_9 ; loopexit .LBB_test_7: ; shortcirc_next.2 add r28, r3, r27 lhz r26, 9(r28) add r28, r4, r27 lhz r25, 9(r28) addi r28, r27, 8 cmpw cr7, r26, r25 mfcr r26, 1 rlwinm r26, r26, 31, 31, 31 add r25, r8, r27 cmpw cr7, r25, r7 mfcr r25, 1 rlwinm r25, r25, 29, 31, 31 and. r26, r26, r25 bne .LBB_test_1 ; loopentry Next up, improve the code for the loop. llvm-svn: 22769	2005-08-12 22:22:17 +00:00
Chris Lattner	4fec86d348	Fix a FIXME: if we are inserting code for a PHI argument, split the critical edge so that the code is not always executed for both operands. This prevents LSR from inserting code into loops whose exit blocks contain PHI uses of IV expressions (which are outside of loops). On gzip, for example, we turn this ugly code: .LBB_test_1: ; loopentry add r27, r3, r28 lhz r27, 3(r27) add r26, r4, r28 lhz r26, 3(r26) add r25, r30, r28 ;; Only live if exiting the loop add r24, r29, r28 ;; Only live if exiting the loop cmpw cr0, r27, r26 bne .LBB_test_5 ; loopexit into this: .LBB_test_1: ; loopentry or r27, r28, r28 add r28, r3, r27 lhz r28, 3(r28) add r26, r4, r27 lhz r26, 3(r26) cmpw cr0, r28, r26 beq .LBB_test_3 ; shortcirc_next.0 .LBB_test_2: ; loopentry.loopexit_crit_edge add r2, r30, r27 add r8, r29, r27 b .LBB_test_9 ; loopexit .LBB_test_2: ; shortcirc_next.0 ... blt .LBB_test_1 into this: .LBB_test_1: ; loopentry or r27, r28, r28 add r28, r3, r27 lhz r28, 3(r28) add r26, r4, r27 lhz r26, 3(r26) cmpw cr0, r28, r26 beq .LBB_test_3 ; shortcirc_next.0 .LBB_test_2: ; loopentry.loopexit_crit_edge add r2, r30, r27 add r8, r29, r27 b .LBB_t_3: ; shortcirc_next.0 .LBB_test_3: ; shortcirc_next.0 ... blt .LBB_test_1 Next step: get the block out of the loop so that the loop is all fall-throughs again. llvm-svn: 22766	2005-08-12 22:06:11 +00:00
Chris Lattner	b7ebe65c56	Change break critical edges to not remove, then insert, PHI node entries. Instead, just update the BB in-place. This is both faster, and it prevents split-critical-edges from shuffling the PHI argument list unneccesarily. llvm-svn: 22765	2005-08-12 21:58:07 +00:00
Chris Lattner	62df798919	remove some trickiness that broke yacr2 and some other programs last night llvm-svn: 22751	2005-08-10 17:15:20 +00:00
Chris Lattner	f83ce5faee	Make loop-simplify produce better loops by turning PHI nodes like X = phi [X, Y] into just Y. This often occurs when it seperates loops that have collapsed loop headers. This implements LoopSimplify/phi-node-simplify.ll llvm-svn: 22746	2005-08-10 02:07:32 +00:00
Chris Lattner	677d85784a	Allow indvar simplify to canonicalize ANY affine IV, not just affine IVs with constant stride. This implements Transforms/IndVarsSimplify/variable-stride-ivs.ll llvm-svn: 22744	2005-08-10 01:12:06 +00:00
Chris Lattner	edff91a49a	Teach LSR to strength reduce IVs that have a loop-invariant but non-constant stride. For code like this: void foo(float a, float b, int n, int stride_a, int stride_b) { int i; for (i=0; i<n; i++) a[istride_a] = b[istride_b]; } we now emit: .LBB_foo2_2: ; no_exit lfs f0, 0(r4) stfs f0, 0(r3) addi r7, r7, 1 add r4, r2, r4 add r3, r6, r3 cmpw cr0, r7, r5 blt .LBB_foo2_2 ; no_exit instead of: .LBB_foo_2: ; no_exit mullw r8, r2, r7 ;; multiply! slwi r8, r8, 2 lfsx f0, r4, r8 mullw r8, r2, r6 ;; multiply! slwi r8, r8, 2 stfsx f0, r3, r8 addi r2, r2, 1 cmpw cr0, r2, r5 blt .LBB_foo_2 ; no_exit loops with variable strides occur pretty often. For example, in SPECFP2K there are 317 variable strides in 177.mesa, 3 in 179.art, 14 in 188.ammp, 56 in 168.wupwise, 36 in 172.mgrid. Now we can allow indvars to turn functions written like this: void foo2(float a, float b, int n, int stride_a, int stride_b) { int i, ai = 0, bi = 0; for (i=0; i<n; i++) { a[ai] = b[bi]; ai += stride_a; bi += stride_b; } } into code like the above for better analysis. With this patch, they generate identical code. llvm-svn: 22740	2005-08-10 00:45:21 +00:00
Chris Lattner	dde7dc525e	Fix Regression/Transforms/LoopStrengthReduce/phi_node_update_multiple_preds.ll by being more careful about updating PHI nodes llvm-svn: 22739	2005-08-10 00:35:32 +00:00
Chris Lattner	c6c4d99a21	Fix some 80 column violations. Once we compute the evolution for a GEP, tell SE about it. This allows users of the GEP to know it, if the users are not direct. This allows us to compile this testcase: void fbSolidFillmmx(int w, unsigned char d) { while (w >= 64) { (unsigned long long ) (d + 0) = 0; (unsigned long long ) (d + 8) = 0; (unsigned long long ) (d + 16) = 0; (unsigned long long ) (d + 24) = 0; (unsigned long long ) (d + 32) = 0; (unsigned long long ) (d + 40) = 0; (unsigned long long ) (d + 48) = 0; (unsigned long long *) (d + 56) = 0; w -= 64; d += 64; } } into: .LBB_fbSolidFillmmx_2: ; no_exit li r2, 0 stw r2, 0(r4) stw r2, 4(r4) stw r2, 8(r4) stw r2, 12(r4) stw r2, 16(r4) stw r2, 20(r4) stw r2, 24(r4) stw r2, 28(r4) stw r2, 32(r4) stw r2, 36(r4) stw r2, 40(r4) stw r2, 44(r4) stw r2, 48(r4) stw r2, 52(r4) stw r2, 56(r4) stw r2, 60(r4) addi r4, r4, 64 addi r3, r3, -64 cmpwi cr0, r3, 63 bgt .LBB_fbSolidFillmmx_2 ; no_exit instead of: .LBB_fbSolidFillmmx_2: ; no_exit li r11, 0 stw r11, 0(r4) stw r11, 4(r4) stwx r11, r10, r4 add r12, r10, r4 stw r11, 4(r12) stwx r11, r9, r4 add r12, r9, r4 stw r11, 4(r12) stwx r11, r8, r4 add r12, r8, r4 stw r11, 4(r12) stwx r11, r7, r4 add r12, r7, r4 stw r11, 4(r12) stwx r11, r6, r4 add r12, r6, r4 stw r11, 4(r12) stwx r11, r5, r4 add r12, r5, r4 stw r11, 4(r12) stwx r11, r2, r4 add r12, r2, r4 stw r11, 4(r12) addi r4, r4, 64 addi r3, r3, -64 cmpwi cr0, r3, 63 bgt .LBB_fbSolidFillmmx_2 ; no_exit llvm-svn: 22737	2005-08-09 23:39:36 +00:00
Chris Lattner	02742710f3	SCEVAddExpr::get() of an empty list is invalid. llvm-svn: 22724	2005-08-09 01:13:47 +00:00
Chris Lattner	a091ff1764	Implement: LoopStrengthReduce/share_ivs.ll Two changes: * Only insert one PHI node for each stride. Other values are live in values. This cannot introduce higher register pressure than the previous approach, and can take advantage of reg+reg addressing modes. * Factor common base values out of uses before moving values from the base to the immediate fields. This improves codegen by starting the stride-specific PHI node out at a common place for each IV use. As an example, we used to generate this for a loop in swim: .LBB_main_no_exit_2E_6_2E_i_no_exit_2E_7_2E_i_2: ; no_exit.7.i lfd f0, 0(r8) stfd f0, 0(r3) lfd f0, 0(r6) stfd f0, 0(r7) lfd f0, 0(r2) stfd f0, 0(r5) addi r9, r9, 1 addi r2, r2, 8 addi r5, r5, 8 addi r6, r6, 8 addi r7, r7, 8 addi r8, r8, 8 addi r3, r3, 8 cmpw cr0, r9, r4 bgt .LBB_main_no_exit_2E_6_2E_i_no_exit_2E_7_2E_i_1 now we emit: .LBB_main_no_exit_2E_6_2E_i_no_exit_2E_7_2E_i_2: ; no_exit.7.i lfdx f0, r8, r2 stfdx f0, r9, r2 lfdx f0, r5, r2 stfdx f0, r7, r2 lfdx f0, r3, r2 stfdx f0, r6, r2 addi r10, r10, 1 addi r2, r2, 8 cmpw cr0, r10, r4 bgt .LBB_main_no_exit_2E_6_2E_i_no_exit_2E_7_2E_i_1 As another more dramatic example, we used to emit this: .LBB_main_L_90_no_exit_2E_0_2E_i16_no_exit_2E_1_2E_i19_2: ; no_exit.1.i19 lfd f0, 8(r21) lfd f4, 8(r3) lfd f5, 8(r27) lfd f6, 8(r22) lfd f7, 8(r5) lfd f8, 8(r6) lfd f9, 8(r30) lfd f10, 8(r11) lfd f11, 8(r12) fsub f10, f10, f11 fadd f5, f4, f5 fmul f5, f5, f1 fadd f6, f6, f7 fadd f6, f6, f8 fadd f6, f6, f9 fmadd f0, f5, f6, f0 fnmsub f0, f10, f2, f0 stfd f0, 8(r4) lfd f0, 8(r25) lfd f5, 8(r26) lfd f6, 8(r23) lfd f9, 8(r28) lfd f10, 8(r10) lfd f12, 8(r9) lfd f13, 8(r29) fsub f11, f13, f11 fadd f4, f4, f5 fmul f4, f4, f1 fadd f5, f6, f9 fadd f5, f5, f10 fadd f5, f5, f12 fnmsub f0, f4, f5, f0 fnmsub f0, f11, f3, f0 stfd f0, 8(r24) lfd f0, 8(r8) fsub f4, f7, f8 fsub f5, f12, f10 fnmsub f0, f5, f2, f0 fnmsub f0, f4, f3, f0 stfd f0, 8(r2) addi r20, r20, 1 addi r2, r2, 8 addi r8, r8, 8 addi r10, r10, 8 addi r12, r12, 8 addi r6, r6, 8 addi r29, r29, 8 addi r28, r28, 8 addi r26, r26, 8 addi r25, r25, 8 addi r24, r24, 8 addi r5, r5, 8 addi r23, r23, 8 addi r22, r22, 8 addi r3, r3, 8 addi r9, r9, 8 addi r11, r11, 8 addi r30, r30, 8 addi r27, r27, 8 addi r21, r21, 8 addi r4, r4, 8 cmpw cr0, r20, r7 bgt .LBB_main_L_90_no_exit_2E_0_2E_i16_no_exit_2E_1_2E_i19_1 we now emit: .LBB_main_L_90_no_exit_2E_0_2E_i16_no_exit_2E_1_2E_i19_2: ; no_exit.1.i19 lfdx f0, r21, r20 lfdx f4, r3, r20 lfdx f5, r27, r20 lfdx f6, r22, r20 lfdx f7, r5, r20 lfdx f8, r6, r20 lfdx f9, r30, r20 lfdx f10, r11, r20 lfdx f11, r12, r20 fsub f10, f10, f11 fadd f5, f4, f5 fmul f5, f5, f1 fadd f6, f6, f7 fadd f6, f6, f8 fadd f6, f6, f9 fmadd f0, f5, f6, f0 fnmsub f0, f10, f2, f0 stfdx f0, r4, r20 lfdx f0, r25, r20 lfdx f5, r26, r20 lfdx f6, r23, r20 lfdx f9, r28, r20 lfdx f10, r10, r20 lfdx f12, r9, r20 lfdx f13, r29, r20 fsub f11, f13, f11 fadd f4, f4, f5 fmul f4, f4, f1 fadd f5, f6, f9 fadd f5, f5, f10 fadd f5, f5, f12 fnmsub f0, f4, f5, f0 fnmsub f0, f11, f3, f0 stfdx f0, r24, r20 lfdx f0, r8, r20 fsub f4, f7, f8 fsub f5, f12, f10 fnmsub f0, f5, f2, f0 fnmsub f0, f4, f3, f0 stfdx f0, r2, r20 addi r19, r19, 1 addi r20, r20, 8 cmpw cr0, r19, r7 bgt .LBB_main_L_90_no_exit_2E_0_2E_i16_no_exit_2E_1_2E_i19_1 llvm-svn: 22722	2005-08-09 00:18:09 +00:00
Chris Lattner	37c24cc98c	Suck the base value out of the UsersToProcess vector into the BasedUser class to simplify the code. Fuse two loops. llvm-svn: 22721	2005-08-08 22:56:21 +00:00
Chris Lattner	37ed895bf1	Split MoveLoopVariantsToImediateField out from MoveImmediateValues. The first is a correctness thing, and the later is an optzn thing. This also is needed to support a future change. llvm-svn: 22720	2005-08-08 22:32:34 +00:00
Chris Lattner	9f269e40c9	Use the new 'moveBefore' method to simplify some code. Really, which is easier to understand? :) llvm-svn: 22706	2005-08-08 19:11:57 +00:00
Chris Lattner	14203e85b2	Not all constants are legal immediates in load/store instructions. llvm-svn: 22704	2005-08-08 06:25:50 +00:00
Chris Lattner	c70bbc0c41	Implement LoopStrengthReduce/share_code_in_preheader.ll by having one rewriter for all code inserted into the preheader, which is never flushed. llvm-svn: 22702	2005-08-08 05:47:49 +00:00
Chris Lattner	9bfa6f8784	Implement a simple optimization for the termination condition of the loop. The termination condition actually wants to use the post-incremented value of the loop, not a new indvar with an unusual base. On PPC, for example, this allows us to compile LoopStrengthReduce/exit_compare_live_range.ll to: _foo: li r2, 0 .LBB_foo_1: ; no_exit li r5, 0 stw r5, 0(r3) addi r2, r2, 1 cmpw cr0, r2, r4 bne .LBB_foo_1 ; no_exit blr instead of: _foo: li r2, 1 ;; IV starts at 1, not 0 .LBB_foo_1: ; no_exit li r5, 0 stw r5, 0(r3) addi r5, r2, 1 cmpw cr0, r2, r4 or r2, r5, r5 ;; Reg-reg copy, extra live range bne .LBB_foo_1 ; no_exit blr This implements LoopStrengthReduce/exit_compare_live_range.ll llvm-svn: 22699	2005-08-08 05:28:22 +00:00
Chris Lattner	579b20b747	All stats are "Number of ..." llvm-svn: 22694	2005-08-07 20:02:04 +00:00
Chris Lattner	2c14cf7b74	Add some simple folds that occur in bitfield cases. Fix a minor bug in isHighOnes, where it would consider 0 to have high ones. llvm-svn: 22693	2005-08-07 07:03:10 +00:00
Chris Lattner	134ebd0801	Fix typoCVS: ---------------------------------------------------------------------- llvm-svn: 22692	2005-08-07 07:00:52 +00:00
Chris Lattner	f4dd8c445c	* Use the new PHINode::hasConstantValue method to simplify some code * Teach this code to move allocas out of the loop when tail call eliminating a call marked 'tail'. This implements TailCallElim/move_alloca_for_tail_call.ll * Do not perform this transformation if a call is marked 'tail' and if there are allocas that we cannot move out of the loop in #2. Doing so would increase the stack usage of the function. This implements fixes PR615 and TailCallElim/dont-tce-tail-marked-call.ll. llvm-svn: 22690	2005-08-07 04:27:41 +00:00
Chris Lattner	11e7a5eda7	Make sure to clean CastedPointers after casts are potentially deleted. This fixes LSR crashes on 301.apsi, 191.fma3d, and 189.lucas llvm-svn: 22673	2005-08-05 01:30:11 +00:00
Chris Lattner	9f9c260b8c	now that hasConstantValue defaults to only returning values that dominate the PHI node, this ugly code can vanish. llvm-svn: 22672	2005-08-05 01:04:30 +00:00
Chris Lattner	257efb2ad3	This code can handle non-dominating instructions llvm-svn: 22667	2005-08-05 00:57:45 +00:00
Nate Begeman	b392321cae	Fix a fixme in CondPropagate.cpp by moving a PhiNode optimization into BasicBlock's removePredecessor routine. This requires shuffling around the definition and implementation of hasContantValue from Utils.h,cpp into Instructions.h,cpp llvm-svn: 22664	2005-08-04 23:24:19 +00:00
Chris Lattner	45f8b6e7aa	Modify how immediates are removed from base expressions to deal with the fact that the symbolic evaluator is not always able to use subtraction to remove expressions. This makes the code faster, and fixes the last crash on 178.galgel. Finally, add a statistic to see how many phi nodes are inserted. On 178.galgel, we get the follow stats: 2562 loop-reduce - Number of PHIs inserted 3927 loop-reduce - Number of GEPs strength reduced llvm-svn: 22662	2005-08-04 22:34:05 +00:00
Chris Lattner	a6d7c355bc	* Refactor some code into a new BasedUser::RewriteInstructionToUseNewBase method. * Fix a crash on 178.galgel, where we would insert expressions before PHI nodes instead of into the PHI node predecessor blocks. llvm-svn: 22657	2005-08-04 20:03:32 +00:00
Chris Lattner	0f7c0fa2a7	Fix a case that caused this to crash on 178.galgel llvm-svn: 22653	2005-08-04 19:26:19 +00:00
Chris Lattner	acc42c4df1	Teach LSR about loop-variant expressions, such as loops like this: for (i = 0; i < N; ++i) A[i][foo()] = 0; here we still want to strength reduce the A[i] part, even though foo() is l-v. This also simplifies some of the 'CanReduce' logic. This implements Transforms/LoopStrengthReduce/ops_after_indvar.ll llvm-svn: 22652	2005-08-04 19:08:16 +00:00
Nate Begeman	456044b724	Remove some more dead code. llvm-svn: 22650	2005-08-04 18:13:56 +00:00
Chris Lattner	eaf24725b2	Refactor this code substantially with the following improvements: 1. We only analyze instructions once, guaranteed 2. AnalyzeGetElementPtrUsers has been ripped apart and replaced with something much simpler. The next step is to handle expressions that are not all indvar+loop-invariant values (e.g. handling indvar+loopvariant). llvm-svn: 22649	2005-08-04 17:40:30 +00:00
Chris Lattner	6f286b760f	refactor some code llvm-svn: 22643	2005-08-04 01:19:13 +00:00
Chris Lattner	6510749050	invert to if's to make the logic simpler llvm-svn: 22641	2005-08-04 00:40:47 +00:00
Chris Lattner	a0102fbc4f	When processing outer loops and we find uses of an IV in inner loops, make sure to handle the use, just don't recurse into it. This permits us to generate this code for a simple nested loop case: .LBB_foo_0: ; entry stwu r1, -48(r1) stw r29, 44(r1) stw r30, 40(r1) mflr r11 stw r11, 56(r1) lis r2, ha16(L_A$non_lazy_ptr) lwz r30, lo16(L_A$non_lazy_ptr)(r2) li r29, 1 .LBB_foo_1: ; no_exit.0 bl L_bar$stub li r2, 1 or r3, r30, r30 .LBB_foo_2: ; no_exit.1 lfd f0, 8(r3) stfd f0, 0(r3) addi r4, r2, 1 addi r3, r3, 8 cmpwi cr0, r2, 100 or r2, r4, r4 bne .LBB_foo_2 ; no_exit.1 .LBB_foo_3: ; loopexit.1 addi r30, r30, 800 addi r2, r29, 1 cmpwi cr0, r29, 100 or r29, r2, r2 bne .LBB_foo_1 ; no_exit.0 .LBB_foo_4: ; return lwz r11, 56(r1) mtlr r11 lwz r30, 40(r1) lwz r29, 44(r1) lwz r1, 0(r1) blr instead of this: _foo: .LBB_foo_0: ; entry stwu r1, -48(r1) stw r28, 44(r1) ;; uses an extra register. stw r29, 40(r1) stw r30, 36(r1) mflr r11 stw r11, 56(r1) li r30, 1 li r29, 0 or r28, r29, r29 .LBB_foo_1: ; no_exit.0 bl L_bar$stub mulli r2, r28, 800 ;; unstrength-reduced multiply lis r3, ha16(L_A$non_lazy_ptr) ;; loop invariant address computation lwz r3, lo16(L_A$non_lazy_ptr)(r3) add r2, r2, r3 mulli r4, r29, 800 ;; unstrength-reduced multiply addi r3, r3, 8 add r3, r4, r3 li r4, 1 .LBB_foo_2: ; no_exit.1 lfd f0, 0(r3) stfd f0, 0(r2) addi r5, r4, 1 addi r2, r2, 8 ;; multiple stride 8 IV's addi r3, r3, 8 cmpwi cr0, r4, 100 or r4, r5, r5 bne .LBB_foo_2 ; no_exit.1 .LBB_foo_3: ; loopexit.1 addi r28, r28, 1 ;;; Many IV's with stride 1 addi r29, r29, 1 addi r2, r30, 1 cmpwi cr0, r30, 100 or r30, r2, r2 bne .LBB_foo_1 ; no_exit.0 .LBB_foo_4: ; return lwz r11, 56(r1) mtlr r11 lwz r30, 36(r1) lwz r29, 40(r1) lwz r28, 44(r1) lwz r1, 0(r1) blr llvm-svn: 22640	2005-08-04 00:14:11 +00:00
Chris Lattner	fc62470466	Teach loop-reduce to see into nested loops, to pull out immediate values pushed down by SCEV. In a nested loop case, this allows us to emit this: lis r3, ha16(L_A$non_lazy_ptr) lwz r3, lo16(L_A$non_lazy_ptr)(r3) add r2, r2, r3 li r3, 1 .LBB_foo_2: ; no_exit.1 lfd f0, 8(r2) ;; Uses offset of 8 instead of 0 stfd f0, 0(r2) addi r4, r3, 1 addi r2, r2, 8 cmpwi cr0, r3, 100 or r3, r4, r4 bne .LBB_foo_2 ; no_exit.1 instead of this: lis r3, ha16(L_A$non_lazy_ptr) lwz r3, lo16(L_A$non_lazy_ptr)(r3) add r2, r2, r3 addi r3, r3, 8 li r4, 1 .LBB_foo_2: ; no_exit.1 lfd f0, 0(r3) stfd f0, 0(r2) addi r5, r4, 1 addi r2, r2, 8 addi r3, r3, 8 cmpwi cr0, r4, 100 or r4, r5, r5 bne .LBB_foo_2 ; no_exit.1 llvm-svn: 22639	2005-08-03 23:44:42 +00:00
Chris Lattner	bb78c97e24	improve debug output llvm-svn: 22638	2005-08-03 23:30:08 +00:00
Chris Lattner	db23c74e5e	Move from Stage 0 to Stage 1. Only emit one PHI node for IV uses with identical bases and strides (after moving foldable immediates to the load/store instruction). This implements LoopStrengthReduce/dont_insert_redundant_ops.ll, allowing us to generate this PPC code for test1: or r30, r3, r3 .LBB_test1_1: ; Loop li r2, 0 stw r2, 0(r30) stw r2, 4(r30) bl L_pred$stub addi r30, r30, 8 cmplwi cr0, r3, 0 bne .LBB_test1_1 ; Loop instead of this code: or r30, r3, r3 or r29, r3, r3 .LBB_test1_1: ; Loop li r2, 0 stw r2, 0(r29) stw r2, 4(r30) bl L_pred$stub addi r30, r30, 8 ;; Two iv's with step of 8 addi r29, r29, 8 cmplwi cr0, r3, 0 bne .LBB_test1_1 ; Loop llvm-svn: 22635	2005-08-03 22:51:21 +00:00
Chris Lattner	430d0022df	Rename IVUse to IVUsersOfOneStride, use a struct instead of a pair to unify some parallel vectors and get field names more descriptive than "first" and "second". This isn't lisp afterall :) llvm-svn: 22633	2005-08-03 22:21:05 +00:00
Chris Lattner	84e9baa925	Fix a nasty dangling pointer issue. The ScalarEvolution pass would keep a map from instruction* to SCEVHandles. When we delete instructions, we have to tell it about it. We would run into nasty cases where new instructions were reallocated at old instruction addresses and get the old map values. Bad bad bad :( llvm-svn: 22632	2005-08-03 21:36:09 +00:00
Chris Lattner	3de05cc930	The correct fix for PR612, which also fixes Transforms/LowerInvoke/2005-08-03-InvokeWithPHIUse.ll llvm-svn: 22628	2005-08-03 18:51:44 +00:00
Chris Lattner	f8a81a9886	When inserting code, make sure not to insert it before PHI nodes. This fixes PR612 and Transforms/LowerInvoke/2005-08-03-InvokeWithPHI.ll llvm-svn: 22626	2005-08-03 18:34:29 +00:00
Chris Lattner	d683bdd0f8	Fix Transforms/SimplifyCFG/2005-08-03-PHIFactorCrash.ll, a problem that occurred while bugpointing another testcase llvm-svn: 22621	2005-08-03 17:59:45 +00:00
Chris Lattner	2dbf1960ff	Finally, add the required constraint checks to fix Transforms/SimplifyCFG/2005-08-01-PHIUpdateFail.ll the right way llvm-svn: 22615	2005-08-03 00:59:12 +00:00
Chris Lattner	908036942c	Simplify some code, add the correct pred checks llvm-svn: 22613	2005-08-03 00:38:27 +00:00
Chris Lattner	982b75c061	Refactor code out of PropagatePredecessorsForPHIs, turning it into a pure function with no side-effects llvm-svn: 22612	2005-08-03 00:29:26 +00:00
Chris Lattner	1f047fd513	use splice instead of remove/insert to avoid some symtab operations llvm-svn: 22611	2005-08-03 00:23:42 +00:00
Chris Lattner	76dc204488	move two functions up in the file, use SafeToMergeTerminators to eliminate some duplicated code llvm-svn: 22610	2005-08-03 00:19:45 +00:00
Chris Lattner	733d6704ce	Rip some code out of the main SimplifyCFG function into a subfunction and call it from the only place it is live. No functionality changes. llvm-svn: 22609	2005-08-03 00:11:16 +00:00
Chris Lattner	ac594de8dc	Disable this patch: http://lists.cs.uiuc.edu/pipermail/llvm-commits/Week-of-Mon-20050801/027345.html This breaks real programs and only fixes an obscure regression testcase. A real fix is in development. llvm-svn: 22606	2005-08-02 23:31:38 +00:00
Chris Lattner	eee90f7eb4	Change a place to use an arbitrary value instead of null, when possible llvm-svn: 22605	2005-08-02 23:29:23 +00:00
Chris Lattner	22d00a8e90	Update to use the new MathExtras.h support for log2 computation. Patch contributed by Jim Laskey! llvm-svn: 22592	2005-08-02 19:16:58 +00:00
Chris Lattner	351b891cbc	Like the comment says, do not insert cast instructions before phi nodes llvm-svn: 22586	2005-08-02 03:31:14 +00:00
Chris Lattner	4fd3e16cbd	This code was very close, but not quite right. It did not take into consideration the case where a reference in an unreachable block could occur. This fixes Transforms/SimplifyCFG/2005-08-01-PHIUpdateFail.ll, something I ran into while bugpoint'ing another pass. llvm-svn: 22584	2005-08-02 03:24:05 +00:00
Chris Lattner	75a44e154e	add a comment, make a check more lenient llvm-svn: 22581	2005-08-02 02:52:02 +00:00
Chris Lattner	dcce49e006	Simplify for loop, clear a per-loop map after processing each loop llvm-svn: 22580	2005-08-02 02:44:31 +00:00
Chris Lattner	9ef1294210	Add a comment Make LSR ignore GEP's that have loop variant base values, as we currently cannot codegen them llvm-svn: 22576	2005-08-02 01:32:29 +00:00
Chris Lattner	564900e5e5	Fix an iterator invalidation problem llvm-svn: 22575	2005-08-02 00:41:11 +00:00
Chris Lattner	e17c5d0e59	ConstantInt::get only works for arguments < 128. SimplifyLibCalls probably has to be audited to make sure it does not make this mistake elsewhere. Also, if this code knows that the type will be unsigned, obviously one arm of this is dead. Reid, can you take a look into this further? llvm-svn: 22566	2005-08-01 16:52:50 +00:00
Jeff Cohen	546fd5944e	Keep tabs and trailing spaces out. llvm-svn: 22565	2005-07-30 18:33:25 +00:00
Jeff Cohen	c500991055	Fix VC++ build problems. llvm-svn: 22564	2005-07-30 18:22:27 +00:00
Nate Begeman	17a0e2afea	Ack, typo llvm-svn: 22560	2005-07-30 00:21:31 +00:00
Nate Begeman	e68bcd1946	Commit a new LoopStrengthReduce pass that can use scalar evolutions and target data to decide which loop induction variables to strength reduce and how to do so. This work is mostly by Chris Lattner, with tweaks by me to get it working on some of MultiSource. llvm-svn: 22558	2005-07-30 00:15:07 +00:00
Nate Begeman	2bca4d9b7b	Break SCEVExpander out of IndVarSimplify into its own .h/.cpp file so that other passes may use it. llvm-svn: 22557	2005-07-30 00:12:19 +00:00
Jeff Cohen	5f4ef3c5a8	Eliminate all remaining tabs and trailing spaces. llvm-svn: 22523	2005-07-27 06:12:32 +00:00
Chris Lattner	31d0ac2414	ConvertibleToGEP always returns 0, remove some old crufty code which is actually dead because of this! llvm-svn: 22515	2005-07-26 16:38:28 +00:00
Chris Lattner	18aa4d8196	Do not let MaskedValueIsZero consider undef to be zero, for reasons explained in the comment. This fixes UnitTests/2003-09-18-BitFieldTest on darwin llvm-svn: 22483	2005-07-20 18:49:28 +00:00
Chris Lattner	247aef884c	When transforming &A[i] < &A[j] -> i < j, make sure to perform the comparison as a signed compare. This patch may fix PR597, but is correct in any case. llvm-svn: 22465	2005-07-18 23:07:33 +00:00
Chris Lattner	4ed40f7c6f	Fix a problem that instcombine would hit when dealing with unreachable code. Because the instcombine has to scan the entire function when it starts up to begin with, we might as well do it in DFO so we can nuke unreachable code. This fixes: Transforms/InstCombine/2005-07-07-DeadPHILoop.ll llvm-svn: 22348	2005-07-07 20:40:38 +00:00
Chris Lattner	937c71f2b3	Fix PR590 and Transforms/Mem2Reg/2005-06-30-ReadBeforeWrite.ll. The optimization for locally used allocas was not safe for allocas that were read before they were written. This change disables that optimization in that case. llvm-svn: 22318	2005-06-30 07:29:44 +00:00
John Criswell	810b4f8d55	Doh! Forgot to LLVMify the style. llvm-svn: 22312	2005-06-29 15:57:50 +00:00
John Criswell	4642afdcc1	Basic fix for PR#591; don't convert an fprintf() to an fwrite() if there is a mismatch in their character type pointers (i.e. fprintf() prints an array of ubytes while fwrite() takes an array of sbytes). We can probably do better than this (such as casting the ubyte to an sbyte). llvm-svn: 22310	2005-06-29 15:03:18 +00:00
Chris Lattner	9610c6f287	add a debug type llvm-svn: 22277	2005-06-24 16:00:46 +00:00
Andrew Lenharth	cf52eb2b99	prevent va_arg from being hoisted from a loop llvm-svn: 22265	2005-06-20 13:36:33 +00:00
Andrew Lenharth	d4b103107e	prevent DCE of vaarg intrinsics. This should take care of most regressions llvm-svn: 22263	2005-06-19 14:41:20 +00:00
Andrew Lenharth	9144ec4764	core changes for varargs llvm-svn: 22254	2005-06-18 18:34:52 +00:00
Reid Spencer	a7828baa3c	Fix a problem with the strcmp optimization checking the wrong string and not casting to the correct type. llvm-svn: 22250	2005-06-18 17:46:28 +00:00
Reid Spencer	4fdd96c4e0	Clean up some uninitialized variables and missing return statements that GCC 4.0.0 compiler (sometimes incorrectly) warns about under release build. llvm-svn: 22249	2005-06-18 17:37:34 +00:00
Chris Lattner	2ceb6ee576	This is not true: (X != 13 \| X < 15) -> X < 15 It is actually always true. This fixes PR586 and Transforms/InstCombine/2005-06-16-SetCCOrSetCCMiscompile.ll llvm-svn: 22236	2005-06-17 03:59:17 +00:00
Chris Lattner	73bcba5f61	Don't crash when dealing with INTMIN. This fixes PR585 and Transforms/InstCombine/2005-06-16-RangeCrash.ll llvm-svn: 22234	2005-06-17 02:05:55 +00:00
Chris Lattner	5e735294bf	Don't crash on: X = phi (X, X). This fixes PR584 and Transforms/SimplifyCFG/2005-06-16-PHICrash.ll llvm-svn: 22232	2005-06-17 01:45:53 +00:00
Chris Lattner	c53cb9d3ff	avoid constructing out of range shift amounts. llvm-svn: 22230	2005-06-17 01:29:28 +00:00
Chris Lattner	89dc4f16f5	Fix PR583 and testcase Transforms/InstCombine/2005-06-15-DivSelectCrash.ll llvm-svn: 22227	2005-06-16 04:55:52 +00:00
Chris Lattner	252a845e30	Fix PR571, removing code that does just the WRONG thing :) llvm-svn: 22225	2005-06-16 03:00:08 +00:00
Chris Lattner	104002bee3	Fix a bug in my previous patch. Do not get the shift amount type (which is always ubyte, get the type being shifted). This unbreaks espresso llvm-svn: 22224	2005-06-16 01:52:07 +00:00
Chris Lattner	d48b127aea	Fix PR575, patch provided by John Mellor-Crummey. Thanks! llvm-svn: 22223	2005-06-15 22:49:30 +00:00
Chris Lattner	df81539278	Fix PR582. The rewriter can move casts around, which invalidated the BB iterator. This fixes Transforms/IndVarsSimplify/2005-06-15-InstMoveCrash.ll llvm-svn: 22221	2005-06-15 21:29:31 +00:00
Chris Lattner	50bdfcb045	Do not promote globals only used by main to locals if there are constantexprs or other uses hanging off of them. llvm-svn: 22219	2005-06-15 21:11:48 +00:00
Chris Lattner	19b57f55aa	Fix PR577 and testcase InstCombine/2005-06-15-ShiftSetCCCrash.ll. Do not perform undefined out of range shifts. llvm-svn: 22217	2005-06-15 20:53:31 +00:00
Reid Spencer	a299d6f701	Put the hack back in that removes features, causes regressions to fail, but allows test programs to succeed. Actual fix for this is forthcoming. llvm-svn: 22213	2005-06-15 18:25:30 +00:00
Reid Spencer	6d231e55fa	Unbreak several InstCombine regression checks introduced by a hack to fix the bzip2 test. A better hack is needed. llvm-svn: 22209	2005-06-13 06:41:26 +00:00
Chris Lattner	1609a541cd	Fix a 64-bit problem, passing (int)0 through ... instead of (void*)0 llvm-svn: 22206	2005-06-09 03:32:54 +00:00
Chris Lattner	fbc45f10d0	Fix a problem on 64-bit targets where we passed (int)0 through ... instead of (void*)0. llvm-svn: 22205	2005-06-09 02:59:00 +00:00
Andrew Lenharth	ffe65458e7	hack to fix bzip2 (bug 571) llvm-svn: 22192	2005-06-04 12:43:56 +00:00
Reid Spencer	9fbad13dd7	Make the registration hash_map static. No other module needs it. Also, document what its for a little better. llvm-svn: 22164	2005-05-21 01:27:04 +00:00
Reid Spencer	0b13cdabae	Adjust the file comment to read a little easier. llvm-svn: 22163	2005-05-21 00:57:44 +00:00
Reid Spencer	45bb4afc79	Make sure ... arguments are casted to sbyte* where needed. llvm-svn: 22162	2005-05-21 00:39:30 +00:00
Reid Spencer	895af9ef24	Add a "brief" comment for CastToCStr llvm-svn: 22161	2005-05-21 00:23:23 +00:00
Chris Lattner	f8053cee7c	Fix mismatched type problem that crashed on cases like this: sprintf(P, "%s", X); Where X is not an sbyte*. This fixes the bug JohnMC reported on llvm-bugs. llvm-svn: 22159	2005-05-20 22:22:25 +00:00
Chris Lattner	19f9f32a5c	Fix Transforms/SimplifyCFG/switch-simplify-crash.ll llvm-svn: 22158	2005-05-20 22:19:54 +00:00
Chris Lattner	05deb04cb0	teach the inliner about coldcc and noreturn functions llvm-svn: 22113	2005-05-18 04:30:33 +00:00
Reid Spencer	74305a6233	Don't look for __builtin_ffs, we'll never see it from llvm-gcc and there's not reason to include it for other front ends. llvm-svn: 22070	2005-05-15 21:27:34 +00:00
Reid Spencer	17f7784c5d	Provide this optimization as well: ffs(x) -> (x == 0 ? 0 : 1+llvm.cttz(x)) llvm-svn: 22068	2005-05-15 21:19:45 +00:00
Reid Spencer	3de98ee643	Duh .. you actually have to #include Config/config.h before you can test for one of the values that it defines! llvm-svn: 22058	2005-05-15 17:20:47 +00:00
Reid Spencer	b195fcd5ef	Changes for ffs lib call simplification: * Check for availability of ffsll call in configure script * Support ffs, ffsl, and ffsll conversion to constant value if the argument is constant. llvm-svn: 22027	2005-05-14 16:42:52 +00:00
Chris Lattner	403d1c204c	Preserve calling conv when hacking on calls llvm-svn: 22025	2005-05-14 12:28:32 +00:00
Chris Lattner	05c703ea85	preserve calling conventions when hacking on code llvm-svn: 22024	2005-05-14 12:25:32 +00:00
Chris Lattner	bcefcf8552	Make sure to preserve the calling convention when changing an invoke into a call. This fixes Prolangs-C++/deriv2, kimwitu++, and Misc-C++/bigfib on X86 with -enable-x86-fastcc. llvm-svn: 22023	2005-05-14 12:21:56 +00:00
Chris Lattner	61d9d81770	calling a function with the wrong CC is undefined, turn it into an unreachable instruction. This is useful for catching optimizers that don't preserve calling conventions llvm-svn: 21928	2005-05-13 07:09:09 +00:00
Chris Lattner	ca968393ab	When lowering invokes to calls, amke sure to preserve the calling conv. This fixes Ptrdist/anagram with x86 llcbeta llvm-svn: 21925	2005-05-13 06:27:02 +00:00
Chris Lattner	ae186e012c	Prefer int 0 instead of long 0 for GEP arguments. llvm-svn: 21924	2005-05-13 06:10:12 +00:00
Chris Lattner	31c667e234	Fix Reassociate/shifttest.ll llvm-svn: 21839	2005-05-10 03:39:25 +00:00
Chris Lattner	bfc796f622	If a function contains no allocas, all of the calls in it are trivially suitable for tail calls. llvm-svn: 21836	2005-05-09 23:51:13 +00:00
Chris Lattner	b62f5082c5	implement and.ll:test33 llvm-svn: 21809	2005-05-09 04:58:36 +00:00
Chris Lattner	d0525a29d1	Preserve calling conventions when doing IPO llvm-svn: 21798	2005-05-09 01:05:50 +00:00
Chris Lattner	21d1dde72a	wrap long lines, preserve calling conventions when cloning functions and turning calls into invokes llvm-svn: 21797	2005-05-09 01:04:34 +00:00
Chris Lattner	a4c8022caf	Convert non-address taken functions with C calling conventions to fastcc. llvm-svn: 21791	2005-05-08 22:18:06 +00:00
Chris Lattner	df3332660f	Implement Reassociate/mul-neg-add.ll llvm-svn: 21788	2005-05-08 21:41:35 +00:00
Chris Lattner	c4f8e2b0ed	Bail out earlier llvm-svn: 21786	2005-05-08 21:33:47 +00:00
Chris Lattner	877b114037	Teach reassociate that 0-X === X*-1 llvm-svn: 21785	2005-05-08 21:28:52 +00:00
Chris Lattner	9f284e0a3c	Fix PR557 and basictest[34].ll. This makes reassociate realize that loads should be treated as unmovable, and gives distinct ranks to distinct values defined in the same basic block, allowing reassociate to do its thing. llvm-svn: 21783	2005-05-08 20:57:04 +00:00
Chris Lattner	9187f3905e	Add debugging information llvm-svn: 21781	2005-05-08 20:09:57 +00:00
Chris Lattner	08582be283	eliminate gotos llvm-svn: 21780	2005-05-08 19:48:43 +00:00
Chris Lattner	5847e5e10c	Improve reassociation handling of inverses, implementing inverses.ll. llvm-svn: 21778	2005-05-08 18:59:37 +00:00
Chris Lattner	4922118dc4	clean up and modernize this pass. llvm-svn: 21776	2005-05-08 18:45:26 +00:00
Chris Lattner	b18dbbfff5	Strength reduce SAR into SHR if there is no way sign bits could be shifted in. This tends to get cases like this: X = cast ubyte to int Y = shr int X, ... Tested by: shift.ll:test24 llvm-svn: 21775	2005-05-08 17:34:56 +00:00
Chris Lattner	e1850b86b6	Refactor some code llvm-svn: 21772	2005-05-08 00:19:31 +00:00
Chris Lattner	6e2086d7e4	Handle some simple cases where we can see that values get annihilated. llvm-svn: 21771	2005-05-08 00:08:33 +00:00
Chris Lattner	4294cec0f1	Fix a miscompilation of crafty by clobbering the "A" variable. llvm-svn: 21770	2005-05-07 23:49:08 +00:00
Chris Lattner	1e5065052a	Rewrite the guts of the reassociate pass to be more efficient and logical. Instead of trying to do local reassociation tweaks at each level, only process an expression tree once (at its root). This does not improve the reassociation pass in any real way. llvm-svn: 21768	2005-05-07 21:59:39 +00:00
Reid Spencer	170ae7ff70	* Add two strlen optimizations: strlen(x) != 0 -> x != 0 strlen(x) == 0 -> x == 0 * Change nested statistics to use style of other LLVM statistics so that only the name of the optimization (simplify-libcalls) is used as the statistic name, and the description indicates which specific all is optimized. Cuts down on some redundancy and saves a few bytes of space. * Make note of stpcpy optimization that could be done. llvm-svn: 21766	2005-05-07 20:15:59 +00:00
Reid Spencer	4f01a822b4	Don't increment the counter unless the debug flag is set. llvm-svn: 21762	2005-05-07 04:59:45 +00:00
Chris Lattner	cea579932d	Convert shifts to muls to assist reassociation. This implements Reassociate/shifttest.ll llvm-svn: 21761	2005-05-07 04:24:13 +00:00
Chris Lattner	f43e974abd	Simplify the code and rearrange it. No major functionality changes here. llvm-svn: 21759	2005-05-07 04:08:02 +00:00
Chris Lattner	7effa0ed06	BAD typeo which caused many testsuite failures last night. Note to self, do not change code after testing it without retesting! llvm-svn: 21741	2005-05-06 17:13:16 +00:00
Chris Lattner	6aacb0f9da	Preserve tail marker llvm-svn: 21737	2005-05-06 06:48:21 +00:00
Chris Lattner	9f3dced2c7	Implement Transforms/Inline/inline-tail.ll llvm-svn: 21736	2005-05-06 06:47:52 +00:00
Chris Lattner	324d2eedb2	preserve the tail marker llvm-svn: 21734	2005-05-06 06:46:58 +00:00
Chris Lattner	53db546b97	Wrap long lines llvm-svn: 21720	2005-05-06 05:34:40 +00:00
Chris Lattner	a36d525741	DCE intrinsic instructions without side effects. llvm-svn: 21719	2005-05-06 05:27:34 +00:00
Chris Lattner	ef298a3b8a	Teach instcombine propagate zeroness through shl instructions, implementing and.ll:test31 llvm-svn: 21717	2005-05-06 04:53:20 +00:00
Chris Lattner	873804168e	Implement shift.ll:test23. If we are shifting right then immediately truncating the result, turn signed shift rights into unsigned shift rights if possible. This leads to later simplification and happens often in 176.gcc. For example, this testcase: struct xxx { unsigned int code : 8; }; enum codes { A, B, C, D, E, F }; int foo(struct xxx P) { if ((enum codes)P->code == A) bar(); } used to be compiled to: int %foo(%struct.xxx %P) { %tmp.1 = getelementptr %struct.xxx* %P, int 0, uint 0 ; <uint> [#uses=1] %tmp.2 = load uint %tmp.1 ; <uint> [#uses=1] %tmp.3 = cast uint %tmp.2 to int ; <int> [#uses=1] %tmp.4 = shl int %tmp.3, ubyte 24 ; <int> [#uses=1] %tmp.5 = shr int %tmp.4, ubyte 24 ; <int> [#uses=1] %tmp.6 = cast int %tmp.5 to sbyte ; <sbyte> [#uses=1] %tmp.8 = seteq sbyte %tmp.6, 0 ; <bool> [#uses=1] br bool %tmp.8, label %then, label %UnifiedReturnBlock Now it is compiled to: %tmp.1 = getelementptr %struct.xxx* %P, int 0, uint 0 ; <uint> [#uses=1] %tmp.2 = load uint %tmp.1 ; <uint> [#uses=1] %tmp.2 = cast uint %tmp.2 to sbyte ; <sbyte> [#uses=1] %tmp.8 = seteq sbyte %tmp.2, 0 ; <bool> [#uses=1] br bool %tmp.8, label %then, label %UnifiedReturnBlock which is the difference between this: foo: subl $4, %esp movl 8(%esp), %eax movl (%eax), %eax shll $24, %eax sarl $24, %eax testb %al, %al jne .LBBfoo_2 and this: foo: subl $4, %esp movl 8(%esp), %eax movl (%eax), %eax testb %al, %al jne .LBBfoo_2 This occurs 3243 times total in the External tests, 215x in povray, 6x in each f2c'd program, 1451x in 176.gcc, 7x in crafty, 20x in perl, 25x in gap, 3x in m88ksim, 25x in ijpeg. Maybe this will cause a little jump on gcc tommorow :) llvm-svn: 21715	2005-05-06 04:18:52 +00:00
Chris Lattner	7208616ec0	Implement xor.ll:test22 llvm-svn: 21713	2005-05-06 02:07:39 +00:00
Chris Lattner	4c2d3781aa	implement and.ll:test30 and set.ll:test21 llvm-svn: 21712	2005-05-06 01:53:19 +00:00
Chris Lattner	dd1e562ec3	implement or.ll:test20 llvm-svn: 21709	2005-05-06 00:58:50 +00:00
Chris Lattner	807aa20f67	Fix a bug compimling Ruby, fixing this testcase: LowerSetJmp/2005-05-05-OldUses.ll llvm-svn: 21696	2005-05-05 15:47:43 +00:00
Chris Lattner	809dfac421	Instcombine: cast (X != 0) to int, cast (X == 1) to int -> X iff X has only the low bit set. This implements set.ll:test20. This triggers 2x on povray, 9x on mesa, 11x on gcc, 2x on crafty, 1x on eon, 6x on perlbmk and 11x on m88ksim. It allows us to compile these two functions into the same code: struct s { unsigned int bit : 1; }; unsigned foo(struct s p) { if (p->bit) return 1; else return 0; } unsigned bar(struct s p) { return p->bit; } llvm-svn: 21690	2005-05-04 19:10:26 +00:00
Reid Spencer	282d057485	Implement the IsDigitOptimization for simplifying calls to the isdigit library function: isdigit(chr) -> 0 or 1 if chr is constant isdigit(chr) -> chr - '0' <= 9 otherwise Although there are many calls to isdigit in llvm-test, most of them are compiled away by macros leaving only this: 2 MultiSource/Applications/hexxagon llvm-svn: 21688	2005-05-04 18:58:28 +00:00
Reid Spencer	1e520fd661	* Correct the function prototypes for some of the functions to match the actual spec (int -> uint) * Add the ability to get/cache the strlen function prototype. * Make sure generated values are appropriately named for debugging purposes * Add the SPrintFOptimiation for 4 casts of sprintf optimization: sprintf(str,cstr) -> llvm.memcpy(str,cstr) (if cstr has no %) sprintf(str,"") -> store sbyte 0, str sprintf(str,"%s",src) -> llvm.memcpy(str,src) (if src is constant) sprintf(str,"%c",chr) -> store chr, str ; store sbyte 0, str+1 The sprintf optimization didn't fire as much as I had hoped: 2 MultiSource/Applications/SPASS 5 MultiSource/Benchmarks/McCat/18-imp 22 MultiSource/Benchmarks/Prolangs-C/TimberWolfMC 1 MultiSource/Benchmarks/Prolangs-C/assembler 6 MultiSource/Benchmarks/Prolangs-C/unix-smail 2 MultiSource/Benchmarks/mediabench/mpeg2/mpeg2dec llvm-svn: 21679	2005-05-04 03:20:21 +00:00
Reid Spencer	38cabd7265	Implement optimizations for the strchr and llvm.memset library calls. Neither of these activated as many times as was hoped: strchr: 9 MultiSource/Applications/siod 1 MultiSource/Applications/d 2 MultiSource/Prolangs-C/archie-client 1 External/SPEC/CINT2000/176.gcc/176.gcc llvm.memset: no hits llvm-svn: 21669	2005-05-03 07:23:44 +00:00
Reid Spencer	95d8efdfcf	Avoid garbage output in the statistics display by ensuring that the strings passed to Statistic's constructor are not destructable. The stats are printed during static destruction and the SimplifyLibCalls module was getting destructed before the statistics. llvm-svn: 21661	2005-05-03 02:54:54 +00:00
Reid Spencer	49fa070401	Add the StrNCmpOptimization which is similar to strcmp. Unfortunately, this optimization didn't trigger on any llvm-test tests. llvm-svn: 21660	2005-05-03 01:43:45 +00:00
Reid Spencer	2d5c7beebd	Implement the fprintf optimization which converts calls like this: fprintf(F,"hello") -> fwrite("hello",strlen("hello"),1,F) fprintf(F,"%s","hello") -> fwrite("hello",strlen("hello"),1,F) fprintf(F,"%c",'x') -> fputc('c',F) This optimization fires severals times in llvm-test: 313 MultiSource/Applications/Burg 302 MultiSource/Benchmarks/Prolangs-C/TimberWolfMC 189 MultiSource/Benchmarks/Prolangs-C/mybison 175 MultiSource/Benchmarks/Prolangs-C/football 130 MultiSource/Benchmarks/Prolangs-C/unix-tbl llvm-svn: 21657	2005-05-02 23:59:26 +00:00
John Criswell	f42ed7bdaf	Fixed a comment. llvm-svn: 21653	2005-05-02 14:47:42 +00:00
Chris Lattner	a816eee427	Implement getelementptr.ll:test11 llvm-svn: 21647	2005-05-01 04:42:15 +00:00
Chris Lattner	a9d84e3388	Check for volatile loads only once. Implement load.ll:test7 llvm-svn: 21645	2005-05-01 04:24:53 +00:00
Reid Spencer	16449a9eb0	Fix a comment that stated the wrong thing. llvm-svn: 21638	2005-04-30 06:45:47 +00:00
Reid Spencer	4c444fe007	* Don't depend on "guessing" what a FILE* is, just require that the actual type be obtained from a CallInst we're optimizing. * Make it possible for getConstantStringLength to return the ConstantArray that it extracts in case the content is needed by an Optimization. * Implement the strcmp optimization * Implement the toascii optimization This pass is now firing several to many times in the following MultiSource tests: Applications/Burg - 7 (strcat,strcpy) Applications/siod - 13 (strcat,strcpy,strlen) Applications/spiff - 120 (exit,fputs,strcat,strcpy,strlen) Applications/treecc - 66 (exit,fputs,strcat,strcpy) Applications/kimwitu++ - 34 (strcmp,strcpy,strlen) Applications/SPASS - 588 (exit,fputs,strcat,strcpy,strlen) llvm-svn: 21626	2005-04-30 03:17:54 +00:00
Reid Spencer	9361697f93	Implement the optimizations for "pow" and "fputs" library calls. llvm-svn: 21618	2005-04-29 09:39:47 +00:00
Reid Spencer	c968ea0495	Remove optimizations that don't require both operands to be constant. These are moved to simplify-libcalls pass. llvm-svn: 21614	2005-04-29 05:55:35 +00:00
Jeff Cohen	4bc952f703	Consistently use 'class' to silence VC++ llvm-svn: 21612	2005-04-29 03:05:44 +00:00
Reid Spencer	ed55a6b5e0	* Add constant folding for additional floating point library calls such as sinh, cosh, etc. * Make the name comparisons for the fp libcalls a little more efficient by switching on the first character of the name before doing comparisons. llvm-svn: 21611	2005-04-28 23:01:59 +00:00
Reid Spencer	16983ca865	Remove from the TODO list those optimizations that are already handled by constant folding implemented in lib/Transforms/Utils/Local.cpp. llvm-svn: 21604	2005-04-28 18:05:16 +00:00
Reid Spencer	649ac283e4	Document additional libcall transformations that need to be written. Help Wanted! There's a lot of them to write. llvm-svn: 21603	2005-04-28 04:40:06 +00:00
Reid Spencer	7ddcfb3375	Doxygenate. llvm-svn: 21602	2005-04-27 21:29:20 +00:00
Chris Lattner	36ffb1ff37	remove 'statement with no effect' warning llvm-svn: 21600	2005-04-27 20:12:17 +00:00
Reid Spencer	08b4940509	More Cleanup: * Name the instructions by appending to name of original * Factor common part out of a switch statement. llvm-svn: 21597	2005-04-27 17:46:54 +00:00
Reid Spencer	e249a82e73	This is a cleanup commit: * Correct stale documentation in a few places * Re-order the file to better associate things and reduce line count * Make the pass thread safe by caching the Function* objects needed by the optimizers in the pass object instead of globally. * Provide the SimplifyLibCalls pass object to the optimizer classes so they can access cached Function* objects and TargetData info * Make sure the pass resets its cache if the Module passed to runOnModule changes * Rename CallOptimizer LibCallOptimization. All the classes are named Optimization while the objects are Optimizer. * Don't cache Function* in the optimizer objects because they could be used by multiple PassManager's running in multiple threads * Add an optimization for strcpy which is similar to strcat * Add a "TODO" list at the end of the file for ideas on additional libcall optimizations that could be added (get ideas from other compilers). Sorry for the huge diff. Its mostly reorganization of code. That won't happen again as I believe the design and infrastructure for this pass is now done or close to it. llvm-svn: 21589	2005-04-27 07:54:40 +00:00
Chris Lattner	93f4e9dd26	detect functions that never return, and turn the instruction following a call to them into an 'unreachable' instruction. This triggers a bunch of times, particularly on gcc: gzip: 36 gcc: 601 eon: 12 bzip: 38 llvm-svn: 21587	2005-04-27 04:52:23 +00:00
Reid Spencer	dc11db68b6	Prefix the debug statistics so they group together. llvm-svn: 21583	2005-04-27 00:20:23 +00:00
Reid Spencer	e95a647b2a	In debug builds, make a statistic for each kind of call optimization. This helps track down what gets triggered in the pass so its easier to identify good test cases. llvm-svn: 21582	2005-04-27 00:05:45 +00:00
Chris Lattner	7f4f773e9f	This analysis doesn't take 'throwing' into consideration, it looks at 'unwinding' llvm-svn: 21581	2005-04-26 23:53:25 +00:00
Reid Spencer	f9d4be187f	Fix up the debug statement to actually use a newline .. radical concept. llvm-svn: 21580	2005-04-26 23:07:08 +00:00
Reid Spencer	18b998192f	Uh, this isn't argpromotion. llvm-svn: 21579	2005-04-26 23:05:17 +00:00
Reid Spencer	2bc7a4f82a	Add some debugging output so we can tell which calls are getting triggered llvm-svn: 21578	2005-04-26 23:02:16 +00:00
Reid Spencer	f8c03d9db6	No, seriously folks, memcpy really does return void. llvm-svn: 21575	2005-04-26 22:49:48 +00:00
Reid Spencer	aaca170867	memcpy returns void!!!!! llvm-svn: 21574	2005-04-26 22:46:23 +00:00
Reid Spencer	4855ebf622	Fix some bugs found by running on llvm-test: * MemCpyOptimization can only be optimized if the 3rd and 4th arguments are constants and we weren't checking for that. * The result of llvm.memcpy (and llvm.memmove) is void* not sbyte*, put in a cast. llvm-svn: 21570	2005-04-26 19:55:57 +00:00
Reid Spencer	bb92b4fdfb	Changes From Review Feedback: * Have the SimplifyLibCalls pass acquire the TargetData and pass it down to the optimization classes so they can use it to make better choices for the signatures of functions, etc. * Rearrange the code a little so the utility functions are closer to their usage and keep the core of the pass near the top of the files. * Adjust the StrLen pass to get/use the correct prototype depending on the TargetData::getIntPtrType() result. The result of strlen is size_t which could be either uint or ulong depending on the platform. * Clean up some coding nits (cast vs. dyn_cast, remove redundant items from a switch, etc.) * Implement the MemMoveOptimization as a twin of MemCpyOptimization (they only differ in name). llvm-svn: 21569	2005-04-26 19:13:17 +00:00
Chris Lattner	bd43b9db9d	Fix the compile failures from last night. llvm-svn: 21565	2005-04-26 14:40:41 +00:00
Reid Spencer	b4f7b83dce	* Merge get_GVInitializer and getCharArrayLength into a single function named getConstantStringLength. This is the common part of StrCpy and StrLen optimizations and probably several others, yet to be written. It performs all the validity checks for looking at constant arrays that are supposed to be null-terminated strings and then computes the actual length of the string. * Implement the MemCpyOptimization class. This just turns memcpy of 1, 2, 4 and 8 byte data blocks that are properly aligned on those boundaries into a load and a store. Much more could be done here but alignment restrictions and lack of knowledge of the target instruction set prevent use from doing significantly more. That will have to be delegated to the code generators as they lower llvm.memcpy calls. llvm-svn: 21562	2005-04-26 07:45:18 +00:00
Reid Spencer	76dab9a523	* Implement StrLenOptimization * Factor out commonalities between StrLenOptimization and StrCatOptimization * Make sure that signatures return sbyte* not void* llvm-svn: 21559	2005-04-26 05:24:00 +00:00
Reid Spencer	8ee5aacc38	Incorporate feedback from Chris: * Change signatures of OptimizeCall and ValidateCalledFunction so they are non-const, allowing the optimization object to be modified. This is in support of caching things used across multiple calls. * Provide two functions for constructing and caching function types * Modify the StrCatOptimization to cache Function objects for strlen and llvm.memcpy so it doesn't regenerate them on each call site. Make sure these are invalidated each time we start the pass. * Handle both a GEP Instruction and a GEP ConstantExpr * Add additional checks to make sure we really are dealing with an arary of sbyte and that all the element initializers are ConstantInt or ConstantExpr that reduce to ConstantInt. * Make sure the GlobalVariable is constant! * Don't use ConstantArray::getString as it can fail and it doesn't give us the right thing. We must check for null bytes in the middle of the array. * Use llvm.memcpy instead of memcpy so we can factor alignment into it. * Don't use void* types in signatures, replace with sbyte* instead. llvm-svn: 21555	2005-04-26 03:26:15 +00:00
Reid Spencer	fe91dfec91	Changes due to code review and new implementation: * Don't use std::string for the function names, const char* will suffice * Allow each CallOptimizer to validate the function signature before doing anything * Repeatedly loop over the functions until an iteration produces no more optimizations. This allows one optimization to insert a call that is optimized by another optimization. * Implement the ConstantArray portion of the StrCatOptimization * Provide a template for the MemCpyOptimization * Make ExitInMainOptimization split the block, not delete everything after the return instruction. (This covers revision 1.3 and 1.4, as the 1.3 comments were botched) llvm-svn: 21548	2005-04-25 21:20:38 +00:00
Reid Spencer	f2534c7291	Lots of changes based on review and new functionality: * Use a llvm-svn: 21546	2005-04-25 21:11:48 +00:00
Chris Lattner	a21bf8d1be	implement getelementptr.ll:test10 llvm-svn: 21541	2005-04-25 20:17:30 +00:00
Reid Spencer	9bbaa2ab7f	Post-Review Cleanup: * Fix comments at top of file * Change algorithm for running the call optimizations from nn to something closer to n. Use a hash_map to store and lookup the optimizations since there will eventually (or potentially) be a large number of them. This gets lookup based on the name of the function to O(1). Each CallOptimizer now has a std::string member named func_name that tracks the name of the function that it applies to. It is this string that is entered into the hash_map for fast comparison against the function names encountered in the module. * Cleanup some style issues pertaining to iterator invalidation * Don't pass the Function pointer to the OptimizeCall function because if the optimization needs it, it can get it from the CallInst passed in. * Add the skeleton for a new CallOptimizer, StrCatOptimizer which will eventually replace strcat's of constant strings with direct copies. llvm-svn: 21526	2005-04-25 03:59:26 +00:00
Reid Spencer	39a762d149	A new pass to provide specific optimizations for certain well-known library calls. The pass visits all external functions in the module and determines if such function calls can be optimized. The optimizations are specific to the library calls involved. This initial version only optimizes calls to exit(3) when they occur in main(): it changes them to ret instructions. llvm-svn: 21522	2005-04-25 02:53:12 +00:00
Chris Lattner	2f1457fd83	Eliminate cases where we could << by 64, which is undefined in C. llvm-svn: 21500	2005-04-24 17:46:05 +00:00
Chris Lattner	d6f636a340	Implement xor.ll:test21: select (not C), A, B -> select C, B, A llvm-svn: 21495	2005-04-24 07:30:14 +00:00
Chris Lattner	d1f46d3bf9	Use getPrimitiveSizeInBits() instead of getPrimitiveSize()*8 Completely rework the 'setcc (cast x to larger), y' code. This code has the advantage of implementing setcc.ll:test19 (being more general than the previous code) and being correct in all cases. This allows us to unxfail 2004-11-27-SetCCForCastLargerAndConstant.ll, and close PR454. llvm-svn: 21491	2005-04-24 06:59:08 +00:00
Jeff Cohen	82639853c0	Eliminate tabs and trailing spaces llvm-svn: 21480	2005-04-23 21:38:35 +00:00
Chris Lattner	77c32c34d7	Generalize the setcc -> PHI and Select folding optimizations to work with any constant RHS, not just a constant integer RHS. This implements select.ll:test17 llvm-svn: 21470	2005-04-23 15:31:55 +00:00
Misha Brukman	b1c9317bb4	Remove trailing whitespace llvm-svn: 21427	2005-04-21 23:48:37 +00:00
Chris Lattner	a3159af703	Fix a bug where we would not promote calls to invokes if they occured in the same block as the setjmp. Thanks to Greg Pettyjohn for noticing this! llvm-svn: 21403	2005-04-21 16:46:46 +00:00
Chris Lattner	7ceb081f3f	Improve doxygen documentation, patch contributed by Evan Jones! llvm-svn: 21393	2005-04-21 16:04:49 +00:00
Chris Lattner	374e659466	Instcombine this: %shortcirc_val = select bool %tmp.1, bool true, bool %tmp.4 ; <bool> [#uses=1] %tmp.6 = cast bool %shortcirc_val to int ; <int> [#uses=1] into this: %shortcirc_val = or bool %tmp.1, %tmp.4 ; <bool> [#uses=1] %tmp.6 = cast bool %shortcirc_val to int ; <int> [#uses=1] not this: %tmp.4.cast = cast bool %tmp.4 to int ; <int> [#uses=1] %tmp.6 = select bool %tmp.1, int 1, int %tmp.4.cast ; <int> [#uses=1] llvm-svn: 21389	2005-04-21 05:43:13 +00:00
Chris Lattner	b38b443b15	Teach simplifycfg that setcc is cheap and non-trapping, so that it can convert this: %tmp.1 = seteq int %i, 0 ; <bool> [#uses=1] br bool %tmp.1, label %shortcirc_done, label %shortcirc_next shortcirc_next: ; preds = %entry %tmp.4 = seteq int %j, 0 ; <bool> [#uses=1] br label %shortcirc_done shortcirc_done: ; preds = %shortcirc_next, %entry %shortcirc_val = phi bool [ %tmp.4, %shortcirc_next ], [ true, %entry ] ; <bool> [#uses=1] to this: %tmp.1 = seteq int %i, 0 ; <bool> [#uses=1] %tmp.4 = seteq int %j, 0 ; <bool> [#uses=1] %shortcirc_val = select bool %tmp.1, bool true, bool %tmp.4 ; <bool> [#uses=1] ... which is later simplified by instcombine into an or. llvm-svn: 21388	2005-04-21 05:31:13 +00:00
Chris Lattner	8cb10a1775	Wrap some long lines. Make IPSCCP strip off dead constant exprs that are using functions, making them appear as though their address is taken. This allows us to propagate some more pool descriptors, lowering the overhead of pool alloc. llvm-svn: 21363	2005-04-19 19:16:19 +00:00
Chris Lattner	5c219469a0	Eliminate a broken transformation, fixing PR548 llvm-svn: 21354	2005-04-19 06:04:18 +00:00
Chris Lattner	ee84413730	silence a bogus warning llvm-svn: 21320	2005-04-18 05:26:21 +00:00
Chris Lattner	16a50fd0a0	a new simple pass, which will be extended to be more useful in the future. This pass forward branches through conditions when it can show that the conditions is either always true or false for a predecessor. This currently only handles the most simple cases of this, but is successful at threading across 2489 branches and 65 switch instructions in 176.gcc, which isn't bad. llvm-svn: 21306	2005-04-15 19:28:32 +00:00
Chris Lattner	95f16a3ac4	Get rid of this for_each loop llvm-svn: 21253	2005-04-12 18:51:33 +00:00
Chris Lattner	4236261930	Fix bug: InstCombine/2005-05-07-UDivSelectCrash.ll llvm-svn: 21152	2005-04-08 04:03:26 +00:00
Chris Lattner	4706046e68	Implement the following xforms: (X-Y)-X --> -Y A + (B - A) --> B (B - A) + A --> B llvm-svn: 21138	2005-04-07 17:14:51 +00:00
Chris Lattner	c7f3c1a00e	Implement InstCombine/add.ll:test28, transforming C1-(X+C2) --> (C1-C2)-X. This occurs several dozen times in specint2k, particularly in crafty and gcc apparently. llvm-svn: 21136	2005-04-07 16:28:01 +00:00
Chris Lattner	a9be4490d8	Transform X-(X+Y) == -Y and X-(Y+X) == -Y llvm-svn: 21134	2005-04-07 16:15:25 +00:00
Chris Lattner	ecfa9b5810	disable this transformation in the one obscure case that really pessimizes pointer analysis. llvm-svn: 20916	2005-03-29 06:37:47 +00:00
Alkis Evlogimenos	9ead0d7b4c	Rename createPromoteMemoryToRegister() to createPromoteMemoryToRegisterPass() to be consistent with other pass creation functions. llvm-svn: 20885	2005-03-28 02:01:12 +00:00
Chris Lattner	514e843e89	Enhance loopsimplify to preserve alias analysis instead of clobbering it. This prevents crashes on some programs when using -ds-aa -licm. llvm-svn: 20831	2005-03-25 06:37:22 +00:00
Chris Lattner	faf7791fea	Fix a bug where LICM was not updating AA information properly when sinking a pointer value out of a loop causing it to be duplicated. llvm-svn: 20828	2005-03-25 00:22:36 +00:00
Chris Lattner	1c790bf656	enable -debug-only=licm llvm-svn: 20788	2005-03-23 21:00:12 +00:00
Chris Lattner	7b9020a059	Fix the missing symbols problem Bill was hitting. Patch contributed by Bill Wendling!! llvm-svn: 20649	2005-03-17 15:38:16 +00:00
Chris Lattner	6cb4559369	stop using method. llvm-svn: 20603	2005-03-15 05:19:49 +00:00
Chris Lattner	531f9e92d4	This mega patch converts us from using Function::a{iterator\|begin\|end} to using Function::arg_{iterator\|begin\|end}. Likewise Module::g* -> Module::global_*. This patch is contributed by Gabor Greif, thanks! llvm-svn: 20597	2005-03-15 04:54:21 +00:00
Chris Lattner	8c79559443	fix a bug where we thought arguments were constants :( llvm-svn: 20506	2005-03-06 22:52:29 +00:00
Chris Lattner	2ce303b406	Fix Regression/Transforms/LoopStrengthReduce/dont_insert_redundant_ops.ll, hopefully not breaking too many other things. llvm-svn: 20505	2005-03-06 22:36:12 +00:00
Chris Lattner	45403e5052	implement Transforms/LoopStrengthReduce/invariant_value_first_arg.ll llvm-svn: 20501	2005-03-06 22:06:22 +00:00
Chris Lattner	d3874fad44	minor simplifications of the code. llvm-svn: 20497	2005-03-06 21:58:22 +00:00
Chris Lattner	dd3ec92085	trivial simplification llvm-svn: 20494	2005-03-06 21:35:38 +00:00
Chris Lattner	238f6df546	Fix a bug where we could corrupt a parent loop's header info if we unrolled a nested loop. This fixes Transforms/LoopUnroll/2005-03-06-BadLoopInfoUpdate.ll and PR532 llvm-svn: 20493	2005-03-06 20:57:32 +00:00
Chris Lattner	1b032f59e7	Make this MUCH faster by avoiding a linear search in the symbol table code. llvm-svn: 20479	2005-03-06 05:42:36 +00:00
Jeff Cohen	4abcea3a69	Reformat comments to fix 80 columns. llvm-svn: 20467	2005-03-05 22:45:40 +00:00
Jeff Cohen	be37fa07fd	Reuse induction variables created for strength-reduced GEPs by other similar GEPs. llvm-svn: 20466	2005-03-05 22:40:34 +00:00
Chris Lattner	6d0a24c608	second argument to Value::setName is now gone. llvm-svn: 20463	2005-03-05 19:05:20 +00:00
Chris Lattner	cfe2822cdf	Do not compute 1ULL << 64, which is undefined. This fixes Ptrdist/ks on the sparc, and testcase Regression/Transforms/InstCombine/2005-03-04-ShiftOverflow.ll llvm-svn: 20445	2005-03-04 23:21:33 +00:00
Jeff Cohen	a2c59b7423	Add support for not strength reducing GEPs where the element size is a small power of two. This emphatically includes the zeroeth power of two. llvm-svn: 20429	2005-03-04 04:04:26 +00:00
Chris Lattner	ef1e989e4f	Add an optional argument to lower to a specific constant value instead of to a "sizeof" expression. llvm-svn: 20414	2005-03-03 01:03:43 +00:00
Jeff Cohen	8ea6f9e821	Fixed the following LSR bugs: * Loop invariant code does not dominate the loop header, but rather the end of the loop preheader. * The base for a reduced GEP isn't a constant unless all of its operands (preceding the induction variable) are constant. * Allow induction variable elimination for the simple case after all. Also made changes recommended by Chris for properly deleting instructions. llvm-svn: 20383	2005-03-01 03:46:11 +00:00
Jeff Cohen	dcaa48b5c4	Fix crash in LSR due to attempt to remove original induction variable. However, for reasons explained in the comments, I also deactivated this code as it needs more thought. llvm-svn: 20367	2005-02-28 00:08:56 +00:00
Jeff Cohen	fd63d3af0d	PHI nodes were incorrectly placed when more than one GEP is reduced in a loop. llvm-svn: 20360	2005-02-27 21:08:04 +00:00
Jeff Cohen	39751c3b7c	First pass at improved Loop Strength Reduction. Still not yet ready for prime time. llvm-svn: 20358	2005-02-27 19:37:07 +00:00
Chris Lattner	7561ca1d15	Teach globalopt how memset/cpy/move affect memory, to allow better optimization. llvm-svn: 20352	2005-02-27 18:58:52 +00:00
Chris Lattner	0ce80cd542	Fix spelling, patch contributed by Gabor Greif! llvm-svn: 20343	2005-02-27 06:18:25 +00:00
Chris Lattner	cc6d75fddf	remove extraneous cast llvm-svn: 20334	2005-02-26 18:33:28 +00:00
Chris Lattner	1cca959e5d	Implement Transforms/SimplifyCFG/switch_thread.ll This does a simple form of "jump threading", which eliminates CFG edges that are provably dead. This triggers 90 times in the external tests, and eliminating CFG edges is always always a good thing! :) llvm-svn: 20300	2005-02-24 06:17:52 +00:00
Chris Lattner	25169caa80	make this more efficient. Scan up to 16 nodes, not the whole list. llvm-svn: 20289	2005-02-23 16:53:04 +00:00
Chris Lattner	52e931b37d	Remove use of bind_obj llvm-svn: 20276	2005-02-22 23:22:58 +00:00
Chris Lattner	7b5d9e2217	Do not mark obviously unreachable blocks live when processing PHI nodes, and handle incomplete control dependences correctly. This fixes: Regression/Transforms/ADCE/dead-phi-edge.ll -> a missed optimization Regression/Transforms/ADCE/dead-phi-edge.ll -> a compiler crash distilled from QT4 llvm-svn: 20227	2005-02-17 19:28:49 +00:00
Chris Lattner	31f3382b3b	Fix the second bug attached to PR504. llvm-svn: 20181	2005-02-14 20:11:45 +00:00
Chris Lattner	e616fea3bc	Fix for testcase Transforms/IndVarsSimplify/2005-02-11-InvokeCrash.ll and PR504. llvm-svn: 20129	2005-02-12 03:26:49 +00:00
Alkis Evlogimenos	c4a44c6b3d	Localize globals if they are only used in main(). This replaces the global with an alloca, which eventually gets promoted into a register. This enables a lot of other optimizations later on. llvm-svn: 20109	2005-02-10 18:36:30 +00:00
Alkis Evlogimenos	346bb20409	Fix crash on MallocInsts of unsized types. llvm-svn: 19988	2005-02-02 04:43:37 +00:00
Chris Lattner	82b42c5d85	API change. llvm-svn: 19959	2005-02-01 01:23:49 +00:00
Chris Lattner	d6a4492f81	Adjust to changes in APIs llvm-svn: 19958	2005-02-01 01:23:31 +00:00
Chris Lattner	f98a7bffb3	Hacks to make this ugly ugly code work with the new use lists. llvm-svn: 19957	2005-02-01 01:22:56 +00:00
Chris Lattner	72684fecf8	Implement InstCombine/cast.ll:test25, a case that occurs many times in spec llvm-svn: 19953	2005-01-31 05:51:45 +00:00
Chris Lattner	31f486c775	Implement the trivial cases in InstCombine/store.ll llvm-svn: 19950	2005-01-31 05:36:43 +00:00
Chris Lattner	fe1b0b8b24	Implement Transforms/InstCombine/cast-load-gep.ll, which allows us to devirtualize 11 indirect calls in perlbmk. llvm-svn: 19947	2005-01-31 04:50:46 +00:00
Chris Lattner	d8e20188c6	Adjust to changes in instruction interfaces. llvm-svn: 19900	2005-01-29 00:39:08 +00:00
Chris Lattner	a3f06fa2dd	Switchinst takes a hint for the number of cases it will have. llvm-svn: 19899	2005-01-29 00:38:45 +00:00
Chris Lattner	a35dfcedd3	switchinst ctor now takes a hint for the number of cases that it will have. llvm-svn: 19898	2005-01-29 00:38:26 +00:00
Chris Lattner	84d3137da7	Adjust Valuehandle to hold its operand directly in it. llvm-svn: 19897	2005-01-29 00:37:36 +00:00
Chris Lattner	cd517ff0c7	* add some DEBUG statements * Properly compile this: struct a {}; int test() { struct a b[2]; if (&b[0] != &b[1]) abort (); return 0; } to 'return 0', not abort(). llvm-svn: 19875	2005-01-28 19:32:01 +00:00
Alkis Evlogimenos	fbd921987f	Add a dependency to the trace library so that it gets pulled in automatically. llvm-svn: 19828	2005-01-25 16:23:57 +00:00
Chris Lattner	9e2c7facb2	Get rid of a several dozen more and instructions in specint. llvm-svn: 19786	2005-01-23 20:26:55 +00:00
Chris Lattner	fc4429e7c1	Handle comparisons of gep instructions that have different typed indices as long as they are the same size. llvm-svn: 19734	2005-01-21 23:06:49 +00:00
Chris Lattner	411336fe04	Add two optimizations. The first folds (X+Y)-X -> Y The second folds operations into selects, e.g. (select C, (X+Y), (Y+Z)) -> (Y+(select C, X, Z) This occurs a few times across spec, e.g. select add/sub mesa: 83 0 povray: 5 2 gcc 4 2 parser 0 22 perlbmk 13 30 twolf 0 3 llvm-svn: 19706	2005-01-19 21:50:18 +00:00
Chris Lattner	a3cc1835ad	Fix 'raise' to work with packed types. Patch by Morten Ofstad. llvm-svn: 19693	2005-01-19 16:16:35 +00:00
Chris Lattner	715364364b	Delete PHI nodes that are not dead but are locked in a cycle of single useness. llvm-svn: 19629	2005-01-17 05:10:15 +00:00
Chris Lattner	03f06f11aa	Move code out of indentation one level to make it easier to read. Disable the xform for < > cases. It turns out that the following is being miscompiled: bool %test(sbyte %S) { %T = cast sbyte %S to uint %V = setgt uint %T, 255 ret bool %V } llvm-svn: 19628	2005-01-17 03:20:02 +00:00
Chris Lattner	51726c47fe	Fix some bugs in an xform added yesterday. This fixes Prolangs-C/allroots. llvm-svn: 19553	2005-01-14 17:35:12 +00:00
Chris Lattner	7aa41cfa88	Fix a compile crash on spiff llvm-svn: 19552	2005-01-14 17:17:59 +00:00
Chris Lattner	4fa89827e2	if two gep comparisons only differ by one index, compare that index directly. This allows us to better optimize begin() -> end() comparisons in common cases. llvm-svn: 19542	2005-01-14 00:20:05 +00:00
Chris Lattner	d35d210ea0	Do not overrun iterators. This fixes a 176.gcc crash llvm-svn: 19541	2005-01-13 23:26:48 +00:00
Chris Lattner	a04c904c4c	Turn select C, (X+Y), (X-Y) --> (X+(select C, Y, (-Y))). This occurs in the 'sim' program and probably elsewhere. In sim, it comes up for cases like this: #define round(x) ((x)>0.0 ? (x)+0.5 : (x)-0.5) double G; void T(double X) { G = round(X); } (it uses the round macro a lot). This changes the LLVM code from: %tmp.1 = setgt double %X, 0.000000e+00 ; <bool> [#uses=1] %tmp.4 = add double %X, 5.000000e-01 ; <double> [#uses=1] %tmp.6 = sub double %X, 5.000000e-01 ; <double> [#uses=1] %mem_tmp.0 = select bool %tmp.1, double %tmp.4, double %tmp.6 store double %mem_tmp.0, double* %G to: %tmp.1 = setgt double %X, 0.000000e+00 ; <bool> [#uses=1] %mem_tmp.0.p = select bool %tmp.1, double 5.000000e-01, double -5.000000e-01 %mem_tmp.0 = add double %mem_tmp.0.p, %X store double %mem_tmp.0, double* %G ret void llvm-svn: 19537	2005-01-13 22:52:24 +00:00
Chris Lattner	81e8417614	Implement an optimization for == and != comparisons like this: _Bool test2(int X, int Y) { return &arr[X][Y] == arr; } instead of generating this: bool %test2(int %X, int %Y) { %tmp.3.idx = mul int %X, 160 ; <int> [#uses=1] %tmp.3.idx1 = shl int %Y, ubyte 2 ; <int> [#uses=1] %tmp.3.offs2 = sub int 0, %tmp.3.idx ; <int> [#uses=1] %tmp.7 = seteq int %tmp.3.idx1, %tmp.3.offs2 ; <bool> [#uses=1] ret bool %tmp.7 } generate this: bool %test2(int %X, int %Y) { seteq int %X, 0 ; <bool>:0 [#uses=1] seteq int %Y, 0 ; <bool>:1 [#uses=1] %tmp.7 = and bool %0, %1 ; <bool> [#uses=1] ret bool %tmp.7 } This idiom occurs in C++ programs when iterating from begin() to end(), in a vector or array. For example, we now compile this: void test(int X, int Y) { for (int i = arr; i != arr+100; ++i) foo(i); } to this: no_exit: ; preds = %entry, %no_exit ... %exitcond = seteq uint %indvar.next, 100 ; <bool> [#uses=1] br bool %exitcond, label %return, label %no_exit instead of this: no_exit: ; preds = %entry, %no_exit ... %inc5 = getelementptr [100 x [40 x int]]* %arr, int 0, int 0, int %inc.rec ; <int> [#uses=1] %tmp.8 = seteq int %inc5, getelementptr ([100 x [40 x int]]* %arr, int 0, int 100, int 0) ; <bool> [#uses=1] %indvar.next = add uint %indvar, 1 ; <uint> [#uses=1] br bool %tmp.8, label %return, label %no_exit llvm-svn: 19536	2005-01-13 22:25:21 +00:00
Chris Lattner	4cb9fa373b	Fix some bugs in code I didn't mean to check in. llvm-svn: 19534	2005-01-13 20:40:58 +00:00
Chris Lattner	0798af33a5	Fix a crash compiling 129.compress llvm-svn: 19533	2005-01-13 20:14:25 +00:00
Reid Spencer	134f02d0c7	Add the LOADABLE_MODULE=1 directive to indicate that this shared library is intended to be a dlopenable module and not a "plain" shared library. llvm-svn: 19456	2005-01-11 04:33:32 +00:00
Jeff Cohen	3e62e7c68b	Apply feedback from Chris. llvm-svn: 19432	2005-01-10 04:23:32 +00:00
Chris Lattner	798e84f59e	Fix VS warnings llvm-svn: 19383	2005-01-08 19:48:40 +00:00
Chris Lattner	46fa04b531	Fix VS warnings. llvm-svn: 19382	2005-01-08 19:45:31 +00:00
Chris Lattner	fdfe3e49fe	Fix uint64_t -> unsigned VS warnings. llvm-svn: 19381	2005-01-08 19:42:22 +00:00
Chris Lattner	47f395cd85	Silence VS warnings. llvm-svn: 19380	2005-01-08 19:37:20 +00:00
Chris Lattner	ce274ce93d	Silence warnings llvm-svn: 19379	2005-01-08 19:34:41 +00:00
Jeff Cohen	677babc4d4	Add more missing createXxxPass functions. llvm-svn: 19370	2005-01-08 17:21:40 +00:00
Misha Brukman	417ca179a9	Convert tabs to spaces llvm-svn: 19320	2005-01-07 07:05:34 +00:00
Jeff Cohen	9a7ac16214	Add missing createXxxPass functions llvm-svn: 19319	2005-01-07 06:57:28 +00:00
Jeff Cohen	844410b48e	Add missing include llvm-svn: 19315	2005-01-07 05:42:13 +00:00
Jeff Cohen	eca0d0f2da	Put createLoopUnswitchPass() into proper namespace llvm-svn: 19306	2005-01-06 05:47:18 +00:00
Jeff Cohen	27595a4aec	Add missing include llvm-svn: 19305	2005-01-06 05:46:44 +00:00
Chris Lattner	86102b8ad5	This is a bulk commit that implements the following primary improvements: * We can now fold cast instructions into select instructions that have at least one constant operand. * We now optimize expressions more aggressively based on bits that are known to be zero. These optimizations occur a lot in code that uses bitfields even in simple ways. * We now turn more cast-cast sequences into AND instructions. Before we would only do this if it if all types were unsigned. Now only the middle type needs to be unsigned (guaranteeing a zero extend). * We transform sign extensions into zero extensions in several cases. This corresponds to these test/Regression/Transforms/InstCombine testcases: 2004-11-22-Missed-and-fold.ll and.ll: test28-29 cast.ll: test21-24 and-or-and.ll cast-cast-to-and.ll zeroext-and-reduce.ll llvm-svn: 19220	2005-01-01 16:22:27 +00:00
Chris Lattner	3215bb6049	Implement SimplifyCFG/DeadSetCC.ll SimplifyCFG is one of those passes that we use for final cleanup: it should not rely on other passes to clean up its garbage. This fixes the "why are trivially dead setcc's in the output of gccas" problem. llvm-svn: 19212	2005-01-01 16:02:12 +00:00
Chris Lattner	13516fe2e7	Fix PR491 and testcase Transforms/DeadStoreElimination/2004-12-28-PartialStore.ll llvm-svn: 19180	2004-12-29 04:36:02 +00:00
Chris Lattner	b17f3e13ec	Adjust to new interfaces llvm-svn: 18958	2004-12-15 07:22:25 +00:00
Chris Lattner	9ad0d55025	Constant exprs are not efficiently negatable in practice. This disables turning X - (constantexpr) into X + (-constantexpr) among other things. llvm-svn: 18935	2004-12-14 20:08:06 +00:00
Brian Gaeke	f9639d2a74	Fix link error in PPC optimized build of 'opt'. llvm-svn: 18913	2004-12-13 21:28:39 +00:00
Chris Lattner	8f430a3b59	Get rid of getSizeOf, using ConstantExpr::getSizeOf instead. do not insert a prototype for malloc of: void* malloc(uint): on 64-bit u targets this is not correct. Instead of prototype it as void *malloc(...), and pass the correct intptr_t through the "...". Finally, fix Regression/CodeGen/SparcV9/2004-12-13-MallocCrash.ll, by not forming constantexpr casts from pointer to uint. llvm-svn: 18908	2004-12-13 20:00:02 +00:00
Chris Lattner	a199e3c1e2	Change indentation of a whole bunch of code, no real changes here. llvm-svn: 18843	2004-12-12 23:49:37 +00:00
Chris Lattner	14d07db44d	More substantial simplifications and speedups. This makes ADCE about 20% faster in some cases. llvm-svn: 18842	2004-12-12 23:40:17 +00:00
Chris Lattner	9115eb3024	More minor microoptimizations llvm-svn: 18841	2004-12-12 22:44:30 +00:00
Chris Lattner	d4298781c1	Remove some more set operations llvm-svn: 18840	2004-12-12 22:22:18 +00:00
Chris Lattner	a538439bf0	Reduce number of set operations. llvm-svn: 18839	2004-12-12 22:16:13 +00:00
Chris Lattner	bf5b7cf638	Optimize div/rem + select combinations more. In particular, implement div.ll:test10 and rem.ll:test4. llvm-svn: 18838	2004-12-12 21:48:58 +00:00
Chris Lattner	745196a5fc	Properly implement copying of a global, fixing the 255.vortex & povray failures from last night. llvm-svn: 18832	2004-12-12 19:34:41 +00:00
Chris Lattner	88deefa303	Simplify code and do not invalidate iterators. This fixes a crash compiling TimberWolfMC that was exposed due to recent optimizer changes. llvm-svn: 18831	2004-12-12 18:23:20 +00:00
Chris Lattner	1cbd5be7a1	Though the previous xform applies to literally dozens (hundreds?) of variables in SPEC, the subsequent optimziations that we are after don't play with with FP values, so disable this xform for them. Really we just don't want stuff like: double G; (always 0 or 412312.312) = G; turning into: bool G_b; = G_b ? 412312.312 : 0; We'd rather just do the load. -Chris llvm-svn: 18819	2004-12-12 06:03:06 +00:00
Chris Lattner	40e4cec9ee	If a variable can only hold two values, and is not already a bool, shrink it down to actually BE a bool. This allows simple value range propagation stuff work harder, deleting comparisons in bzip2 in some hot loops. This implements GlobalOpt/integer-bool.ll, which is the essence of the loop condition distilled into a testcase. llvm-svn: 18817	2004-12-12 05:53:50 +00:00
Chris Lattner	cbc0161d1f	If one side of and/or is known to be 0/-1, it doesn't matter if the other side is overdefined. This allows us to fold conditions like: if (X < Y \|\| Y > Z) in some cases. llvm-svn: 18807	2004-12-11 23:15:19 +00:00
Chris Lattner	263b0a1669	Only cound if we actually made a change. llvm-svn: 18800	2004-12-11 17:00:14 +00:00
Chris Lattner	ffefea0772	The split bb is really the exit of the old function llvm-svn: 18799	2004-12-11 16:59:54 +00:00
Chris Lattner	2f687fd9d6	Two bug fixes: 1. Actually increment the Statistic for the GV elim optzn 2. When resolving undef branches, only resolve branches in executable blocks, avoiding marking a bunch of completely dead blocks live. This has a big impact on the quality of the generated code. With this patch, we positively rip up vortex, compiling Ut_MoveBytes to a single memcpy call. In vortex we get this: 12 ipsccp - Number of globals found to be constant 986 ipsccp - Number of arguments constant propagated 1378 ipsccp - Number of basic blocks unreachable 8919 ipsccp - Number of instructions removed llvm-svn: 18796	2004-12-11 06:05:53 +00:00
Chris Lattner	8525ebe465	Do not delete the entry block to a function. llvm-svn: 18795	2004-12-11 05:32:19 +00:00
Chris Lattner	91dbae6fee	Implement Transforms/SCCP/ipsccp-gvar.ll, by tracking values stored to non-address-taken global variables. llvm-svn: 18790	2004-12-11 05:15:59 +00:00
Chris Lattner	99e1295645	Fix a bug where we could delete dead invoke instructions with uses. In functions where we fully constant prop the return value, replace all ret instructions with 'ret undef'. llvm-svn: 18786	2004-12-11 02:53:57 +00:00
Chris Lattner	bae4b64553	Implement SCCP/ipsccp-conditional.ll, by totally deleting dead blocks. llvm-svn: 18781	2004-12-10 22:29:08 +00:00
Chris Lattner	7285f43836	Fix SCCP/2004-12-10-UndefBranchBug.ll llvm-svn: 18776	2004-12-10 20:41:50 +00:00
Chris Lattner	4fc998da2e	Fix Regression/Transforms/SimplifyCFG/2004-12-10-SimplifyCFGCrash.ll, and the failure on make_dparser last night. llvm-svn: 18766	2004-12-10 17:42:31 +00:00
Chris Lattner	b439464c61	This is the initial implementation of IPSCCP, as requested by Brian. This implements SCCP/ipsccp-basic.ll, rips apart Olden/mst (as described in PR415), and does other nice things. There is still more to come with this, but it's a start. llvm-svn: 18752	2004-12-10 08:02:06 +00:00
Chris Lattner	36d39cecb4	note to self: Do not check in debugging code! llvm-svn: 18693	2004-12-09 07:15:52 +00:00
Chris Lattner	f17a2fb849	Implement trivial sinking for load instructions. This causes us to sink 567 loads in spec llvm-svn: 18692	2004-12-09 07:14:34 +00:00
Chris Lattner	39c98bb31c	Do extremely simple sinking of instructions when they are only used in a successor block. This turns cases like this: x = a op b if (c) { use x } into: if (c) { x = a op b use x } This triggers 3965 times in spec, and is tested by Regression/Transforms/InstCombine/sink_instruction.ll This appears to expose a bug in the X86 backend for 177.mesa, which I'm looking in to. llvm-svn: 18677	2004-12-08 23:43:58 +00:00
Alkis Evlogimenos	a1291a0679	Fix this regression and remove the XFAIL from this test. llvm-svn: 18674	2004-12-08 23:10:30 +00:00
Chris Lattner	8f30caf549	Fix Transforms/InstCombine/2004-12-08-RemInfiniteLoop.ll llvm-svn: 18670	2004-12-08 22:20:34 +00:00
Chris Lattner	674ce86cd0	Add support for compilers without argument dependent name lookup, contributed by Bjørn Wennberg llvm-svn: 18627	2004-12-08 16:12:20 +00:00
Chris Lattner	407000c497	Remove unneeded class qualifier, contributed by Bjørn Wennberg llvm-svn: 18625	2004-12-08 16:05:02 +00:00
Reid Spencer	9273d480ad	For PR387:\ Add doInitialization method to avoid overloaded virtuals llvm-svn: 18602	2004-12-07 08:11:36 +00:00
Chris Lattner	9019e5cfa0	Implement stripping of debug symbols, making the --strip-debug options in gccas/gccld more than just a noop. llvm-svn: 18456	2004-12-03 16:22:08 +00:00
Chris Lattner	e8ebcb3300	Initial reimplementation of the -strip pass, with a stub for implementing -S llvm-svn: 18440	2004-12-02 21:25:03 +00:00
Chris Lattner	a4c9808603	This pass is moving to lib IPO llvm-svn: 18439	2004-12-02 21:24:40 +00:00
Chris Lattner	c0677c081d	Implement a FIXME by checking to make sure that a malloc is not being used in scary and unknown ways before we promote it. This fixes the miscompilation of 188.ammp that has been plauging us since a globalopt patch went in. Thanks a ton to Tanya for helping me diagnose the problem! llvm-svn: 18418	2004-12-02 07:11:07 +00:00
Chris Lattner	3b18139b3c	Fix a minor bug where we set a var to initialized on malloc, not on store. This doesn't fix anything that I'm aware of, just noticed it by inspection llvm-svn: 18417	2004-12-02 06:25:58 +00:00
Chris Lattner	951673a94c	This pass is completely broken. llvm-svn: 18387	2004-11-30 17:09:06 +00:00
Chris Lattner	019445715e	Squelch warning llvm-svn: 18381	2004-11-30 07:47:34 +00:00
Chris Lattner	868ae13dc0	Fix test/Regression/Transforms/LICM/2004-09-14-AliasAnalysisInvalidate.llx This only fails on darwin or on X86 under valgrind. llvm-svn: 18377	2004-11-30 07:01:15 +00:00
Chris Lattner	fd8cbc257e	Alkis noticed that this variable is dead. Thanks! llvm-svn: 18369	2004-11-30 04:01:44 +00:00
Chris Lattner	389cfac0d1	If we have something like this: if (x) { code ... } else { code ... } Turn it into: code if (x) { ... } else { ... } This reduces code size and in some common cases allows us to completely eliminate the conditional. This turns several if/then/else blocks in loops into straightline code in 179.art, turning the loops into single basic blocks (good for modsched even!). Maybe now brg will leave me alone ;-) llvm-svn: 18366	2004-11-30 00:29:14 +00:00
Chris Lattner	6e455608e2	Allow hoisting loads of globals and alloca's in conditionals. llvm-svn: 18363	2004-11-29 21:26:12 +00:00
Reid Spencer	279fa256a2	Fix for PR454: * Make sure we handle signed to unsigned conversion correctly * Move this visitSetCondInst case to its own method. llvm-svn: 18312	2004-11-28 21:31:15 +00:00
Chris Lattner	6ea2888832	Make DSE potentially more aggressive by being more specific about alloca sizes. llvm-svn: 18309	2004-11-28 20:44:37 +00:00
Chris Lattner	14f3cdc227	Implement Regression/Transforms/InstCombine/getelementptr_cast.ll, which occurs many times in crafty llvm-svn: 18273	2004-11-27 17:55:46 +00:00
Chris Lattner	b137409926	Provide size information when checking to see if we can LICM a load, this allows us to hoist more loads in some cases. llvm-svn: 18265	2004-11-26 21:20:09 +00:00
Chris Lattner	540e5f92b4	Do not count debugger intrinsics in size estimation. llvm-svn: 18110	2004-11-22 17:23:57 +00:00
Chris Lattner	79e87e39eb	Ignore debugger intrinsics when doing inlining size computations. llvm-svn: 18109	2004-11-22 17:21:44 +00:00
Chris Lattner	6d048a0d32	Do not consider debug intrinsics in the size computations for loop unrolling. Patch contributed by Michael McCracken! llvm-svn: 18108	2004-11-22 17:18:36 +00:00
Misha Brukman	72a57c3259	Allow constructor parameter to override aggregating args; fix spacing llvm-svn: 18028	2004-11-20 02:20:27 +00:00
Chris Lattner	446948e094	Fix the exposed prototype for the lower packed pass, thanks to Morten Ofstad. llvm-svn: 17996	2004-11-19 16:49:34 +00:00
Chris Lattner	d137be2d0d	CPR is dead. llvm-svn: 17992	2004-11-19 16:24:57 +00:00
Chris Lattner	953075442d	Delete stoppoints that occur for the same source line. llvm-svn: 17970	2004-11-18 21:41:39 +00:00
Chris Lattner	c08ac110df	Check in hook that I forgot llvm-svn: 17956	2004-11-18 17:24:20 +00:00
Chris Lattner	27af257ea0	Do not delete dead invoke instructions! llvm-svn: 17897	2004-11-16 16:32:28 +00:00
Reid Spencer	9339638e9c	Remove unused variable for compilation by VC++. Patch contributed by Morten Ofstad. llvm-svn: 17830	2004-11-15 17:29:41 +00:00
Chris Lattner	1890f94413	Minor cleanups. There is no reason for SCCP to derive from instvisitor anymore. llvm-svn: 17825	2004-11-15 07:15:04 +00:00
Chris Lattner	9a038a3a5e	Count more accurately llvm-svn: 17824	2004-11-15 07:02:42 +00:00
Chris Lattner	97013636cd	Quiet warnings on the persephone tester llvm-svn: 17821	2004-11-15 05:54:07 +00:00
Chris Lattner	d18c16b842	Two minor improvements: 1. Speedup getValueState by having it not consider Arguments. It's better to just add them before we start SCCP'ing. 2. SCCP can delete the contents of dead blocks. No really, it's ok! This reduces the size of the IR for subsequent passes, even though simplifycfg would do the same job. In practice, simplifycfg does not run until much later than sccp in gccas llvm-svn: 17820	2004-11-15 05:45:33 +00:00
Chris Lattner	4f0316229c	rename InstValue to LatticeValue, as it holds for more than instructions. llvm-svn: 17818	2004-11-15 05:03:30 +00:00
Chris Lattner	074be1f6e4	Substantially refactor the SCCP class into an SCCP pass and an SCCPSolver class. The only changes are minor: * Do not try to SCCP instructions that return void in the rewrite loop. This is silly and fool hardy, wasting a map lookup and adding an entry to the map which is never used. * If we decide something has an undefined value, rewrite it to undef, potentially leading to further simplications. llvm-svn: 17816	2004-11-15 04:44:20 +00:00
Chris Lattner	28eeb73f2f	If a global is just loaded and restored, realize that it is not changing value. This allows us to turn more globals into constants and eliminate them. This patch implements GlobalOpt/load-store-global.llx. Note that this patch speeds up 255.vortex from: Output/255.vortex.out-cbe.time:program 7.640000 Output/255.vortex.out-llc.time:program 9.810000 to: Output/255.vortex.out-cbe.time:program 7.250000 Output/255.vortex.out-llc.time:program 9.490000 Which isn't bad at all! llvm-svn: 17746	2004-11-14 20:50:30 +00:00
Chris Lattner	46dd5a6304	This optimization makes MANY phi nodes that all have the same incoming value. If this happens, detect it early instead of relying on instcombine to notice it later. This can be a big speedup, because PHI nodes can have many incoming values. llvm-svn: 17741	2004-11-14 19:29:34 +00:00
Chris Lattner	7515cabe2a	Implement instcombine/phi.ll:test6 - pulling operations through PHI nodes. This exposes subsequent optimization possiblities and reduces code size. This triggers 1423 times in spec. llvm-svn: 17740	2004-11-14 19:13:23 +00:00
Chris Lattner	15ff1e1885	Transform this: %X = alloca ... %Y = alloca ... X == Y into false. This allows us to simplify some stuff in eon (and probably many other C++ programs) where operator= was checking for self assignment. Folding this allows us to SROA several additional structs. llvm-svn: 17735	2004-11-14 07:33:16 +00:00
Chris Lattner	5a8b003a09	Remove note to self llvm-svn: 17734	2004-11-14 06:57:47 +00:00
Chris Lattner	af555adc15	If a function always returns a constant, replace all calls sites with that constant value. This makes the return value dead and allows for simplification in the caller. This implements IPConstantProp/return-constant.ll This triggers several dozen times throughout SPEC. llvm-svn: 17730	2004-11-14 06:10:11 +00:00
Chris Lattner	fe3f4e6ebd	Teach SROA how to promote an array index that is variable, if the dimension of the array is just two. This occurs 8 times in gcc, 6 times in crafty, and 12 times in 099.go. This implements ScalarRepl/sroa_two.ll llvm-svn: 17727	2004-11-14 05:00:19 +00:00
Chris Lattner	8881912d71	Rearrange some code, no functionality changes. llvm-svn: 17724	2004-11-14 04:24:28 +00:00
Chris Lattner	9fa7f0ae0a	Remove debugging code llvm-svn: 17719	2004-11-13 23:32:53 +00:00
Chris Lattner	244031d306	Argument promotion transforms functions to unconditionally load their argument pointers. This is only valid to do if the function already unconditionally loaded an argument or if the pointer passed in is known to be valid. Make sure to do the required checks. This fixed ArgumentPromotion/control-flow.ll and the Burg program. llvm-svn: 17718	2004-11-13 23:31:34 +00:00
Chris Lattner	8c3e7b92af	Simplify handling of shifts to be the same as we do for adds. Add support for (X * C1) + (X * C2) (where * can be mul or shl), allowing us to fold: Y+Y+Y+Y+Y+Y+Y+Y into %tmp.8 = shl long %Y, ubyte 3 ; <long> [#uses=1] instead of %tmp.4 = shl long %Y, ubyte 2 ; <long> [#uses=1] %tmp.12 = shl long %Y, ubyte 2 ; <long> [#uses=1] %tmp.8 = add long %tmp.4, %tmp.12 ; <long> [#uses=1] This implements add.ll:test25 Also add support for (XC1)-(XC2) -> X*(C1-C2), implementing sub.ll:test18 llvm-svn: 17704	2004-11-13 19:50:12 +00:00
Chris Lattner	4efe20a103	Fold: (X + (X << C2)) --> X * ((1 << C2) + 1) ((X << C2) + X) --> X * ((1 << C2) + 1) This means that we now canonicalize "Y+Y+Y" into: %tmp.2 = mul long %Y, 3 ; <long> [#uses=1] instead of: %tmp.10 = shl long %Y, ubyte 1 ; <long> [#uses=1] %tmp.6 = add long %Y, %tmp.10 ; <long> [#uses=1] llvm-svn: 17701	2004-11-13 19:31:40 +00:00
Chris Lattner	2858e17538	Lazily create the abort message, so only translation units that use unwind will actually get it. llvm-svn: 17700	2004-11-13 19:07:32 +00:00
Chris Lattner	9b0291b18d	Fix: CodeExtractor/2004-11-12-InvokeExtract.ll llvm-svn: 17699	2004-11-13 00:06:45 +00:00
Chris Lattner	5bcca6058a	Fix a bug where the code extractor would get a bit confused handling invoke instructions, setting DefBlock to a block it did not have dom info for. llvm-svn: 17697	2004-11-12 23:50:44 +00:00
Chris Lattner	5c1d84c769	Simplify handling of constant initializers llvm-svn: 17696	2004-11-12 22:42:57 +00:00
Chris Lattner	9621dfab3f	Actually, leave the check in. This prevents us from counting dead arguments as IPCP opportunities. llvm-svn: 17680	2004-11-11 07:47:54 +00:00
Chris Lattner	5fa696f8e4	Fix bug: IPConstantProp/deadarg.ll llvm-svn: 17679	2004-11-11 07:46:29 +00:00
Chris Lattner	c1d24cd859	Make IP Constant prop more aggressive about handling self recursive calls. This implements IPConstantProp/recursion.ll llvm-svn: 17666	2004-11-10 19:43:59 +00:00
Chris Lattner	0d3773d8b1	Do not let dead constant expressions hanging off of functions prevent IPCP. This allows to elimination of a bunch of global pool descriptor args from programs being pool allocated (and is also generally useful!) llvm-svn: 17657	2004-11-09 20:47:30 +00:00
Chris Lattner	436285e75d	Change this back so that I get stable numbers to reflect the change from the nightly testers llvm-svn: 17646	2004-11-09 08:05:23 +00:00
Chris Lattner	1f0a97c6cb	Fix bug: 2004-11-08-FreeUseCrash.ll llvm-svn: 17642	2004-11-09 05:10:56 +00:00
Chris Lattner	49fa1ecd04	VERY large functions that are only called from one place are not really exciting to inline. Only inline medium or small sized functions with a single call site. llvm-svn: 17588	2004-11-07 21:46:47 +00:00
Chris Lattner	595016d090	This is V9 specific, move it there. llvm-svn: 17545	2004-11-07 00:39:26 +00:00
Chris Lattner	3c670cb65a	Remove dead vars llvm-svn: 17482	2004-11-05 04:46:22 +00:00
Chris Lattner	33eb909939	Fix some warnings on VC++ llvm-svn: 17481	2004-11-05 04:45:43 +00:00
Chris Lattner	96f6616479	* Rearrange code slightly * Disable broken transforms for simplifying (setcc (cast X to larger), CI) where CC is not != or == llvm-svn: 17422	2004-11-02 03:50:32 +00:00
Chris Lattner	8af7424920	Speed up the tail duplication pass on the testcase below from 68.2s to 1.23s: #define CL0(a) case a: f(); goto c; #define CL1(a) CL0(a##0) CL0(a##1) CL0(a##2) CL0(a##3) CL0(a##4) CL0(a##5) \ CL0(a##6) CL0(a##7) CL0(a##8) CL0(a##9) #define CL2(a) CL1(a##0) CL1(a##1) CL1(a##2) CL1(a##3) CL1(a##4) CL1(a##5) \ CL1(a##6) CL1(a##7) CL1(a##8) CL1(a##9) #define CL3(a) CL2(a##0) CL2(a##1) CL2(a##2) CL2(a##3) CL2(a##4) CL2(a##5) \ CL2(a##6) CL2(a##7) CL2(a##8) CL2(a##9) #define CL4(a) CL3(a##0) CL3(a##1) CL3(a##2) CL3(a##3) CL3(a##4) CL3(a##5) \ CL3(a##6) CL3(a##7) CL3(a##8) CL3(a##9) void f(); void a() { int b; c: switch (b) { CL4(1) } } This comes from GCC PR 15524 llvm-svn: 17390	2004-11-01 07:05:07 +00:00
Chris Lattner	93d1e39f3e	Do not compute the predecessor list for a block unless we need it. This speeds up simplifycfg on this program, from 44.87s to 0.29s (with a profiled build): #define CL0(a) case a: goto c; #define CL1(a) CL0(a##0) CL0(a##1) CL0(a##2) CL0(a##3) CL0(a##4) CL0(a##5) \ CL0(a##6) CL0(a##7) CL0(a##8) CL0(a##9) #define CL2(a) CL1(a##0) CL1(a##1) CL1(a##2) CL1(a##3) CL1(a##4) CL1(a##5) \ CL1(a##6) CL1(a##7) CL1(a##8) CL1(a##9) #define CL3(a) CL2(a##0) CL2(a##1) CL2(a##2) CL2(a##3) CL2(a##4) CL2(a##5) \ CL2(a##6) CL2(a##7) CL2(a##8) CL2(a##9) #define CL4(a) CL3(a##0) CL3(a##1) CL3(a##2) CL3(a##3) CL3(a##4) CL3(a##5) \ CL3(a##6) CL3(a##7) CL3(a##8) CL3(a##9) void f(); void a() { int b; c: switch (b) { CL4(1) } } This testcase is contrived to expose N^2 behavior, but this patch should speedup simplifycfg on any programs that use large switch statements. This testcase comes from GCC PR17895. llvm-svn: 17389	2004-11-01 06:53:58 +00:00
Reid Spencer	57cbe39d1e	Change Library Names Not To Conflict With Others When Installed llvm-svn: 17286	2004-10-27 23:18:45 +00:00
Chris Lattner	7dfc2d29ac	Convert 'struct' to 'class' in various places to adhere to the coding standards and work better with VC++. Patch contributed by Morten Ofstad! llvm-svn: 17281	2004-10-27 16:14:51 +00:00
Chris Lattner	70c2039b39	Hrm, this code was severely botched. As it turns out, this patch: http://mail.cs.uiuc.edu/pipermail/llvm-commits/Week-of-Mon-20041018/019708.html exposed ANOTHER latent bug in this xform, which caused Prolangs-C/bison to fill the zion nightly tester disk up and make the tester barf. This is obviously not a good thing, so lets fix this bug shall we? :) llvm-svn: 17276	2004-10-27 05:57:15 +00:00
Chris Lattner	845afe9b20	Initialize with the correct constant type llvm-svn: 17270	2004-10-27 03:55:24 +00:00
Chris Lattner	d57638c4a7	Fix compatibility with MSVC, patch by Morten Ofstad llvm-svn: 17218	2004-10-25 18:45:16 +00:00
Reid Spencer	fad217c847	Eliminate compilation warning on uninitialized variable. llvm-svn: 17163	2004-10-22 16:10:39 +00:00
Chris Lattner	fe9abf92de	* empty log message * llvm-svn: 17161	2004-10-22 06:43:28 +00:00
Chris Lattner	5c3c21e10a	Fix a bug Nate noticed, where we miscompiled a simple testcase llvm-svn: 17157	2004-10-22 04:53:16 +00:00
Reid Spencer	c1c320c335	We won't use automake llvm-svn: 17155	2004-10-22 03:35:04 +00:00
Brian Gaeke	c9d8b4d45c	Explain what this pass does. llvm-svn: 17146	2004-10-20 19:38:58 +00:00
Chris Lattner	257b284038	Hrm, some people complain when the compiler cheerfully tells them what it's doing... I guess they're right. llvm-svn: 17142	2004-10-19 06:33:16 +00:00
Reid Spencer	6a11a75f31	Initial automake generated Makefile template llvm-svn: 17136	2004-10-18 23:55:41 +00:00
Nate Begeman	b18121e6a9	Initial implementation of the strength reduction for GEP instructions in loops. This optimization is not turned on by default yet, but may be run with the opt tool's -loop-reduce flag. There are many FIXMEs listed in the code that will make it far more applicable to a wide range of code, but you have to start somewhere :) This limited version currently triggers on the following tests in the MultiSource directory: pcompress2: 7 times cfrac: 5 times anagram: 2 times ks: 6 times yacr2: 2 times llvm-svn: 17134	2004-10-18 21:08:22 +00:00
Chris Lattner	88a8a329c3	Get this file compiling with VC++, patch contributed by Morten Ofstad. Thanks Morten! llvm-svn: 17125	2004-10-18 15:43:46 +00:00
Reid Spencer	ce0783318b	Correction to allow compilation with Visual C++. Patch contributed by Morten Ofstad. Thanks Morten! llvm-svn: 17123	2004-10-18 14:38:48 +00:00
Chris Lattner	5edb2f32d0	Simplify code by deleting instructions that preceed unreachable instructions. Simplify code by simplifying terminators that branch to blocks that start with an unreachable instruction. llvm-svn: 17116	2004-10-18 04:07:22 +00:00
Chris Lattner	a67dd32004	Turn store -> null/undef into the LLVM unreachable instruction! This simple change hacks off 10K of bytecode from perlbmk (.5%) even though the front-end is not generating them yet and we are not optimizing the resultant code. This isn't too bad. llvm-svn: 17111	2004-10-18 03:00:50 +00:00
Chris Lattner	8ba9ec9bbb	Turn things with obviously undefined semantics into 'store -> null' llvm-svn: 17110	2004-10-18 02:59:09 +00:00
Chris Lattner	3b92f17165	My friend the invoke instruction does not dominate all basic blocks if it occurs in the entry node of a function llvm-svn: 17109	2004-10-18 01:48:31 +00:00
Chris Lattner	34ae670706	Fix a bug that occurs when the constant value is the result of an invoke. In particular, invoke ret values are only live in the normal dest of the invoke not in the unwind dest. llvm-svn: 17108	2004-10-18 01:21:17 +00:00
Chris Lattner	6a792feb02	Getting ADCE to interact well with unreachable instructions seems like a nontrivial exercise that I'm not interested in tackling right now. Just punt and treat them like unwind's. This 'fixes' test/Regression/Transforms/ADCE/unreachable-function.ll llvm-svn: 17106	2004-10-17 23:45:06 +00:00
Chris Lattner	6e79e55aea	Fix Regression/Transforms/Inline/2004-10-17-InlineFunctionWithoutReturn.ll If a function had no return instruction in it, and the result of the inlined call instruction was used, we would crash. llvm-svn: 17104	2004-10-17 23:21:07 +00:00
Chris Lattner	107c15c33d	Remove printout, realize that instructions in the entry block dominate all other blocks. llvm-svn: 17099	2004-10-17 21:31:34 +00:00
Chris Lattner	215c7ebaa6	When inserting PHI nodes, don't insert any phi nodes that are obviously unneccesary. This allows us to delete several hundred phi nodes of the form PHI(x,x,x,undef) from 253.perlbmk and probably other programs as well. This implements Mem2Reg/UndefValuesMerge.ll llvm-svn: 17098	2004-10-17 21:25:56 +00:00
Chris Lattner	96db59e48a	Enhance hasConstantValue to ignore undef values in phi nodes. This allows it to think that PHI[4, undef] == 4. llvm-svn: 17096	2004-10-17 21:23:26 +00:00
Chris Lattner	e29d634a94	hasConstantValue will soon return instructions that don't dominate the PHI node, so prepare for this. llvm-svn: 17095	2004-10-17 21:22:38 +00:00
Chris Lattner	67f0545daf	Fix a type violation llvm-svn: 17069	2004-10-16 23:28:04 +00:00
Chris Lattner	684c5c6587	Kill the bogon that slipped into my buffer before I committed. llvm-svn: 17067	2004-10-16 19:46:33 +00:00
Chris Lattner	6580e09fef	Implement InstCombine/getelementptr.ll:test9, which is the source of many ugly and giant constnat exprs in some programs. llvm-svn: 17066	2004-10-16 19:44:59 +00:00
Chris Lattner	98e541457b	Add support for unreachable llvm-svn: 17056	2004-10-16 18:21:33 +00:00
Chris Lattner	81a7a23494	Optimize instructions involving undef values. For example X+undef == undef. llvm-svn: 17047	2004-10-16 18:11:37 +00:00
Chris Lattner	7e6d4a12b5	Add support for UndefValue llvm-svn: 17046	2004-10-16 18:10:31 +00:00
Chris Lattner	c0e2e82477	When promoting mem2reg, make uninitialized values become undef isntead of 0. llvm-svn: 17045	2004-10-16 18:10:06 +00:00
Chris Lattner	646354bae1	Handle undef values as undefined on the constant lattice ignore unreachable instructions llvm-svn: 17044	2004-10-16 18:09:41 +00:00
Chris Lattner	6ac3ef950d	Add note llvm-svn: 17043	2004-10-16 18:09:25 +00:00

... 7 8 9 10 11 ...

2481 Commits