Commit Graph

5144 Commits

Author SHA1 Message Date
Gabor Greif 5d8f7e0cc7 eliminate warning
llvm-svn: 42892
2007-10-12 07:44:54 +00:00
Chris Lattner d8675e4915 Fix some 80 column violations.
Fix DecomposeSimpleLinearExpr to handle simple constants better.
Don't nuke gep(bitcast(allocation)) if the bitcast(allocation) will
fold the allocation.  This fixes PR1728 and Instcombine/malloc3.ll

llvm-svn: 42891
2007-10-12 05:30:59 +00:00
Devang Patel 899cc56612 Lower memcpy if it makes sense.
llvm-svn: 42864
2007-10-11 17:21:57 +00:00
Devang Patel 2af23f976b Do not walk invalid iterator.
llvm-svn: 42812
2007-10-09 21:31:36 +00:00
Devang Patel a69f987b66 Fix bug in updating dominance frontier after loop
unswitch when frontier includes basic blocks that 
are not inside loop.

llvm-svn: 42654
2007-10-05 22:29:34 +00:00
Devang Patel 3574759d85 Fix 80 col violation.
llvm-svn: 42591
2007-10-03 21:17:43 +00:00
Devang Patel e192e32577 Refactor code in a separate method.
llvm-svn: 42590
2007-10-03 21:16:08 +00:00
Dan Gohman c731c97fac Use empty() member functions when that's what's being tested for instead
of comparing begin() and end().

llvm-svn: 42585
2007-10-03 19:26:29 +00:00
Dale Johannesen 9d559cfff5 Tone down an overzealous optimization.
llvm-svn: 42582
2007-10-03 17:45:27 +00:00
Tanya Lattner 30f65fe4a7 Fix PR1719, by not marking llvm.global.annotations internal.
llvm-svn: 42578
2007-10-03 17:05:40 +00:00
Chris Lattner d66e0cd6c0 Fix PR1719, by not marking llvm.noinline internal.
llvm-svn: 42565
2007-10-03 03:59:15 +00:00
Dale Johannesen b6c05b1f90 Fix stride computations for long double arrays.
llvm-svn: 42508
2007-10-01 23:08:35 +00:00
Devang Patel 2a60ff1aeb Relax unsafe use check. If there is one unconditional use inside the loop then it is safe to promote value even if there is another conditional use inside the loop.
llvm-svn: 42493
2007-10-01 18:12:58 +00:00
Dale Johannesen 6bf69ed3cc minor long double related changes
llvm-svn: 42439
2007-09-28 18:06:58 +00:00
Dale Johannesen 1d1d0e7735 Don't do SRA for unions with long double fields.
Fixes a SWB crash.

llvm-svn: 42422
2007-09-28 00:21:38 +00:00
Devang Patel 7bba386f72 Handle multiple induction variables.
This fixes PR714.

llvm-svn: 42309
2007-09-25 18:24:48 +00:00
Devang Patel 440d13b55b Do not reserve DOM check for GetElementPtrInst.
llvm-svn: 42306
2007-09-25 17:55:50 +00:00
Devang Patel 5e1651d270 doh..
llvm-svn: 42300
2007-09-25 17:43:08 +00:00
Devang Patel 87d7e8ebcb Add transformation to update loop interation space. Now,
for (i=A; i<N; i++) {
  if (i < X && i > Y)
    do_something();
}

is transformed into

U=min(N,X); L=max(A,Y);
for (i=L;i<U;i++)
  do_somethihg();                            

llvm-svn: 42299
2007-09-25 17:31:19 +00:00
Devang Patel 9e30e1a3be Do not promote null values because it may be unsafe to do so.
llvm-svn: 42270
2007-09-24 20:02:42 +00:00
Dan Gohman 75470c3bf1 explicit keywords.
llvm-svn: 42262
2007-09-24 15:48:49 +00:00
Devang Patel 361e52f39c Fix PR1692
llvm-svn: 42209
2007-09-21 21:18:19 +00:00
Owen Anderson 46da2a6262 Add partial caching of non-local memory dependence queries. This provides a modest
speedup for GVN.

llvm-svn: 42185
2007-09-21 03:53:52 +00:00
Devang Patel 83cc3f8f51 Update aux. info associated with an instruction before erasing instruction.
llvm-svn: 42180
2007-09-20 23:45:50 +00:00
Devang Patel 6117a3b696 Don't increment invalid iterator.
llvm-svn: 42178
2007-09-20 23:01:50 +00:00
Nick Lewycky eae7e7d00b Fix optimization. %x = sub %x, %y does not imply that %y is zero.
llvm-svn: 42157
2007-09-20 00:48:36 +00:00
Devang Patel 464276f831 Avoid unsafe promotion.
llvm-svn: 42149
2007-09-19 20:18:51 +00:00
Duncan Sands d31649bc59 Improve comment.
llvm-svn: 42132
2007-09-19 10:25:38 +00:00
Duncan Sands 56df7dec2b A global variable with external weak linkage can be null, while
an alias could alias such a global variable.

llvm-svn: 42130
2007-09-19 10:10:31 +00:00
Devang Patel 69a55a38ed Relax loop ExitCondition predicate restriction.
llvm-svn: 42122
2007-09-19 00:28:47 +00:00
Devang Patel 455a53b7db Filter loops where split condition's false branch is not empty. For example
for (int i = 0; i < N; ++i) {
  if (i == somevalue)
    dosomething();
   else
    dosomethingelse();
}

llvm-svn: 42121
2007-09-19 00:15:16 +00:00
Devang Patel 4c238c451f Bail out early, before modifying anything.
llvm-svn: 42120
2007-09-19 00:11:01 +00:00
Devang Patel 31f2c8592c Work is incomplete. Loop is not modified at all right now.
llvm-svn: 42119
2007-09-19 00:08:13 +00:00
Devang Patel fcda998ab2 Fix PR1657
llvm-svn: 42075
2007-09-18 01:54:42 +00:00
Devang Patel 267c07b51f Do not eliminate loop when it is invalid to do so. For example,
for(int i = 0; i < N; i++) {
	if ( i == XYZ) {
		A;
	else
		B;
	}
	C;
	D;
}

llvm-svn: 42058
2007-09-17 21:01:05 +00:00
Devang Patel 712dbe9d13 Skeleton for transformations to truncate loop's iteration space.
llvm-svn: 42054
2007-09-17 20:39:48 +00:00
Devang Patel 9d1af9b63d Fix comment.
llvm-svn: 42048
2007-09-17 20:07:40 +00:00
Chris Lattner 0625bd6472 Merge DenseMapKeyInfo & DenseMapValueInfo into DenseMapInfo
Add a new DenseMapInfo::isEqual method to allow clients to redefine
the equality predicate used when probing the hash table.

llvm-svn: 42042
2007-09-17 18:34:04 +00:00
Dan Gohman 2ac2652779 Instcombine x-((x/y)*y) into a remainder operator.
llvm-svn: 42035
2007-09-17 17:31:57 +00:00
Duncan Sands 6d5da71288 Factor the trampoline transformation into a subroutine.
llvm-svn: 42021
2007-09-17 10:26:40 +00:00
Owen Anderson 4cd516b50b Be more careful when constant-folding PHI nodes.
llvm-svn: 41998
2007-09-16 08:04:16 +00:00
Owen Anderson 8d0cb881e5 Remove RLE. It is subsumed by GVN.
llvm-svn: 41968
2007-09-14 22:33:52 +00:00
Dale Johannesen 98d3a08d8f Remove the assumption that FP's are either float or
double from some of the many places in the optimizers
it appears, and do something reasonable with x86
long double.
Make APInt::dump() public, remove newline, use it to
dump ConstantSDNode's.
Allow APFloats in FoldingSet.
Expand X86 backend handling of long doubles (conversions
to/from int, mostly).

llvm-svn: 41967
2007-09-14 22:26:36 +00:00
Chris Lattner 5d13fb538f Fix a logic error in ValueIsOnlyUsedLocallyOrStoredToOneGlobal that caused
miscompilation of 188.ammp.  Reject select and bitcast in 
ValueIsOnlyUsedLocallyOrStoredToOneGlobal because RewriteHeapSROALoadUser can't handle it.

llvm-svn: 41950
2007-09-14 03:41:21 +00:00
Chris Lattner d9111b88d1 silence a bogus gcc warning.
llvm-svn: 41949
2007-09-14 03:07:24 +00:00
Bill Wendling 264d4813c7 Temporary reverting r41817
(http://lists.cs.uiuc.edu/pipermail/llvm-commits/Week-of-Mon-20070910/053370.html). It's
causing SPASS to fail.

llvm-svn: 41938
2007-09-14 01:13:55 +00:00
Chris Lattner 011f91b5b2 Teach GlobalLoadUsesSimpleEnoughForHeapSRA and the SROA rewriter how to handle
a limited form of PHI nodes.  This finally fixes PR1639, speeding 179.art up
from 7.84s to 3.13s on PPC.

llvm-svn: 41933
2007-09-13 21:31:36 +00:00
Chris Lattner ba98f89388 be tolerant of PHI nodes when rewriting heap SROA code. This is a step
along the way of PR1639

llvm-svn: 41930
2007-09-13 18:00:31 +00:00
Chris Lattner f315d4f1a7 refactor some code, no functionality change. On the path to PR1639
llvm-svn: 41929
2007-09-13 17:29:05 +00:00
Chris Lattner 6eed0e7366 Make ValueIsOnlyUsedLocallyOrStoredToOneGlobal smart enough to see through
bitcasts and phis.  This is a step to fixing PR1639.

llvm-svn: 41928
2007-09-13 16:37:20 +00:00
Chris Lattner 2d2892ee6e Make AllUsesOfLoadedValueWillTrapIfNull strong enough to see through PHI
nodes.  This is the first step of the fix for PR1639.

llvm-svn: 41927
2007-09-13 16:30:19 +00:00
Chris Lattner 7b412cb823 Change llvm.gcroot to not init the root to null at runtime, this prevents
using it for live-in values etc.

llvm-svn: 41879
2007-09-12 17:53:10 +00:00
Duncan Sands 9204663bcb Turn calls to trampolines into calls to the underlying
nested function.

llvm-svn: 41844
2007-09-11 14:35:41 +00:00
Devang Patel 7ed6eb8992 Avoid negative logic.
llvm-svn: 41829
2007-09-11 01:10:45 +00:00
Devang Patel 8c95373ced Refactor code into a separate method.
llvm-svn: 41826
2007-09-11 00:42:56 +00:00
Devang Patel d67479b6ee Clear split info object.
llvm-svn: 41823
2007-09-11 00:23:56 +00:00
Devang Patel a28a7f1b2d Split condition does not have to be ICmpInst in all cases.
llvm-svn: 41822
2007-09-11 00:12:56 +00:00
Devang Patel f4202e91f8 Check all terminators inside loop.
llvm-svn: 41821
2007-09-10 23:57:58 +00:00
Chris Lattner e804567cd8 remove some dead code, this is handled by constant folding.
llvm-svn: 41819
2007-09-10 23:46:29 +00:00
Devang Patel 2181b8e86a Swap exit condition operands if it works.
llvm-svn: 41817
2007-09-10 23:34:06 +00:00
Chris Lattner c75cbe6473 Prevent tailcallelim from breaking "recursive" calls to builtins.
llvm-svn: 41804
2007-09-10 20:58:55 +00:00
Devang Patel f8ab0a9acc Filter exit conditions which are not yet handled.
llvm-svn: 41800
2007-09-10 18:33:42 +00:00
Devang Patel d7409fdce5 Require SCEV before LCSSA.
llvm-svn: 41798
2007-09-10 18:08:23 +00:00
Chris Lattner 85a51e0060 Don't zap back to back volatile load/stores
llvm-svn: 41759
2007-09-07 05:33:03 +00:00
Dale Johannesen bed9dc423c Next round of APFloat changes.
Use APFloat in UpgradeParser and AsmParser.
Change all references to ConstantFP to use the
APFloat interface rather than double.  Remove
the ConstantFP double interfaces.
Use APFloat functions for constant folding arithmetic
and comparisons.
(There are still way too many places APFloat is
just a wrapper around host float/double, but we're
getting there.)

llvm-svn: 41747
2007-09-06 18:13:44 +00:00
Nick Lewycky 0c5c47944a Use isTrueWhenEqual. Thanks Chris!
llvm-svn: 41741
2007-09-06 02:40:25 +00:00
Nick Lewycky b0b066eaaa When the two operands of an icmp are equal, there are five possible predicates
that would make the icmp true. Fixes PR1637.

llvm-svn: 41740
2007-09-06 01:10:22 +00:00
Chuck Rose III 2320323647 Forgot to obey 80 column rule. Fixing that.
llvm-svn: 41725
2007-09-05 20:36:41 +00:00
Chuck Rose III e58572233d Added default parameters to GetElementPtrInstr constructor call. Visual Studio 2k5 was getting confused and was unable to compile it. Suspected compiler error.
llvm-svn: 41721
2007-09-05 16:54:38 +00:00
Devang Patel f6ef552f3d Insert cloned loop basic blocks before original loop header.
llvm-svn: 41713
2007-09-04 20:46:35 +00:00
David Greene c656cbb8c2 Update GEP constructors to use an iterator interface to fix
GLIBCXX_DEBUG issues.

llvm-svn: 41697
2007-09-04 15:46:09 +00:00
Anton Korobeynikov 35322d745c Silence warning while compiling with gcc 4.2
llvm-svn: 41676
2007-09-02 22:11:14 +00:00
Evan Cheng ffac17a223 Fix a gcroot lowering bug.
llvm-svn: 41668
2007-09-01 02:00:51 +00:00
Chris Lattner 0e258b8518 Cut off crazy computation. This helps PR1622 slightly.
llvm-svn: 41522
2007-08-28 04:23:55 +00:00
Devang Patel d2456a171d Use simpler test to filter loops.
llvm-svn: 41516
2007-08-27 21:34:31 +00:00
David Greene 703623d571 Update InvokeInst to work like CallInst
llvm-svn: 41506
2007-08-27 19:04:21 +00:00
Dan Gohman 71eaf62e5f Change comments to refer to @malloc and @free instead of %malloc and %free.
llvm-svn: 41488
2007-08-27 16:11:48 +00:00
Anton Korobeynikov 24fb6b2f8c Don't promote volatile loads/stores. This is needed (for example) to handle setjmp/longjmp properly.
This fixes PR1520.

llvm-svn: 41461
2007-08-26 21:43:30 +00:00
Owen Anderson 2b9ec7ff33 Don't DSe volatile stores.
llvm-svn: 41456
2007-08-26 21:14:47 +00:00
Devang Patel 6114751544 Move exit condition and exit branch from exiting block into loop header and dominator info. This avoid execution of dead iteration. Loop is already filter in the beginning such that this change is safe.
llvm-svn: 41394
2007-08-25 02:39:24 +00:00
Devang Patel c1ef32ef3d Constant split values needs upper bound and lower bound check, just like any other split value.
llvm-svn: 41389
2007-08-25 01:09:14 +00:00
Devang Patel 4e63e1f5b5 While calculating upper loop bound for first loop and lower loop bound for second loop, take care of edge cases.
llvm-svn: 41387
2007-08-25 00:56:38 +00:00
Devang Patel f5a01bf025 Fix regression that I caused yesterday night while adding logic to select appropriate split condition branch.
llvm-svn: 41365
2007-08-24 19:32:26 +00:00
Devang Patel 4bc9298f2a It is not safe to execute split condition's true branch first all the time. If split
condition predicate is GT or GE then execute false branch first.

llvm-svn: 41358
2007-08-24 06:17:19 +00:00
Devang Patel 4be56a5d12 Reject ICMP_NE as index split condition.
llvm-svn: 41357
2007-08-24 06:02:25 +00:00
Devang Patel 5e46fac6de Tightenup loop filter.
llvm-svn: 41356
2007-08-24 05:36:56 +00:00
Devang Patel 504dc0aaed Remove incomplete cost analysis.
llvm-svn: 41354
2007-08-24 05:21:13 +00:00
Chris Lattner b0f158cfdf rename APInt::toString -> toStringUnsigned for symmetry with toStringSigned()
Add an APSInt::toString() method.

llvm-svn: 41309
2007-08-23 05:15:32 +00:00
Devang Patel 887db2d832 Remove dead code.
llvm-svn: 41295
2007-08-22 21:07:41 +00:00
Devang Patel 6f4f23320d Fix typo.
llvm-svn: 41292
2007-08-22 20:55:18 +00:00
Devang Patel 31206b56d5 Cosmetic change
"True Loop" and "False Loop" naming terminology to refer two loops
after loop cloning is confusing. Instead just use A_Loop and B_Loop.

llvm-svn: 41287
2007-08-22 19:33:29 +00:00
Devang Patel 90da534987 Refactor loop condition check in a separate function.
llvm-svn: 41282
2007-08-22 18:27:01 +00:00
Devang Patel cd8beb7645 Fix thinko.
Starting value of second loop's induction variable can not be lower 
then starting value of original loop's induction variable.

llvm-svn: 41280
2007-08-22 18:07:47 +00:00
Devang Patel a12000d572 Rename bunch of variables.
llvm-svn: 41250
2007-08-21 21:12:02 +00:00
Devang Patel f98db5e62a Preserve LCSSA.
llvm-svn: 41246
2007-08-21 19:47:46 +00:00
Devang Patel b5933bbbd5 Use SmallVector instead of std::vector.
llvm-svn: 41207
2007-08-21 00:31:24 +00:00
Devang Patel 8f4228d619 s/ExitBlock/ExitingBlock/g
llvm-svn: 41204
2007-08-20 23:51:18 +00:00
Devang Patel 49c4f9a889 Replace indunction variable with split value in loop body.
This fixes art miscompile.

llvm-svn: 41195
2007-08-20 20:49:01 +00:00
Devang Patel c2e2d15f45 Do not split loops rejected by processOneIterationLoop().
llvm-svn: 41194
2007-08-20 20:24:15 +00:00
Nick Lewycky bfa9499a88 Oops, remove assert that wasn't meant to be committed.
llvm-svn: 41170
2007-08-18 23:21:28 +00:00
Nick Lewycky 5b5b1ab9e0 Never insert duplicate edges.
llvm-svn: 41169
2007-08-18 23:18:03 +00:00
Nick Lewycky a0d49dac26 Typo.
llvm-svn: 41168
2007-08-18 15:08:56 +00:00
Devang Patel 1282b6e181 Avoid spliting loops where two split condition branches are not independent.
llvm-svn: 41148
2007-08-18 00:00:32 +00:00
Devang Patel d1fcfcc76c When one branch of condition is eliminated then head of the other
branch is not necessary immediate dominators of merge blcok in all cases.

llvm-svn: 41144
2007-08-17 21:59:16 +00:00
Owen Anderson f5023a7a84 Factor out some code into a helper function.
llvm-svn: 41131
2007-08-16 22:51:56 +00:00
Owen Anderson 221a43604e Add some more comments to GVN.
llvm-svn: 41129
2007-08-16 22:02:55 +00:00
Devang Patel 3640e78057 Dominance frontier is now required.
llvm-svn: 41096
2007-08-15 03:34:53 +00:00
Devang Patel b81bcbde09 Cleanup removeBlocks.
Use dominance frontier to fixup incoming edges of successor blocks not domianted by DeadBB.
Use df_iterator to walk and delete basic blocks dominated by DeadBB.

llvm-svn: 41095
2007-08-15 03:31:47 +00:00
Reid Spencer 0db035567c Remove unneeded header file.
llvm-svn: 41094
2007-08-15 03:01:04 +00:00
Devang Patel f55b79fa71 Avoid triangle loops.
llvm-svn: 41093
2007-08-15 02:14:55 +00:00
Devang Patel 22c7993ecf Break infinite loop.
llvm-svn: 41091
2007-08-14 23:59:17 +00:00
Devang Patel 7cad917160 Avoid nested loops at the moment.
llvm-svn: 41090
2007-08-14 23:53:57 +00:00
Devang Patel 33ba97d747 Fix dominance frontier update while removing blocks.
llvm-svn: 41082
2007-08-14 18:35:57 +00:00
Owen Anderson bc271a02fd Eliminate PHI nodes with constant values during normal GVN processing, even when
they're not related to eliminating a load.

llvm-svn: 41081
2007-08-14 18:33:27 +00:00
Owen Anderson 398602a6eb Be more aggressive in pruning unnecessary PHI nodes when doing PHI construction.
llvm-svn: 41080
2007-08-14 18:16:29 +00:00
Owen Anderson 676070d503 Make GVN iterative.
llvm-svn: 41078
2007-08-14 18:04:11 +00:00
Owen Anderson a7b220f23a Fix a case where GVN was failing to return true when it had, in fact, modified
the function.

llvm-svn: 41077
2007-08-14 17:59:48 +00:00
Devang Patel dbe8497d45 Handle last value assignments.
llvm-svn: 41063
2007-08-14 01:30:57 +00:00
Devang Patel f74ccbb4e8 StartValue is already calculated.
llvm-svn: 41062
2007-08-14 00:15:45 +00:00
Devang Patel 948653915f Preserve simple analysis.
llvm-svn: 41054
2007-08-13 22:22:13 +00:00
Devang Patel b8a41bb4f1 Preserve dominator info.
llvm-svn: 41053
2007-08-13 22:13:24 +00:00
Devang Patel da48cf40db If NewBB dominates DestBB then DestBB is not part of NewBB's dominance frontier.
llvm-svn: 41051
2007-08-13 21:59:17 +00:00
Devang Patel f258578206 Split loops and do CFG cleanup.
llvm-svn: 41029
2007-08-12 07:02:51 +00:00
Reid Spencer 9f90f965de Remove unused variables.
llvm-svn: 41028
2007-08-12 04:45:36 +00:00
Chris Lattner 99c8ee2977 Transform a load from an undef/zero global into an undef/global even if we
have complex pointer manipulation going on.  This allows us to compile
stuff like this:

__m128i foo(__m128i x){
                static const unsigned int c_0[4] = { 0, 0, 0, 0 };
                __m128i v_Zero = _mm_loadu_si128((__m128i*)c_0);
                x  = _mm_unpacklo_epi8(x,  v_Zero);
                return x;
}

into:

_foo:
        xorps   %xmm1, %xmm1
        punpcklbw       %xmm1, %xmm0
        ret

llvm-svn: 41022
2007-08-11 18:48:48 +00:00
Devang Patel f417c2cc34 Clone loop.
llvm-svn: 40998
2007-08-10 18:07:13 +00:00
Devang Patel aa36a43908 Add utility to clone loops.
llvm-svn: 40997
2007-08-10 17:59:47 +00:00
Devang Patel 9a4761464f Remove unncessary duplication.
llvm-svn: 40979
2007-08-10 00:59:03 +00:00
Devang Patel 7bdf4531bb Calculate exit and start value of true loop and false loop respectively.
llvm-svn: 40978
2007-08-10 00:53:35 +00:00
Devang Patel 67af6cd7ea ExitCondition and Induction variable are loop constraints
not split condition constraints.

llvm-svn: 40977
2007-08-10 00:33:50 +00:00
Chris Lattner a8e4b4bc7b when we see a unaligned load from an insufficiently aligned global or
alloca, increase the alignment of the load, turning it into an aligned load.

This allows us to compile:

#include <xmmintrin.h>
__m128i foo(__m128i x){
 static const unsigned int c_0[4] = { 0, 0, 0, 0 };
	  __m128i v_Zero = _mm_loadu_si128((__m128i*)c_0);
  x  = _mm_unpacklo_epi8(x,  v_Zero);
  return x;
}

into:

_foo:
	punpcklbw	_c_0.5944, %xmm0
	ret
	.data
	.lcomm	_c_0.5944,16,4		# c_0.5944

instead of:

_foo:
	movdqu	_c_0.5944, %xmm1
	punpcklbw	%xmm1, %xmm0
	ret
	.data
	.lcomm	_c_0.5944,16,2		# c_0.5944

llvm-svn: 40971
2007-08-09 19:05:49 +00:00
Owen Anderson 9b1cc8cac0 Make NonLocal and None const in the right way. :-)
llvm-svn: 40961
2007-08-09 04:42:44 +00:00
Devang Patel 42e3e5bec1 Traverse loop blocks' terminators to find split candidates.
llvm-svn: 40960
2007-08-09 01:39:01 +00:00
Devang Patel 0183c797c4 Add cost analysis.
llvm-svn: 40952
2007-08-08 22:25:28 +00:00
Devang Patel 0e34ee25ab Preserve dom info while processing one iteration loop.
llvm-svn: 40947
2007-08-08 21:39:47 +00:00
Owen Anderson b84d3b1c92 Change the None and NonLocal markers in memdep to be const.
llvm-svn: 40946
2007-08-08 21:39:39 +00:00
Devang Patel 8abc5c82b7 Clear split info.
llvm-svn: 40944
2007-08-08 21:18:27 +00:00
Devang Patel 593bf9ceb3 Handle multiple split conditions.
llvm-svn: 40941
2007-08-08 21:02:17 +00:00
Owen Anderson 680862880d Global values also don't undead-ify pointers in our dead alloca's set.
llvm-svn: 40936
2007-08-08 19:12:31 +00:00
Owen Anderson ddf4aee543 Make handleEndBlock significantly faster with one trivial improvement,
and one hack to avoid hitting a bad case when the alias analysis is imprecise.

llvm-svn: 40935
2007-08-08 18:38:28 +00:00
Owen Anderson 50df9685b0 Small improvement: if a function doesn't access memory, we don't need to scan
it for potentially undeading pointers.

llvm-svn: 40933
2007-08-08 17:58:56 +00:00
Owen Anderson 52aaabf74d Add some comments, remove a dead argument, and simplify some control flow.
No functionality change.

llvm-svn: 40932
2007-08-08 17:50:09 +00:00
Owen Anderson b17ab03081 A few more small cleanups.
llvm-svn: 40922
2007-08-08 06:06:02 +00:00
Owen Anderson 0aecf0ebef First round of cleanups from Chris' feedback.
llvm-svn: 40919
2007-08-08 04:52:29 +00:00
Devang Patel 68de1ae816 Embrace patch review feedback.
llvm-svn: 40915
2007-08-08 01:51:27 +00:00
Devang Patel c7e53bdcfd Fix new compare instruction's signness. Caught by Chris during review.
llvm-svn: 40912
2007-08-07 23:17:52 +00:00
Owen Anderson 0cc1a76283 Don't insert nearly as many redundant phi nodes.
llvm-svn: 40909
2007-08-07 23:12:31 +00:00
Devang Patel 19211b6528 Use eraseFromParent().
llvm-svn: 40903
2007-08-07 17:45:35 +00:00
David Greene bacdbaa0da Fix comment typo
llvm-svn: 40898
2007-08-07 16:52:03 +00:00
David Greene 816a190cdf Fix GLIBCXX_DEBUG error triggered by incrementing erased iterator.
llvm-svn: 40897
2007-08-07 16:44:38 +00:00
Devang Patel c70106cb30 Begin loop index split pass.
llvm-svn: 40883
2007-08-07 00:25:56 +00:00
Nick Lewycky 8052019a20 It's safe to fold not of fcmp.
llvm-svn: 40870
2007-08-06 20:04:16 +00:00
David Greene 77b2accbca Make this code more efficient.
llvm-svn: 40861
2007-08-06 15:09:17 +00:00
Chris Lattner c7ba225705 remove some dead lines
llvm-svn: 40859
2007-08-06 06:21:06 +00:00
Reid Spencer d959cfc882 Silence some warnings from doxygen about @param argument name not matching the
actual argument name of the documented function.

llvm-svn: 40851
2007-08-05 19:35:22 +00:00
Chris Lattner f0da7975ea at the end of instcombine, explicitly clear WorklistMap.
This shrinks it down to something small.  On the testcase
from PR1432, this speeds up instcombine from 0.7959s to 0.5000s,
(59%)

llvm-svn: 40840
2007-08-05 08:47:58 +00:00
Chris Lattner edce70d2fe rewrite the code used to construct pruned SSA form with the IDF method.
In the old way, we computed and inserted phi nodes for the whole IDF of 
the definitions of the alloca, then computed which ones were dead and
removed them.

In the new method, we first compute the region where the value is live,
and use that information to only insert phi nodes that are live.  This
eliminates the need to compute liveness later, and stops the algorithm
from inserting a bunch of phis which it then later removes.

This speeds up the testcase in PR1432 from 2.00s to 0.15s (14x) in a
release build and 6.84s->0.50s (14x) in a debug build.

llvm-svn: 40825
2007-08-04 22:50:14 +00:00
Chris Lattner d91576b01e Factor out a whole bunch of code into it's own method.
llvm-svn: 40824
2007-08-04 21:14:29 +00:00
Chris Lattner 4e1b4140eb Use getNumPreds(BB) instead of computing them manually. This is a very small but
measurable speedup.

llvm-svn: 40823
2007-08-04 21:06:15 +00:00
Chris Lattner b6a4ba808b Change the rename pass to be "tail recursive", only adding N-1 successors
to the worklist, and handling the last one with a 'tail call'.  This speeds
up PR1432 from 2.0578s to 2.0012s (2.8%)

llvm-svn: 40822
2007-08-04 20:40:27 +00:00
Chris Lattner 840259c8d3 cache computation of #preds for a BB. This speeds up
mem2reg from 2.0742->2.0522s on PR1432.

llvm-svn: 40821
2007-08-04 20:24:50 +00:00
Chris Lattner 050bac4bed reserve operand space for phi nodes when we insert them.
llvm-svn: 40820
2007-08-04 20:14:34 +00:00
Chris Lattner 9318785df5 use continue to avoid nesting, no functionality change.
llvm-svn: 40819
2007-08-04 20:07:06 +00:00
Chris Lattner 6b04ecbaf9 Promoting allocas with the 'single store' fastpath is
faster than with the 'local to a block' fastpath.  This speeds
up PR1432 from 2.1232 to 2.0686s (2.6%)

llvm-svn: 40818
2007-08-04 20:03:23 +00:00
Chris Lattner 4a930f9444 When PromoteLocallyUsedAllocas promoted allocas, it didn't remember
to increment NumLocalPromoted, and didn't actually delete the
dead alloca, leading to an extra iteration of mem2reg.

llvm-svn: 40817
2007-08-04 20:01:43 +00:00
Chris Lattner 63c039780c std::map -> DenseMap
llvm-svn: 40816
2007-08-04 19:52:20 +00:00
Nick Lewycky 20f0811fc0 Clean up comments, fix up some confusing code logic.
Predsimplify fails llvm-gcc bootstrap.

llvm-svn: 40815
2007-08-04 18:45:32 +00:00
Chris Lattner 7d382f7680 fix a logic bug where we wouldn't promote single store allocas if the
stored value was a non-instruction value.  Doh.

This increase the # single store allocas from 8982 to 9026, and
speeds up mem2reg on the testcase in PR1432 from 2.17 to 2.13s.

llvm-svn: 40813
2007-08-04 02:45:02 +00:00
Chris Lattner 1b215f0661 When we do the single-store optimization, delete both the store
and the alloca so they don't get reprocessed.

This speeds up PR1432 from 2.20s to 2.17s.

llvm-svn: 40812
2007-08-04 02:38:38 +00:00
Chris Lattner 862f125457 Three improvements:
1. Check for revisiting a block before checking domination, which is faster.
  2. If the stored value isn't an instruction, we don't have to check for domination.
  3. If we have a value used in the same block more than once, make sure to remove the
     block from the UsingBlocks vector.  Not doing so forces us to go through the slow
     path for the alloca.

The combination of these improvements increases the number of allocas on the fastpath
from 8935 to 8982 on PR1432.  This speeds it up from 2.90s to 2.20s (31%)

llvm-svn: 40811
2007-08-04 02:32:22 +00:00
Chris Lattner ae1e00eb36 switch from using a std::set to using a SmallPtrSet. This speeds up the
testcase in PR1432 from 6.33s to 2.90s (2.22x)

llvm-svn: 40810
2007-08-04 02:21:22 +00:00
Chris Lattner 9181801bb7 In mem2reg, when handling the single-store case, make sure to remove
a using block from the list if we handle it.  Not doing this caused us
to not be able to promote (with the fast path) allocas which have uses (whoops).

This increases the # allocas hitting this fastpath from 4042 to 8935 on the
testcase in PR1432, speeding up mem2reg by 2.6x

llvm-svn: 40809
2007-08-04 02:15:24 +00:00
Chandler Carruth 7132e00de7 This is the patch to provide clean intrinsic function overloading support in LLVM. It cleans up the intrinsic definitions and generally smooths the process for more complicated intrinsic writing. It will be used by the upcoming atomic intrinsics as well as vector and float intrinsics in the future.
This also changes the syntax for llvm.bswap, llvm.part.set, llvm.part.select, and llvm.ct* intrinsics. They are automatically upgraded by both the LLVM ASM reader and the bitcode reader. The test cases have been updated, with special tests added to ensure the automatic upgrading is supported.

llvm-svn: 40807
2007-08-04 01:51:18 +00:00
Chris Lattner 886a41a007 split rewriting of single-store allocas into its own
method.

llvm-svn: 40806
2007-08-04 01:47:41 +00:00
Chris Lattner 3cede09c67 refactor some code to shrink PromoteMem2Reg::run a bit
llvm-svn: 40805
2007-08-04 01:41:18 +00:00
Chris Lattner d524537fe9 add a typedef, no other change.
llvm-svn: 40804
2007-08-04 01:19:38 +00:00
Chris Lattner df138be527 avoid an unneeded vector copy. This speeds up mem2reg on the testcase
in PR1432 by 6%

llvm-svn: 40803
2007-08-04 01:07:49 +00:00
Chris Lattner fd838f0770 make RenamePassWorkList a local var instead of an ivar.
llvm-svn: 40802
2007-08-04 01:04:40 +00:00
Owen Anderson 2d19aae4ca Fix a subtle miscompilation. This allows 197.parser to be compiled correctly.
llvm-svn: 40791
2007-08-03 19:59:35 +00:00
Owen Anderson 774761c503 Fix a subtle iterator invalidation bug in a recursive algorithm.
llvm-svn: 40776
2007-08-03 11:03:26 +00:00
Chris Lattner 1f70816c73 Fix an accidental commit.
llvm-svn: 40758
2007-08-02 21:33:36 +00:00
Owen Anderson a8ba659976 Fix 80 col. violations.
llvm-svn: 40751
2007-08-02 18:20:52 +00:00
Owen Anderson 9699a6ea03 Fix 80 col. violations.
llvm-svn: 40750
2007-08-02 18:16:06 +00:00
Owen Anderson e3590584b9 Fix 80 col. violations.
llvm-svn: 40749
2007-08-02 18:11:11 +00:00
Owen Anderson 0ac1fc8ac1 Fix a bug that was causing several miscompilations on SPEC.
llvm-svn: 40746
2007-08-02 17:56:05 +00:00
Chris Lattner dc2cf228ce Replacing a cast with another one does not reduce the number of
casts in the input.

llvm-svn: 40741
2007-08-02 17:23:38 +00:00
Chris Lattner 222b214be7 Disable an xform that causes an infinite loop. This fixes PR1594
llvm-svn: 40739
2007-08-02 16:56:32 +00:00
Chris Lattner 2740694450 wrap some long lines. Major offenders that are left include
gvn, gvnpre, dse, and predsimplify.  To see these, use:

  make check-line-length

llvm-svn: 40738
2007-08-02 16:53:43 +00:00
Devang Patel a882328e61 Update dominator info for the middle blocks created while spliting
exit edge to preserve LCSSA.

Fix dominance frontier update during loop unswitch. This fixes PR 1589, again

llvm-svn: 40737
2007-08-02 15:25:57 +00:00
Chris Lattner b0418fc607 Enhance instcombine to be more aggressive about folding casts of
operations of casts.  This implements InstCombine/zext-fold.ll

llvm-svn: 40726
2007-08-02 06:11:14 +00:00
Chris Lattner d7cb625a9e Fix PR1575 and test/Transforms/CondProp/2007-08-01-InvalidRead.ll
llvm-svn: 40720
2007-08-02 04:47:05 +00:00
Devang Patel 34890b2f27 Undo previous check-in.
llvm-svn: 40698
2007-08-01 23:24:50 +00:00
Devang Patel 561b0c29a3 Update dominator info for the middle blocks created while spliting
exit edge to preserve LCSSA.

Fix dominance frontier update during loop unswitch. This fixes PR 1589.

llvm-svn: 40695
2007-08-01 22:23:50 +00:00
Owen Anderson c321e5e272 Make non-local memdep not be recursive, and fix a bug on 403.gcc that this exposed.
llvm-svn: 40692
2007-08-01 22:01:54 +00:00
Dan Gohman 34d442f274 More explicit keywords.
llvm-svn: 40673
2007-08-01 15:32:29 +00:00
Owen Anderson 10e52eddb3 Rename FastDSE to just DSE.
llvm-svn: 40668
2007-08-01 06:36:51 +00:00
Owen Anderson e4a374812b Move FastDSE in to DeadStoreElimination.
llvm-svn: 40667
2007-08-01 06:30:51 +00:00
Owen Anderson 4894e6d8bc Remove old DSE.
llvm-svn: 40666
2007-08-01 06:30:10 +00:00
David Greene 17a5dfe6f7 New CallInst interface to address GLIBCXX_DEBUG errors caused by
indexing an empty std::vector.

Updates to all clients.

llvm-svn: 40660
2007-08-01 03:43:44 +00:00
Owen Anderson 10ffa860d8 Don't let the memory allocator outsmart GVN. ;-)
llvm-svn: 40655
2007-07-31 23:27:13 +00:00
Owen Anderson 2464f4f048 Fix a failure I accidentally caused in my last commit by mishandling the
removal of redundant phis.

llvm-svn: 40650
2007-07-31 20:18:28 +00:00
Lauro Ramos Venancio 549e775e67 Fix a bug in GetKnownAlignment of packed structs.
llvm-svn: 40649
2007-07-31 20:13:21 +00:00
Owen Anderson d58fa6b09f Fix a misoptimization in aha.
llvm-svn: 40642
2007-07-31 17:43:14 +00:00
Dan Gohman 8c4da37b1f Use SCEVExpander::InsertCastOfTo instead of calling new IntToPtrInst
directly, because the insert point used by the SCEVExpander may vary
from what LSR originally computes.

llvm-svn: 40641
2007-07-31 17:22:27 +00:00
Devang Patel d8b1ceb5b4 Add note.
llvm-svn: 40638
2007-07-31 16:52:25 +00:00
Devang Patel d491198000 Loop unswitch preserves dom info.
Use simple analysis interface to preserve analysis info maintained by other loop passes.

llvm-svn: 40627
2007-07-31 08:03:26 +00:00
Devang Patel b98a097ae9 Implement Simple Analysis interfaces - cloneBasicBlockAnalysis and deleteAnalysisValue.
llvm-svn: 40626
2007-07-31 08:01:41 +00:00
Devang Patel 7d165e1d84 If loop can be unswitched again, then do it yourself.
llvm-svn: 40609
2007-07-30 23:07:10 +00:00
Owen Anderson 850138157e Avoid potential iterator invalidation problems.
llvm-svn: 40607
2007-07-30 21:26:39 +00:00
Devang Patel 14fae50666 Remove dead code.
llvm-svn: 40606
2007-07-30 21:10:44 +00:00
Devang Patel c5e340eded LCSSA preserves dom info.
llvm-svn: 40604
2007-07-30 20:23:45 +00:00
Devang Patel 698852561c Loop Rotation pass preserves dominator tree and frontier.
llvm-svn: 40603
2007-07-30 20:22:53 +00:00
Devang Patel bb97ac4dce LICM preserves scalar evolution and dom frontier.
llvm-svn: 40602
2007-07-30 20:19:59 +00:00
Reid Spencer dff9d69cfb Fix a typo/thinko.
llvm-svn: 40599
2007-07-30 19:53:57 +00:00
Owen Anderson 212d5c27f6 Use more caching when computing non-local dependence. This makes bzip2 not
use up the entire 32-bit address space.

llvm-svn: 40596
2007-07-30 17:29:24 +00:00
Owen Anderson d66e285b2e Fix a bug caused by indiscriminantly asking for the dominators of a predecessor.
llvm-svn: 40595
2007-07-30 16:57:08 +00:00
Devang Patel e3206cb425 Use SmallPtrSet.
llvm-svn: 40560
2007-07-27 18:34:27 +00:00
Chuck Rose III 1a39a2d13d VStudio compiler errors and placing Function*->ExFunc map under ManagedStatic control.
This commit fixes two things.  One is a pair of VStudio compiler errors stemming from variables
which defined within the for loop statement and also within the body of the for loop.  I fixed these 
by renaming one of the two variables.  Additionally, I've made the Function*->ExFunc map in 
ExternalFunctions.cpp a ManagedStatic object, so that cleanup will be done on llvm_shutdown.  In repeated
uses of the interpreter, where the same Function* address may get used for completely differnet functions,
this was causing a crash.

llvm-svn: 40558
2007-07-27 18:26:35 +00:00
Devang Patel a51e0a3d8d Fix thinko. Update return status appropriately.
llvm-svn: 40546
2007-07-26 20:21:42 +00:00
Owen Anderson dbf23ccaa0 Fix a couple more bugs in the phi construction by pulling in code that does
almost the same things from LCSSA.

llvm-svn: 40540
2007-07-26 18:26:51 +00:00
Dan Gohman 6e853bc73f Move the GET_SIDE_EFFECT_INFO logic from isInstructionTriviallyDead
to Instruction::mayWriteToMemory, fixing a FIXME, and helping
various places that call mayWriteToMemory directly.

llvm-svn: 40533
2007-07-26 16:06:08 +00:00
Dan Gohman eb47d9213c Remove a bogus return statement, what appears to have been a pasto
from Relation::contradicts in Relation::incorporate.

llvm-svn: 40531
2007-07-26 15:29:35 +00:00
Owen Anderson 3b8cc30a61 Fix what is _hopefully_ the last corner case for loops.
llvm-svn: 40503
2007-07-25 23:54:42 +00:00
Owen Anderson 8707412593 My last commit was not correct for nested loops. Fix it, and add a testcase for it.
llvm-svn: 40498
2007-07-25 22:19:40 +00:00
Owen Anderson 3c67004d47 Fix an infinite loop on 300.twolf.
llvm-svn: 40497
2007-07-25 22:03:06 +00:00
Owen Anderson 7bf26ee444 Fix a bug that was causing GVN to crash on 252.eon.
llvm-svn: 40494
2007-07-25 21:13:41 +00:00
Owen Anderson 5e5599b7ce Add basic support for performing whole-function RLE.
Note: This has not yet been thoroughly tested.  Use at your own risk.

llvm-svn: 40489
2007-07-25 19:57:03 +00:00
Devang Patel 33227115b9 Add BasicInliner interface.
This interface allows clients to inline bunch of functions with module
level call graph information.:wq

llvm-svn: 40486
2007-07-25 18:00:25 +00:00
Owen Anderson ab6ec2eac2 Add a GVN pass, using the value numbering code I developed for GVNPRE and the
load elimination code from RedundantLoadElimination.

llvm-svn: 40469
2007-07-24 17:55:58 +00:00
Owen Anderson 9baaaa52e6 Rename a lot of things to change FastDLE to RedundantLoadElimination.
llvm-svn: 40457
2007-07-24 00:17:04 +00:00
Owen Anderson 7292a4a93f Rename FastDLE as RedundantLoadElimination.
llvm-svn: 40456
2007-07-24 00:08:38 +00:00
Owen Anderson 5e68f0c93d Don't delete volatile loads. Doing so is not safe.
llvm-svn: 40448
2007-07-23 22:05:54 +00:00
Owen Anderson 6aba721425 Add FastDLE, the load-elimination counterpart of FastDSE.
llvm-svn: 40445
2007-07-23 21:48:08 +00:00
Owen Anderson 5a201baba9 Fix file header.
llvm-svn: 40440
2007-07-23 18:30:37 +00:00
Chris Lattner 4512cd2cab completely remove a transformation that is unsafe in the face of
undefs.

llvm-svn: 40439
2007-07-23 17:10:17 +00:00
Devang Patel 5e39293e62 Apply temporary work around to fix llvm mis-compilation
reported in PR 1556.

llvm-svn: 40133
2007-07-21 00:34:29 +00:00
Chris Lattner d82e4a19cc this xform is already done by the constant folder.
llvm-svn: 40124
2007-07-20 22:06:41 +00:00
Dan Gohman e31a61eeca Optimize alignment of loads and stores.
llvm-svn: 40102
2007-07-20 16:34:21 +00:00
Duncan Sands 2be91fcdd8 Place SCCPSolver also in the anonymous namespace. This
pacifies g++-4.2.

llvm-svn: 40089
2007-07-20 08:56:21 +00:00
Owen Anderson 5bd6c3f2c4 Fix a bug where we were marking GEP expressions with the wrong opcode.
llvm-svn: 40085
2007-07-20 08:19:20 +00:00
Owen Anderson f9e6542969 Make val_replace fail early, which reduces the time to optimize 403.gcc to 14.8s.
llvm-svn: 40064
2007-07-19 19:57:13 +00:00
Devang Patel a273d1cd3a Verify loop info.
llvm-svn: 40062
2007-07-19 18:02:32 +00:00
Owen Anderson 6aa17f1def Use SmallVector and DenseMap in even more places.
With this, the time to optimize 403.gcc is down to 15.1s.

llvm-svn: 40042
2007-07-19 06:37:56 +00:00
Owen Anderson 75a244d6eb Change ValueTable to use a DenseMap for mapping expressions to value numbers.
This results in a slight speedup for 403.gcc.

llvm-svn: 40040
2007-07-19 06:13:15 +00:00
Owen Anderson 6a4ff8549b Move some sets and maps to SmallPtrSet and DenseMap respectively. This
reduces the time to optimize 403.gcc from 17.6s to 16.4s.

llvm-svn: 40036
2007-07-19 03:32:44 +00:00
Devang Patel 186e0d8b0a After a basic block is split into two parts,
second part dominates all the blocks dominated
by original basic block. And first part dominates
second part.

llvm-svn: 40035
2007-07-19 02:29:24 +00:00
Devang Patel de5901523c Now this temp. fix is not required.
llvm-svn: 40034
2007-07-19 02:22:21 +00:00
Devang Patel 8a1d1ac925 Fix typo.
llvm-svn: 40025
2007-07-18 23:50:19 +00:00
Devang Patel bb8ea8cefc Fix dominator info update to accommodate CFG changes.
This fixes PR1559.

llvm-svn: 40024
2007-07-18 23:48:20 +00:00
Owen Anderson 09f86993bd Take advantage of undefined behavior if the source program tries to GEP
beyond the end of an alloca to make FastDSE faster and more aggressive.

llvm-svn: 39945
2007-07-16 23:34:39 +00:00
Owen Anderson 7fcaaadf1c Add support for walking up memory def chains, which enables finding many more
dead stores on 400.perlbench.

llvm-svn: 39929
2007-07-16 21:52:50 +00:00
Reid Spencer 3363f4ad96 Return Undef if the block has no dominator. This was required to allow
llvm-gcc build to succeed. Without this change it fails in libstdc++
compilation. This causes no regressions in dejagnu tests. However, 
someone who knows this code better might want to review it.

llvm-svn: 39924
2007-07-16 21:03:44 +00:00
Dan Gohman 06c60b6032 Fix comments about vectors to use the current wording.
llvm-svn: 39921
2007-07-16 14:29:03 +00:00
Chris Lattner 640fd5124d Repair a regression in Transforms/InstCombine/mul.ll that Reid noticed.
llvm-svn: 39896
2007-07-16 04:15:34 +00:00
Nick Lewycky b7c0c8a350 Start adding and cleaning up comments.
llvm-svn: 39894
2007-07-16 02:58:37 +00:00
Chris Lattner d4fef8dbca Implement shift-simplify.ll:test[45].
First teach instcombine that sign bit checks only demand the 
sign bit, this allows simplify demanded bits to hack on 
expressions better.

Second, teach instcombine that ashr is useless if only the
sign bit is demanded.

llvm-svn: 39880
2007-07-15 20:54:51 +00:00
Chris Lattner 06205d5567 Implement shift-simplify.ll:test3, turning:
(X << 31) <s 0  --> (X&1) != 0

This happens dozens of times in the CFE.

llvm-svn: 39879
2007-07-15 20:42:37 +00:00
Nick Lewycky 39519f5c41 Use maximal intersection algorithm exclusively. Fixes miscompile bug.
llvm-svn: 39852
2007-07-14 04:28:04 +00:00
Devang Patel 4cd1413f15 Make LCSSA a loop pass.
llvm-svn: 39844
2007-07-13 23:57:11 +00:00
Owen Anderson d975efab16 Handle GEPs with all-zero indices in the same way we handle pointer-pointer bitcasts. Also, fix a potentia infinite loop.
This brings FastDSE to parity with old DSE on 175.vpr.

llvm-svn: 39839
2007-07-13 22:50:48 +00:00
Devang Patel 29ccf8ba52 Disable claims to preserve analysis until open issues are resolved.
llvm-svn: 39834
2007-07-13 21:53:42 +00:00
Owen Anderson 9c9ef21432 Be more aggressive in removing dead stores, and in removing instructions trivially dead after DSE.
This drastically improves the effect of FastDSE on kimwitu++.

llvm-svn: 39819
2007-07-13 18:26:26 +00:00
Owen Anderson 32c4a05dd4 Reimplement removing stores to allocas at the end of a function. This should be safe now.
llvm-svn: 39790
2007-07-12 21:41:30 +00:00
Owen Anderson d4451dee1e Make the condition-checking for free with non-trivial dependencies more correct.
llvm-svn: 39789
2007-07-12 18:08:51 +00:00
Owen Anderson 5e06995b3d Remove the end-block handling code. It was unsafe, and making it safe would have resulted in falling back to the slow DSE case. I need to think some more about the right way to handle this.
llvm-svn: 39788
2007-07-12 17:52:20 +00:00
Gabor Greif b8bca52c7d checked in as obvious,
thanks Benoit Boissinot!

llvm-svn: 39774
2007-07-12 13:31:38 +00:00
Owen Anderson 1e1bace52b Let MemoryDependenceAnalysis take care of updating AliasAnalysis.
llvm-svn: 39769
2007-07-12 00:06:21 +00:00
Devang Patel fac4d1f014 Preserve analysis info.
llvm-svn: 39767
2007-07-11 23:47:28 +00:00
Owen Anderson aa07172340 Handle the case where an entire structure is freed, and its dependency is a store to a field within
that structure.

Also, refactor the runOnBasicBlock() function, splitting some of the special cases into separate functions.

llvm-svn: 39762
2007-07-11 23:19:17 +00:00
Owen Anderson 1441470be8 Add support for eliminate stores to stack-allocated memory locations at the end
of a function.

llvm-svn: 39754
2007-07-11 21:06:56 +00:00
Owen Anderson e720144837 Handle eliminating stores that occur right before a free.
llvm-svn: 39753
2007-07-11 20:38:34 +00:00
Owen Anderson bf971aafb6 Clean up a few things based on Chris' feedback.
llvm-svn: 39747
2007-07-11 19:03:09 +00:00
Tanya Lattner ccecbcd779 Adding ability to demote phi to stack.
llvm-svn: 39744
2007-07-11 18:41:34 +00:00
Owen Anderson 5e72db3f7f Add FastDSE, a new algorithm for doing dead store elimination. This algorithm is not as accurate
as the current DSE, but it only a linear scan over each block, rather than quadratic.  Eventually
(once it has been improved somewhat), this will replace the current DSE.

NOTE: This has not yet been extensively tested.
llvm-svn: 38517
2007-07-11 00:46:18 +00:00
Owen Anderson 084d3c2e2f Make the pass registration static.
llvm-svn: 38508
2007-07-10 20:20:19 +00:00
Anton Korobeynikov 76547349c1 During module cloning copy aliases too. This fixes PR1544
llvm-svn: 38505
2007-07-10 19:07:35 +00:00
Nick Lewycky e635cc43c6 Update the ValueRanges interface to use value numbers instead of Value*s.
llvm-svn: 38483
2007-07-10 03:28:21 +00:00
Owen Anderson 4c4b238448 Move some key maps from std::map to DenseMap. This improves the time to optimize Anton's testcase from 17.5s
to 15.7s.

llvm-svn: 38480
2007-07-10 00:27:22 +00:00
Owen Anderson 41c2cab873 Use a cheaper test, delaying calling find_leader() until we know that it's necessary. This improves
the time to optimize Anton's testcase from 21.1s to 17.6s.

llvm-svn: 38479
2007-07-10 00:09:25 +00:00
Owen Anderson 7ee197ecf2 Add an assertion if find_leader fails.
llvm-svn: 38477
2007-07-09 23:57:18 +00:00
Owen Anderson effc7a7d16 Take advantage of the new fast SmallPtrSet assignment operator when propagating AVAIL_OUT sets.
This reduces the time to optimize Anton's testcase from 31.2s to 21.s!

llvm-svn: 38475
2007-07-09 22:29:50 +00:00
Devang Patel e8ec7661ea Expose struct size threhold to allow users to tweak their own setting.
llvm-svn: 38472
2007-07-09 21:19:23 +00:00
Owen Anderson 56b01eb3d9 Fix a comment.
llvm-svn: 38459
2007-07-09 16:43:55 +00:00
Owen Anderson 267ba45249 Improve a hotspot that was making build_sets() slower by calling lookup() too
often.  This improves Anton's testcase from 36s to 32s.

llvm-svn: 38441
2007-07-09 07:56:55 +00:00
Owen Anderson 1c83b5d999 Start using a set representation that remembers the set of value numbers represented
in the set.  For the moment, this results in a slight performance decrease, but
it lays the groundwork for future improvements.

llvm-svn: 38439
2007-07-09 06:50:06 +00:00
Owen Anderson 8b99e0ab20 Fix an error where ANTIC_OUT was ending up with more than one expression of
the same value number.  This fixes an infinite loop on 444.namd.

llvm-svn: 37967
2007-07-07 20:13:57 +00:00
Nick Lewycky 9b2252c6f0 Back out Devang's fix for PR1320 because it causes PR1542.
llvm-svn: 37966
2007-07-07 16:23:34 +00:00
Devang Patel 12358b4827 These rountines are now available as part of basic block utilities.
llvm-svn: 37955
2007-07-06 22:03:47 +00:00
Devang Patel 86d0ea973d Request DominanceFrontiner in advance.
llvm-svn: 37954
2007-07-06 21:43:22 +00:00
Devang Patel 3ee408264b Preserve various analysis info.
llvm-svn: 37953
2007-07-06 21:40:13 +00:00
Devang Patel d7767cc2a7 Add SplitEdge and SplitBlock utility routines.
llvm-svn: 37952
2007-07-06 21:39:20 +00:00
Owen Anderson 7d4bbc1c0c Be more aggressive in the heuristic. This mostly exposes more opportunities
for the GVN part of GVNPRE to apply.

llvm-svn: 37951
2007-07-06 20:29:43 +00:00
Owen Anderson 3c3dd902ec Achieve what the incorrect test was trying to do by simply requiring that all
critical edges be split before we begin.

llvm-svn: 37949
2007-07-06 18:12:36 +00:00
Owen Anderson bcdd7ec4c9 Remove an incorrect check.
llvm-svn: 37948
2007-07-06 16:52:47 +00:00
Zhou Sheng 1ee941dac4 Correct a typo.
llvm-svn: 37936
2007-07-06 06:01:16 +00:00
Owen Anderson 02e9698293 Fix a bunch of issues found in a testcase from 400.perlbench.
llvm-svn: 37929
2007-07-05 23:11:26 +00:00
Nick Lewycky 73dd692173 Break "variable canonicalization" out of InequalityGraph and into its own class
"ValueNumbering".

llvm-svn: 37881
2007-07-05 03:15:00 +00:00
Owen Anderson ca1a184fd8 Fix another bug, this time in PREing select instructions.
llvm-svn: 37878
2007-07-04 22:33:23 +00:00
Owen Anderson cd94fc982a Fix a typo that was killing GVNPRE of select instructions.
llvm-svn: 37871
2007-07-04 18:26:18 +00:00
Owen Anderson 664e260a9c Fix an error in phi translation of GEPs that was causing failures.
llvm-svn: 37868
2007-07-04 04:51:16 +00:00
Owen Anderson 2e4b6feac2 Add support for performing GVNPRE on GEP instructions.
llvm-svn: 37862
2007-07-03 23:51:19 +00:00
Owen Anderson b9a494aea3 Add functionality to value number GEP instructions. This also provides the infrastructure that will
be used for function calls.  NOTE: This does not yet do any transformation of GEPs or function calls.

llvm-svn: 37860
2007-07-03 22:50:56 +00:00
Owen Anderson 6b958c72bd Make the unary operator case a bit faster, since casts are the only kind of unary operation.
llvm-svn: 37857
2007-07-03 19:01:42 +00:00
Owen Anderson 59bd053fc5 Add support for performing GVNPRE on cast instructions, and add a testcase for this.
llvm-svn: 37856
2007-07-03 18:37:08 +00:00
Devang Patel 0975c6d7f9 Preserve DominanceFrontier.
llvm-svn: 37820
2007-06-29 23:11:49 +00:00
David Greene 1e2a12019f Fix reference to iterator invalidated by an erase operation. Uncovered
by _GLIBCXX_DEBUG.

llvm-svn: 37796
2007-06-29 02:53:16 +00:00
Devang Patel 9feb7f5846 Do not filter loop if candidate branch is in loop header.
llvm-svn: 37792
2007-06-29 01:39:53 +00:00
Owen Anderson 67799d4ffb Add support for value numbering (but not actually optimizing) cast instructions.
llvm-svn: 37789
2007-06-29 00:51:03 +00:00
Owen Anderson c738f7ca42 Add a type field to expressions in preparation for performing GVNPRE on casts.
llvm-svn: 37788
2007-06-29 00:40:05 +00:00
Owen Anderson 8a9fa5d081 Add support for performing GVNPRE on select instructions. This fixes test/Transforms/GVNPRE/select.ll.
llvm-svn: 37783
2007-06-28 23:51:21 +00:00
Devang Patel 6ba5ad482f - Undo previous check and allow loop switch for condtion that is not inside
loop.
- Avoid loop unswich for loop header branch.
- While cloning dominators fix typo and handle self dominating blocks.

llvm-svn: 37772
2007-06-28 02:05:46 +00:00
Devang Patel 3304e469f7 Update LoopUnswitch pass to preserve DomiantorTree.
llvm-svn: 37771
2007-06-28 00:49:00 +00:00
Devang Patel 3c723c8db7 If a condition is not inside a loop then the condition is suitable
to loop unswitch candidate for the loop.

llvm-svn: 37770
2007-06-28 00:44:10 +00:00
Owen Anderson e02da55cc8 Make many sets a much more reasonable size. This decreases the time to optimize
Anton's testcase from 35.5s to 34.7s.

llvm-svn: 37769
2007-06-28 00:34:34 +00:00
Owen Anderson 7dae8efcf2 Use cached information that has already been computed to make clean() simpler and faster. This is a small speedup on most cases.
llvm-svn: 37761
2007-06-27 17:38:29 +00:00
Owen Anderson 0eb265729a Fold a lot of code into two cases: binary instructions and ternary instructions.
This saves many lines of code duplication.  No functionality change.

llvm-svn: 37759
2007-06-27 17:03:03 +00:00
Zhou Sheng 8d438858c8 Fix a bug.
llvm-svn: 37751
2007-06-27 09:50:26 +00:00
Owen Anderson b6a39fcb21 Add support for performing GVNPRE on the three vector-specific operations.
llvm-svn: 37745
2007-06-27 04:10:46 +00:00
Owen Anderson 5477c54aa0 1. Correct some comments and clean up some dead code.
2. When calculating ANTIC_IN, only iterate the changed blocks.  For most average
inputs this is a small speedup, but for cases with unusual CFGs, this can be a significant win.

llvm-svn: 37742
2007-06-26 23:29:41 +00:00
Chris Lattner ea5c4bd51c fix Transforms/Inline/2007-06-25-WeakInline.ll by not inlining functions
with weak linkage.

llvm-svn: 37723
2007-06-25 21:50:09 +00:00
Owen Anderson 43ca4b48f1 Use the built-in postorder iterators rather than computing a postorder walk by hand.
llvm-svn: 37721
2007-06-25 18:25:31 +00:00
Owen Anderson 191eb06352 1) Fix an issue with non-deterministic iteration order in phi_translate
2) Remove some maximal-set computing code that is no longer used.
3) Use a post-order CFG traversal to compute ANTIC_IN instead of a postdom traversal.
This causes the ANTIC_IN calculation to converge much faster.  Thanks to Daniel Berlin for suggesting this.

With this patch, the time to optimize 403.gcc decreased from 17.5s to 7.5s, and Anton's huge
testcase decreased from 62 minutes to 38 seconds.

llvm-svn: 37714
2007-06-25 05:41:12 +00:00
Nick Lewycky 8735f44104 Fix value ranges.
llvm-svn: 37713
2007-06-24 20:14:22 +00:00
Owen Anderson 7fb6da8e4d Fix a silly mistake that was causing failures.
llvm-svn: 37712
2007-06-24 08:42:24 +00:00
Nick Lewycky 0f986fdbfa Remove tabs.
llvm-svn: 37710
2007-06-24 04:40:16 +00:00
Nick Lewycky 26e25d340e Remove use of ETForest. Also cleaned up issues around unreachable basic
blocks, and optimizing within one basic block.

llvm-svn: 37709
2007-06-24 04:36:20 +00:00
Owen Anderson 49409f6501 Rework topo_sort so eliminate some behavior that scaled terribly. This reduces the time to optimize 403.gcc from 18.2s to 17.5s,
and has an even larger effect on larger testcases.

llvm-svn: 37708
2007-06-22 21:31:16 +00:00
Owen Anderson 21a1131565 Perform fewer set insertions while calculating ANTIC_IN. This reduces the amount of time to optimize 403.gcc from 21.9s to 18.2s.
llvm-svn: 37707
2007-06-22 18:27:04 +00:00
Owen Anderson 92c7b22e1a Remove some code that I was using for collecting performance information that should not have been committed.
llvm-svn: 37706
2007-06-22 17:04:40 +00:00
Owen Anderson f6e21871ad Avoid excessive calls to find_leader when calculating AVAIL_OUT. This reduces the time to optimize 403.gcc from 23.5s to 21.9s.
llvm-svn: 37702
2007-06-22 03:14:03 +00:00
Owen Anderson d50a29d613 Reserve space in vectors before topologically sorting into them. This improves the time to optimize 403.gcc from 28s to 23.5s.
llvm-svn: 37699
2007-06-22 00:43:22 +00:00
Owen Anderson 28a2d449fa Make a bunch of optimizations for compile time to GVNPRE, including smarter set unions, deferring blocks rather than computing maximal sets, and smarter use of sets. With these enhancements, the time to optimize 273.perlbmk goes from 5.3s to 2.7s.
llvm-svn: 37698
2007-06-22 00:20:30 +00:00
Chris Lattner fb032b176b Significantly improve the documentation of the instcombine divide/compare
transformation.  Also, keep track of which end of the integer interval overflows
occur on.  This fixes Transforms/InstCombine/2007-06-21-DivCompareMiscomp.ll
and rdar://5278853, a miscompilation of perl.

llvm-svn: 37692
2007-06-21 18:11:19 +00:00
Owen Anderson 2ff912bf33 Change lots of sets from std::set to SmallPtrSet. This reduces the time required to optimize 253.perlbmk from 10.9s to 5.3s.
llvm-svn: 37690
2007-06-21 17:57:53 +00:00
Devang Patel d5258a23a5 Move code to update dominator information after basic block is split
from LoopSimplify.cpp to Dominator.cpp

llvm-svn: 37689
2007-06-21 17:23:45 +00:00
Owen Anderson 27876a3ff9 Eliminate a redundant check. This speeds up optimization of 253.perlbmk from 13.5 seconds to 10.9 seconds.
llvm-svn: 37683
2007-06-21 01:59:05 +00:00
Owen Anderson fd5683ad7a Comment-ize the functions in GVNPRE.
llvm-svn: 37681
2007-06-21 00:19:05 +00:00
Chris Lattner 3bbec59e8b refactor a bunch of code out of visitICmpInstWithInstAndIntCst into its own
routine.

llvm-svn: 37679
2007-06-20 23:46:26 +00:00
Owen Anderson 06c1e585c9 Split runOnFunction into many smaller functions. This make it easier to get accurate performance analysis of GVNPRE.
llvm-svn: 37678
2007-06-20 22:10:02 +00:00
Owen Anderson b0714bb7bb Make GVNPRE accurate report whether it modified the function or not.
llvm-svn: 37673
2007-06-20 18:30:20 +00:00
Owen Anderson 7b0fb44ca9 Get rid of an unneeded helper function.
llvm-svn: 37670
2007-06-20 00:43:33 +00:00
Owen Anderson 1ad2c10215 Use a DenseMap instead of an std::map for the value numbering. This reduces the time to optimize lencod on a PPC Debug build from ~300s to ~140s.
llvm-svn: 37668
2007-06-19 23:23:54 +00:00
Owen Anderson 2320d430bd Make dependsOnInvoke much more specific in what it tests, which in turn make it much faster to run. This reduces the time to optimize lencondwith a debug build on PPC from ~450s to ~300s.
llvm-svn: 37667
2007-06-19 23:07:16 +00:00
Tanya Lattner c655839d71 Moved Inliner.h to include/llvm/Transforms/IPO/InlinerPass.h
llvm-svn: 37666
2007-06-19 22:31:52 +00:00
Tanya Lattner ab11b1c702 Inliner pass header file was moved.
llvm-svn: 37665
2007-06-19 22:29:50 +00:00
Dan Gohman 32f53bbd85 Rename ScalarEvolution::deleteInstructionFromRecords to
deleteValueFromRecords and loosen the types to all it to accept
Value* instead of just Instruction*, since this is what
ScalarEvolution uses internally anyway. This allows more flexibility
for future uses.

llvm-svn: 37657
2007-06-19 14:28:31 +00:00
Owen Anderson 1370faf889 Handle constants in phi nodes properly. This fixes test/Transforms/GVNPRE/2007-06-18-ConstantInPhi.ll
llvm-svn: 37655
2007-06-19 07:35:36 +00:00
Chris Lattner 09a33a4f64 silence a bogus warning Duraid ran into.
llvm-svn: 37649
2007-06-19 05:43:49 +00:00
Owen Anderson 91c54950b3 Be careful to erase values from all of the appropriate sets when they're not needed anymore. This fixes a few more memory-related issues.
llvm-svn: 37647
2007-06-19 05:37:32 +00:00
Owen Anderson b9cbaed623 Remember to clear the maximal sets between functions.
Thanks to Nicholas for valgrinding this.

llvm-svn: 37646
2007-06-19 04:32:55 +00:00
Owen Anderson b56fba0c5a Refactor GVNPRE to use a much smart method of uniquing value sets, and centralize a lot of the value numbering information. No functionality change.
llvm-svn: 37645
2007-06-19 03:31:41 +00:00
Owen Anderson dd998e1913 Cache the results of dependsOnInvoke()
llvm-svn: 37622
2007-06-18 04:42:29 +00:00
Owen Anderson f1c04e1ddb Fix indentation.
llvm-svn: 37621
2007-06-18 04:31:21 +00:00
Owen Anderson b364b413af Don't perform an expensive check if it's not necessary.
llvm-svn: 37620
2007-06-18 04:30:44 +00:00
Owen Anderson 658f2c4881 Fix test/Transforms/GVNPRE/2007-06-15-InvokeInst.ll by ignoring all instructions that depend on invokes.
llvm-svn: 37610
2007-06-16 00:26:54 +00:00
Dan Gohman 203a035251 Use SCEVConstant::get instead of SCEVUnknown::get to create an
integer constant SCEV.

llvm-svn: 37596
2007-06-15 18:00:55 +00:00
Owen Anderson acaed06827 Fix test/Transforms/GVNPRE/2007-06-15-Looping.ll
llvm-svn: 37595
2007-06-15 17:55:15 +00:00
Dan Gohman cb9e09ad57 Add a SCEV class and supporting code for sign-extend expressions.
This created an ambiguity for expandInTy to decide when to use
sign-extension or zero-extension, but it turns out that most of its callers
don't actually need a type conversion, now that LLVM types don't have
explicit signedness. Drop expandInTy in favor of plain expand, and change
the few places that actually need a type conversion to do it themselves.

llvm-svn: 37591
2007-06-15 14:38:12 +00:00
Chris Lattner 373389260f Generalize many transforms to work on ~ of vectors in addition to ~ of
integer ops.  This implements Transforms/InstCombine/and-or-not.ll
test3/test4, and finishes off PR1510

llvm-svn: 37589
2007-06-15 06:23:19 +00:00
Chris Lattner 481e28b1f5 Implement two xforms:
1. ~(~X | Y) === (X & ~Y)
2. (A|B) & ~(A&B) -> A^B

This allows us to transform  ~(~(a|b) | (a&b)) -> a^b.

This implements PR1510 for scalar values.

llvm-svn: 37584
2007-06-15 05:58:24 +00:00
Chris Lattner f14e5175ed delete some obviously dead vector operations, which deletes a few thousand
operations from Duraids example.

llvm-svn: 37582
2007-06-15 05:26:55 +00:00
Owen Anderson 4036ad485f Fix test/Transforms/GVNPRE/2007-06-12-PhiTranslate.ll
llvm-svn: 37564
2007-06-12 22:43:57 +00:00
Owen Anderson 4276984012 Refactor some code, and fix test/Transforms/GVNPRE/2007-06-12-NoExit.ll by being more careful when using
post-dominator information.

llvm-svn: 37556
2007-06-12 16:57:50 +00:00
Dale Johannesen edfec0b515 Sink CmpInst's to their uses to reduce register pressure.
llvm-svn: 37554
2007-06-12 16:50:17 +00:00
Owen Anderson a75dd4dc56 Fix a few more bugs, including an instance of walking in reverse topological rather than topological order. This
fixes a testcase extracted from llvm-test.

llvm-svn: 37550
2007-06-12 00:50:47 +00:00
Devang Patel 78b9c68164 Add and use DominatorTreeBase::findNearestCommonDominator().
llvm-svn: 37545
2007-06-11 23:31:22 +00:00
Devang Patel 536ac4dca7 Simplify.
llvm-svn: 37542
2007-06-11 21:45:31 +00:00
Devang Patel d18054afcf simplify
llvm-svn: 37541
2007-06-11 21:25:31 +00:00
Devang Patel ab2eee89a4 Simplify. Dominator Tree is required so always available.
llvm-svn: 37540
2007-06-11 21:18:00 +00:00
Owen Anderson d184c18074 Handle functions with multiple exit blocks properly.
llvm-svn: 37539
2007-06-11 16:25:17 +00:00
Owen Anderson 223718c40e Perform PRE of comparison operators.
llvm-svn: 37536
2007-06-09 18:35:31 +00:00
Owen Anderson 7d76b2a774 Collect statistics from GVN-PRE.
llvm-svn: 37530
2007-06-08 22:02:36 +00:00
Owen Anderson b232efaf48 Fix typo in a comment.
llvm-svn: 37526
2007-06-08 20:57:08 +00:00
Owen Anderson 55994f2453 Fix a bug that was causing the elimination phase not to replace values when it should be.
With this patch, GVN-PRE now correctly optimizes the example from the thesis.

Many thanks to Daniel Berlin for helping me find errors in this.

llvm-svn: 37525
2007-06-08 20:44:02 +00:00
Owen Anderson 2e5efc30c2 Small bugfix, and const-ify some methods (Thanks, Bill).
llvm-svn: 37513
2007-06-08 01:52:45 +00:00
Devang Patel becc466451 Update LoopSimplify to require and preserve DominatorTree only.
Now LoopSimplify does not require nor preserve ETForest.

llvm-svn: 37512
2007-06-08 01:50:32 +00:00
Owen Anderson be80240b29 Add partial redundancy elimination.
llvm-svn: 37510
2007-06-08 01:03:01 +00:00
Devang Patel 8ecffa996a Do not preserve ETForest.
llvm-svn: 37506
2007-06-08 00:02:08 +00:00
Devang Patel 3f4c6fe7e8 Do not require ETForest. Now it is unused by LICM.
llvm-svn: 37502
2007-06-07 22:21:15 +00:00
Devang Patel cf470e5255 Do not use ETForest as well as DomiantorTree. DominatorTree is sufficient.
llvm-svn: 37501
2007-06-07 22:17:16 +00:00
Devang Patel fc7fdef7d2 Use DominatorTree instead of ETForest.
This allows faster immediate domiantor walk.

llvm-svn: 37500
2007-06-07 21:57:03 +00:00
Devang Patel df6355ccf8 Use DominatorTree instead of ETForest.
llvm-svn: 37499
2007-06-07 21:42:15 +00:00
Devang Patel fb582f8dda Use DominatorTree instead of ETForest.
llvm-svn: 37498
2007-06-07 21:35:27 +00:00
Devang Patel 5b8a5516e4 Use DominatorTree instead of ETForest.
llvm-svn: 37495
2007-06-07 18:45:06 +00:00
Devang Patel 593e766fb5 Use DominatorTree instead of ETForest.
llvm-svn: 37494
2007-06-07 18:40:55 +00:00
Devang Patel af41e4a192 Maintain ETNode as part of DomTreeNode.
This adds redundancy for now.

llvm-svn: 37492
2007-06-07 17:47:21 +00:00
Tanya Lattner 5801c23e05 Formating fixes.
llvm-svn: 37491
2007-06-07 17:12:16 +00:00
Tanya Lattner cb90f1d881 Instruct the inliner to obey the noinline attribute. Add test case.
llvm-svn: 37481
2007-06-06 21:59:26 +00:00
Chris Lattner 34404e3247 simplify this code and fix PR1493, now that llvm-gcc3 is dead.
llvm-svn: 37478
2007-06-06 20:51:41 +00:00
Lauro Ramos Venancio 368e8872db Fix PR1499.
llvm-svn: 37472
2007-06-06 17:08:48 +00:00
Nick Lewycky 91ed6efc24 Inform ScalarEvolutions that we're deleting Values.
This is the obviously correct part of the fix for PR1487.

llvm-svn: 37457
2007-06-06 03:51:56 +00:00
Owen Anderson 634a063c1d Add simple full redundancy elimination.
llvm-svn: 37455
2007-06-06 01:27:49 +00:00
Chris Lattner 1b7b6e76ec Fix PR1495 and CodeGen/X86/2007-06-05-LSR-Dominator.ll
llvm-svn: 37454
2007-06-06 01:23:55 +00:00
Devang Patel 506310d3dd Avoid non-trivial loop unswitching while optimizing for size.
llvm-svn: 37446
2007-06-06 00:21:03 +00:00
Owen Anderson ddbe430732 Fix a misunderstanding of the algorithm. Really, we should be tracking values
and expression separately.  We can get around this, however, by only keeping
opaque values in TMP_GEN.

llvm-svn: 37443
2007-06-05 23:46:12 +00:00
Owen Anderson c84720913a Don't leak memory.
llvm-svn: 37442
2007-06-05 22:11:49 +00:00
Owen Anderson 9b89e4b561 Fix a small bug, some 80 cols violations, and add some more debugging output.
llvm-svn: 37436
2007-06-05 17:31:23 +00:00
Dan Gohman 151169df1e Allow insertelement, extractelement, and shufflevector to be hoisted/sunk
by LICM.

llvm-svn: 37435
2007-06-05 16:05:55 +00:00
Bill Wendling 6357bf20fa Patches by Chuck Rose to unbreak V Studio builds.
Thanks Chuck!

llvm-svn: 37428
2007-06-04 23:52:59 +00:00
Devang Patel b3adb9876a s/ETNode::getChildren/ETNode::getETNodeChildren/g
llvm-svn: 37426
2007-06-04 23:45:02 +00:00
Owen Anderson 3c9d8eef21 Don't use std::set_difference when the two sets are sorted differently. Compute
the difference manually instead.

This allows GVNPRE to produce correct analysis for the example in the GVNPRE
paper.

llvm-svn: 37425
2007-06-04 23:34:56 +00:00
Owen Anderson 3df5299f94 Fix a bunch of small bugs, and improve the debugging output significantly.
llvm-svn: 37424
2007-06-04 23:28:33 +00:00
Chris Lattner d7897d40b6 When rebuilding constant structs, make sure to honor the isPacked bit.
This fixes PR1491 and GlobalOpt/2007-06-04-PackedStruct.ll

llvm-svn: 37423
2007-06-04 22:23:42 +00:00
Owen Anderson 38b6b22a41 Make phi_translate correct.
llvm-svn: 37418
2007-06-04 18:05:26 +00:00
Devang Patel ebc5b96735 s/DominatorTree::createNewNode/DominatorTree::addNewBlock/g
llvm-svn: 37415
2007-06-04 16:43:25 +00:00
Devang Patel a89566aefd Add basic block level interface to change immediate dominator
and create new node.

llvm-svn: 37414
2007-06-04 16:22:33 +00:00
Devang Patel bdd1aaef10 s/llvm::DominatorTreeBase::DomTreeNode/llvm::DomTreeNode/g
llvm-svn: 37407
2007-06-04 00:32:22 +00:00
Owen Anderson 0eca9aad10 Don't use the custom comparator where it's not necessary.
llvm-svn: 37406
2007-06-03 22:02:14 +00:00
Devang Patel 0e8aa7b69a s/DominatorTreeBase::Node/DominatorTreeBase:DomTreeNode/g
llvm-svn: 37403
2007-06-03 06:26:14 +00:00
Owen Anderson 46499645db Remove an unused method.
llvm-svn: 37402
2007-06-03 05:58:25 +00:00
Owen Anderson 0b68cda302 There's no need to have an Expression class... Value works just as well! This simplifies a lot of code.
llvm-svn: 37401
2007-06-03 05:55:58 +00:00
Devang Patel ac54a62fd2 Insert new instructions in AliasSet.
llvm-svn: 37390
2007-06-01 22:15:31 +00:00
Owen Anderson 48e93f2ce9 clean() needs to process things in topological order.
llvm-svn: 37389
2007-06-01 22:00:37 +00:00
Owen Anderson 4c89142466 Fix Expression comparison, which in turn fixes a value numbering error.
llvm-svn: 37386
2007-06-01 17:34:47 +00:00
Owen Anderson 331bf6a959 Add a topological sort function.
llvm-svn: 37376
2007-05-31 22:44:11 +00:00
Owen Anderson 81d156e16f Attempt to fix up phi_translate.
llvm-svn: 37366
2007-05-31 00:42:15 +00:00
Devang Patel 9b3b35d14f Fix typo.
llvm-svn: 37360
2007-05-30 15:29:37 +00:00
Chris Lattner 8767920f20 Fix Transforms/ScalarRepl/2007-05-29-MemcpyPreserve.ll and the second
half of PR1421, by not decimating structs with holes that are the source and
destination of a memcpy.

llvm-svn: 37358
2007-05-30 06:11:23 +00:00
Owen Anderson 4b0c1859fd Fix a typo
llvm-svn: 37350
2007-05-29 23:34:14 +00:00
Owen Anderson 0c4230724c Re-fix a bug, where I was now being too aggressive.
llvm-svn: 37348
2007-05-29 23:26:30 +00:00
Owen Anderson 4a6ec8fb57 Use proper debugging facilities so other people don't have to look at my commented-out
debugging lines.

llvm-svn: 37347
2007-05-29 23:15:21 +00:00
Owen Anderson f11bdc7637 Comment debug code out that I accidentally uncommented last time.
llvm-svn: 37346
2007-05-29 22:43:03 +00:00
Owen Anderson ac83a3e4ff Add a place where I missed using the maximal set. Note that using the maximal
set this way is _SLOW_.  Somewhere down the line, I'll look at speeding it up.

llvm-svn: 37345
2007-05-29 22:35:41 +00:00
Owen Anderson 5fba6c19b2 Very first part of a GVN-PRE implementation. It currently performs a bunch of analysis, and nothing more. It is also quite slow for the moment. However,
it should give a sense of what's going on.

llvm-svn: 37343
2007-05-29 21:53:49 +00:00
Chris Lattner 80c94a4a04 Fix PR1446 by not scalarrepl'ing giant structures.
llvm-svn: 37326
2007-05-24 18:43:04 +00:00
Dan Gohman 30978078bf Minor comment cleanups.
llvm-svn: 37321
2007-05-24 14:36:04 +00:00
Chris Lattner f79577d314 fix a miscompilation when passing a float through varargs
llvm-svn: 37297
2007-05-23 01:17:04 +00:00
Chris Lattner a655a157a0 Fix Transforms/InstCombine/2007-05-18-CastFoldBug.ll, a bug that devastates
objc code due to the way the FE lowers objc message sends.

llvm-svn: 37256
2007-05-19 06:51:32 +00:00
Chris Lattner e8bd53c36a Handle negative strides much more optimally. This compiles X86/lsr-negative-stride.ll
into:

_t:
        movl 8(%esp), %ecx
        movl 4(%esp), %eax
        cmpl %ecx, %eax
        je LBB1_3       #bb17
LBB1_1: #bb
        cmpl %ecx, %eax
        jg LBB1_4       #cond_true
LBB1_2: #cond_false
        subl %eax, %ecx
        cmpl %ecx, %eax
        jne LBB1_1      #bb
LBB1_3: #bb17
        ret
LBB1_4: #cond_true
        subl %ecx, %eax
        cmpl %ecx, %eax
        jne LBB1_1      #bb
        jmp LBB1_3      #bb17

instead of:

_t:
        subl $4, %esp
        movl %esi, (%esp)
        movl 12(%esp), %ecx
        movl 8(%esp), %eax
        cmpl %ecx, %eax
        je LBB1_4       #bb17
LBB1_1: #bb.outer
        movl %ecx, %edx
        negl %edx
LBB1_2: #bb
        cmpl %ecx, %eax
        jle LBB1_5      #cond_false
LBB1_3: #cond_true
        addl %edx, %eax
        cmpl %ecx, %eax
        jne LBB1_2      #bb
LBB1_4: #bb17
        movl (%esp), %esi
        addl $4, %esp
        ret
LBB1_5: #cond_false
        movl %ecx, %edx
        subl %eax, %edx
        movl %eax, %esi
        addl %esi, %esi
        cmpl %ecx, %esi
        je LBB1_4       #bb17
LBB1_6: #cond_false.bb.outer_crit_edge
        movl %edx, %ecx
        jmp LBB1_1      #bb.outer

llvm-svn: 37252
2007-05-19 01:22:21 +00:00
Devang Patel 2c30a37a5c Fix PR1431
Test case at Transformations/SCCP/2007-05-16-InvokeCrash.ll

llvm-svn: 37185
2007-05-17 22:10:15 +00:00
Chris Lattner 66ad6fac2f selects can also reach here
llvm-svn: 37081
2007-05-15 06:42:04 +00:00
Chris Lattner 234f96daa8 Fix Transforms/InstCombine/2007-05-14-Crash.ll
llvm-svn: 37057
2007-05-15 00:16:00 +00:00
Dan Gohman 8d40e4d965 Correct a few comments.
llvm-svn: 37034
2007-05-14 14:31:17 +00:00
Chris Lattner cea37beb52 Fix Transforms/GlobalOpt/2007-05-13-Crash.ll
llvm-svn: 37020
2007-05-13 21:28:07 +00:00
Chris Lattner 1480e16596 significantly improve debug output of lsr
llvm-svn: 36996
2007-05-11 22:40:34 +00:00
Dan Gohman b5650ebd6a Fix typos.
llvm-svn: 36994
2007-05-11 21:10:54 +00:00
Dan Gohman 2980d9da45 This patch extends the LoopUnroll pass to be able to unroll loops
with unknown trip counts. This is left off by default, and a
command-line option enables it. It also begins to separate loop
unrolling into a utility routine; eventually it might be made usable
from other passes.

It currently works by inserting conditional branches between each
unrolled iteration, unless it proves that the trip count is a
multiple of a constant integer > 1, which it currently only does in
the rare case that the trip count expression is a Mul operator with
a ConstantInt operand. Eventually this information might be provided
by other sources, for example by a pass that peels/splits the loop
for this purpose.

llvm-svn: 36990
2007-05-11 20:53:41 +00:00
Chris Lattner 600db3eb96 fix regressions from my previous checking, including
Transforms/InstCombine/2006-12-08-ICmp-Combining.ll

llvm-svn: 36989
2007-05-11 16:58:45 +00:00
Chris Lattner fe2b44de9f fix Transforms/InstCombine/2007-05-10-icmp-or.ll
llvm-svn: 36984
2007-05-11 05:55:56 +00:00
Devang Patel 9557247412 Fix PR1333
Testcases :
http://lists.cs.uiuc.edu/pipermail/llvm-commits/Week-of-Mon-20070507/049451.html
http://lists.cs.uiuc.edu/pipermail/llvm-commits/Week-of-Mon-20070507/049452.html

llvm-svn: 36955
2007-05-09 08:24:12 +00:00
Dan Gohman 2e1f804764 Fix various whitespace inconsistencies.
llvm-svn: 36936
2007-05-08 15:19:19 +00:00
Dan Gohman 49d08a57f5 Correct the comment for ApproximateLoopSize to reflect what it actually does.
llvm-svn: 36935
2007-05-08 15:14:19 +00:00
Dale Johannesen 86e1dcf530 Don't generate branch to entry block.
llvm-svn: 36917
2007-05-08 01:01:04 +00:00
Chris Lattner 3b6f75cb2f Fix PR1395, by passing the ID correctly
llvm-svn: 36894
2007-05-06 23:13:56 +00:00
Nick Lewycky e7da2d6ac3 Fix typo in comment.
llvm-svn: 36873
2007-05-06 13:37:16 +00:00
Chris Lattner 9b35b3e863 Fix a bug in my previous patch
llvm-svn: 36857
2007-05-06 07:24:03 +00:00
Chris Lattner 5aa73fe34c Implement Transforms/InstCombine/cast_ptr.ll
llvm-svn: 36809
2007-05-05 22:41:33 +00:00
Chris Lattner 361e981415 wrap long lines
llvm-svn: 36807
2007-05-05 22:32:24 +00:00
Chris Lattner 1077d2a30d Fix Transforms/LoopUnroll/2007-05-05-UnrollMiscomp.ll and PR1385.
If we have a LCSSA, only modify the input value if the inval was defined
by an instruction in the loop.  If defined by something before the loop,
it is still valid.

llvm-svn: 36784
2007-05-05 18:49:57 +00:00
Chris Lattner 57d89a5a89 make a temporary for *SI, no functionality change.
llvm-svn: 36782
2007-05-05 18:36:36 +00:00
Chris Lattner 5c827bda0d Fix InstCombine/2007-05-04-Crash.ll and PR1384
llvm-svn: 36775
2007-05-05 01:59:31 +00:00
Dan Gohman 2bcbd5b7ca Use IntrinsicInst to test for prefetch instructions, which is ever so
slightly nicer than using CallInst with an extra check; thanks Chris.

llvm-svn: 36743
2007-05-04 14:59:09 +00:00
Dan Gohman 3fbb18d1b6 Allow strength reduction to make use of addressing modes for the
address operand in a prefetch intrinsic.

llvm-svn: 36713
2007-05-03 23:20:33 +00:00
Devang Patel 8c78a0bff0 Drop 'const'
llvm-svn: 36662
2007-05-03 01:11:54 +00:00
Devang Patel e95c6ad802 Use 'static const char' instead of 'static const int'.
Due to darwin gcc bug, one version of darwin linker coalesces
static const int, which defauts PassID based pass identification.

llvm-svn: 36652
2007-05-02 21:39:20 +00:00
Lauro Ramos Venancio 41223586a2 Fix build error.
llvm-svn: 36648
2007-05-02 20:37:47 +00:00
Devang Patel 09f162ca6a Do not use typeinfo to identify pass in pass manager.
llvm-svn: 36632
2007-05-01 21:15:47 +00:00
Anton Korobeynikov 546ea7ea88 Implement review feedback
llvm-svn: 36564
2007-04-29 18:02:48 +00:00
Anton Korobeynikov b18f8f85e9 Implement review feedback. Aliasees can be either GlobalValue's or
bitcasts of them.

llvm-svn: 36537
2007-04-28 13:45:00 +00:00
Chris Lattner 089e35cc57 fix a bug triggered by 403.gcc
llvm-svn: 36527
2007-04-28 05:27:36 +00:00
Chris Lattner 6e880871e9 Fix several latent bugs in EmitGEPOffset that didn't manifest with its
previous clients.  This fixes MallocBench/gs

llvm-svn: 36525
2007-04-28 04:52:43 +00:00
Chris Lattner c753800800 uhn zap cvs
llvm-svn: 36523
2007-04-28 03:50:56 +00:00
Chris Lattner acbf6a401d Implement PR1345 and Transforms/InstCombine/bitcast-gep.ll
llvm-svn: 36521
2007-04-28 00:57:34 +00:00
Chris Lattner 1db224db92 refactor some code relating to pointer cast xforms, pulling it out of the codepath
for unrelated casts.

llvm-svn: 36511
2007-04-27 17:44:50 +00:00
Zhou Sheng 3178736d50 Using APInt more efficiently.
llvm-svn: 36475
2007-04-26 16:42:07 +00:00
Devang Patel d3ccc073a2 Mem2Reg does not need TargetData.
llvm-svn: 36444
2007-04-25 18:32:35 +00:00
Devang Patel 073be55d8e Remove unused function argument.
llvm-svn: 36441
2007-04-25 17:15:20 +00:00
Anton Korobeynikov a97b694c82 Implement aliases. This fixes PR1017 and it's dependent bugs. CFE part
will follow.

llvm-svn: 36435
2007-04-25 14:27:10 +00:00
Chris Lattner 827cb98a0a If an alloca only has two types of uses: 1) reads 2) a memcpy/memmove that
copies from a constant global, then we can change the reads to read from the
global instead of from the alloca.  This eliminates the alloca and the memcpy,
and promotes secondary optimizations (because the loads are now loads from
a constant global).

This is important for a common C idiom:

void foo() {
   int A[] = {1,2,3,4,5,6,7,8,9...};
   ... only reads of A ...
}

For some reason, people forget to mark the array static or const.

This triggers on these multisource benchmarks:
JM/ldecode: block_pos, [3 x [4 x [4 x i32]]]
FreeBench/mason: m, [18 x i32], inlined 4 times
MiBench/office-stringsearch: search_strings, [1332 x i8*]
MiBench/office-stringsearch: find_strings, [1333 x i8*]
Prolangs-C++/city: dirs, [9 x i8*], inlined 4 places

and these spec benchmarks:
177.mesa: message, [8 x [32 x i8]]
186.crafty: bias_rl45, [64 x i32]
186.crafty: diag_sq, [64 x i32]
186.crafty: empty, [9 x i8]
186.crafty: xlate, [15 x i8]
186.crafty: status, [13 x i8]
186.crafty: bdinfo, [25 x i8]
445.gobmk: routines, [16 x i8*]
458.sjeng: piece_rep, [14 x i8*]
458.sjeng: t, [13 x i32], inlined 4 places.
464.h264ref: block8x8_idx, [3 x [4 x [4 x i32]]]
464.h264ref: block_pos, [3 x [4 x [4 x i32]]]
464.h264ref: j_off_tab, [12 x i32]

This implements Transforms/ScalarRepl/memcpy-from-global.ll

llvm-svn: 36429
2007-04-25 06:40:51 +00:00
Chris Lattner 31e5addb67 refactor the SROA code out into its own method, no functionality change.
llvm-svn: 36426
2007-04-25 05:02:56 +00:00
Owen Anderson 510fefcd8a Undo my previous changes. Since my approach to this problem is being revised,
this approach is no longer appropriate.

llvm-svn: 36421
2007-04-25 04:18:54 +00:00
Devang Patel d3208523b2 Fix
http://lists.cs.uiuc.edu/pipermail/llvm-commits/Week-of-Mon-20070423/048376.html

llvm-svn: 36417
2007-04-25 00:37:04 +00:00
Owen Anderson c24701ed7f Rollback some changes that adversely affected performance. I'm currently rethinking
my approach to this, so hopefully I'll find a way to do this without making this slower.

llvm-svn: 36392
2007-04-24 06:40:39 +00:00
Devang Patel 38bc86f057 Fix
http://lists.cs.uiuc.edu/pipermail/llvm-commits/Week-of-Mon-20070423/048333.html

llvm-svn: 36380
2007-04-23 22:42:03 +00:00
Owen Anderson 64995e1b3f Make PredicateSimplifier not use DominatorTree.
llvm-svn: 36300
2007-04-21 07:38:12 +00:00
Owen Anderson 2965adb849 Fix a comment.
llvm-svn: 36299
2007-04-21 07:12:44 +00:00
Jeff Cohen 5959f42498 Comment out usage of write() for now.
llvm-svn: 36287
2007-04-20 22:40:10 +00:00
Devang Patel 83a3adcc3f Avoid recursion.
llvm-svn: 36272
2007-04-20 20:04:37 +00:00
Owen Anderson 2da606c757 Move more passes to using ETForest instead of DominatorTree.
llvm-svn: 36271
2007-04-20 06:27:13 +00:00
Zhou Sheng aafe4e216e Make use of ConstantInt::isZero instead of ConstantInt::isNullValue.
llvm-svn: 36261
2007-04-19 05:39:12 +00:00
Zhou Sheng 82fcf3cb5f Make the operations of APInt variables more efficient.
llvm-svn: 36260
2007-04-19 05:35:00 +00:00
Evan Cheng db9b65d67a Revert Owen's last check-in. This is breaking Mac OS X / PPC llvm-gcc bootstrap.
llvm-svn: 36258
2007-04-18 22:39:00 +00:00
Owen Anderson 9421f03959 Revert changes that caused breakage.
llvm-svn: 36255
2007-04-18 06:46:57 +00:00
Owen Anderson 9a6091dec1 Switch more uses of DominatorTree over to ETForest.
llvm-svn: 36254
2007-04-18 05:43:13 +00:00
Owen Anderson 550e8db9c7 Use ETForest instead of DominatorTree.
llvm-svn: 36252
2007-04-18 05:25:43 +00:00
Owen Anderson fc40d446c9 Use ETForest instead of DominatorTree.
llvm-svn: 36249
2007-04-18 04:55:33 +00:00
Owen Anderson 08293fd6d1 Use new ETForest accessor.
llvm-svn: 36248
2007-04-18 04:46:35 +00:00
Owen Anderson f38f2f2394 Use ETForest instead of DominatorTree.
llvm-svn: 36247
2007-04-18 04:39:32 +00:00
Dan Gohman 2ce1116b33 Spell doFinalization right, so that it is a proper virtual override and
gets called.

llvm-svn: 36208
2007-04-17 18:21:36 +00:00
Chris Lattner 233f97ac6a remove use of BasicBlock::getNext
llvm-svn: 36205
2007-04-17 18:09:47 +00:00
Chris Lattner 24e2d9ca03 remove use of BasicBlock::getNext
llvm-svn: 36202
2007-04-17 17:54:12 +00:00
Chris Lattner cd9bda71a0 eliminate use of Instruction::getNext()
llvm-svn: 36200
2007-04-17 17:51:03 +00:00
Chris Lattner 77a3edcb92 remove use of Instruction::getNext
llvm-svn: 36199
2007-04-17 17:47:54 +00:00
Devang Patel abdff3fecd Fix
http://lists.cs.uiuc.edu/pipermail/llvm-commits/Week-of-Mon-20070416/047888.html

llvm-svn: 36182
2007-04-16 23:03:45 +00:00
Anton Korobeynikov fb80151c42 Removed tabs everywhere except autogenerated & external files. Add make
target for tabs checking.

llvm-svn: 36146
2007-04-16 18:10:23 +00:00
Chris Lattner 343c88cdb9 Fix PR1335 and Transforms/Inline/2007-04-15-InlineEH.ll
llvm-svn: 36090
2007-04-15 21:38:06 +00:00
Owen Anderson f35a1dbc7a Remove ImmediateDominator analysis. The same information can be obtained from DomTree. A lot of code for
constructing ImmediateDominator is now folded into DomTree construction.

This is part of the ongoing work for PR217.

llvm-svn: 36063
2007-04-15 08:47:27 +00:00
Chris Lattner f8a7bf317e fix SimplifyLibCalls/IsDigit.ll
llvm-svn: 36047
2007-04-15 05:38:40 +00:00
Chris Lattner 4a6e0cbd41 Extend store merging to support the 'if/then' version in addition to if/then/else.
This sinks the two stores in this example into a single store in cond_next.  In this
case, it allows elimination of the load as well:

        store double 0.000000e+00, double* @s.3060
        %tmp3 = fcmp ogt double %tmp1, 5.000000e-01             ; <i1> [#uses=1]
        br i1 %tmp3, label %cond_true, label %cond_next
cond_true:              ; preds = %entry
        store double 1.000000e+00, double* @s.3060
        br label %cond_next
cond_next:              ; preds = %entry, %cond_true
        %tmp6 = load double* @s.3060            ; <double> [#uses=1]

This implements Transforms/InstCombine/store-merge.ll:test2

llvm-svn: 36040
2007-04-15 01:02:18 +00:00
Chris Lattner 14a251b937 refactor some code, no functionality change.
llvm-svn: 36037
2007-04-15 00:07:55 +00:00
Chris Lattner 28d921d04f fix long lines
llvm-svn: 36031
2007-04-14 23:32:02 +00:00
Chris Lattner 7bfdd0abe1 Implement Transforms/InstCombine/vec_extract_elt.ll, transforming:
define i32 @test(float %f) {
        %tmp7 = insertelement <4 x float> undef, float %f, i32 0
        %tmp17 = bitcast <4 x float> %tmp7 to <4 x i32>
        %tmp19 = extractelement <4 x i32> %tmp17, i32 0
        ret i32 %tmp19
}

into:

define i32 @test(float %f) {
        %tmp19 = bitcast float %f to i32                ; <i32> [#uses=1]
        ret i32 %tmp19
}

On PPC, this is the difference between:

_test:
        mfspr r2, 256
        oris r3, r2, 8192
        mtspr 256, r3
        stfs f1, -16(r1)
        addi r3, r1, -16
        addi r4, r1, -32
        lvx v2, 0, r3
        stvx v2, 0, r4
        lwz r3, -32(r1)
        mtspr 256, r2
        blr

and:

_test:
        stfs f1, -4(r1)
        nop
        nop
        nop
        lwz r3, -4(r1)
        blr

llvm-svn: 36025
2007-04-14 23:02:14 +00:00
Chris Lattner b37fb6a0da Implement InstCombine/vec_demanded_elts.ll:test2. This allows us to turn
unsigned test(float f) {
 return _mm_cvtsi128_si32( (__m128i) _mm_set_ss( f*f ));
}

into:

_test:
        movss 4(%esp), %xmm0
        mulss %xmm0, %xmm0
        movd %xmm0, %eax
        ret

instead of:

_test:
        movss 4(%esp), %xmm0
        mulss %xmm0, %xmm0
        xorps %xmm1, %xmm1
        movss %xmm0, %xmm1
        movd %xmm1, %eax
        ret

GCC gets:

_test:
        subl    $28, %esp
        movss   32(%esp), %xmm0
        mulss   %xmm0, %xmm0
        xorps   %xmm1, %xmm1
        movss   %xmm0, %xmm1
        movaps  %xmm1, %xmm0
        movd    %xmm0, 12(%esp)
        movl    12(%esp), %eax
        addl    $28, %esp
        ret

llvm-svn: 36020
2007-04-14 22:29:23 +00:00
Chris Lattner a6b5660209 avoid copying sets and vectors around.
llvm-svn: 36017
2007-04-14 22:10:17 +00:00
Chris Lattner 6f58839b20 avoid iterator invalidation.
llvm-svn: 36002
2007-04-14 18:06:52 +00:00
Jeff Cohen 4bd0fd367a An even better fix.
llvm-svn: 35998
2007-04-14 17:18:29 +00:00
Jeff Cohen 7233aa9369 Fix recent regression that broke several llvm-tests.
llvm-svn: 35996
2007-04-14 16:55:19 +00:00
Chris Lattner 49fa8d2bff Implement a few missing xforms: printf("foo\n") -> puts. printf("x") -> putchar
printf("") -> noop.  Still need to do the xforms for fprintf.

This implements Transforms/SimplifyLibCalls/Printf.ll

llvm-svn: 35984
2007-04-14 01:17:48 +00:00
Chris Lattner 02137eec8f in addition to merging, constantmerge should also delete trivially dead globals,
in order to clean up after simplifylibcalls.

llvm-svn: 35982
2007-04-14 01:11:54 +00:00
Chris Lattner efb33d28c6 Implement PR1201 and test/Transforms/InstCombine/malloc-free-delete.ll
llvm-svn: 35981
2007-04-14 00:20:02 +00:00
Chris Lattner 164b76565b use an accessor to simplify code.
llvm-svn: 35979
2007-04-14 00:17:39 +00:00
Chris Lattner efd3051d60 Now that codegen prepare isn't defeating me, I can finally fix what I set
out to do! :)

This fixes a problem where LSR would insert a bunch of code into each MBB
that uses a particular subexpression (e.g. IV+base+C).  The problem is that
this code cannot be CSE'd back together if inserted into different blocks.

This patch changes LSR to attempt to insert a single copy of this code and
share it, allowing codegenprepare to duplicate the code if it can be sunk
into various addressing modes.  On CodeGen/ARM/lsr-code-insertion.ll,
for example, this gives us code like:

        add r8, r0, r5
        str r6, [r8, #+4]
..
        ble LBB1_4      @cond_next
LBB1_3: @cond_true
        str r10, [r8, #+4]
LBB1_4: @cond_next
...
LBB1_5: @cond_true55
        ldr r6, LCPI1_1
        str r6, [r8, #+4]

instead of:

        add r10, r0, r6
        str r8, [r10, #+4]
...
        ble LBB1_4      @cond_next
LBB1_3: @cond_true
        add r8, r0, r6
        str r10, [r8, #+4]
LBB1_4: @cond_next
...
LBB1_5: @cond_true55
        add r8, r0, r6
        ldr r10, LCPI1_1
        str r10, [r8, #+4]

Besides being smaller and more efficient, this makes it immediately
obvious that it is profitable to predicate LBB1_3 now :)

llvm-svn: 35972
2007-04-13 20:42:26 +00:00
Chris Lattner feee64e997 Completely rewrite addressing-mode related sinking of code. In particular,
this fixes problems where codegenprepare would sink expressions into load/stores
that are not valid, and fixes cases where it would miss important valid ones.

This fixes several serious codesize and perf issues, particularly on targets
with complex addressing modes like arm and x86.  For example, now we compile
CodeGen/X86/isel-sink.ll to:

_test:
        movl 8(%esp), %eax
        movl 4(%esp), %ecx
        cmpl $1233, %eax
        ja LBB1_2       #F
LBB1_1: #T
        movl $4, (%ecx,%eax,4)
        movl $141, %eax
        ret
LBB1_2: #F
        movl (%ecx,%eax,4), %eax
        ret

instead of:

_test:
        movl 8(%esp), %eax
        leal (,%eax,4), %ecx
        addl 4(%esp), %ecx
        cmpl $1233, %eax
        ja LBB1_2       #F
LBB1_1: #T
        movl $4, (%ecx)
        movl $141, %eax
        ret
LBB1_2: #F
        movl (%ecx), %eax
        ret

llvm-svn: 35970
2007-04-13 20:30:56 +00:00
Devang Patel 38705d5494 Remove use of SlowOperationInformer.
llvm-svn: 35967
2007-04-13 18:58:18 +00:00
Devang Patel b730fe57bf Undo previous check-in.
llvm-svn: 35966
2007-04-13 18:35:15 +00:00
Devang Patel f929b86140 Hello uses LLVMSupport.a (SlowerOperationInformer)
llvm-svn: 35965
2007-04-13 18:28:23 +00:00
Lauro Ramos Venancio 749e4668e7 Implement the "thread_local" keyword.
llvm-svn: 35950
2007-04-12 18:32:50 +00:00
Reid Spencer c78d122a6a Build Hello by default so it can be used in test cases.
llvm-svn: 35922
2007-04-11 21:03:37 +00:00
Chris Lattner 5ee4d0726a Fix Transforms/ScalarRepl/union-pointer.ll
llvm-svn: 35906
2007-04-11 15:45:25 +00:00
Chris Lattner 74ff60ff84 Turn stuff like:
icmp slt i32 %X, 0              ; <i1>:0 [#uses=1]
        sext i1 %0 to i32               ; <i32>:1 [#uses=1]

into:

        %X.lobit = ashr i32 %X, 31              ; <i32> [#uses=1]

This implements InstCombine/icmp.ll:test[34]

llvm-svn: 35891
2007-04-11 06:57:46 +00:00
Chris Lattner d0f7942e23 Simplify some comparisons to arithmetic, this implements:
Transforms/InstCombine/icmp.ll

llvm-svn: 35890
2007-04-11 06:53:04 +00:00
Chris Lattner 20f2372a7c canonicalize (x <u 2147483648) -> (x >s -1) and (x >u 2147483647) -> (x <s 0)
llvm-svn: 35886
2007-04-11 06:12:58 +00:00
Chris Lattner 7ddbff090a fix a miscompilation of:
define i32 @test(i32 %X) {
entry:
        %Y = and i32 %X, 4              ; <i32> [#uses=1]
        icmp eq i32 %Y, 0               ; <i1>:0 [#uses=1]
        sext i1 %0 to i32               ; <i32>:1 [#uses=1]
        ret i32 %1
}

by moving code out of commonIntCastTransforms into visitZExt.  Simplify the
APInt gymnastics in it etc.

llvm-svn: 35885
2007-04-11 05:45:39 +00:00
Chris Lattner 32104034f8 fix a regression introduced by my last patch.
llvm-svn: 35879
2007-04-11 03:27:24 +00:00
Chris Lattner daa012d1fb Simplify SROA conversion to integer in some ways, make it more general in others.
We now tolerate small amounts of undefined behavior, better emulating what
would happen if the transaction actually occurred in memory.  This fixes
SingleSource/UnitTests/2007-04-10-BitfieldTest.c on PPC, at least until
Devang gets a chance to fix the CFE from doing undefined things with bitfields :)

llvm-svn: 35875
2007-04-11 00:57:54 +00:00
Chris Lattner 467b69cabb Strengthen the boundary conditions of this fold, implementing
InstCombine/set.ll:test25

llvm-svn: 35852
2007-04-09 23:52:13 +00:00
Owen Anderson 3c7867935e Re-constify things that don't break the build. Last patch in this
series, I promise.

llvm-svn: 35848
2007-04-09 23:38:18 +00:00
Chris Lattner 3e9690f987 eliminate the last uses of some TLI methods.
llvm-svn: 35844
2007-04-09 23:29:07 +00:00
Owen Anderson f1ca1376d3 Unconst-ify stuff that broke the build.
llvm-svn: 35843
2007-04-09 23:08:26 +00:00
Owen Anderson 5917716146 Const-ify some parameters, and some cosmetic cleanups. No functionality
change.

llvm-svn: 35842
2007-04-09 22:54:50 +00:00
Owen Anderson e0ef5ac6bd Tabs -> Spaces
llvm-svn: 35841
2007-04-09 22:31:43 +00:00
Owen Anderson 83efbc84f7 Improve some _slow_ behavior introduced in my patches the last few days.
llvm-svn: 35839
2007-04-09 22:25:09 +00:00
Chris Lattner 780c009756 switch LSR to use isLegalAddressingMode instead of other simpler hooks
llvm-svn: 35837
2007-04-09 22:20:14 +00:00
Devang Patel bca0d57179 Check _all_ PHINodes.
llvm-svn: 35836
2007-04-09 22:20:10 +00:00
Devang Patel 8eb8eeada9 Insert new pre-header before new header. Original pre-header may
happen to be an entry, in such case, it is not a good idea to
insert new block before entry.

Also fix typo in assertion check.

llvm-svn: 35833
2007-04-09 21:40:43 +00:00
Devang Patel 854197884b Preserve canonical loop form.
llvm-svn: 35829
2007-04-09 20:19:46 +00:00
Reid Spencer 8436cdfda2 Don't link against System or Support library. These things will already
be in the opt tool.

llvm-svn: 35827
2007-04-09 19:17:47 +00:00
Devang Patel b9af5747a5 Do not create new pre-header. Reuse original pre-header.
llvm-svn: 35825
2007-04-09 19:04:21 +00:00
Devang Patel 03d7ae3a74 Simpler for() loops.
llvm-svn: 35822
2007-04-09 17:09:13 +00:00
Devang Patel d6ba41e02d Fix future bug. Of course, Chris spotted this.
Handle Argument or Undef as an incoming PHI value.

llvm-svn: 35821
2007-04-09 16:41:46 +00:00
Devang Patel b28a391a8d More cosmetic changes.
llvm-svn: 35820
2007-04-09 16:21:29 +00:00
Devang Patel 88bc2c6f82 Only cosmetic changes. Zero functionality Change.
llvm-svn: 35819
2007-04-09 16:11:48 +00:00
Chris Lattner a87c9f6114 Fix PR1304 and Transforms/InstCombine/2007-04-08-SingleEltVectorCrash.ll
llvm-svn: 35792
2007-04-09 01:37:55 +00:00
Chris Lattner 4ca9cbb170 Eliminate useless insertelement instructions. This implements
Transforms/InstCombine/vec_insertelt.ll and fixes PR1286.

We now compile the code from that bug into:

_foo:
        movl 4(%esp), %eax
        movdqa (%eax), %xmm0
        movl 8(%esp), %ecx
        psllw (%ecx), %xmm0
        movdqa %xmm0, (%eax)
        ret

instead of:

_foo:
        subl $4, %esp
        movl %ebp, (%esp)
        movl %esp, %ebp
        movl 12(%ebp), %eax
        movdqa (%eax), %xmm0
        #IMPLICIT_DEF %eax
        pinsrw $2, %eax, %xmm0
        xorl %ecx, %ecx
        pinsrw $3, %ecx, %xmm0
        pinsrw $4, %eax, %xmm0
        pinsrw $5, %ecx, %xmm0
        pinsrw $6, %eax, %xmm0
        pinsrw $7, %ecx, %xmm0
        movl 8(%ebp), %eax
        movdqa (%eax), %xmm1
        psllw %xmm0, %xmm1
        movdqa %xmm1, (%eax)
        movl %ebp, %esp
        popl %ebp
        ret

woo :)

llvm-svn: 35788
2007-04-09 01:11:16 +00:00
Owen Anderson ae39ca037a Cleanup some from my DomSet-removal changes. Add a new
isReachableFromEntry
test to ETForest to factor a common test out of code.

llvm-svn: 35786
2007-04-09 00:52:49 +00:00
Chris Lattner aa8ad10c2f Fix a typo that broke SimplifyLibCalls/SPrintF.ll (pr1315)
llvm-svn: 35768
2007-04-08 18:11:26 +00:00
Chris Lattner c8d3788f71 reenable this xform, whoops :)
llvm-svn: 35765
2007-04-08 08:01:49 +00:00
Chris Lattner 7621a031d8 Fix regression on Instcombine/apint-or2.ll
llvm-svn: 35763
2007-04-08 07:55:22 +00:00
Chris Lattner 1150df9cc4 Generalize the code that handles (A&B)|(A&C) to work where B/C are not constants.
Add a new xform to simplify (A&B)|(~A&C).  THis implements InstCombine/or2.ll:test1

llvm-svn: 35760
2007-04-08 07:47:01 +00:00
Chris Lattner 5717981e5d implement a fixme: move optimizations for fwrite out of fputs into a new
fwrite optimizer.

llvm-svn: 35758
2007-04-08 07:00:35 +00:00
Nick Lewycky e6c64466c7 Remove DominatorSet usage from LoopSimplify. Patch from Owen Anderson.
llvm-svn: 35757
2007-04-08 01:04:30 +00:00
Chris Lattner 182a945fb5 Significantly simplify the clients of GetConstantStringInfo, by having it
just return the string itself.

llvm-svn: 35755
2007-04-07 21:58:02 +00:00
Chris Lattner 08c0b8b3c8 Fix problems in the sprintf optimizer
llvm-svn: 35754
2007-04-07 21:17:51 +00:00
Chris Lattner bed184cbcf Change CastToCStr to take a pointer instead of a reference.
Fix some miscompilations in fprintf optimizer.

llvm-svn: 35753
2007-04-07 21:04:50 +00:00
Chris Lattner 898d698d9f Fix an off-by-one error that broke Prolangs/deriv2 with llc on x86
and Prolangs-C/cdecl

llvm-svn: 35749
2007-04-07 20:19:08 +00:00
Owen Anderson f7ebea1b9f Add DomSet back, and revert the changes to LoopSimplify. Apparently the
ETForest updating mechanisms don't work as I thought they did.  These changes
will be reapplied once the issue is worked out.

llvm-svn: 35741
2007-04-07 18:23:27 +00:00
Nick Lewycky d4f51a8ae3 Add support for cast instructions.
llvm-svn: 35734
2007-04-07 15:48:32 +00:00
Owen Anderson 8763ba1b88 Completely purge DomSet. This is the (hopefully) final patch for PR1171.
llvm-svn: 35731
2007-04-07 07:17:27 +00:00
Owen Anderson 706e97049d Completely purge DomSet from LoopSimplify. This is part of the
continuing work on PR1171.

llvm-svn: 35730
2007-04-07 06:56:47 +00:00
Owen Anderson d03a646f06 BreakCriticalEdges does still preserve DominatorTree.
llvm-svn: 35729
2007-04-07 05:57:09 +00:00
Owen Anderson b39d9ca902 Expunge DomSet from BreakCriticalEdges. This is part of the continuing
work for PR 1171.

llvm-svn: 35728
2007-04-07 05:49:29 +00:00
Owen Anderson f095bf3ac4 Expunge DomSet from CodeExtractor. This is part of the continuing work
on PR1171.

llvm-svn: 35726
2007-04-07 05:31:27 +00:00
Nick Lewycky 93f541057b Support NE inequality in ValueRanges.
llvm-svn: 35724
2007-04-07 04:49:12 +00:00
Owen Anderson 910419596e Expunge a bunch of uses of DomSet from LoopSimplify. Many more remain.
This is the beginning of work for PR1171.

llvm-svn: 35720
2007-04-07 04:37:14 +00:00
Nick Lewycky 3bb6de85d1 Cleanup. Refactor out the applying of value ranges to its own method.
llvm-svn: 35719
2007-04-07 03:36:51 +00:00
Nick Lewycky 12d44abe0f Use TargetData to find the size of a type.
llvm-svn: 35718
2007-04-07 03:16:12 +00:00
Nick Lewycky eeb01b41ef Strengthen icmp snuggling by doing 'compare-or-equal-to' to 'compare'
first and then range testing second.

llvm-svn: 35715
2007-04-07 02:30:14 +00:00
Devang Patel f42389ffe5 Add loop rotation pass.
llvm-svn: 35714
2007-04-07 01:25:15 +00:00
Chris Lattner 0f1509511e fix a miscompilation in printf optimizer.
llvm-svn: 35713
2007-04-07 01:18:36 +00:00
Chris Lattner 6a36d636e9 trunc to bool no longer compares against zero
llvm-svn: 35712
2007-04-07 01:03:46 +00:00
Chris Lattner e8829aa9dd cleanups for strlen optimizer
llvm-svn: 35711
2007-04-07 01:02:00 +00:00
Chris Lattner 485b6415b1 Introduce a new ReplaceCallWith method, which simplifies a lot of code.
llvm-svn: 35710
2007-04-07 00:42:32 +00:00
Chris Lattner 6a6c1f1c30 fixes for strcpy optimizer
llvm-svn: 35709
2007-04-07 00:26:18 +00:00
Chris Lattner f9ee647e86 Fix bugs in strncmp.
llvm-svn: 35708
2007-04-07 00:06:57 +00:00
Chris Lattner c9ccc30212 fix 3 miscompilations and several compielr crashes in strcmp optimizer.
llvm-svn: 35707
2007-04-07 00:01:51 +00:00
Chris Lattner 39f0bb9670 Fix several nasty bugs in the strchr optimizer, this fixes
SimplifyLibCalls/2007-04-06-strchr-miscompile.ll and PR1307

llvm-svn: 35706
2007-04-06 23:38:55 +00:00
Chris Lattner 56b7fc7768 clean up strcat optimizer, no functionality change.
llvm-svn: 35704
2007-04-06 22:59:33 +00:00
Chris Lattner 9b2b8abd20 rename getConstantStringLength -> GetConstantStringInfo. Make it return
the start index of the array as well as the length.  No functionality change.

llvm-svn: 35703
2007-04-06 22:54:17 +00:00
Chris Lattner 3dbe65f80a implement Transforms/InstCombine/malloc2.ll and PR1313
llvm-svn: 35700
2007-04-06 18:57:34 +00:00
Chris Lattner 1a9a760318 Fix Transforms/GlobalOpt/2007-04-05-Crash.ll
llvm-svn: 35689
2007-04-05 21:09:42 +00:00
Chris Lattner 108083edff Use a worklist-driven algorithm instead of a recursive one.
llvm-svn: 35680
2007-04-05 01:27:02 +00:00
Dale Johannesen 7c2001d014 Prevent transformConstExprCastCall from generating conversions that assert
elsewhere.

llvm-svn: 35668
2007-04-04 19:16:42 +00:00
Jeff Cohen 5a1c750f31 Fix 2007-04-04-BadFoldBitcastIntoMalloc.ll
llvm-svn: 35665
2007-04-04 16:58:57 +00:00
Duncan Sands f01a47c93c Fix comment.
llvm-svn: 35655
2007-04-04 06:42:45 +00:00
Chris Lattner e5bbb3cb1a Fix a bug I introduced with my patch yesterday which broke Qt (I converted
some constant exprs to apints).

Thanks to Anton for tracking down a small testcase that triggered this!

llvm-svn: 35633
2007-04-03 23:29:39 +00:00
Chris Lattner a74deafb13 reinstate the previous two patches, with a bugfix :)
ldecod now passes.

llvm-svn: 35626
2007-04-03 17:43:25 +00:00
Evan Cheng 7511fa280d Reverting back to 1.723. The last two commits broke JM (and possibily others) on ARM.
llvm-svn: 35620
2007-04-03 08:11:50 +00:00
Chris Lattner 81e0707552 split some code out into a helper function
llvm-svn: 35615
2007-04-03 05:11:24 +00:00
Chris Lattner 64c764cebc Split a whole ton of code out of visitICmpInst into visitICmpInstWithInstAndIntCst.
llvm-svn: 35614
2007-04-03 04:46:52 +00:00
Chris Lattner 8b2ec5f506 Fix PR1253 and xor2.ll:test[01]
llvm-svn: 35612
2007-04-03 01:47:41 +00:00
Chris Lattner f3197a7d53 allow -1 strides to reuse "1" strides.
llvm-svn: 35607
2007-04-02 22:51:58 +00:00
Zhou Sheng 9bc8ab100d 1. Make use of APInt operation instead of using ConstantExpr::getXXX.
2. Use cheaper APInt methods.

llvm-svn: 35594
2007-04-02 13:45:30 +00:00
Zhou Sheng 56cda95658 Use uint32_t for bitwidth instead of unsigned.
llvm-svn: 35593
2007-04-02 08:20:41 +00:00
Chris Lattner 28e0e4e11e Pass the type of the store access, not the type of the store, into the
target hook.  This allows us to codegen a loop as:

LBB1_1: @cond_next
        mov r2, #0
        str r2, [r0, +r3, lsl #2]
        add r3, r3, #1
        cmn r3, #1
        bne LBB1_1      @cond_next

instead of:

LBB1_1: @cond_next
        mov r2, #0
        str r2, [r0], #+4
        add r3, r3, #1
        cmn r3, #1
        bne LBB1_1      @cond_next

This looks the same, but has one fewer induction variable (and therefore,
one fewer register) live in the loop.

llvm-svn: 35592
2007-04-02 06:34:44 +00:00
Chris Lattner 9d5aacee92 Wrap long line
llvm-svn: 35588
2007-04-02 05:48:58 +00:00
Chris Lattner 50490d54f2 use more obvious function name.
llvm-svn: 35587
2007-04-02 05:42:22 +00:00
Chris Lattner b24acc7bee simplify (x+c)^signbit as (x+c+signbit), pointed out by PR1288. This implements
test/Transforms/InstCombine/xor.ll:test28

llvm-svn: 35584
2007-04-02 05:36:22 +00:00
Chris Lattner b7b75145f1 reduce use of std::set
llvm-svn: 35576
2007-04-02 01:44:59 +00:00
Chris Lattner c3748562bd Various passes before isel split edges and do other CFG-restructuring changes.
isel has its own particular features that it wants in the CFG, in order to
reduce the number of times a constant is computed, etc.  Make sure that we
clean up the CFG before doing any other things for isel.  Doing so can
dramatically reduce the number of split edges and reduce the number of
places that constants get computed.  For example, this shrinks
CodeGen/Generic/phi-immediate-factoring.ll from 44 to 37 instructions on X86,
and from 21 to 17 MBB's in the output.  This is primarily a code size win,
not a performance win.

This implements CodeGen/Generic/phi-immediate-factoring.ll and PR1296.

llvm-svn: 35575
2007-04-02 01:35:34 +00:00
Chris Lattner 8fe3cbe6bd print the type of an inserted IV in -debug mode.
llvm-svn: 35563
2007-04-01 22:21:39 +00:00
Chris Lattner c3eeb42809 simplify this code, make it work for ap ints
llvm-svn: 35561
2007-04-01 20:57:36 +00:00
Zhou Sheng 150f3bbab2 Avoid unnecessary APInt construction.
llvm-svn: 35555
2007-04-01 17:13:37 +00:00
Reid Spencer 6bba6c8143 For PR1297:
Support overloaded intrinsics bswap, ctpop, cttz, ctlz.

llvm-svn: 35547
2007-04-01 07:35:23 +00:00
Chris Lattner 0427799531 Fix InstCombine/2007-03-31-InfiniteLoop.ll
llvm-svn: 35536
2007-04-01 05:36:37 +00:00
Chris Lattner f2836d17b6 Split the sdisel code munging stuff out into its own opt-pass, CodeGenPrepare.
llvm-svn: 35528
2007-03-31 04:06:36 +00:00
Zhou Sheng 82c42284f4 Delete dead code.
llvm-svn: 35525
2007-03-31 02:50:26 +00:00
Zhou Sheng 4f16402e0d Use APInt operators to calculate the carry bits, remove this loop.
llvm-svn: 35524
2007-03-31 02:38:39 +00:00
Zhou Sheng fd28a33031 Make sure the use of ConstantInt::getZExtValue() for shift amount safe.
llvm-svn: 35510
2007-03-30 17:20:39 +00:00
Zhou Sheng b25806fa5f 1. Make sure the use of ConstantInt::getZExtValue() for getting shift
amount is safe.
2. Use new method on ConstantInt instead of (? :) operator.
3. Use new method uge() on ConstantInt to simplify codes.

llvm-svn: 35505
2007-03-30 09:29:48 +00:00
Zhou Sheng 5e60a4a6b0 Use APInt operation instead of ConstantExpr::getXX.
llvm-svn: 35503
2007-03-30 05:45:18 +00:00
Zhou Sheng b3a80b1d70 1. Make more use of APInt::getHighBitsSet/getLowBitsSet.
2. Let APInt variable do the binary operation stuff instead of using
   ConstantExpr::getXXX.

llvm-svn: 35450
2007-03-29 08:15:12 +00:00
Zhou Sheng 444af49cc0 Clean up some codes in InstCombiner::SimplifyDemandedBits().
llvm-svn: 35446
2007-03-29 04:45:55 +00:00
Zhou Sheng a4475575c0 Clean up codes in InstCombiner::SimplifyDemandedBits():
1. Line out nested call of APInt::zext/trunc.
2. Make more use of APInt::getHighBitsSet/getLowBitsSet.
3. Use APInt[] operator instead of expression like "APIntVal & SignBit".

llvm-svn: 35444
2007-03-29 02:26:30 +00:00
Zhou Sheng 4961cf1c06 1. Make the APInt variable do the binary operation stuff if possible
instead of using ConstantExpr::getXX.
2. Use constant reference to APInt if possible instead of expensive
   APInt copy.

llvm-svn: 35443
2007-03-29 01:57:21 +00:00
Zhou Sheng 117477e28b Avoid unnecessary APInt construction.
llvm-svn: 35431
2007-03-28 17:38:21 +00:00
Zhou Sheng 23f7a1c947 1. Make more use of getLowBitsSet/getHighBitsSet.
2. Use APInt[] instead of "X & SignBit".
3. Clean up some codes.
4. Make the expression like "ShiftAmt = ShiftAmtC->getZExtValue()" safe.

llvm-svn: 35424
2007-03-28 15:02:20 +00:00
Zhou Sheng 2777a31850 1. Make more use of getLowBitsSet/getHighBitsSet.
2. Make the APInt value do the zext/trunc stuff instead of using
   ConstantExpr::getZExt().

llvm-svn: 35422
2007-03-28 09:19:01 +00:00
Zhou Sheng c2d3309b99 Use UnknownBIts[BitWidth-1] instead of UnknownBIts & SignBits.
llvm-svn: 35418
2007-03-28 05:15:57 +00:00
Zhou Sheng 18570b1f14 Remove unused APInt variable.
llvm-svn: 35414
2007-03-28 03:02:21 +00:00
Zhou Sheng 57e3f7324b Clean up codes in ComputeMaskedBits():
1. Line out nested use of zext/trunc.
2. Make more use of getHighBitsSet/getLowBitsSet.
3. Use APInt[] != 0 instead of "(APInt & SignBit) != 0".

llvm-svn: 35408
2007-03-28 02:19:03 +00:00
Reid Spencer a5c18bf798 For PR1280:
When converting an add/xor/and triplet into a trunc/sext, only do so if the
intermediate integer type is a bitwidth that the targets can handle.

llvm-svn: 35400
2007-03-28 01:36:16 +00:00
Evan Cheng a4ed8a512a Unbreaks non-debug builds.
llvm-svn: 35383
2007-03-27 16:44:48 +00:00
Reid Spencer 54d5b1b8f8 Implement some minor review feedback.
llvm-svn: 35373
2007-03-26 23:58:26 +00:00
Reid Spencer 441486c172 For PR1271:
Fix another incorrectly converted shift mask.

llvm-svn: 35371
2007-03-26 23:45:51 +00:00
Devang Patel 4398e242dd Reduce malloc/free traffic.
llvm-svn: 35370
2007-03-26 23:19:29 +00:00
Chris Lattner d2602d5054 eliminate use of std::set
llvm-svn: 35361
2007-03-26 20:40:50 +00:00
Reid Spencer 755d0e7ffc Get better debug output by having modified instructions print both the
original and new instruction. A slight performance hit with ostringstream
but it is only for debug.
Also, clean up an uninitialized variable warning noticed in a release build.

llvm-svn: 35358
2007-03-26 17:44:01 +00:00
Reid Spencer 769a5a8e0b Get the number of bits to set in a mask correct for a shl/lshr transform.
llvm-svn: 35357
2007-03-26 17:18:58 +00:00
Reid Spencer 50898607a9 For PR1271:
Fix SingleSource/Regression/C/2003-05-21-UnionBitFields.c by changing a
getHighBitsSet call to getLowBitsSet call that was incorrectly converted
from the original lshr constant expression.

llvm-svn: 35348
2007-03-26 05:25:00 +00:00
Dale Johannesen e5866e7b89 Look through bitcast when finding IVs. (Chris' patch really.)
llvm-svn: 35347
2007-03-26 03:01:27 +00:00
Reid Spencer 52830327e9 For PR1271:
Remove a use of getLowBitsSet that caused the mask used for replacement of
shl/lshr pairs with an AND instruction to be computed incorrectly. Its not
clear exactly why this is the case. This solves the disappearing shifts
problem, but it doesn't fix Regression/C/2003-05-21-UnionBitFields. It
seems there is more going on.

llvm-svn: 35342
2007-03-25 21:11:44 +00:00
Chris Lattner 9bf53ffaa2 implement Transforms/InstCombine/cast2.ll:test3 and PR1263
llvm-svn: 35341
2007-03-25 20:43:09 +00:00
Reid Spencer 624766f8a2 Some cleanup from review:
* Don't assume shift amounts are <= 64 bits
* Avoid creating an extra APInt in SubOne and AddOne by using -- and ++
* Add another use of getLowBitsSet
* Convert a series of if statements to a switch

llvm-svn: 35339
2007-03-25 19:55:33 +00:00
Reid Spencer 80263aadf3 Refactor several ConstantExpr::getXXX calls with ConstantInt arguments
using the facilities of APInt. While this duplicates a tiny fraction of
the constant folding code, it also makes the code easier to read and
avoids large ConstantExpr overhead for simple, known computations.

llvm-svn: 35335
2007-03-25 05:33:51 +00:00
Zhou Sheng 222d5ebfd2 1. Avoid unnecessary APInt construction if possible.
2. Use isStrictlyPositive() instead of isPositive() in two places where
   they need APInt value > 0 not only >=0.

llvm-svn: 35333
2007-03-25 05:01:29 +00:00
Reid Spencer cd99fbdf3b Make more uses of getHighBitsSet and get rid of some pointless & of an
APInt with its type mask.

llvm-svn: 35325
2007-03-25 04:26:16 +00:00
Reid Spencer d8aad61d4d More APIntification:
* Convert the last use of a uint64_t that should have been an APInt.
* Change ComputeMaskedBits to have a const reference argument for the Mask
  so that recursions don't cause unneeded temporaries. This causes temps
  to be needed in other places (where the mask has to change) but this
  change optimizes for the recursion which is more frequent.
* Remove two instances of &ing a Mask with getAllOnesValue. Its not
  needed any more because APInt is accurate in its bit computations.
* Start using the getLowBitsSet and getHighBits set methods on APInt
  instead of shifting. This makes it more clear in the code what is
  going on.

llvm-svn: 35321
2007-03-25 02:03:12 +00:00
Chris Lattner 3a8248f79d fix a regression on vector or instructions.
llvm-svn: 35314
2007-03-24 23:56:43 +00:00
Zhou Sheng e9ebd3f6ba Make some codes more efficient.
llvm-svn: 35297
2007-03-24 15:34:37 +00:00
Reid Spencer a962d18774 For PR1205:
Convert some calls to ConstantInt::getZExtValue() into getValue() and
use APInt facilities in the subsequent computations.

llvm-svn: 35294
2007-03-24 00:42:08 +00:00
Reid Spencer 959a21d3dc For PR1205:
* APIntify visitAdd and visitSelectInst
* Remove unused uint64_t versions of utility functions that have been
  replaced with APInt versions.
This completes most of the changes for APIntification of InstCombine. This
passes llvm-test and llvm/test/Transforms/InstCombine/APInt.

Patch by Zhou Sheng.

llvm-svn: 35287
2007-03-23 21:24:59 +00:00
Reid Spencer 6d39206bc2 For PR1205:
APIntify visitDiv, visitMul and visitRem.

Patch by Zhou Sheng.

llvm-svn: 35283
2007-03-23 20:05:17 +00:00
Chris Lattner 12b89cc148 switch AddReachableCodeToWorklist from being recursive to being iterative.
llvm-svn: 35282
2007-03-23 19:17:18 +00:00
Reid Spencer 6274c72ee1 For PR1205:
APIntify several utility functions supporting logical operators and shift
operators.

Patch by Zhou Sheng.

llvm-svn: 35281
2007-03-23 18:46:34 +00:00
Zhou Sheng 0900993ebc Make the "KnownZero ^ TypeMask" computation just once.
llvm-svn: 35276
2007-03-23 03:13:21 +00:00
Zhou Sheng 755f04b5d7 Simplify the code.
llvm-svn: 35275
2007-03-23 02:39:25 +00:00
Reid Spencer b722f2b110 For PR1205:
APInt support for logical operators in visitAnd, visitOr, and visitXor.

Patch by Zhou Sheng.

llvm-svn: 35273
2007-03-22 22:19:58 +00:00
Reid Spencer 4154e732e6 For PR1205:
* APIntify commonIntCastTransforms
* APIntify visitTrunc
* APIntify visitZExt

Patch by Zhou Sheng.

llvm-svn: 35271
2007-03-22 20:56:53 +00:00
Reid Spencer c3e3b8a32f For PR1205:
* Re-enable the APInt version of MaskedValueIsZero.
* APIntify the Comput{Un}SignedMinMaxValuesFromKnownBits functions
* APIntify visitICmpInst.

llvm-svn: 35270
2007-03-22 20:36:03 +00:00
Dan Gohman dcb291faa4 Change uses of Function::front to Function::getEntryBlock for readability.
llvm-svn: 35265
2007-03-22 16:38:57 +00:00
Nick Lewycky b0da7ed9c8 Fix broken optimization disabled by a logic bug.
Analyze GEPs. If the indices are all zero, transfer whether the pointer is
known to be not null through the GEP.

Add a few more cases for xor and shift instructions.

llvm-svn: 35257
2007-03-22 02:02:51 +00:00
Reid Spencer f40711637f For PR1248:
* Fix some indentation and comments in InsertRangeTest
* Add an "IsSigned" parameter to AddWithOverflow and make it handle signed
  additions. Also, APIntify this function so it works with any bitwidth.
* For the icmp pred ([us]div %X, C1), C2 transforms, exit early if the
  div instruction's RHS is zero.
* Finally, for icmp pred (sdiv %X, C1), -C2, fix an off-by-one error. The
  HiBound needs to be incremented in order to get the range test correct.

llvm-svn: 35247
2007-03-21 23:19:50 +00:00
Dale Johannesen bacf4acf65 do not share old induction variables when this would result in invalid
instructions (that would have to be split later)

llvm-svn: 35227
2007-03-20 21:54:54 +00:00
Jeff Cohen 1baf5c84ab Fix some VC++ warnings.
llvm-svn: 35224
2007-03-20 20:43:18 +00:00
Devang Patel 1758cb50de LoopSimplify::FindPHIToPartitionLoops()
Use ETForest instead of DominatorSet.

llvm-svn: 35221
2007-03-20 20:18:12 +00:00
Zhou Sheng b3949340c8 Simplify isHighOnes().
llvm-svn: 35211
2007-03-20 12:49:06 +00:00
Dale Johannesen e3a02be5f1 use types of loads and stores, not address, in CheckForIVReuse
llvm-svn: 35197
2007-03-20 00:47:50 +00:00
Reid Spencer 6682721316 Make isOneBitSet faster by using APInt::isPowerOf2. Thanks Chris.
llvm-svn: 35194
2007-03-20 00:16:52 +00:00
Reid Spencer cc031a43aa APIntify the isHighOnes utility function.
llvm-svn: 35190
2007-03-19 21:29:50 +00:00
Reid Spencer ef599b0786 Implement isMaxValueMinusOne in terms of APInt instead of uint64_t.
Patch by Sheng Zhou.

llvm-svn: 35188
2007-03-19 21:10:28 +00:00
Reid Spencer 3b93db72b4 Implement isMinValuePlusOne using facilities of APInt instead of uint64_t
Patch by Zhou Sheng.

llvm-svn: 35187
2007-03-19 21:08:07 +00:00
Reid Spencer 129a86792d Implement isOneBitSet in terms of APInt::countPopulation.
llvm-svn: 35186
2007-03-19 21:04:43 +00:00
Reid Spencer 450434ed65 1. Use APInt::getSignBit to reduce clutter (patch by Sheng Zhou)
2. Replace uses of the "isPositive" utility function with APInt::isPositive

llvm-svn: 35185
2007-03-19 20:58:18 +00:00
Reid Spencer 03c31d5bb0 Remove a redundant clause in an if statement.
Patch by Sheng Zhou.

llvm-svn: 35184
2007-03-19 20:47:50 +00:00
Chris Lattner 9c62db7c8c fix ScalarRepl/2007-03-19-CanonicalizeMemcpy.ll
llvm-svn: 35169
2007-03-19 18:25:57 +00:00
Chris Lattner 877a3b424d implement the next chunk of SROA with memset/memcpy's of aggregates. This
implements Transforms/ScalarRepl/memset-aggregate-byte-leader.ll

llvm-svn: 35150
2007-03-19 00:16:43 +00:00
Nick Lewycky db204ecfbc Clean up this code and fix subtract miscompile.
llvm-svn: 35146
2007-03-18 22:58:46 +00:00
Chris Lattner 0741842b3b Implement InstCombine/and-xor-merge.ll:test[12].
Rearrange some code to simplify it now that shifts are binops

llvm-svn: 35145
2007-03-18 22:51:34 +00:00
Nick Lewycky 17d20fd41e Propagate ValueRanges across equality.
Add some more micro-optimizations: x * 0 = 0, a - x = a --> x = 0.

llvm-svn: 35138
2007-03-18 01:09:32 +00:00
Anton Korobeynikov 22f436da42 Silence warning
llvm-svn: 35137
2007-03-17 14:48:06 +00:00
Nick Lewycky 4f73de2b4e Add more comments and update to new asm syntax.
Add new micro-optimizations.

Add icmp predicate snuggling. Given %x ULT 4, "icmp ugt %x, 2" becomes
"icmp eq %x, 3". This doesn't apply in any non-trivial cases yet due to missing
support for NE values in ValueRanges.

llvm-svn: 35119
2007-03-16 02:37:39 +00:00
Zhou Sheng d8c645b0ba ShiftAmt might equal to zero. Handle this situation.
llvm-svn: 35094
2007-03-14 09:07:33 +00:00
Zhou Sheng b912844554 Enable KnownZero/One.clear().
llvm-svn: 35093
2007-03-14 03:21:24 +00:00
Evan Cheng b5eb932c93 Correct type info for isLegalAddressImmediate() check.
llvm-svn: 35086
2007-03-13 20:34:37 +00:00
Chris Lattner d1bce956b4 ifdef out some dead code.
Fix PR1244 and Transforms/InstCombine/2007-03-13-CompareMerge.ll

llvm-svn: 35082
2007-03-13 14:27:42 +00:00
Zhou Sheng ebe634e662 For expression like
"APInt::getAllOnesValue(ShiftAmt).zextOrCopy(BitWidth)",
to handle ShiftAmt == BitWidth situation, use zextOrCopy() instead of
zext().

llvm-svn: 35080
2007-03-13 06:40:59 +00:00
Zhou Sheng af4341d441 In APInt version ComputeMaskedBits():
1. Ensure VTy, KnownOne and KnownZero have same bitwidth.
  2. Make code more efficient.

llvm-svn: 35078
2007-03-13 02:23:10 +00:00
Evan Cheng 720acdfb31 Use new TargetLowering addressing modes hooks.
llvm-svn: 35072
2007-03-12 23:27:37 +00:00
Jeff Cohen 00227417d2 Unbreak VC++ build. Do not use identifiers starting with _ as they are reserved and
can collide with system defined names.  Windows defines _BB, for example.

llvm-svn: 35066
2007-03-12 17:56:27 +00:00
Reid Spencer 1791f23803 Add an APInt version of SimplifyDemandedBits.
Patch by Zhou Sheng.

llvm-svn: 35064
2007-03-12 17:25:59 +00:00
Reid Spencer d9281784be Add an APInt version of ShrinkDemandedConstant.
Patch by Zhou Sheng.

llvm-svn: 35063
2007-03-12 17:15:10 +00:00
Zhou Sheng be171ee5cd Avoid to assert on "(KnownZero & KnownOne) == 0".
llvm-svn: 35062
2007-03-12 16:54:56 +00:00
Zhou Sheng b3e00c4656 In function ComputeMaskedBits():
1. Replace getSignedMinValue() with getSignBit() for better code readability.
  2. Replace APIntOps::shl() with operator<<= for convenience.
  3. Make APInt construction more effective.

llvm-svn: 35060
2007-03-12 05:44:52 +00:00
Nick Lewycky d9bd0bc3e2 Add value ranges. Currently inefficient in both execution time and
optimization power.

llvm-svn: 35058
2007-03-10 18:12:48 +00:00
Anton Korobeynikov 8a6dc102d3 Use range tests in LowerSwitch, where possible
llvm-svn: 35057
2007-03-10 16:46:28 +00:00
Devang Patel 5f50e61d52 Remove dead comments.
llvm-svn: 35053
2007-03-09 23:41:03 +00:00
Devang Patel bda1250624 Avoid recursion. Use iterative algorithm for RenamePass().
llvm-svn: 35052
2007-03-09 23:39:14 +00:00
Devang Patel 58818c530f Increment iterator now because IVUseShouldUsePostIncValue may remove
User from the list of I users.

llvm-svn: 35051
2007-03-09 21:19:53 +00:00
Zhou Sheng d1eb3d593e Fix a bug in function ComputeMaskedBits().
llvm-svn: 35027
2007-03-08 15:15:18 +00:00
Chris Lattner abd3bff4f2 This appears correct, enable it so we can see perf changes on testers
llvm-svn: 35024
2007-03-08 07:03:55 +00:00
Chris Lattner 9f022d550b Second half of PR1226. This is currently still disabled, until I have a chance to
do the correctness/performance analysis testing.

llvm-svn: 35023
2007-03-08 06:36:54 +00:00
Zhou Sheng 387d7b1a35 Fix a bug in APIntified ComputeMaskedBits().
llvm-svn: 35022
2007-03-08 05:42:00 +00:00
Reid Spencer bb5741fb02 For PR1205:
Provide an APIntified version of MaskedValueIsZero. This will (temporarily)
cause a "defined but not used" message from the compiler. It will be used
in the next patch in this series.

Patch by Sheng Zhou.

llvm-svn: 35019
2007-03-08 01:52:58 +00:00
Reid Spencer aa69640b10 For PR1205:
Add a new ComputeMaskedBits function that is APIntified. We'll slowly
convert things over to use this version. When its all done, we'll remove
the existing version.

llvm-svn: 35018
2007-03-08 01:46:38 +00:00
Devang Patel 2ac57e1f02 Now IndVarSimplify is a LoopPass.
llvm-svn: 35003
2007-03-07 06:39:01 +00:00
Devang Patel 69730c96db Now LICM is a LoopPass.
llvm-svn: 35001
2007-03-07 04:41:30 +00:00
Devang Patel 9779e56c04 Now LoopUnroll is a LoopPass.
llvm-svn: 34996
2007-03-07 01:38:05 +00:00
Devang Patel 901a27d892 Now LoopUnswitch is a LoopPass.
llvm-svn: 34992
2007-03-07 00:26:10 +00:00
Devang Patel b0743b5d6a Now LoopStrengthReduce is a LoopPass.
llvm-svn: 34984
2007-03-06 21:14:09 +00:00
Reid Spencer 3939b1a274 Remove an unnecessary if statement and adjust indentation.
llvm-svn: 34939
2007-03-05 23:36:13 +00:00
Chris Lattner 66e6a8229a This is the first major step of implementing PR1226. We now successfully
scalarrepl things down to elements, but mem2reg can't promote elements that
are memset/memcpy'd.  Until then, the code is disabled "0 &&".

llvm-svn: 34924
2007-03-05 07:52:57 +00:00
Chris Lattner fe53cf2459 fix a subtle bug that caused an MSVC warning. Thanks to Jeffc for pointing this out.
llvm-svn: 34920
2007-03-05 00:11:19 +00:00
Chris Lattner 5fdded1d2f Add some simplifications for demanded bits, this allows instcombine to turn:
define i64 @test(i64 %A, i32 %B) {
        %tmp12 = zext i32 %B to i64             ; <i64> [#uses=1]
        %tmp3 = shl i64 %tmp12, 32              ; <i64> [#uses=1]
        %tmp5 = add i64 %tmp3, %A               ; <i64> [#uses=1]
        %tmp6 = and i64 %tmp5, 123              ; <i64> [#uses=1]
        ret i64 %tmp6
}

into:

define i64 @test(i64 %A, i32 %B) {
        %tmp6 = and i64 %A, 123         ; <i64> [#uses=1]
        ret i64 %tmp6
}

This implements Transforms/InstCombine/add2.ll:test1

llvm-svn: 34919
2007-03-05 00:02:29 +00:00
Jeff Cohen b622c11f77 Unbreak VC++ build.
llvm-svn: 34917
2007-03-05 00:00:42 +00:00
Chris Lattner ab2f913b68 simplify some code
llvm-svn: 34914
2007-03-04 23:16:36 +00:00
Chris Lattner c33fd469ef minor cleanups
llvm-svn: 34904
2007-03-04 04:50:21 +00:00
Chris Lattner 8258b44b22 Speed up -instcombine by 20% by avoiding a particularly expensive passmgr call.
llvm-svn: 34902
2007-03-04 04:27:24 +00:00
Chris Lattner a5403a587c switch MarkAliveBlocks over to using SmallPtrSet instead of std::set, speeding
up simplifycfg by 20%

llvm-svn: 34901
2007-03-04 04:20:48 +00:00
Chris Lattner d7b4c92cd0 make better use of LCSSA information in RewriteLoopExitValues. Before, we
would scan the entire loop body, then scan all users of instructions in the
loop, looking for users outside the loop.  Now, since we know that the
loop is in LCSSA form, we know that any users outside the loop will be LCSSA
phi nodes.  Just scan them.

This speeds up indvars significantly.

llvm-svn: 34898
2007-03-04 03:43:23 +00:00
Chris Lattner 1f7648efba Implement PR1179/PR1232 and test/Transforms/IndVarsSimplify/loop_evaluate_[234].ll
This makes -indvars require and use LCSSA, updating it as appropriate.

llvm-svn: 34896
2007-03-04 01:00:28 +00:00
Chris Lattner ed30abf0cb Make RewriteLoopExitValues far less nested by using continue in the loop
llvm-svn: 34891
2007-03-03 22:48:48 +00:00
Chris Lattner da1d04a057 my recent change caused a failure in a bswap testcase, because it changed
the order that instcombine processed instructions in the testcase.  The end
result is that instcombine finished with:

define i16 @test1(i16 %a) {
        %tmp = zext i16 %a to i32               ; <i32> [#uses=2]
        %tmp21 = lshr i32 %tmp, 8               ; <i32> [#uses=1]
        %tmp5 = shl i32 %tmp, 8         ; <i32> [#uses=1]
        %tmp.upgrd.32 = or i32 %tmp21, %tmp5            ; <i32> [#uses=1]
        %tmp.upgrd.3 = trunc i32 %tmp.upgrd.32 to i16           ; <i16> [#uses=1]
        ret i16 %tmp.upgrd.3
}

which can't get matched as a bswap.

This patch makes instcombine more sophisticated about removing truncating
casts, allowing it to turn this into:

define i16 @test2(i16 %a) {
        %tmp211 = lshr i16 %a, 8
        %tmp52 = shl i16 %a, 8
        %tmp.upgrd.323 = or i16 %tmp211, %tmp52
        ret i16 %tmp.upgrd.323
}

which then matches as bswap.  This fixes bswap.ll and implements
InstCombine/cast2.ll:test[12].  This also implements cast elimination of
add/sub.

llvm-svn: 34870
2007-03-03 05:27:34 +00:00
Nick Lewycky db42295ff2 Translate bit operations to English.
llvm-svn: 34868
2007-03-03 03:14:40 +00:00
Chris Lattner 960a543037 add a top-level iteration loop to instcombine. This means that it will never
finish without combining something it is capable of.

llvm-svn: 34865
2007-03-03 02:04:50 +00:00
Reid Spencer c34dedf686 APIntify this pass.
llvm-svn: 34863
2007-03-03 00:48:31 +00:00
Reid Spencer 53a3739c80 Finally get this patch right :)
Replace expensive getZExtValue() == 0 calls with isZero() calls.

llvm-svn: 34861
2007-03-02 23:51:25 +00:00
Reid Spencer ba547cbb2a Dang, I've done that twice now! Undo previous commit.
llvm-svn: 34860
2007-03-02 23:37:53 +00:00
Reid Spencer 558990e189 Use more efficient test for one value in a ConstantInt.
llvm-svn: 34859
2007-03-02 23:35:28 +00:00
Reid Spencer 29fe20a98b Guard against huge loop trip counts in an APInt safe way.
llvm-svn: 34858
2007-03-02 23:31:34 +00:00
Reid Spencer dec03a08d6 Make sure debug code is not evaluated in non-debug case.
llvm-svn: 34856
2007-03-02 23:15:21 +00:00
Reid Spencer 1e102971d2 1. Sort switch cases using APInt safe comparison.
2. Make sure debug output of APInt values is safe for all bit widths.

llvm-svn: 34855
2007-03-02 23:05:28 +00:00
Reid Spencer 43376a74af Use APInt safe isOne() method on ConstantInt instead of getZExtValue()==1
llvm-svn: 34854
2007-03-02 23:03:17 +00:00
Reid Spencer bb38d79ad6 Make sorting of ConstantInt be APInt clean through use of ult function.
llvm-svn: 34853
2007-03-02 23:01:14 +00:00
Chris Lattner b15e2b182f Fix a significant algorithm problem with the instcombine worklist. removing
a value from the worklist required scanning the entire worklist to remove all
entries.  We now use a combination map+vector to prevent duplicates from
happening and prevent the scan.  This speeds up instcombine on a large file
from the llvm-gcc bootstrap from 189.7s to 4.84s in a debug build and from
5.04s to 1.37s in a release build.

llvm-svn: 34848
2007-03-02 21:28:56 +00:00
Chris Lattner 51f5457ad4 minor cleanup
llvm-svn: 34846
2007-03-02 19:59:19 +00:00
Chris Lattner 4bd8cda3f0 switch the inliner from being recursive to being iterative.
llvm-svn: 34832
2007-03-02 03:11:20 +00:00
Reid Spencer 197adfaa0a Reverse a premature commital.
llvm-svn: 34822
2007-03-02 00:31:39 +00:00
Reid Spencer 2e54a15943 Prefer non-virtual calls to ConstantInt::isZero over virtual calls to
Constant::isNullValue() in situations where it is possible.

llvm-svn: 34821
2007-03-02 00:28:52 +00:00
Reid Spencer fa63226751 Although probably not necessary, guard against a potential assertion by
using isNullValue() instead of getZExtValue() == 0.

llvm-svn: 34815
2007-03-01 21:54:37 +00:00
Reid Spencer 17797076ef Use isUnitValue() instead of getZExtValue() == 1 which will prevent an
assert if the ConstantInt's value is large.

llvm-svn: 34814
2007-03-01 21:51:23 +00:00
Reid Spencer 5b0548de77 Use APInt conversion to string so the result is correct regardless of the
bit width of the ConstantInt being converted.

llvm-svn: 34810
2007-03-01 21:00:32 +00:00
Reid Spencer 24f1a0e78f The 64-bit constructor for ConstantInt changes from int64_t to uint64_t.
This caused a warning for construction with -1. Avoid the warning by using
-1ULL instead.

llvm-svn: 34796
2007-03-01 19:33:52 +00:00
Reid Spencer 6a44033465 Remove the "isSigned" parameters from ConstantRange. It turns out they
are not needed as the results are the same with or without it.

Patch by Nicholas Lewycky.

llvm-svn: 34782
2007-03-01 07:54:15 +00:00
Reid Spencer d373b9dc59 For PR1205:
Adjust to changes in ConstantRange interface.

llvm-svn: 34762
2007-02-28 22:03:51 +00:00
Reid Spencer 3a7e9d8e75 For PR1205:
Remove ConstantInt from ConstantRange interface and adjust its users to
compensate.

llvm-svn: 34758
2007-02-28 19:57:34 +00:00
Reid Spencer 56f784d12d For PR1205:
First round of ConstantRange changes. This makes all CR constructors use
only APInt and not use ConstantInt. Clients are adjusted accordingly.

llvm-svn: 34756
2007-02-28 18:57:32 +00:00
Devang Patel 97517ff930 Use efficient container SmallPtrSet
llvm-svn: 34640
2007-02-26 20:22:50 +00:00
Devang Patel 967b84c681 Do not unswitch loop on same value again and again.
llvm-svn: 34638
2007-02-26 19:31:58 +00:00
Chris Lattner c4d8e7e614 Fix InstCombine/2007-02-23-PhiFoldInfLoop.ll and PR1217
llvm-svn: 34546
2007-02-24 01:03:45 +00:00
Chris Lattner 1e48acb858 fix an obscure and tricky bug the inliner can hit sometimes.
llvm-svn: 34531
2007-02-23 19:54:30 +00:00
Jim Laskey d879dfbf1c Revert changes for a simplier solution.
llvm-svn: 34495
2007-02-22 16:21:18 +00:00
Jim Laskey e4ccf22c34 Itanium ABI exception handing support.
llvm-svn: 34480
2007-02-21 22:49:50 +00:00
Dan Gohman 8c8597c4d9 Fix typos in comments.
llvm-svn: 34456
2007-02-20 20:52:03 +00:00
Chris Lattner c35fe713ff remove reoptimizer-specific passes
llvm-svn: 34439
2007-02-20 05:31:49 +00:00
Chris Lattner b5f6d0c15a eliminate use of deprecated apis
llvm-svn: 34417
2007-02-19 07:34:47 +00:00
Chris Lattner 9f4707eb04 fix comment
llvm-svn: 34395
2007-02-18 22:10:58 +00:00
Chris Lattner a6f54c0e2c simplify pass, delete dead gvar protos as well.
llvm-svn: 34394
2007-02-18 22:10:34 +00:00
Chris Lattner 99c6cf60f1 convert more vectors to smallvectors, 2.8% speedup
llvm-svn: 34333
2007-02-15 22:52:10 +00:00
Chris Lattner af6094fe3f change some vectors to smallvectors. This speeds up instcombine on 447.dealII
by 5%.

llvm-svn: 34332
2007-02-15 22:48:32 +00:00
Chris Lattner 7907e5fe07 switch an std::set to a SmallPtr set, this speeds up instcombine by 9.5%
on 447.dealII

llvm-svn: 34323
2007-02-15 19:41:52 +00:00
Reid Spencer 09575bac2e For PR1195:
Change use of "packed" term to "vector" in comments, strings, variable
names, etc.

llvm-svn: 34300
2007-02-15 03:39:18 +00:00
Reid Spencer 537ee02f89 Change an assert that mentions Packed Type -> Vector Type.
llvm-svn: 34298
2007-02-15 03:11:20 +00:00
Reid Spencer d84d35ba70 For PR1195:
Rename PackedType -> VectorType, ConstantPacked -> ConstantVector, and
PackedTyID -> VectorTyID. No functional changes.

llvm-svn: 34293
2007-02-15 02:26:10 +00:00
Chris Lattner 945e437c65 Generalize TargetData strings, to support more interesting forms of data.
Patch by Scott Michel.

llvm-svn: 34266
2007-02-14 05:52:17 +00:00
Chris Lattner ade1c2bb51 eliminate a bunch of vector-related heap traffic
llvm-svn: 34222
2007-02-13 05:58:53 +00:00
Chris Lattner a06a8fd2d7 Eliminate use of ctors that take vectors.
llvm-svn: 34219
2007-02-13 02:10:56 +00:00
Chris Lattner a731513406 stop using methods that take vectors.
llvm-svn: 34205
2007-02-12 22:56:41 +00:00
Chris Lattner 32ab643df7 Switch ValueSymbolTable to use StringMap<Value*> instead of std::map<std::string, Value*>
as its main datastructure.  There are many improvements yet to be made, but
this speeds up opt --std-compile-opts on 447.dealII by 7.3%.

llvm-svn: 34193
2007-02-12 05:18:08 +00:00
Chris Lattner 8dd4cae4f8 simplify code by using Value::takeName
llvm-svn: 34177
2007-02-11 01:37:51 +00:00
Chris Lattner 6e0123b17f Simplify code by using value::takename
llvm-svn: 34176
2007-02-11 01:23:03 +00:00
Chris Lattner 8d4c36bb40 simplify name juggling through the use of Value::takeName.
llvm-svn: 34175
2007-02-11 01:08:35 +00:00
Chris Lattner c473d8e431 Privatize StructLayout::MemberOffsets, adding an accessor
llvm-svn: 34156
2007-02-10 19:55:17 +00:00
Chris Lattner bf6286ba04 Fix Transforms/DeadArgElim/2007-02-07-FuncRename.ll, fallout from PR411.
This happened because deadargelim now causes VMCore to auto-rename every
function that it hacks arguments out of.  Because it hacks arguments out of
functions in a non-deterministic order, this caused the resultant numbering
to be nondet.  The fix is to just be careful to not rename functions!

llvm-svn: 34005
2007-02-07 19:31:33 +00:00
Chris Lattner 88051b0fad shrink vmcore by moving symbol table stripping support out of VMCore into
the one IPO pass that uses it.

llvm-svn: 33990
2007-02-07 06:22:45 +00:00
Chris Lattner 430c9217f0 redesign the primary datastructure used by mem2reg to eliminate an
std::map of std::vector's (ouch!).  This speeds up mem2reg by 10% on 176.gcc.

llvm-svn: 33974
2007-02-07 01:15:04 +00:00
Chris Lattner c85e79f3e0 With the last change, we no longer need both directions of mapping from
BBNumbers.  Instead of using a bi-directional mapping, just use a single
densemap.  This speeds up mem2reg on 176.gcc by 8%, from  1.3489 to
1.2485s.

llvm-svn: 33940
2007-02-05 23:37:20 +00:00
Reid Spencer 557ab15e71 Apply the VISIBILITY_HIDDEN field to the remaining anonymous classes in
the Transforms library. This reduces debug library size by 132 KB, debug
binary size by 376 KB, and reduces link time for llvm tools slightly.

llvm-svn: 33939
2007-02-05 23:32:05 +00:00
Chris Lattner 52da61fb5c Simplify use of DFBlocks, this makes no noticable performance difference,
but paves the way to eliminate BBNumbers.

llvm-svn: 33938
2007-02-05 23:31:26 +00:00
Reid Spencer 193abd95c9 This file should have been removed when -raise was removed. It isn't
used any more.

llvm-svn: 33937
2007-02-05 23:27:02 +00:00
Chris Lattner bf67b1229b Switch InsertedPHINodes back to SmallPtrSet now that the SmallPtrSet::erase
bug is fixed.

llvm-svn: 33932
2007-02-05 23:11:37 +00:00
Chris Lattner 606dde0093 switch a SmallPtrSet back to an std::set for now, this caused problems.
llvm-svn: 33930
2007-02-05 22:28:52 +00:00
Chris Lattner 1ed84bbd2d switch an std::set over to a SmallPtrSet, speeding up mem2reg 6% on 176.gcc.
llvm-svn: 33929
2007-02-05 22:15:21 +00:00
Chris Lattner 70fbb9de4c switch an std::set over to SmallPtrSet, speeding up mem2reg 3.4% on 176.gcc.
llvm-svn: 33928
2007-02-05 22:13:11 +00:00
Chris Lattner 8fbc888d91 eliminate some malloc traffic, this speeds up mem2reg by 3.4%.
llvm-svn: 33927
2007-02-05 21:58:48 +00:00
Reid Spencer ca3bf1ad85 Add missing and needed #include.
llvm-svn: 33926
2007-02-05 21:47:39 +00:00
Reid Spencer 35a0718d82 Make the class VISIBILITY_HIDDEN.
Reduce lexical size of the anonymous namespace.

llvm-svn: 33925
2007-02-05 21:45:12 +00:00
Reid Spencer 1241d6d5ab For PR411:
Adjust to changes in Module interface:
getMainFunction() -> getFunction("main")
getNamedFunction(X) -> getFunction(X)

llvm-svn: 33922
2007-02-05 21:19:13 +00:00
Reid Spencer 3aaaa0b2bd For PR411:
This patch replaces the SymbolTable class with ValueSymbolTable which does
not support types planes. This means that all symbol names in LLVM must now
be unique. The patch addresses the necessary changes to deal with this and
removes code no longer needed as a result. This completes the bulk of the
changes for this PR. Some cleanup patches will follow.

llvm-svn: 33918
2007-02-05 20:47:22 +00:00
Reid Spencer e84cf92141 For PR411:
This pass is no longer needed.

llvm-svn: 33917
2007-02-05 20:41:05 +00:00
Reid Spencer ba09a3e5f0 Create a pass to strip dead function declarations (prototypes). This is
for use by llvm-extract and bugpoint.

llvm-svn: 33916
2007-02-05 20:24:25 +00:00
Chris Lattner 83ac5ae9f3 Fix miscompilations of consumer-typeset, telecomm-gsm, and 176.gcc.
llvm-svn: 33902
2007-02-05 05:57:49 +00:00
Reid Spencer a1d35926b7 For PR1177:
Revert last patch which caused iteration invalidation.

llvm-svn: 33901
2007-02-05 05:23:32 +00:00
Chris Lattner 0a28e90f2c fix a miscompilation of 176.gcc
llvm-svn: 33900
2007-02-05 04:09:35 +00:00
Owen Anderson f6fa108993 Use DenseMap for pointer->pointer maps.
llvm-svn: 33897
2007-02-05 02:39:47 +00:00
Chris Lattner 3e009e8b8f rewrite shift/shift folding, now that types are not signed.
llvm-svn: 33892
2007-02-05 00:57:54 +00:00
Nick Lewycky 15245953a5 Fix indenting, remove tabs.
Learn from sext and zext. The destination value falls within the range of the
source type.

Generalize properties regarding constant ints.

Get smarter about marking blocks as unreachable. If 1 >= 2 in order for this
block to execute, then it isn't reachable.

llvm-svn: 33889
2007-02-04 23:43:05 +00:00
Reid Spencer 3f4e6e84dc For PR1163:
Make the Module's dependent library use a std::vector instead of SetVector
adjust #includes in .cpp files because SetVector.h is no longer included.

llvm-svn: 33855
2007-02-04 00:40:42 +00:00
Chris Lattner 6c344e56b1 remove some dead code
llvm-svn: 33845
2007-02-03 23:28:07 +00:00
Reid Spencer 8de97bba5a For PR1072:
Removing -raise has neglible positive or negative side effects so we are
opting to remove it. See the PR for comparison details.

llvm-svn: 33844
2007-02-03 23:15:56 +00:00
Chris Lattner 1bfc7ab6a7 Switch inliner over to use DenseMap instead of std::map for ValueMap. This
speeds up the inliner 16%.

llvm-svn: 33801
2007-02-03 00:08:31 +00:00
Chris Lattner fc8190dbb7 Switch this back to using an std::map. DenseMap entries are getting invalidated
llvm-svn: 33799
2007-02-02 22:36:16 +00:00
Chris Lattner 37d400a83d Remove more malloc thrashing, this speeds up IPSCCP on kimwitu another 6.7%.
llvm-svn: 33796
2007-02-02 21:15:06 +00:00
Chris Lattner 3e667f3e61 Convert an std::set to SmallSet, this speeds up IPSCCP 17% on kimwitu.
llvm-svn: 33794
2007-02-02 20:57:39 +00:00
Chris Lattner 0e7ec675da eliminate a malloc/free for (almost) every GEP processed. This speeds up
IPSCCP 3.3% on kimwitu.

llvm-svn: 33793
2007-02-02 20:51:48 +00:00
Chris Lattner 067d607e0e switch hash_map's over to DenseMap in SCCP. This speeds up SCCP by 30% in
a release-assert build on kimwitu++.

llvm-svn: 33792
2007-02-02 20:38:30 +00:00
Reid Spencer 2f34b98cbf Remove dead code and fix indentation per Chris' review comments.
llvm-svn: 33785
2007-02-02 14:41:37 +00:00
Reid Spencer 0d5f9237b6 Use short form of binary operator create functions.
llvm-svn: 33783
2007-02-02 14:08:20 +00:00
Chris Lattner d5fea61d98 bugfix for reid's shift patch.
llvm-svn: 33779
2007-02-02 05:29:55 +00:00
Reid Spencer 2341c22ec7 Changes to support making the shift instructions be true BinaryOperators.
This feature is needed in order to support shifts of more than 255 bits
on large integer types.  This changes the syntax for llvm assembly to
make shl, ashr and lshr instructions look like a binary operator:
   shl i32 %X, 1
instead of
   shl i32 %X, i8 1
Additionally, this should help a few passes perform additional optimizations.

llvm-svn: 33776
2007-02-02 02:16:23 +00:00
Chris Lattner c904205d28 Fix Transforms/InstCombine/2007-02-01-LoadSinkAlloca.ll, a serious code
pessimization where instcombine can sink a load (good for code size) that
prevents an alloca from being promoted by mem2reg (bad for everything).

llvm-svn: 33771
2007-02-01 22:30:07 +00:00
Reid Spencer 26c642de74 Ensure that ConvertOperandToType generates a result conversion by
initializing the Res variable to 0 and asserting it is not zero after the
result should have been created.

llvm-svn: 33761
2007-02-01 19:14:51 +00:00
Chris Lattner ce494229a1 Fix bugs in the inliner having to do with single-entry phi nodes and valuemap
updating.  These were exposed by Devang's recent passmgr changes (with
non-default passorderings) because now the inliner can be interleved with
the LCSSA pass.

llvm-svn: 33760
2007-02-01 18:48:38 +00:00
Chris Lattner 416a8939c3 remove temporary vectors.
llvm-svn: 33715
2007-01-31 20:08:52 +00:00
Chris Lattner 7a63e7a7ad eliminate temporary vectors
llvm-svn: 33713
2007-01-31 20:07:32 +00:00
Chris Lattner 927653f27f eliminate temporary vectors
llvm-svn: 33712
2007-01-31 19:59:55 +00:00
Chris Lattner 4fc18a4cb8 Revert another incorrectly applied chunk, which fixes InstCombine/vec_insert_to_shuffle.ll
llvm-svn: 33705
2007-01-31 18:09:17 +00:00
Chris Lattner f96f4a874c eliminate temporary vectors
llvm-svn: 33693
2007-01-31 04:40:53 +00:00
Chris Lattner aa17576933 Move symbolic constant folding code to libanalysis.
llvm-svn: 33688
2007-01-31 00:53:10 +00:00
Chris Lattner 024f4ab383 Adjust #includes to match movement of constant folding code from transformutils to libanalysis.
llvm-svn: 33680
2007-01-30 23:46:24 +00:00
Chris Lattner 2ae054adb0 move a bunch of constant folding code f rom Transforms/Utils/Local.cpp into
libanalysis/ConstantFolding.cpp.

llvm-svn: 33679
2007-01-30 23:45:45 +00:00
Chris Lattner 14789a92e1 remove now-dead code.
llvm-svn: 33678
2007-01-30 23:29:47 +00:00
Chris Lattner f94bed3f13 the inliner pass now passes targetdata down through the inliner api's
llvm-svn: 33677
2007-01-30 23:28:39 +00:00
Chris Lattner ad84a730ba The inliner/cloner can now optionally take TargetData info, which can be
used by constant folding.

llvm-svn: 33676
2007-01-30 23:22:39 +00:00
Chris Lattner e3eda25641 pass TD to constant folding apis
llvm-svn: 33674
2007-01-30 23:16:15 +00:00
Chris Lattner 0d74d3c09b use smallvector instead of vector to make constant folding a bit more efficient
llvm-svn: 33672
2007-01-30 23:15:19 +00:00
Chris Lattner 6fc4b46d43 adjust to api change
llvm-svn: 33671
2007-01-30 23:14:52 +00:00
Chris Lattner 2c4610e4ca Change constant folding APIs to take an optional TargetData, and change
ConstantFoldInstOperands/ConstantFoldCall to take a pointer to an array
of operands + size, instead of an std::vector.

In some cases, switch to using a SmallVector instead of a vector.
This allows us to get rid of some special case gross code that was there
to avoid the cost of constructing a vector.

llvm-svn: 33670
2007-01-30 23:13:49 +00:00
Chris Lattner 2b15f2ba9d remove some bits that are not yet meant to land.
llvm-svn: 33666
2007-01-30 22:50:32 +00:00
Chris Lattner 4284f6463a Symbolically evaluate constant expressions like &A[123] - &A[4].f.
This occurs in C++ code like:

#include <iostream>
#include <iterator>
int a[] = { 1, 2, 3, 4, 5 };
int main() {
  using namespace std;
  copy(a, a + sizeof(a)/sizeof(a[0]), ostream_iterator<int>(cout, "\n"));
  return 0;
}

Before we would decide the loop trip count is:
sdiv (i32 sub (i32 ptrtoint (i32* getelementptr ([5 x i32]* @a, i32 0, i32 5) to i32), i32 ptrtoint ([5 x i32]* @a to i32)), i32 4)

Now we decide it is "5".  Amazing.

This code will need to be refactored, but I'm doing that as a separate
commit.

llvm-svn: 33665
2007-01-30 22:32:46 +00:00
Reid Spencer 5301e7c605 For PR1136: Rename GlobalVariable::isExternal as isDeclaration to avoid
confusion with external linkage types.

llvm-svn: 33663
2007-01-30 20:08:39 +00:00
Nick Lewycky 56639800c9 Simplify names of lattice values. SGTUNE becomes SGT, for example.
Fix initializeConstant, now initializeInt. Fixes major performance
bottleneck.

X == Y || X->DominatedBy(Y) is redundant. Remove the X == Y part.

Fix crasher in makeEqual where getOrInsertNode would add a new constant,
producing an NE relationship between the two members we're trying to make
equal. This now allows us to mark more BBs as unreachable.

llvm-svn: 33612
2007-01-29 02:56:54 +00:00
Anton Korobeynikov 037c867b54 Propagate changes from my local tree. This patch includes:
1. New parameter attribute called 'inreg'. It has meaning "place this
parameter in registers, if possible". This is some generalization of
gcc's regparm(n) attribute. It's currently used only in X86-32 backend.
2. Completely rewritten CC handling/lowering code inside X86 backend.
Merged stdcall + c CCs and fastcall + fast CC.
3. Dropped CSRET CC. We cannot add struct return variant for each
target-specific CC (e.g. stdcall + csretcc and so on).
4. Instead of CSRET CC introduced 'sret' parameter attribute. Setting in
on first attribute has meaning 'This is hidden pointer to structure
return. Handle it gently'.
5. Fixed small bug in llvm-extract + add new feature to
FunctionExtraction pass, which relinks all internal-linkaged callees
from deleted function to external linkage. This will allow further
linking everything together.

NOTEs: 1. Documentation will be updated soon.
       2. llvm-upgrade should be improved to translate csret => sret.
          Before this, there will be some unexpected test fails.
llvm-svn: 33597
2007-01-28 13:31:35 +00:00
Chris Lattner c8fb6de78c Fix test/Transforms/InstCombine/2007-01-27-AndICmp.ll, a miscompilation of
Mozilla that Anton tracked down.

llvm-svn: 33591
2007-01-27 23:08:34 +00:00
Jim Laskey c56315c2b5 Change the MachineDebugInfo to MachineModuleInfo to better reflect usage
for debugging and exception handling.

llvm-svn: 33550
2007-01-26 21:22:28 +00:00
Reid Spencer 3ac38e99b9 For PR761:
The Module::setEndianness and Module::setPointerSize methods have been
removed. Instead you can get/set the DataLayout. Adjust thise accordingly.

llvm-svn: 33530
2007-01-26 08:11:39 +00:00
Devang Patel 13058a5ae9 Inherit CallGraphSCCPass directly from Pass.
llvm-svn: 33514
2007-01-26 00:47:38 +00:00
Devang Patel 5292e65791 Inherit BasicBlockPass directly from Pass.
llvm-svn: 33511
2007-01-25 23:23:25 +00:00
Chris Lattner 79f08506f1 Make llvm-extract preserve the callingconv of prototypes in the extracted
code.

llvm-svn: 33500
2007-01-25 17:38:26 +00:00
Reid Spencer 31a4ef4dc1 Cleanup checks in the load and store of casted pointer transforms. Two
changes: (1) don't special case for i1 any more, (2) use the new
TargetData::getTypeSizeInBits method to ensure source and dest are the
same bit width.

llvm-svn: 33427
2007-01-22 05:51:25 +00:00
Reid Spencer 2eadb5310d For PR970:
Clean up handling of isFloatingPoint() and dealing with PackedType.
Patch by Gordon Henriksen!

llvm-svn: 33415
2007-01-21 00:29:26 +00:00
Reid Spencer 9a4bed06dd Revise the store V, (cast P) -> store (cast V) -> P transform.
We only want to do this if the src and destination types have the same
bit width. This patch uses TargetData::getTypeSizeInBits() instead of
making a special case for integer types and avoiding the transform if
they don't match.

llvm-svn: 33414
2007-01-20 23:35:48 +00:00
Chris Lattner 50ee0e40e5 Teach TargetData to handle 'preferred' alignment for each target, and use
these alignment amounts to align scalars when we can.  Patch by Scott Michel!

llvm-svn: 33409
2007-01-20 22:35:55 +00:00
Owen Anderson dfd79ad319 Correct a comment.
llvm-svn: 33397
2007-01-20 10:07:23 +00:00
Reid Spencer e928a15c9e For this transform: store V, (cast P) -> store (cast V), P
don't allow the transform if V and the pointer's element type are different
width integer types.

llvm-svn: 33371
2007-01-19 21:20:31 +00:00
Reid Spencer a94d394ad2 For PR1043:
This is the final patch for this PR. It implements some minor cleanup
in the use of IntegerType, to wit:
1. Type::getIntegerTypeMask -> IntegerType::getBitMask
2. Type::Int*Ty changed to IntegerType* from Type*
3. ConstantInt::getType() returns IntegerType* now, not Type*

This also fixes PR1120.

Patch by Sheng Zhou.

llvm-svn: 33370
2007-01-19 21:13:56 +00:00
Chris Lattner 120ab038eb Fix InstCombine/2007-01-18-VectorInfLoop.ll, a case where instcombine
infinitely loops.

llvm-svn: 33343
2007-01-18 22:16:33 +00:00
Reid Spencer c050af9126 Clean up some code around the store V, (cast P) -> store (cast V), P
transform. Change some variable names so it is clear what is source and
what is dest of the cast. Also, add an assert to ensure that the integer
to integer case is asserting if the bitwidths are different. This prevents
illegal casts from being formed and catches bitwidth bugs sooner.

llvm-svn: 33337
2007-01-18 18:54:33 +00:00
Reid Spencer a8a1547370 For PR1094:
Adjust the use of SetVector for changes in SetVector's interface.
Patch by Gordon Henriksen.

llvm-svn: 33280
2007-01-17 02:23:37 +00:00
Chris Lattner 479a9fc492 Fix a regression in my isIntegral patch that broke 471.omnetpp. This is
because TargetData::getTypeSize() returns the same for i1 and i8.  This fix
is not right for the full generality of bitwise types, but it fixes the
regression.

llvm-svn: 33237
2007-01-15 17:55:20 +00:00
Nick Lewycky 6ce36cff3a Don't print address of ETNode. Print the DFSNumIn which uniquely identifies
the basic block and is stable across runs in gdb or valgrind.

Make Node::update handle edges which dominate and are tighter than
existing edges.

Replace makeEqual's "squeeze theorem" code. Fixes miscompilation.

Gate the calls to defToOps and opsToDef. Before this, we were getting IG
edges about values which weren't even defined in the dominated area. This
reduces the size of the IG by about half.

llvm-svn: 33236
2007-01-15 14:30:07 +00:00
Chris Lattner c8dcede292 Implement InstCombine/phi.ll:test7, deletion of trivial value loops for
induction variables.

llvm-svn: 33234
2007-01-15 07:30:06 +00:00
Chris Lattner 27df1db485 simplify some code now that types are signless
llvm-svn: 33232
2007-01-15 07:02:54 +00:00
Chris Lattner a4beeef76c delete stores to allocas with one use. This is a trivial form of DSE which
often kicks in for ?: expressions.

llvm-svn: 33231
2007-01-15 06:51:56 +00:00
Chris Lattner 03c4953cdd rename Type::isIntegral to Type::isInteger, eliminating the old Type::isInteger.
rename Type::getIntegralTypeMask to Type::getIntegerTypeMask.

This makes naming much more consistent.  For example, there are now no longer any
instances of IntegerType that are not considered isInteger! :)

llvm-svn: 33225
2007-01-15 02:27:26 +00:00
Chris Lattner 1942249c5b Eliminate calls to isInteger, generalizing code and tightening checks as needed.
llvm-svn: 33218
2007-01-15 01:55:30 +00:00
Chris Lattner f739d01059 Fix Analysis/Dominators/2006-10-02-BreakCritEdges.ll
llvm-svn: 33210
2007-01-15 00:15:09 +00:00
Chris Lattner 6ee923f3bb instcombine has always been miscompiling fcmp x, x, disregarding possible
NANs.  This fixes PR1111 and Transforms/InstCombine/2007-01-14-FcmpSelf.ll

llvm-svn: 33208
2007-01-14 19:42:17 +00:00
Chris Lattner 9818a6fd76 Fix PR1110 and Analysis/Dominators/2007-01-14-BreakCritEdges.ll by being
more careful about unreachable code when updating dominator info.

llvm-svn: 33204
2007-01-14 18:33:35 +00:00
Chris Lattner 387bf3f700 Fix Transforms/InstCombine/2007-01-13-ExtCompareMiscompile.ll, which is part
of PR1107

llvm-svn: 33185
2007-01-13 23:11:38 +00:00
Reid Spencer 47bb5c996e Fix indentation to prior level for easier diffs.
llvm-svn: 33184
2007-01-13 05:10:53 +00:00
Nick Lewycky 4294446fcb "Default context" blocks can occur after a non-default one. This meant
that properties were being applied where they didn't belong. Fixes crash
in new MiBench testcase.

Also mark debugging code as such in #ifdef.

llvm-svn: 33177
2007-01-13 02:05:28 +00:00
Chris Lattner ff7434a526 Fix a minor bug handling constant exprs, introduced by a recent patch.
llvm-svn: 33175
2007-01-13 00:42:58 +00:00
Chris Lattner ca82a908e3 fix a bug in a recent patch
llvm-svn: 33164
2007-01-13 00:02:49 +00:00
Chris Lattner f5e5236b57 simplify some code
llvm-svn: 33150
2007-01-12 22:51:20 +00:00
Chris Lattner 3b6058c278 Remove over-general comparisons
llvm-svn: 33147
2007-01-12 22:49:11 +00:00
Chris Lattner e3721e3002 eliminate redundant check
llvm-svn: 33132
2007-01-12 18:35:11 +00:00
Chris Lattner 15649084e9 Branch conditions must be i1
llvm-svn: 33129
2007-01-12 18:30:11 +00:00
Reid Spencer 7a9c62baa6 For PR1064:
Implement the arbitrary bit-width integer feature. The feature allows
integers of any bitwidth (up to 64) to be defined instead of just 1, 8,
16, 32, and 64 bit integers.

This change does several things:
1. Introduces a new Derived Type, IntegerType, to represent the number of
   bits in an integer. The Type classes SubclassData field is used to
   store the number of bits. This allows 2^23 bits in an integer type.
2. Removes the five integer Type::TypeID values for the 1, 8, 16, 32 and
   64-bit integers. These are replaced with just IntegerType which is not
   a primitive any more.
3. Adjust the rest of LLVM to account for this change.

Note that while this incremental change lays the foundation for arbitrary
bit-width integers, LLVM has not yet been converted to actually deal with
them in any significant way. Most optimization passes, for example, will
still only deal with the byte-width integer types.  Future increments
will rectify this situation.

llvm-svn: 33113
2007-01-12 07:05:14 +00:00
Reid Spencer cddc9dfe97 Implement review feedback for the ConstantBool->ConstantInt merge. Chris
recommended that getBoolValue be replaced with getZExtValue and that
get(bool) be replaced by get(const Type*, uint64_t). This implements
those changes.

llvm-svn: 33110
2007-01-12 04:24:46 +00:00
Nick Lewycky ee32ee0250 If we know that it's a constant being casted, propagate through the cast
instruction. Doesn't work the other way though (can't recover bits that
have been truncated).

llvm-svn: 33104
2007-01-12 01:23:53 +00:00
Nick Lewycky 4a74a75bbb Clean up logic after ConstantBool removal.
llvm-svn: 33096
2007-01-12 00:02:12 +00:00
Reid Spencer 542964f55b Rename BoolTy as Int1Ty. Patch by Sheng Zhou.
llvm-svn: 33076
2007-01-11 18:21:29 +00:00
Zhou Sheng bd23db9968 Remove unnecessary boolean type check.
llvm-svn: 33075
2007-01-11 14:38:17 +00:00
Zhou Sheng 75b871fb1e For PR1043:
Merge ConstantIntegral and ConstantBool into ConstantInt.
Remove ConstantIntegral and ConstantBool from LLVM.

llvm-svn: 33073
2007-01-11 12:24:14 +00:00
Zhou Sheng 691b263e07 Fixed indentation.
llvm-svn: 33072
2007-01-11 10:33:26 +00:00
Nick Lewycky 5d6ede524a Quiet compiler warning. The only reason the function is marked virtual
is so that it can be called from inside a debugger.

llvm-svn: 33067
2007-01-11 02:38:21 +00:00
Nick Lewycky 2fc338f923 New predicate simplifier!
Please do not enable, there is still some known miscompile problem.

llvm-svn: 33066
2007-01-11 02:32:38 +00:00
Chris Lattner 8571caa99b Fix a bug in heap-sra that caused compilation failure of office-ispell.
llvm-svn: 33043
2007-01-09 23:29:37 +00:00
Jeff Cohen 223004cd12 Unbreak VC++ build.
llvm-svn: 33021
2007-01-08 20:17:17 +00:00
Reid Spencer 8f166b0ef3 Comparison of primitive type sizes should now be done in bits, not bytes.
This patch converts getPrimitiveSize to getPrimitiveSizeInBits where it is
appropriate to do so (comparison of integer primitive types).

llvm-svn: 33012
2007-01-08 16:32:00 +00:00
Reid Spencer bf96e02a54 For PR1097:
Enable complex addressing modes on 64-bit platforms involving two induction
variables by keeping a size and scale in 64-bits not 32.
Patch by Dan Gohman.

llvm-svn: 33011
2007-01-08 16:17:51 +00:00
Reid Spencer 4f98e62831 Types should be const.
llvm-svn: 33001
2007-01-07 21:45:41 +00:00
Chris Lattner 950d0e9926 this pass is unused
llvm-svn: 32998
2007-01-07 18:12:43 +00:00
Chris Lattner 34acba48cc Change the interface to Module::getOrInsertFunction to be easier to use,to resolve PR1088, and to help PR411.
This simplifies many clients also

llvm-svn: 32989
2007-01-07 08:12:01 +00:00
Chris Lattner d97f1936bb prepare for adjustment to getOrInsertFunction method
llvm-svn: 32985
2007-01-07 07:54:34 +00:00
Chris Lattner cc4715e06e relax some types
llvm-svn: 32982
2007-01-07 07:22:20 +00:00
Chris Lattner 9641ab26ec relax types
llvm-svn: 32981
2007-01-07 06:59:47 +00:00
Chris Lattner fbc524fe87 relax some types
llvm-svn: 32980
2007-01-07 06:58:05 +00:00
Chris Lattner 0816559b13 add -debug output for -indvars.
llvm-svn: 32971
2007-01-07 01:14:12 +00:00
Chris Lattner 7051d758de Fix regressions in InstCombine/call-cast-target.ll and InstCombine/2003-11-13-ConstExprCastCall.ll
llvm-svn: 32959
2007-01-06 19:53:32 +00:00
Reid Spencer 32af9e8cc5 For PR411:
Take an incremental step towards type plane elimination. This change
separates types from values in the symbol tables by finally making use
of the TypeSymbolTable class. This yields more natural interfaces for
dealing with types and unclutters the SymbolTable class.

llvm-svn: 32956
2007-01-06 07:24:44 +00:00
Chris Lattner c343a99786 this final call to canLosslesslyBitCastTo is dead, because ValueRequiresCast
is only called on integers.

llvm-svn: 32949
2007-01-06 02:11:56 +00:00
Chris Lattner 400f959a0c simplify some more code now that there are not multiple different integer
types of the same size

llvm-svn: 32948
2007-01-06 02:09:32 +00:00
Chris Lattner 64d87b0215 eliminate some uses of canLosslesslyBitCastTo, this actually makes the code stronger, by nuking
relational pointer comparisons with casts.

llvm-svn: 32947
2007-01-06 01:45:59 +00:00
Chris Lattner 3fe98ae10a no need to worry about int vs uint any more.
llvm-svn: 32946
2007-01-06 01:37:35 +00:00
Chris Lattner d7b6ea166d Implement InstCombine/vec_shuffle.ll:%test7, simplifying shuffles with
undef operands.

llvm-svn: 32899
2007-01-05 07:36:08 +00:00
Chris Lattner 17c7c030c2 fold things like a^b != c^a -> b != c. This implements InstCombine/xor.ll:test27
llvm-svn: 32893
2007-01-05 03:04:57 +00:00
Chris Lattner 23eb8ec78b Compile X + ~X to -1. This implements Instcombine/add.ll:test34
llvm-svn: 32890
2007-01-05 02:17:46 +00:00
Reid Spencer 6ff3e73db6 Death to useless bitcast instructions!
llvm-svn: 32866
2007-01-04 05:23:51 +00:00
Chris Lattner 806adafd95 Enable a couple xforms for packed vectors (undef | v) -> -1 for packed.
llvm-svn: 32858
2007-01-04 02:12:40 +00:00
Jim Laskey c4ba9c161b Vectors are not supported by ConstantInt::getAllOnesValue.
llvm-svn: 32827
2007-01-03 00:11:03 +00:00
Reid Spencer e8a74ee5ea Fix a typo.
llvm-svn: 32803
2006-12-31 22:26:06 +00:00
Reid Spencer c635f47d9a For PR950:
This patch replaces signed integer types with signless ones:
1. [US]Byte -> Int8
2. [U]Short -> Int16
3. [U]Int   -> Int32
4. [U]Long  -> Int64.
5. Removal of isSigned, isUnsigned, getSignedVersion, getUnsignedVersion
   and other methods related to signedness. In a few places this warranted
   identifying the signedness information from other sources.

llvm-svn: 32785
2006-12-31 05:48:39 +00:00
Reid Spencer 193df25eb9 For PR1066:
Fix this by ensuring that a bitcast is inserted to do sign switching. This
is only temporarily needed as the merging of signed and unsigned is next
on the SignlessTypes plate.

llvm-svn: 32757
2006-12-24 00:40:59 +00:00
Reid Spencer 910f23f7d7 Shut up some compilers that can't accurately analyze variable usage
correctly and emit "may be used uninitialized" warnings.

llvm-svn: 32756
2006-12-23 19:17:57 +00:00
Reid Spencer 43c77d53ff For PR1065:
Don't allow CmpInst instances to be processed in FoldSelectOpOp because
you can't easily swap their operands.

llvm-svn: 32753
2006-12-23 18:58:04 +00:00
Reid Spencer 266e42b312 For PR950:
This patch removes the SetCC instructions and replaces them with the ICmp
and FCmp instructions. The SetCondInst instruction has been removed and
been replaced with ICmpInst and FCmpInst.

llvm-svn: 32751
2006-12-23 06:05:41 +00:00
Chris Lattner f171af97d5 add a simple fast-path for dead allocas
llvm-svn: 32750
2006-12-22 23:14:42 +00:00
Reid Spencer a276d0972c Remove isSigned calls via foreknowledge of main's argument types.
llvm-svn: 32730
2006-12-21 07:49:49 +00:00
Reid Spencer 4720d4d9ef Get rid of a useless if statement whose then and else blocks were identical.
llvm-svn: 32729
2006-12-21 07:15:54 +00:00
Chris Lattner 1847f6ddbd handle undef values much more carefully: generalize the resolveundefbranches
code to handle instructions as well, so that we properly fold things like
X & undef -> 0.
This fixes Transforms/SCCP/2006-12-19-UndefBug.ll

llvm-svn: 32715
2006-12-20 06:21:33 +00:00
Chris Lattner 575d3218ab switch statistics over to not use static ctors.
llvm-svn: 32709
2006-12-19 23:16:47 +00:00
Chris Lattner 1fa216f572 eliminate static ctor from example.
llvm-svn: 32696
2006-12-19 22:24:09 +00:00
Chris Lattner 40b29cac01 remove dead statistic
llvm-svn: 32695
2006-12-19 22:23:21 +00:00
Chris Lattner 45f966d80f switch more statistics over to STATISTIC, eliminating static ctors. Also,
delete some dead ones.

llvm-svn: 32694
2006-12-19 22:17:40 +00:00
Chris Lattner 1631bcb1d4 Eliminate static ctors due to Statistic objects
llvm-svn: 32693
2006-12-19 22:09:18 +00:00
Chris Lattner 0e5255bdc6 Convert more Statistic's over to STATISTIC
llvm-svn: 32692
2006-12-19 21:49:03 +00:00
Chris Lattner 79a42ac941 Switch over Transforms/Scalar to use the STATISTIC macro. For each statistic
converted, we lose a static initializer.  This also allows GCC to emit warnings
about unused statistics.

llvm-svn: 32690
2006-12-19 21:40:18 +00:00
Reid Spencer 668d90f289 Convert the last uses of CastInst::createInferredCast to a normal cast
creation. These changes are still temporary but at least this pushes
knowledge of signedness out closer to where it can be determined properly
and allows signedness to be removed from VMCore.

llvm-svn: 32654
2006-12-18 08:47:13 +00:00
Reid Spencer b83593e3ea Convert the last use of two-argument ConstantExpr::getCast into another
form so we can remove that method from ConstantExpr.

llvm-svn: 32652
2006-12-18 08:16:27 +00:00
Bill Wendling a77f14265b Added an automatic cast to "std::ostream*" etc. from OStream. We then can
rework the hacks that had us passing OStream in. We pass in std::ostream*
instead, check for null, and then dispatch to the correct print() method.

llvm-svn: 32636
2006-12-17 05:15:13 +00:00
Chris Lattner fd5f03ec3f when inserting a dummy argument to work-around the CBE not supporting
zero arg vararg functions, pass undef instead of 'int 0', which is cheaper.

llvm-svn: 32634
2006-12-16 21:21:53 +00:00
Chris Lattner 8f7b775bf4 re-enable a temporarily-reverted patch
llvm-svn: 32595
2006-12-15 07:32:38 +00:00
Reid Spencer 74a528b427 Fix a bug in EvaluateInDifferentType. The type of operand should not be
used to determine whether a ZExt or SExt cast is performed. Instead, pass
an "isSigned" bool to the function and determine its value from the opcode
of the cast involved.
Also, clean up some cruft from previous patches.

llvm-svn: 32548
2006-12-13 18:21:21 +00:00
Reid Spencer 2a499b0b6c Implement review feedback. Most of this has to do with removing unnecessary
cast instructions. A few are bug fixes.

llvm-svn: 32544
2006-12-13 17:19:09 +00:00
Reid Spencer 612683b0d7 For mul transforms, when checking for a cast from bool as either operand,
make sure to also check that it is a zext from bool, not any other cast
operation type.

llvm-svn: 32539
2006-12-13 08:33:33 +00:00
Reid Spencer 799b5bfc71 Fix and/or/xor (cast A), (cast B) --> cast (and/or/xor A, B)
The cast patch introduced the possibility that the wrong cast opcode
could be used and that this transform could trigger on different kinds
of cast operations. This patch rectifies that.

llvm-svn: 32538
2006-12-13 08:27:15 +00:00
Reid Spencer df1f19a8ef Change the interface to SCEVExpander::InsertCastOfTo to take a cast opcode
so the decision of which opcode to use is pushed upward to the caller.
Adjust the callers to pass the expected opcode.

llvm-svn: 32535
2006-12-13 08:06:42 +00:00
Reid Spencer a730cf80d7 Fix some casts. isdigit(c) returns 0 or 1, not 0 or -1
llvm-svn: 32534
2006-12-13 08:04:32 +00:00
Chris Lattner 7c1dff99dc revert my recent int<->fp and vector union promotion changes, they expose
obscure bugs affecting the X86 code generator.  I will reenable this
when fixed.

llvm-svn: 32524
2006-12-13 02:26:45 +00:00
Reid Spencer bfe26ffcfc Replace CastInst::createInferredCast calls with more accurate cast
creation calls.

llvm-svn: 32521
2006-12-13 00:50:17 +00:00
Reid Spencer bb65ebf9a1 Replace inferred getCast(V,Ty) calls with more strict variants.
Rename getZeroExtend and getSignExtend to getZExt and getSExt to match
the the casting mnemonics in the rest of LLVM.

llvm-svn: 32514
2006-12-12 23:36:14 +00:00
Chris Lattner 2dc148e89d this can be trunc or bitcast, per line 3092.
llvm-svn: 32487
2006-12-12 19:11:20 +00:00
Chris Lattner ade1f6894d Fix regression on 400.perlbench last night.
llvm-svn: 32486
2006-12-12 18:41:03 +00:00
Reid Spencer 13bc5d7b57 Fix numerous inferred casts.
llvm-svn: 32479
2006-12-12 09:18:51 +00:00
Reid Spencer 41cb269a2b Fix the casting for the computation of the Malloc size.
llvm-svn: 32477
2006-12-12 09:17:08 +00:00
Reid Spencer b341b0861d Change inferred getCast into specific getCast. Passes all tests.
llvm-svn: 32469
2006-12-12 05:05:00 +00:00
Chris Lattner 6e5fe376ec Patch for PR1045 and Transforms/ScalarRepl/2006-12-11-SROA-Crash.ll
llvm-svn: 32468
2006-12-12 04:24:41 +00:00
Chris Lattner e810140c4b trunc to integer, not to FP.
llvm-svn: 32426
2006-12-11 01:17:00 +00:00
Chris Lattner 23f4b68f7e implement promotion of unions containing two packed types of the same width.
This implements Transforms/ScalarRepl/union-packed.ll

llvm-svn: 32422
2006-12-11 00:35:08 +00:00
Chris Lattner 216c3028e6 * Eliminate calls to CastInst::createInferredCast.
* Add support for promoting unions with fp values in them.  This produces
   our new int<->fp bitcast instructions, implementing
   Transforms/ScalarRepl/union-fp-int.ll

As an example, this allows us to compile this:

union intfloat { int i; float f; };
float invsqrt(const float arg_x) {
    union intfloat x = { .f = arg_x };
    const float xhalf = arg_x * 0.5f;
    x.i = 0x5f3759df - (x.i >> 1);
    return x.f * (1.5f - xhalf * x.f * x.f);
}

into:

_invsqrt:
        movss 4(%esp), %xmm0
        movd %xmm0, %eax
        sarl %eax
        movl $1597463007, %ecx
        subl %eax, %ecx
        movd %ecx, %xmm1
        mulss LCPI1_0, %xmm0
        mulss %xmm1, %xmm0
        movss LCPI1_1, %xmm2
        mulss %xmm1, %xmm0
        subss %xmm0, %xmm2
        movl 8(%esp), %eax
        mulss %xmm2, %xmm1
        movss %xmm1, (%eax)
        ret

instead of:

_invsqrt:
        subl $4, %esp
        movss 8(%esp), %xmm0
        movss %xmm0, (%esp)
        movl (%esp), %eax
        movl $1597463007, %ecx
        sarl %eax
        subl %eax, %ecx
        movl %ecx, (%esp)
        mulss LCPI1_0, %xmm0
        movss (%esp), %xmm1
        mulss %xmm1, %xmm0
        mulss %xmm1, %xmm0
        movss LCPI1_1, %xmm2
        subss %xmm0, %xmm2
        mulss %xmm2, %xmm1
        movl 12(%esp), %eax
        movss %xmm1, (%eax)
        addl $4, %esp
        ret

llvm-svn: 32418
2006-12-10 23:56:50 +00:00
Reid Spencer efe5c862f1 Incorporate any changes in the successor blocks into the result of
MarkAliveBlocks.

llvm-svn: 32375
2006-12-08 21:52:01 +00:00
Bill Wendling 9bfb1e1f29 What should be the last unnecessary <iostream>s in the library.
llvm-svn: 32333
2006-12-07 22:21:48 +00:00
Bill Wendling 22e978a736 Removing even more <iostream> includes.
llvm-svn: 32320
2006-12-07 20:04:42 +00:00
Bill Wendling f3baad3ee1 Changed llvm_ostream et all to OStream. llvm_cerr, llvm_cout, llvm_null, are
now cerr, cout, and NullStream resp.

llvm-svn: 32298
2006-12-07 01:30:32 +00:00
Reid Spencer 4ae56f3086 Update ConstantIntegral Max/Min tests for new interface.
llvm-svn: 32288
2006-12-06 20:39:57 +00:00
Chris Lattner f06bb658a8 add missing #include
llvm-svn: 32280
2006-12-06 18:14:47 +00:00
Chris Lattner 700b873130 Detemplatize the Statistic class. The only type it is instantiated with
is 'unsigned'.

llvm-svn: 32279
2006-12-06 17:46:33 +00:00
Chris Lattner edcc8c2f8b Remove the 'printname' argument to WriteAsOperand. It is always true, and
passing false would make the asmprinter fail anyway.

llvm-svn: 32264
2006-12-06 06:16:21 +00:00
Chris Lattner ec58903623 counter should be unsigned.
llvm-svn: 32252
2006-12-06 01:50:04 +00:00
Chris Lattner c209b584eb add an instcombine xform. This speeds up 462.libquantum from 9.78s to
7.48s.  This regression is due to unforseen consequences of the cast patch.

llvm-svn: 32209
2006-12-05 01:26:29 +00:00
Devang Patel 21efc73161 SCCP does not handle Packed Type properly. Disable Packed Type handling
for now.

llvm-svn: 32208
2006-12-04 23:54:59 +00:00
Reid Spencer 14fbdd5523 Update call to CastInst::getCastOpcode for its new signature.
llvm-svn: 32166
2006-12-04 02:48:01 +00:00
Jeff Cohen cc08c83186 Unbreak VC++ build.
llvm-svn: 32113
2006-12-02 02:22:01 +00:00
Chris Lattner 7a002fec1f disable transformations that are invalid for fp vectors. This fixes
Transforms/InstCombine/2006-12-01-BadFPVectorXform.ll

llvm-svn: 32112
2006-12-02 00:13:08 +00:00
Reid Spencer ad05ee9f39 Remove 4 FIXMEs to hack around cast-to-bool problems which no longer exist.
llvm-svn: 32051
2006-11-30 23:13:36 +00:00
Chris Lattner c8978c5272 make it clear that this is always a zext
llvm-svn: 32044
2006-11-30 17:35:08 +00:00
Chris Lattner 3ede00b376 One more bugfix, 3 cases of making casts explicit.
llvm-svn: 32043
2006-11-30 17:32:29 +00:00
Chris Lattner 0390b9e6bb Fix a bug in globalopt due to the recent cast patch.
llvm-svn: 32042
2006-11-30 17:26:08 +00:00
Chris Lattner 960acb008b implement cast.ll:test35. With this, we recognize:
unsigned short swp(unsigned short a) {
       return ((a & 0xff00) >> 8 | (a & 0x00ff) << 8);
}

as an idiom for bswap.

llvm-svn: 32011
2006-11-29 07:18:39 +00:00
Chris Lattner d747f015ff Teach instcombine to turn trunc(srl x, c) -> srl (trunc(x), c) when safe.
This implements InstCombine/cast.ll:test34.  It fires hundreds of times on
176.gcc.

llvm-svn: 32009
2006-11-29 07:04:07 +00:00
Chris Lattner a7942b7bbd Implement Regression/Transforms/InstCombine/bswap-fold.ll,
folding   seteq (bswap(x)), c -> seteq(x,bswap(c))

llvm-svn: 32006
2006-11-29 05:02:16 +00:00
Reid Spencer a736fdf216 Join a split line.
llvm-svn: 31996
2006-11-29 01:11:01 +00:00
Reid Spencer 116ad83aa0 Undo the last patch until 253.perlbmk passes with these changes.
llvm-svn: 31977
2006-11-28 20:23:51 +00:00
Reid Spencer 59fe2d89ae Remove 4 FIXME's from the CAST patch now that the back end is correctly
producing code for "trunc to bool". This passes all tests on Linux.

llvm-svn: 31963
2006-11-28 07:23:01 +00:00
Chris Lattner 8e9a7b73d9 Fix PR1014 and InstCombine/2006-11-27-XorBug.ll.
llvm-svn: 31941
2006-11-27 19:55:07 +00:00
Reid Spencer 6c38f0bb07 For PR950:
The long awaited CAST patch. This introduces 12 new instructions into LLVM
to replace the cast instruction. Corresponding changes throughout LLVM are
provided. This passes llvm-test, llvm/test, and SPEC CPUINT2000 with the
exception of 175.vpr which fails only on a slight floating point output
difference.

llvm-svn: 31931
2006-11-27 01:05:10 +00:00
Bill Wendling 4ae401074c Remove #include <iostream> and use llvm_* streams instead.
llvm-svn: 31925
2006-11-26 10:17:54 +00:00
Bill Wendling 8f13b5c43e Replace #include <iostream> with llvm_* streams.
llvm-svn: 31924
2006-11-26 10:02:32 +00:00
Bill Wendling 5dbf43c983 Removed #include <iostream> and replaced with llvm_* streams.
llvm-svn: 31923
2006-11-26 09:46:52 +00:00
Bill Wendling a7459ca813 Removed #include <iostream> and used the llvm_cerr/DOUT streams instead.
llvm-svn: 31922
2006-11-26 09:17:06 +00:00
Nick Lewycky 09b7e4d3ab Update to new predicate simplifier VRP design. Fixes PR966 and PR967.
Remove predicate simplifier from default gcc3 pipeline. New design is too
slow to enable by default.
Add new testcases for problems encountered in development.

llvm-svn: 31895
2006-11-22 23:49:16 +00:00
Chris Lattner ec45a4c88c This xform is handled by FoldOpIntoPhi in visitCastInst in a more elegant way.
llvm-svn: 31889
2006-11-21 17:05:13 +00:00
Chris Lattner 95adf8f1da Do not convert massive blocks on phi nodes into select statements. Instead
only do these transformations if there are a small number of phi's.
This speeds up Ptrdist/ks from 2.35s to 2.19s on my mac pro.

llvm-svn: 31853
2006-11-18 19:19:36 +00:00
Chris Lattner 21eba2da26 If an indvar with a variable stride is used by the exit condition, go ahead
and handle it like constant stride vars.  This fixes some bad codegen in
variable stride cases.  For example, it compiles this:

void foo(int k, int i) {
  for (k=i+i; k <= 8192; k+=i)
    flags2[k] = 0;
}

to:

LBB1_1: #bb.preheader
        movl %eax, %ecx
        addl %ecx, %ecx
        movl L_flags2$non_lazy_ptr, %edx
LBB1_2: #bb
        movb $0, (%edx,%ecx)
        addl %eax, %ecx
        cmpl $8192, %ecx
        jle LBB1_2      #bb
LBB1_5: #return
        ret

or (if the array is local and we are in dynamic-nonpic or static mode):

LBB3_2: #bb
        movb $0, _flags2(%ecx)
        addl %eax, %ecx
        cmpl $8192, %ecx
        jle LBB3_2      #bb

and:

        lis r2, ha16(L_flags2$non_lazy_ptr)
        lwz r2, lo16(L_flags2$non_lazy_ptr)(r2)
        slwi r3, r4, 1
LBB1_2: ;bb
        li r5, 0
        add r6, r4, r3
        stbx r5, r2, r3
        cmpwi cr0, r6, 8192
        bgt cr0, LBB1_5 ;return

instead of:

        leal (%eax,%eax,2), %ecx
        movl %eax, %edx
        addl %edx, %edx
        addl L_flags2$non_lazy_ptr, %edx
        xorl %esi, %esi
LBB1_2: #bb
        movb $0, (%edx,%esi)
        movl %eax, %edi
        addl %esi, %edi
        addl %ecx, %esi
        cmpl $8192, %esi
        jg LBB1_5       #return

and:

        lis r2, ha16(L_flags2$non_lazy_ptr)
        lwz r2, lo16(L_flags2$non_lazy_ptr)(r2)
        mulli r3, r4, 3
        slwi r5, r4, 1
        li r6, 0
        add r2, r2, r5
LBB1_2: ;bb
        li r5, 0
        add r7, r3, r6
        stbx r5, r2, r6
        add r6, r4, r6
        cmpwi cr0, r7, 8192
        ble cr0, LBB1_2 ;bb

This speeds up Benchmarks/Shootout/sieve from 8.533s to 6.464s and
implements LoopStrengthReduce/var_stride_used_by_compare.ll

llvm-svn: 31809
2006-11-17 06:17:33 +00:00
Chris Lattner e3a63d136d Fix a gcc 4.2 warning.
llvm-svn: 31751
2006-11-15 04:53:24 +00:00
Chris Lattner f05d69ae72 implement InstCombine/shift-simplify.ll by transforming:
(X >> Z) op (Y >> Z)  -> (X op Y) >> Z

for all shifts and all ops={and/or/xor}.

llvm-svn: 31729
2006-11-14 07:46:50 +00:00
Chris Lattner d12a4bf799 implement InstCombine/and-compare.ll:test1. This compiles:
typedef struct { unsigned prefix : 4; unsigned code : 4; unsigned unsigned_p : 4; } tree_common;
int foo(tree_common *a, tree_common *b) { return a->code == b->code; }

into:

_foo:
        movl 4(%esp), %eax
        movl 8(%esp), %ecx
        movl (%eax), %eax
        xorl (%ecx), %eax
        # TRUNCATE movb %al, %al
        shrb $4, %al
        testb %al, %al
        sete %al
        movzbl %al, %eax
        ret

instead of:

_foo:
        movl 8(%esp), %eax
        movb (%eax), %al
        shrb $4, %al
        movl 4(%esp), %ecx
        movb (%ecx), %cl
        shrb $4, %cl
        cmpb %al, %cl
        sete %al
        movzbl %al, %eax
        ret

saving one cycle by eliminating a shift.

llvm-svn: 31727
2006-11-14 06:06:06 +00:00
Chris Lattner d4dee405cb Fix InstCombine/2006-11-10-ashr-miscompile.ll a miscompilation introduced
by the shr -> [al]shr patch.  This was reduced from 176.gcc.

llvm-svn: 31653
2006-11-10 23:38:52 +00:00
Chris Lattner 82928ca290 second patch to fix PR992/993.
llvm-svn: 31610
2006-11-09 23:36:08 +00:00
Chris Lattner 924f4fee8b Minimal patch to fix PR992/PR993
llvm-svn: 31608
2006-11-09 23:17:45 +00:00
Chris Lattner 6e2c15c158 Teach ShrinkDemandedConstant how to handle X+C. This implements:
add.ll:test33, add.ll:test34, shift-sra.ll:test2

llvm-svn: 31586
2006-11-09 05:12:27 +00:00
Chris Lattner 4f218d56f5 reenable factoring of GEP expressions, being more precise about the
case that it bad to do.

llvm-svn: 31563
2006-11-08 19:42:28 +00:00
Chris Lattner cd62f11227 make this code more efficient by not creating a phi node we are just going to
delete in the first place.  This also makes it simpler.

llvm-svn: 31562
2006-11-08 19:29:23 +00:00
Jim Laskey 61feeb90f9 Remove redundant <cmath>.
llvm-svn: 31561
2006-11-08 19:16:44 +00:00
Chris Lattner a3acfca920 disable this factoring optzn for GEPs for now, this severely pessimizes some
loops.

llvm-svn: 31560
2006-11-08 18:49:31 +00:00
Reid Spencer fdff938a7e For PR950:
This patch converts the old SHR instruction into two instructions,
AShr (Arithmetic) and LShr (Logical). The Shr instructions now are not
dependent on the sign of their operands.

llvm-svn: 31542
2006-11-08 06:47:33 +00:00
Chris Lattner 4967f6ddea scalarrepl should not split the two elements of the vsiidx array:
int func(vFloat v0, vFloat v1) {
        int ii;
        vSInt32 vsiidx[2];
        vsiidx[0] = _mm_cvttps_epi32(v0);
        vsiidx[1] = _mm_cvttps_epi32(v1);
        ii = ((int *) vsiidx)[4];
        return ii;
}

This fixes Transforms/ScalarRepl/2006-11-07-InvalidArrayPromote.ll

llvm-svn: 31524
2006-11-07 22:42:47 +00:00
Jeff Cohen 7d6f3db3e2 Unbreak VC++ build.
llvm-svn: 31464
2006-11-05 19:31:28 +00:00
Nick Lewycky 67bad5adbc Remove commented line from earlier debugging.
llvm-svn: 31460
2006-11-05 14:19:40 +00:00
Andrew Lenharth 0ebb0b03e6 The wrong parameter was being tested to deturmine i32 vs i64
llvm-svn: 31431
2006-11-03 22:45:50 +00:00
Chris Lattner 62e2cad6b8 remove dead code
llvm-svn: 31398
2006-11-03 01:34:58 +00:00
Reid Spencer de46e48420 For PR786:
Turn on -Wunused and -Wno-unused-parameter. Clean up most of the resulting
fall out by removing unused variables. Remaining warnings have to do with
unused functions (I didn't want to delete code without review) and unused
variables in generated code. Maintainers should clean up the remaining
issues when they see them. All changes pass DejaGnu tests and Olden.

llvm-svn: 31380
2006-11-02 20:25:50 +00:00
Reid Spencer 7eb55b395f For PR950:
Replace the REM instruction with UREM, SREM and FREM.

llvm-svn: 31369
2006-11-02 01:53:59 +00:00
Devang Patel 2cb4f83b38 There can be more than one PHINode at the start of the block.
llvm-svn: 31362
2006-11-01 23:04:45 +00:00
Devang Patel 44519a8feb Handle PHINode with only one incoming value.
This fixes http://llvm.org/bugs/show_bug.cgi?id=979

llvm-svn: 31358
2006-11-01 22:26:43 +00:00
Chris Lattner 5a0bd61c64 Fix GlobalOpt/2006-11-01-ShrinkGlobalPhiCrash.ll and McGill/chomp
llvm-svn: 31352
2006-11-01 18:03:33 +00:00
Chris Lattner eebea43b48 Factor gep instructions through phi nodes.
llvm-svn: 31346
2006-11-01 07:43:41 +00:00
Chris Lattner 14f82c7dcd Turn a phi of many loads into a phi of the address and a single load of the
result.  This can significantly shrink code and exposes identities more
aggressively.

llvm-svn: 31344
2006-11-01 07:13:54 +00:00
Chris Lattner dc826fc068 Fix a bug in the previous patch
llvm-svn: 31342
2006-11-01 04:55:47 +00:00
Chris Lattner cadac0c5c3 Fold things like "phi [add (a,b), add(c,d)]" into two phi's and one add.
This triggers thousands of times on multisource.

llvm-svn: 31341
2006-11-01 04:51:18 +00:00
Chris Lattner 984d6e1669 generalize the fix for PR977 to also fix
Transforms/LCSSA/2006-10-31-UnreachableBlock-2.ll

llvm-svn: 31317
2006-10-31 18:56:48 +00:00
Chris Lattner eb68f080ef Fix PR977 and Transforms/LCSSA/2006-10-31-UnreachableBlock.ll
llvm-svn: 31315
2006-10-31 17:52:18 +00:00
Chris Lattner fc519cd2d1 Fix SimplifyCFG/2006-10-29-InvokeCrash.ll, a crash compiling QT.
llvm-svn: 31284
2006-10-29 21:21:20 +00:00
Chris Lattner 3e763f5708 add option to isCriticalEdge
llvm-svn: 31258
2006-10-28 06:58:17 +00:00
Chris Lattner a6eb7e0803 break edges more intelligently
llvm-svn: 31257
2006-10-28 06:45:33 +00:00
Chris Lattner 80ea207bfa Expose a smarter way to break critical edges.
llvm-svn: 31256
2006-10-28 06:44:56 +00:00
Chris Lattner 400ac04e64 SplitCriticalEdge checks to see if an edge is critical, don't check twice
llvm-svn: 31255
2006-10-28 06:38:14 +00:00
Chris Lattner 5191c65485 prepare for a change I'm about to make
llvm-svn: 31248
2006-10-28 00:59:20 +00:00
Reid Spencer 00c482b7a2 Simplify code a bit by changing instances of:
InsertNewInstBefore(new CastInst(Val, ValTy, Val->GetName()), I)
into:
   InsertCastBefore(Val, ValTy, I)

llvm-svn: 31204
2006-10-26 19:19:06 +00:00
Reid Spencer 7e80b0b31e For PR950:
Make necessary changes to support DIV -> [SUF]Div. This changes llvm to
have three division instructions: signed, unsigned, floating point. The
bytecode and assembler are bacwards compatible, however.

llvm-svn: 31195
2006-10-26 06:15:43 +00:00
Nick Lewycky 5b979ae531 Fix 2006-10-25-AddSetCC. A relational operator (like setlt) can never
produce an EQ property.

llvm-svn: 31193
2006-10-26 02:35:18 +00:00
Nick Lewycky 9d17c82a26 Resurrect r1.25.
Fix and comment the "or", "and" and "xor" transformations.

llvm-svn: 31189
2006-10-25 23:48:24 +00:00
Chris Lattner 53f53db919 hide symbols properly
llvm-svn: 31184
2006-10-25 21:14:31 +00:00
Chris Lattner ebb1ad4382 Fix Transforms/ScalarRepl/2006-10-23-PointerUnionCrash.ll
llvm-svn: 31151
2006-10-24 06:26:32 +00:00
Chris Lattner dc7b9beb20 Revert back to r1.21, which was the last revision of predsimplify that
passes llvm-gcc bootstrap.

llvm-svn: 31146
2006-10-24 00:36:21 +00:00
Chris Lattner fe7b6ef346 Handle fallout from the recent branch-on-undef changes. This fixes
Prolangs-C/agrep and SCCP/2006-10-23-IPSCCP-Crash.ll

llvm-svn: 31132
2006-10-23 18:57:02 +00:00
Nick Lewycky 53b4158448 Remove the Backwards operation. Resolving now works at the time when a
property is added by running through the list of uses of the value and
adding resolved properties to the property set.

llvm-svn: 31126
2006-10-23 01:56:02 +00:00
Nick Lewycky 6f5c30fcec Fix similar missing optimization opportunity in XOR.
llvm-svn: 31123
2006-10-22 22:22:58 +00:00
Nick Lewycky af2b0571d0 Whoops! Add missing NULL check.
llvm-svn: 31121
2006-10-22 21:38:24 +00:00
Nick Lewycky 2c734f3fc1 Handle "if ((x|y) != 0)" for ints like we do for bools. Fixes missed
optimization opportunity pointed out by Chris Lattner.

llvm-svn: 31118
2006-10-22 21:36:41 +00:00
Nick Lewycky f345008339 AllocaInst can't return a null pointer. Fixes missed optimization
opportunity pointed out by Andrew Lewycky.

llvm-svn: 31115
2006-10-22 19:53:27 +00:00
Chris Lattner 250eff20da Add a workaround for PR962, disabling the more aggressive form of this
transformation.  This speeds up a C++ app 2.25x.

llvm-svn: 31113
2006-10-22 18:42:26 +00:00
Chris Lattner af17096dcf 3 Changes:
1. Better document what is going on here.
2. Only hack on one branch per iteration, making the results less conservative.
3. Handle the problematic case by marking edges executable instead of by
   playing with value lattice states.  This is far less pessimistic, and fixes
   SCCP/ipsccp-gvar.ll.

llvm-svn: 31106
2006-10-22 05:59:17 +00:00
Chris Lattner af1222c1a7 llvm-extract should remove module-level asm
llvm-svn: 31086
2006-10-20 21:35:41 +00:00
Chris Lattner 319c86fd38 Fix an ugly problem in SCCP. This fixes Benchmarks/Misc-C++/mandel-text.cpp
llvm-svn: 31073
2006-10-20 20:19:08 +00:00
Chris Lattner 5dee3b2526 Fix miscompilation of MallocBench/espresso which code review pointed out
but apparently didn't make it into the final patch.

llvm-svn: 31070
2006-10-20 18:20:21 +00:00
Reid Spencer e0fc4dfc22 For PR950:
This patch implements the first increment for the Signless Types feature.
All changes pertain to removing the ConstantSInt and ConstantUInt classes
in favor of just using ConstantInt.

llvm-svn: 31063
2006-10-20 07:07:24 +00:00
Devang Patel 5d417e35bc While creating mask, use 1ULL instead of 1.
llvm-svn: 31062
2006-10-20 01:16:56 +00:00
Chris Lattner b8b11599dd Fix SimplifyCFG/2006-10-19-UncondDiv.ll by disabling a bad xform.
llvm-svn: 31061
2006-10-20 00:42:07 +00:00
Devang Patel 5d6df959e3 It is OK to remove extra cast if operation is EQ/NE even though source
and destination sign may not match but other conditions are met.

llvm-svn: 31056
2006-10-19 20:59:13 +00:00
Devang Patel 88afd00d1d Typo Typo.
llvm-svn: 31055
2006-10-19 19:21:36 +00:00
Devang Patel 472530d9fc Typo.
llvm-svn: 31054
2006-10-19 19:05:38 +00:00
Devang Patel b42aef4925 Fix bug in PR454 resolution. Added new test case.
This fixes llvmAsmParser.cpp miscompile by llvm on PowerPC Darwin.

llvm-svn: 31053
2006-10-19 18:54:08 +00:00
Reid Spencer 3c514959dd Undo Chris' last patch, it caused a regression.
llvm-svn: 30991
2006-10-16 23:08:08 +00:00
Chris Lattner 9a1c7dd27a fix a buggy check that accidentally disabled this xform
llvm-svn: 30967
2006-10-15 22:42:15 +00:00
Nick Lewycky 77e030bca9 Replace custom dispatch code with two uses of InstVisitor. Improves
compile-time performance.

llvm-svn: 30896
2006-10-12 02:02:44 +00:00
Chris Lattner 41b442242d Implement SROA of unions with mixed pointers/integers in them. This implements
PR892 and Transforms/ScalarRepl/union-pointer.ll:test2

llvm-svn: 30825
2006-10-08 23:53:04 +00:00
Chris Lattner 05f8272afa Implement Transforms/ScalarRepl/union-pointer.ll:test
llvm-svn: 30823
2006-10-08 23:28:04 +00:00
Chris Lattner 2deeaeaca7 add a new SimplifyDemandedVectorElts method, which works similarly to
SimplifyDemandedBits.  The idea is that some operations can be simplified if
not all of the computed elements are needed.  Some targets (like x86) have a
large number of intrinsics that operate on a single element, but pass other
elts through unmodified.  If those other elements are not needed, the
intrinsics can be simplified to scalar operations, and insertelement ops can
be removed.

This turns (f.e.):

ushort %Convert_sse(float %f) {
        %tmp = insertelement <4 x float> undef, float %f, uint 0                ; <<4 x float>> [#uses=1]
        %tmp10 = insertelement <4 x float> %tmp, float 0.000000e+00, uint 1             ; <<4 x float>> [#uses=1]
        %tmp11 = insertelement <4 x float> %tmp10, float 0.000000e+00, uint 2           ; <<4 x float>> [#uses=1]
        %tmp12 = insertelement <4 x float> %tmp11, float 0.000000e+00, uint 3           ; <<4 x float>> [#uses=1]
        %tmp28 = tail call <4 x float> %llvm.x86.sse.sub.ss( <4 x float> %tmp12, <4 x float> < float 1.000000e+00, float 0.000000e+00, float 0.000000e+00, float 0.000000e+00 > )               ; <<4 x float>> [#uses=1]
        %tmp37 = tail call <4 x float> %llvm.x86.sse.mul.ss( <4 x float> %tmp28, <4 x float> < float 5.000000e-01, float 0.000000e+00, float 0.000000e+00, float 0.000000e+00 > )               ; <<4 x float>> [#uses=1]
        %tmp48 = tail call <4 x float> %llvm.x86.sse.min.ss( <4 x float> %tmp37, <4 x float> < float 6.553500e+04, float 0.000000e+00, float 0.000000e+00, float 0.000000e+00 > )               ; <<4 x float>> [#uses=1]
        %tmp59 = tail call <4 x float> %llvm.x86.sse.max.ss( <4 x float> %tmp48, <4 x float> zeroinitializer )          ; <<4 x float>> [#uses=1]
        %tmp = tail call int %llvm.x86.sse.cvttss2si( <4 x float> %tmp59 )              ; <int> [#uses=1]
        %tmp69 = cast int %tmp to ushort                ; <ushort> [#uses=1]
        ret ushort %tmp69
}

into:

ushort %Convert_sse(float %f) {
entry:
        %tmp28 = sub float %f, 1.000000e+00             ; <float> [#uses=1]
        %tmp37 = mul float %tmp28, 5.000000e-01         ; <float> [#uses=1]
        %tmp375 = insertelement <4 x float> undef, float %tmp37, uint 0         ; <<4 x float>> [#uses=1]
        %tmp48 = tail call <4 x float> %llvm.x86.sse.min.ss( <4 x float> %tmp375, <4 x float> < float 6.553500e+04, float undef, float undef, float undef > )           ; <<4 x float>> [#uses=1]
        %tmp59 = tail call <4 x float> %llvm.x86.sse.max.ss( <4 x float> %tmp48, <4 x float> < float 0.000000e+00, float undef, float undef, float undef > )            ; <<4 x float>> [#uses=1]
        %tmp = tail call int %llvm.x86.sse.cvttss2si( <4 x float> %tmp59 )              ; <int> [#uses=1]
        %tmp69 = cast int %tmp to ushort                ; <ushort> [#uses=1]
        ret ushort %tmp69
}

which improves codegen from:

_Convert_sse:
        movss LCPI1_0, %xmm0
        movss 4(%esp), %xmm1
        subss %xmm0, %xmm1
        movss LCPI1_1, %xmm0
        mulss %xmm0, %xmm1
        movss LCPI1_2, %xmm0
        minss %xmm0, %xmm1
        xorps %xmm0, %xmm0
        maxss %xmm0, %xmm1
        cvttss2si %xmm1, %eax
        andl $65535, %eax
        ret

to:

_Convert_sse:
        movss 4(%esp), %xmm0
        subss LCPI1_0, %xmm0
        mulss LCPI1_1, %xmm0
        movss LCPI1_2, %xmm1
        minss %xmm1, %xmm0
        xorps %xmm1, %xmm1
        maxss %xmm1, %xmm0
        cvttss2si %xmm0, %eax
        andl $65535, %eax
        ret


This is just a first step, it can be extended in many ways.  Testcase here:
Transforms/InstCombine/vec_demanded_elts.ll

llvm-svn: 30752
2006-10-05 06:55:50 +00:00
Chris Lattner 52886e72d7 This case isn't implemented yet. It seems unlikely to be needed, but if it
ever is, we want to get an assert instead of silent bad codegen.

llvm-svn: 30716
2006-10-04 04:58:58 +00:00
Nick Lewycky 58a910dff5 Simplify logic further.
Ensure that we copy KnownProperties before calling visitBasicBlock, else
we may leak properties into blocks where they don't belong.

llvm-svn: 30705
2006-10-03 17:36:01 +00:00
Nick Lewycky 1d00f3e144 Simplify, now that predsimplify depends on break-crit-edges.
Fix SwitchInst where dest-block is the same as one of the cases.

llvm-svn: 30700
2006-10-03 15:19:11 +00:00
Nick Lewycky 755f801adc Move break-crit-edges before the predicate simplifier. Allows us to
optimize in more cases.

llvm-svn: 30699
2006-10-03 14:52:23 +00:00
Evan Cheng ff510a58c2 Revert previous patch. Still breaking things.
llvm-svn: 30698
2006-10-03 07:26:07 +00:00
Chris Lattner 8aca0ee8c3 Fix PR932 and Analysis/Dominators/2006-10-02-BreakCritEdges.ll:
The critical edge block dominates the dest block if the destblock dominates
all edges other than the one incoming from the critical edge.

llvm-svn: 30696
2006-10-03 07:02:02 +00:00
Chris Lattner 7d19067c42 Fix a bug from r1.391 of this file, where we checked the size instead of
the alignment when promoting allocations.  This implements
InstCombine/cast.ll:test32

llvm-svn: 30682
2006-10-01 19:40:58 +00:00
Chris Lattner 4797c891c0 Fix debug output
llvm-svn: 30680
2006-09-30 23:32:50 +00:00
Chris Lattner 24d3d4280a Implement SRA of heap allocations.
llvm-svn: 30679
2006-09-30 23:32:09 +00:00
Chris Lattner 80a01ef6f0 Add some ifdef'd out debug info
llvm-svn: 30676
2006-09-30 19:40:30 +00:00
Chris Lattner 6ab03f6a08 Eliminate ConstantBool::True and ConstantBool::False. Instead, provide
ConstantBool::getTrue() and ConstantBool::getFalse().

llvm-svn: 30665
2006-09-28 23:35:22 +00:00
Owen Anderson 7cb6809c25 Another attempt at making ArgPromotion smarter. This patch no longer breaks Burg.
llvm-svn: 30657
2006-09-28 23:02:22 +00:00
Chris Lattner 525804f31e simplify code
llvm-svn: 30656
2006-09-28 22:58:25 +00:00
Chris Lattner e03ca2ca4a set DEBUG_TYPE right
llvm-svn: 30623
2006-09-27 04:58:23 +00:00
Nick Lewycky 059c79264f Style changes only. Remove dead code, fix a comment.
llvm-svn: 30588
2006-09-23 15:13:08 +00:00
Chris Lattner 6bd6da4097 Be far more careful when splitting a loop header, either to form a preheader
or when splitting loops with a common header into multiple loops.  In particular
the old code would always insert the preheader before the old loop header.  This
is disasterous in cases where the loop hasn't been rotated.  For example, it can
produce code like:

        .. outside the loop...
        jmp LBB1_2      #bb13.outer
LBB1_1: #bb1
        movsd 8(%esp,%esi,8), %xmm1
        mulsd (%edi), %xmm1
        addsd %xmm0, %xmm1
        addl $24, %edi
        incl %esi
        jmp LBB1_3      #bb13
LBB1_2: #bb13.outer
        leal (%edx,%eax,8), %edi
        pxor %xmm1, %xmm1
        xorl %esi, %esi
LBB1_3: #bb13
        movapd %xmm1, %xmm0
        cmpl $4, %esi
        jl LBB1_1       #bb1

Note that the loop body is actually LBB1_1 + LBB1_3, which means that the
loop now contains an uncond branch WITHIN it to jump around the inserted
loop header (LBB1_2).  Doh.

This patch changes the preheader insertion code to insert it in the right
spot, producing this code:

        ... outside the loop, fall into the header ...
LBB1_1: #bb13.outer
        leal (%edx,%eax,8), %esi
        pxor %xmm0, %xmm0
        xorl %edi, %edi
        jmp LBB1_3      #bb13
LBB1_2: #bb1
        movsd 8(%esp,%edi,8), %xmm0
        mulsd (%esi), %xmm0
        addsd %xmm1, %xmm0
        addl $24, %esi
        incl %edi
LBB1_3: #bb13
        movapd %xmm0, %xmm1
        cmpl $4, %edi
        jl LBB1_2       #bb1

Totally crazy, no branch in the loop! :)

llvm-svn: 30587
2006-09-23 08:19:21 +00:00
Chris Lattner 608cd05e3f Teach UpdateDomInfoForRevectoredPreds to handle revectored preds that are not
reachable, making it general purpose enough for use by InsertPreheaderForLoop.
Eliminate custom dominfo updating code in InsertPreheaderForLoop, using
UpdateDomInfoForRevectoredPreds instead.

llvm-svn: 30586
2006-09-23 07:40:52 +00:00
Chris Lattner 51c95cdd82 Fix Transforms/IndVarsSimplify/2006-09-20-LFTR-Crash.ll
llvm-svn: 30555
2006-09-21 05:12:20 +00:00
Nick Lewycky fde9c308b2 Don't rewrite ConstantExpr::get.
llvm-svn: 30552
2006-09-21 01:05:35 +00:00
Nick Lewycky d74c55f483 Once we're down to "setcc type constant1, constant2", at least come up
with the right answer.

llvm-svn: 30550
2006-09-20 23:02:24 +00:00
Nick Lewycky cfff1c3f86 Use a total ordering to compare instructions.
Fixes infinite loop in resolve().

llvm-svn: 30540
2006-09-20 17:04:01 +00:00
Andrew Lenharth 44cb67af5c simplify
llvm-svn: 30535
2006-09-20 15:37:57 +00:00
Chris Lattner 380c7e9a59 We went through all that trouble to compute whether it was safe to transform
this comparison, but never checked it.  Whoops, no wonder we miscompiled
177.mesa!

llvm-svn: 30511
2006-09-20 04:44:59 +00:00
Evan Cheng cd3f6ff0e5 Back out Chris' last set of changes. This breaks 177.mesa and povray somehow.
llvm-svn: 30505
2006-09-20 01:39:40 +00:00
Evan Cheng 453280b94d 80 col.
llvm-svn: 30504
2006-09-20 01:10:02 +00:00
Andrew Lenharth 4f339bebb0 If we have an add, do it in the pointer realm, not the int realm. This is critical in the linux kernel for pointer analysis correctness
llvm-svn: 30496
2006-09-19 18:24:51 +00:00
Chris Lattner 12f52faf93 implement select.ll:test19-22
llvm-svn: 30482
2006-09-19 06:18:21 +00:00
Nick Lewycky b9c5483a93 Walk down the dominator tree instead of the control flow graph. That means
that we can't modify the CFG any more, at least not until it's possible
to update the dominator tree (PR217).

llvm-svn: 30469
2006-09-18 21:09:35 +00:00
Chris Lattner de07792595 Fix an infinite loop building the CFE
llvm-svn: 30465
2006-09-18 18:27:05 +00:00
Chris Lattner 67a35bbce7 Implement a trivial optzn: of vastart is never called in a function that takes
... args, remove the '...'.

This is Transforms/DeadArgElim/dead_vaargs.ll

llvm-svn: 30459
2006-09-18 07:02:31 +00:00
Chris Lattner 4922a0e53f Implement InstCombine/cast.ll:test31. This speeds up 462.libquantum by 26%.
llvm-svn: 30456
2006-09-18 05:27:43 +00:00
Chris Lattner 420c4bcc8d Implement Transforms/InstCombine/shift-sra.ll:test0
llvm-svn: 30450
2006-09-18 04:31:40 +00:00
Chris Lattner b3f24c91b0 Rewrite shift/and/compare sequences to promote better licm of the RHS.
Use isLogicalShift/isArithmeticShift to simplify code.

llvm-svn: 30448
2006-09-18 04:22:48 +00:00
Chris Lattner 850465d53f Fix Transforms/InstCombine/2006-09-15-CastToBool.ll and PR913
llvm-svn: 30405
2006-09-16 03:14:10 +00:00
Chris Lattner 9482cc5b16 revert previous two patches. They cause miscompilation of MultiSource/Applications/Burg
llvm-svn: 30397
2006-09-15 17:24:45 +00:00
Owen Anderson edadd3faee Revert my previous work on ArgumentPromotion. Further investigation has revealed these
changes to be incorrect.  They just weren't showing up in any of our current testcases.

llvm-svn: 30385
2006-09-15 05:22:51 +00:00
Anton Korobeynikov d61d39ec53 Adding dllimport, dllexport and external weak linkage types.
DLL* linkages got full (I hope) codegeneration support in C & both x86
assembler backends.
External weak linkage added for future use, we don't provide any
codegeneration, etc. support for it.

llvm-svn: 30374
2006-09-14 18:23:27 +00:00
Chris Lattner 237ccf2a51 Second half of the fix for Transforms/Inline/inline_cleanup.ll
This folds unconditional branches that are often produced by code
specialization.

llvm-svn: 30307
2006-09-13 21:27:00 +00:00
Nick Lewycky 12efffc96b Add some more consistency checks.
llvm-svn: 30305
2006-09-13 19:32:53 +00:00
Nick Lewycky 51ce8d6b46 Fix unionSets so that it can merge correctly.
llvm-svn: 30304
2006-09-13 19:24:01 +00:00
Chris Lattner 6ef6d06d21 Implement the first half of Transforms/Inline/inline_cleanup.ll
llvm-svn: 30303
2006-09-13 19:23:57 +00:00
Nick Lewycky 3a4dc7b489 Erase dead instructions.
llvm-svn: 30298
2006-09-13 18:55:37 +00:00
Devang Patel fab4972a6e Initialize DontInternalize.
llvm-svn: 30281
2006-09-13 01:02:26 +00:00
Chris Lattner 1d7ec20a4d An sinkable instruction may exist with uses, if those uses are in dead blocks.
Handle this.  This fixes PR908 and Transforms/LICM/2006-09-12-DeadUserOfSunkInstr.ll

llvm-svn: 30275
2006-09-12 19:17:09 +00:00
Chris Lattner d28627009a Fix PR905 and InstCombine/2006-09-11-EmptyStructCrash.ll
llvm-svn: 30266
2006-09-11 21:43:16 +00:00
Nick Lewycky e94f42a740 Skip the linear search if the answer is already known.
llvm-svn: 30251
2006-09-11 17:23:34 +00:00
Chris Lattner d1f8e07808 Allow tail duplication in more cases, relaxing the previous restriction a
bit.  This fixes Regression/Transforms/TailDup/MergeTest.ll

llvm-svn: 30237
2006-09-10 18:17:58 +00:00
Nick Lewycky 9a22d7b60f Replace EquivalenceClasses with a custom-built data structure. Many common
operations (like findProperties) should be faster, at the expense of
unionSets being slower in cases that are rare in practise.

Don't erase a dead Instruction. This fixes a memory corruption issue.

llvm-svn: 30235
2006-09-10 02:27:07 +00:00
Chris Lattner 0468987592 Implement Transforms/InstCombine/hoist_instr.ll
llvm-svn: 30234
2006-09-09 22:02:56 +00:00
Chris Lattner 27ff96d87a Make inlining costs more accurate.
llvm-svn: 30231
2006-09-09 20:40:44 +00:00
Chris Lattner d79dc79831 Turn div X, (Cond ? Y : 0) -> div X, Y
This implements select.ll::test18.

llvm-svn: 30230
2006-09-09 20:26:32 +00:00
Chris Lattner c465046e65 Throttle back tail duplication to avoid creating really ugly sequences of code.
For Transforms/TailDup/if-tail-dup.ll, f.e., it produces:

_foo:
        movl 8(%esp), %eax
        movl 4(%esp), %ecx
        testl $1, %ecx
        je LBB1_2       #cond_next
LBB1_1: #cond_true
        movl $1, (%eax)
LBB1_2: #cond_next
        testl $2, %ecx
        je LBB1_4       #cond_next10
LBB1_3: #cond_true6
        movl $1, 4(%eax)
LBB1_4: #cond_next10
        testl $4, %ecx
        je LBB1_6       #cond_next18
LBB1_5: #cond_true14
        movl $1, 8(%eax)
LBB1_6: #cond_next18
        testl $8, %ecx
        je LBB1_8       #return
LBB1_7: #cond_true22
        movl $1, 12(%eax)
        ret
LBB1_8: #return
        ret

instead of:

_foo:
        movl 4(%esp), %eax
        testl $2, %eax
        sete %cl
        movl 8(%esp), %edx
        testl $1, %eax
        je LBB1_2       #cond_next
LBB1_1: #cond_true
        movl $1, (%edx)
        testb %cl, %cl
        jne LBB1_4      #cond_next10
        jmp LBB1_3      #cond_true6
LBB1_2: #cond_next
        testb %cl, %cl
        jne LBB1_4      #cond_next10
LBB1_3: #cond_true6
        movl $1, 4(%edx)
        testl $4, %eax
        je LBB1_6       #cond_next18
        jmp LBB1_5      #cond_true14
LBB1_4: #cond_next10
        testl $4, %eax
        je LBB1_6       #cond_next18
LBB1_5: #cond_true14
        movl $1, 8(%edx)
        testl $8, %eax
        je LBB1_8       #return
        jmp LBB1_7      #cond_true22
LBB1_6: #cond_next18
        testl $8, %eax
        je LBB1_8       #return
LBB1_7: #cond_true22
        movl $1, 12(%edx)
        ret
LBB1_8: #return
        ret

llvm-svn: 30158
2006-09-07 21:30:15 +00:00
Chris Lattner 845b223da4 Fix Duraid's changes to work when TLI is null. This fixes the failing
lowerinvoke regtests.

llvm-svn: 30115
2006-09-05 17:48:07 +00:00
Duraid Madina cf6749e4c0 add setJumpBufSize() and setJumpBufAlignment() to target-lowering.
Call these from your backend to enjoy setjmp/longjmp goodness, see
lib/Target/IA64/IA64ISelLowering.cpp for an example

llvm-svn: 30095
2006-09-04 06:21:35 +00:00
Owen Anderson 19b80e76df Make ArgumentPromotion handle recursive functions that pass pointers in their recursive calls.
llvm-svn: 30057
2006-09-02 21:19:44 +00:00
Nick Lewycky 8e5599354a Improve handling of SelectInst.
Reorder operations to remove duplicated work.
Fix to leave floating-point types out of the optimization.
Add tests to predsimplify.ll for SwitchInst and SelectInst handling.

llvm-svn: 30055
2006-09-02 19:40:38 +00:00
Nick Lewycky f6f529d008 Don't confuse canonicalize and lookup. Fixes predsimplify.reg4.ll. Also
corrects missing optimization opportunity removing cases from a switch.

llvm-svn: 30009
2006-09-01 03:26:35 +00:00
Nick Lewycky 08674ab707 Properties where both Values weren't in the union (as being equal to
another Value) weren't being found by findProperties.

This fixes predsimplify.ll test6, a missed optimization opportunity.

llvm-svn: 29991
2006-08-31 00:39:16 +00:00
Nick Lewycky 5f8f9af65c Move to using the EquivalenceClass ADT. Removes SynSets.
If a branch's condition has become a ConstantBool, simplify it immediately.
Removing the edge saves work and exposes up more optimization opportunities
in the pass.
Add support for SelectInst.

llvm-svn: 29970
2006-08-30 02:46:48 +00:00
Devang Patel f489d0f85c Do not rely on std::sort and std::erase to get list of unique
exit blocks. The output is dependent on addresses of basic block.

Add and use Loop::getUniqueExitBlocks.

llvm-svn: 29966
2006-08-29 22:29:16 +00:00
Owen Anderson a8a2e5c666 Clean up a bit.
llvm-svn: 29950
2006-08-29 06:10:56 +00:00
Nick Lewycky b2e8ae1700 Add PredicateSimplifier pass. Collapses equal variables into one form
and simplifies expressions. This implements the optimization described
in PR807.

llvm-svn: 29947
2006-08-28 22:44:55 +00:00
Owen Anderson 62c84fe371 Make LoopUnroll fold excessive BasicBlocks. This results in a significant speedup of
gccas on 252.eon

llvm-svn: 29936
2006-08-28 02:09:46 +00:00
Chris Lattner 97c9f20c52 simplify AnalysisGroup registration, eliminating one typeid call.
llvm-svn: 29932
2006-08-28 00:42:29 +00:00
Chris Lattner c2d3d3112e eliminate RegisterOpt. It does the same thing as RegisterPass.
llvm-svn: 29925
2006-08-27 22:42:52 +00:00
Chris Lattner 3d27be1333 s|llvm/Support/Visibility.h|llvm/Support/Compiler.h|
llvm-svn: 29911
2006-08-27 12:54:02 +00:00
Owen Anderson 403b95af47 Fix a crash related to updating Phi nodes in the original header block. This was
causing a crash in 175.vpr

llvm-svn: 29887
2006-08-25 22:13:55 +00:00
Owen Anderson 8e4b029573 Add an assertion to check that we're really preserving LCSSA.
llvm-svn: 29886
2006-08-25 22:12:36 +00:00
Owen Anderson 8cca95cf5d Reapply the indvars patch, since nothing blew up last night.
llvm-svn: 29874
2006-08-25 17:41:25 +00:00
Owen Anderson 94446a4267 Revert my previous patch. Since there are some major changes that went in today,
I'm going to wait to put this in HEAD until tomorrow, so as not to clutter the nightly
tester.

llvm-svn: 29868
2006-08-25 03:45:57 +00:00
Owen Anderson 15a6423431 Specify that indvars actually preserve LCSSA. This has been done for a while, but I
forgot to put in the analysis usage.

llvm-svn: 29867
2006-08-25 03:32:13 +00:00
Owen Anderson e001d811ba Implement unrolling of multiblock loops. This significantly improves the
utility of the LoopUnroll pass.

Also, add a testcase for multiblock-loop unrolling.

llvm-svn: 29859
2006-08-24 21:28:19 +00:00
Reid Spencer 5495fe8dd6 Fix a grammaro in a comment.
llvm-svn: 29765
2006-08-18 09:01:07 +00:00
Chris Lattner 6441cf93c9 Handle single-entry PHI nodes correctly. This fixes PR877 and
Transforms/CondProp/2006-08-14-SingleEntryPhiCrash.ll

llvm-svn: 29673
2006-08-14 21:38:05 +00:00
Chris Lattner f18b396cc2 Don't attempt to split subloops out of a loop with a huge number of backedges.
Not only will this take huge amounts of compile time, the resultant loop nests
won't be useful for optimization.  This reduces loopsimplify time on
Transforms/LoopSimplify/2006-08-11-LoopSimplifyLongTime.ll from ~32s to ~0.4s
with a debug build of llvm on a 2.7Ghz G5.

llvm-svn: 29647
2006-08-12 05:25:00 +00:00
Chris Lattner 85d9944f9a Reimplement the loopsimplify code which deletes edges from unreachable
blocks that target loop blocks.

Before, the code was run once per loop, and depended on the number of
predecessors each block in the loop had.  Unfortunately, scanning preds can
be really slow when huge numbers of phis exist or when phis with huge numbers
of inputs exist.

Now, the code is run once per function and scans successors instead of preds,
which is far faster.  In addition, the new code is simpler and is goto free,
woo.

This change speeds up a nasty testcase Duraid provided me from taking hours to
taking ~72s with a debug build.  The functionality this implements is already
tested in the testsuite as Transforms/CodeExtractor/2004-03-13-LoopExtractorCrash.ll.

llvm-svn: 29644
2006-08-12 04:51:20 +00:00
Reid Spencer 2b6d18a64f Make this example pass use some things from lib/Support (EscapeString,
SlowOperatingInfo, Statistics). Besides providing an example of how to
use these facilities, it also serves to debug problems with runtime linking
when dlopening a loadable module. These three support facilities exercise
different combinations of Text/Weak Weak/Text and Text/Text linking
between the executable and the module.

llvm-svn: 29552
2006-08-07 23:17:24 +00:00
Reid Spencer e6458c3fb2 For PR780:
1. Change the usage of LOADABLE_MODULE so that it implies all the things
   necessary to make a loadable module. This reduces the user's burdern to
   get a loadable module correctly built.
2. Document the usage of LOADABLE_MODULE in the MakefileGuide
3. Adjust the makefile for lib/Transforms/Hello to use the new specification
   for building loadable modules
4. Adjust the sample project to not attempt to build a shared library for
   its little library. This was just wasteful and not instructive at all.

llvm-svn: 29551
2006-08-07 23:12:15 +00:00
Chris Lattner c9009d917d Fix PR867 (and maybe 868) and testcsae:
Transforms/SimplifyCFG/2006-08-03-Crash.ll

llvm-svn: 29515
2006-08-03 21:40:24 +00:00
Chris Lattner 3ff620178b Changes:
1. Update an obsolete comment.
  2. Make the sorting by base an explicit (though still N^2) step, so
     that the code is more clear on what it is doing.
  3. Partition uses so that uses inside the loop are handled before uses
     outside the loop.

Note that none of these changes currently changes the code inserted by LSR,
but they are a stepping stone to getting there.

This code is the result of some crazy pair programming with Nate. :)

llvm-svn: 29493
2006-08-03 06:34:50 +00:00
Chris Lattner 38b6e8382a Add special check to avoid isLoop call. Simple, but doesn't seem to speed
up lcssa much in practice.

llvm-svn: 29465
2006-08-02 00:16:47 +00:00
Chris Lattner 5a2bc786be Replace the SSA update code in LCSSA with a bottom-up approach instead of a top
down approach, inspired by discussions with Tanya.

This approach is significantly faster, because it does not need dominator
frontiers and it does not insert extraneous unused PHI nodes.  For example, on
252.eon, in a release-asserts build, this speeds up LCSSA (which is the slowest
pass in gccas) from 9.14s to 0.74s on my G5.  This code is also slightly smaller
and significantly simpler than the old code.

Amusingly, in a normal Release build (which includes the
"assert(L->isLCSSAForm());" assertion), asserting that the result of LCSSA
is in LCSSA form is actually slower than the LCSSA transformation pass
itself on 252.eon.  I will see if Loop::isLCSSAForm can be sped up next.

llvm-svn: 29463
2006-08-02 00:06:09 +00:00
Chris Lattner 85ea83e821 Add some advice
llvm-svn: 29324
2006-07-27 04:24:14 +00:00
Chris Lattner 1b928478aa Minor comment tweaks
llvm-svn: 29226
2006-07-20 19:06:16 +00:00
Devang Patel edd2f9952e Make it fit into 80 cols.
llvm-svn: 29223
2006-07-20 18:03:39 +00:00
Devang Patel 839d9260f0 Add new constructor to accept vector of exported names while creating
InternalizePass.

llvm-svn: 29222
2006-07-20 17:48:05 +00:00
Owen Anderson 8ef4c92ef8 Add an assertion.
llvm-svn: 29199
2006-07-19 05:48:45 +00:00
Owen Anderson aba8c199dd Make LoopUnroll not die on LCSSA Phis. This makes lencod work again.
llvm-svn: 29198
2006-07-19 05:45:14 +00:00
Owen Anderson 00b974cdbc Fix a error that hadn't yet cause any problems, but I'm sure it would have
somewhere down the road.

llvm-svn: 29197
2006-07-19 03:51:48 +00:00
Chris Lattner fea3974133 silence warnings in a release build
llvm-svn: 29189
2006-07-18 21:48:57 +00:00
Evan Cheng e9c68f52e1 Only reuse a previous IV if it would not require a type conversion.
llvm-svn: 29186
2006-07-18 19:07:58 +00:00
Chris Lattner 19247f36ea eliminate some ugly code, using ConstantExpr::getWithOperands instead.
llvm-svn: 29149
2006-07-14 22:21:31 +00:00
Owen Anderson bea70ee1de Hopefully the final attempt at making IndVars preserve LCSSA.
This should fix PR 831.

llvm-svn: 29141
2006-07-14 18:49:15 +00:00
Chris Lattner 9b6c02ebe4 Revert this patch temporarily until PR831 is fixed.
llvm-svn: 29134
2006-07-13 19:05:20 +00:00
Chris Lattner b3c64f7ab3 Handle instructions in the map, but that map to a null pointer.
This unbreaks smg2000.

llvm-svn: 29127
2006-07-12 21:37:11 +00:00
Owen Anderson dea9202e3b IndVars now (correctly) preserves LCSSA form.
llvm-svn: 29126
2006-07-12 21:29:14 +00:00
Chris Lattner 6148456ec2 In addition to deleting calls, the inliner can constant fold them as well.
Handle this case, which doesn't require a new callgraph edge.  This fixes
a crash compiling MallocBench/gs.

llvm-svn: 29121
2006-07-12 18:37:18 +00:00
Chris Lattner 5de3b8b262 Change the callgraph representation to store the callsite along with the
target CG node.  This allows the inliner to properly update the callgraph
when using the pruning inliner.  The pruning inliner may not copy over all
call sites from a callee to a caller, so the edges corresponding to those
call sites should not be copied over either.

This fixes PR827 and Transforms/Inline/2006-07-12-InlinePruneCGUpdate.ll

llvm-svn: 29120
2006-07-12 18:29:36 +00:00
Chris Lattner 091b6ea847 Silence a warning produced in assertions-disabled mode
llvm-svn: 29108
2006-07-11 18:31:26 +00:00
Owen Anderson 15b1f7d2cd Revert my indvars changes because they were breaking things. Unfortunately this
didn't start showing up until after the recent instcombine fixes.

llvm-svn: 29102
2006-07-11 07:25:33 +00:00
Owen Anderson bbf8990ef7 Add a comment, and fix a typo that broke the build.
llvm-svn: 29094
2006-07-10 22:15:25 +00:00
Owen Anderson ae8aa646f1 Don't indent the entire function.
llvm-svn: 29093
2006-07-10 22:03:18 +00:00
Chris Lattner b7845d69db Recognize 16-bit bswaps by relaxing overconstrained pattern.
This implements Transforms/InstCombine/bswap.ll:test[34].

llvm-svn: 29087
2006-07-10 20:25:24 +00:00
Owen Anderson a6968f83b2 Make instcombine not remove Phi nodes when LCSSA is live.
llvm-svn: 29083
2006-07-10 19:03:49 +00:00
Owen Anderson fe6e97d275 Fix typo in the comment.
llvm-svn: 29078
2006-07-09 21:35:40 +00:00
Owen Anderson aecaabb6e1 Add a fix for an issue where LCSSA would fail to insert undef's in some corner
cases.  Ideally, this issue will go away in the future as LCSSA gets smarter
about which Phi nodes it inserts.

llvm-svn: 29076
2006-07-09 08:14:06 +00:00
Chris Lattner fd2e13b107 Fix PR820 and Transforms/GlobalOpt/2006-07-07-InlineAsmCrash.ll
llvm-svn: 29071
2006-07-07 21:37:01 +00:00
Chris Lattner 996795b0dd Use hidden visibility to make symbols in an anonymous namespace get
dropped.  This shrinks libllvmgcc.dylib another 67K

llvm-svn: 28975
2006-06-28 23:17:24 +00:00
Chris Lattner 4a4c7fe7fa Shrink libllvmgcc.dylib by another 23K
llvm-svn: 28972
2006-06-28 22:08:15 +00:00
Owen Anderson 18e816f356 Switch to a very conservative heuristic for determining when loop-unswitching
will be profitable.  This is mainly to remove some cases where excessive
unswitching would result in long compile times and/or huge generated code.

Once someone comes up with a better heuristic that avoids these cases, this
should be switched out.

llvm-svn: 28962
2006-06-28 17:47:50 +00:00
Chris Lattner 3fda386965 Fix Transforms/InstCombine/2006-06-28-infloop.ll
llvm-svn: 28961
2006-06-28 17:34:50 +00:00
Chris Lattner 0a2e11260e Don't unswitch really large loops even if they are mostly filled with empty
blocks.

llvm-svn: 28959
2006-06-28 16:38:55 +00:00
Andrew Lenharth ebfa24ee9a Catch more function pointer casting problems
Remove the Function pointer cast in these calls, converting it to
a cast of argument.
%tmp60 = tail call int cast (int (ulong)* %str to int (int)*)( int 10 )
%tmp60 = tail call int cast (int (ulong)* %str to int (int)*)( uint %tmp51 )

llvm-svn: 28953
2006-06-28 01:01:52 +00:00
Owen Anderson bb3ae5eb8f Fix for 2006-06-27-DeadSwitchCase.ll
Be more careful when updating Phi nodes after eliminating dead switch cases.  Fix
proposed by Chris.

llvm-svn: 28947
2006-06-27 22:26:09 +00:00
Chris Lattner c4998a0138 Fix Transforms/DeadArgElim/2006-06-27-struct-ret.ll. -deadargelim should not
remove the struct return argument of a csret function, even if it is obviously
dead.

llvm-svn: 28943
2006-06-27 21:05:04 +00:00
Owen Anderson b659bb4196 De-pessimize the handling of LCSSA Phi nodes in IndVarSimplify. Hopefully this
will make Shootout-C/nestedloop faster.

llvm-svn: 28924
2006-06-27 02:17:08 +00:00
Chris Lattner 49771a0462 random code cleanups, no functionality change
llvm-svn: 28914
2006-06-26 19:10:05 +00:00
Owen Anderson f52351e50f Make LoopUnswitch able to unswitch loops with live-out values by taking advantage
of LCSSA.  This results several times the number of unswitchings occurring on
tests such and timberwolfmc, unix-tbl, and ldecod.

llvm-svn: 28912
2006-06-26 07:44:36 +00:00
Chris Lattner 053fb9319d Fix IndVarsSimplify/2006-06-16-Indvar-LCSSA-Crash.ll, a case where a
"LCSSA" phi node causes indvars to break dominance properties.  This fixes
causes indvars to avoid inserting aggressive code in this case, instead
indvars should be fixed to be more aggressive in the face of lcssa phi's.

llvm-svn: 28850
2006-06-17 01:02:31 +00:00
Evan Cheng 8a417a2fde Add missing casts. This fixed some regressions.
llvm-svn: 28834
2006-06-16 18:37:15 +00:00
Evan Cheng 1fc4025a9c More libcall transformations:
printf("%s\n", str) -> puts(str)
printf("%c", c) -> putchar(c)
Also fixed fprintf(file, "%c", c) -> fputc(c, file)

llvm-svn: 28815
2006-06-16 08:36:35 +00:00
Evan Cheng f2ea587aa2 Simplify fprintf(file, "%s", str) to fputs(str, file).
llvm-svn: 28814
2006-06-16 04:52:30 +00:00
Chris Lattner c482a9e31a Implement Transforms/InstCombine/bswap.ll, turning common shift/and/or bswap
idioms into bswap intrinsics.

llvm-svn: 28803
2006-06-15 19:07:26 +00:00
Chris Lattner 0c4f5a655a Fix Transforms/LoopUnswitch/2006-06-13-SingleEntryPHI.ll, a loop unswitch
bug exposed by the recent lcssa work.

llvm-svn: 28779
2006-06-14 04:46:17 +00:00
Chris Lattner e3abb14503 Use the PotDoms map to memoize 'dominating value' lookup. With this patch,
LCSSA is still the slowest pass when gccas'ing 252.eon, but now it only takes
39s instead of 289s. :)

llvm-svn: 28776
2006-06-14 01:13:57 +00:00
Owen Anderson e714a5c549 Fix another instance where PHI nodes need special treatment.
llvm-svn: 28774
2006-06-13 20:50:09 +00:00
Owen Anderson 3f8ff0449a Fix a bug that was causing major slowdowns in povray. This was due to LCSSA
not handling PHI nodes correctly when determining if a value was live-out.

This patch reduces the number of detected live-out variables in the testcase
from 6565 to 485.

llvm-svn: 28771
2006-06-13 19:37:18 +00:00
Owen Anderson fd0a3d6e5c Reapply my 6/9 changes. The bug Evan saw no longer occurs.
llvm-svn: 28759
2006-06-12 21:49:21 +00:00
Chris Lattner b5c9d7a0af Fix an infinite loop on Transforms/SimplifyCFG/2006-06-12-InfLoop.ll
llvm-svn: 28758
2006-06-12 20:18:01 +00:00
Owen Anderson 0ac336965e Fix for 2006-06-26-MultipleExitsSingleBlock.
If a single exit block has multiple predecessors within the loop, it will
appear in the exit blocks list more than once.  LCSSA needs to take that into
account so that it doesn't double process that exit block.

llvm-svn: 28750
2006-06-12 07:10:16 +00:00
Owen Anderson b538f14d2a Re-commit the safe parts of my 6/9 patch. Still working on fixing the unsafe parts.
llvm-svn: 28748
2006-06-11 19:22:28 +00:00
Evan Cheng 1b6e310e6f Back out Owen's 6/9 changes. They broke MultiSource/Benchmarks/Prolangs-C/bison (and perhaps others).
llvm-svn: 28747
2006-06-11 09:32:57 +00:00
Owen Anderson b1dc1d44f8 Add LCSSA as a requirement for LoopUnswitch, and assert that LoopUnswitch preserves
LCSSA.

llvm-svn: 28739
2006-06-09 18:40:32 +00:00
Owen Anderson 505adff3f0 Make Loop able to verify that it is in LCSSA-form, and have the LCSSA pass assert
on this.

llvm-svn: 28738
2006-06-09 18:33:30 +00:00
Evan Cheng 398f70292c RewriteExpr, either the new PHI node of induction variable or the
post-increment value, should be first cast to the appropriated type (to the
type of the common expr). Otherwise, the rewrite of a use based on (common +
iv) may end up with an incorrect type.

llvm-svn: 28735
2006-06-09 00:12:42 +00:00
Owen Anderson 5d029264ec Update some comments, and expose LCSSAID in preparation for having other passes
require LCSSA.

llvm-svn: 28734
2006-06-08 20:02:53 +00:00
Reid Spencer d4b795902c Fix a spello in a comment.
llvm-svn: 28714
2006-06-07 21:24:10 +00:00
Chris Lattner 95cebb082f Fix a bug in a recent patch. This fixes UnitTests/Vector/Altivec/casts.c on
PPC/altivec

llvm-svn: 28698
2006-06-06 22:26:02 +00:00
Owen Anderson ac601b4c4b Fix some formatting, and use inLoop() when appropriate.
llvm-svn: 28694
2006-06-06 04:36:36 +00:00
Owen Anderson 9e81c1bb03 Stop a memory leak, and update some comments.
llvm-svn: 28693
2006-06-06 04:28:30 +00:00
Owen Anderson 766f90b08e Some more clean-up, and squash an IDF-Phi related bug.
llvm-svn: 28680
2006-06-04 00:55:19 +00:00
Owen Anderson eb33815f1b Various clean-ups suggested by Chris.
llvm-svn: 28678
2006-06-04 00:02:23 +00:00
Owen Anderson d00eacc4f9 Fix a bug in Phi-noded insertion. Also, update some comments to reflect what's
actually going on.

llvm-svn: 28677
2006-06-03 23:22:50 +00:00
Chris Lattner 540886f0ae Remove unneeded hook. Patch by Anton K. Thanks!
llvm-svn: 28664
2006-06-02 19:11:46 +00:00
Chris Lattner 02e0b4ddb7 Force anything that #includes llvm/Transforms/Utils/UnifyFunctionExitNodes.h
to link in the implementation.  Thanks to Anton Korobeynikov for figuring out
what was going on here.

llvm-svn: 28660
2006-06-02 18:40:06 +00:00
Chris Lattner cdf2b1fc30 Remove dead #include
llvm-svn: 28642
2006-06-01 20:02:28 +00:00
Chris Lattner cc340c02a4 Make the "pruning cloner" smarter. As it propagates constants through the
code (while cloning) it often gets the branch/switch instructions.  Since it
knows that edges of the CFG are dead, it need not clone (or even look) at
the obviously dead blocks.  This should speed up the inliner substantially on
code where there are lots of inlinable calls to functions with constant
arguments.  On C++ code in particular, this kicks in.

llvm-svn: 28641
2006-06-01 19:19:23 +00:00
Chris Lattner f905a7b994 Silence a -pedantic warning.
llvm-svn: 28632
2006-06-01 17:16:21 +00:00
Owen Anderson 619e4ba57f Remove a FIXME that was fixed with my last patch.
llvm-svn: 28619
2006-06-01 06:07:40 +00:00
Owen Anderson cd76fa04a1 More cleanups. Also, add a special case for updating PHI nodes, and
reimplement getValueDominatingFunction to walk the DominanceTree rather than
just searching blindly.

llvm-svn: 28618
2006-06-01 06:05:47 +00:00
Chris Lattner 1df0e98ac2 Swap the order of operands created here. For +&|^, the order doesn't matter,
but for sub, it really does!  Fix fixes a miscompilation of fibheap_cut in
llvmgcc4.

llvm-svn: 28600
2006-05-31 21:14:00 +00:00
Owen Anderson dad8c57340 Extract a huge loop into a helper method. Fix a few iterator-invalidation bugs.
llvm-svn: 28599
2006-05-31 20:55:06 +00:00
Owen Anderson 8a8f278f15 Add Use replacement. Assuming there is nothing horribly wrong with this, LCSSA
is now theoretically feature-complete.  It has not, however, been thoroughly
test, and is still considered experimental.

llvm-svn: 28529
2006-05-29 01:00:00 +00:00
Owen Anderson 152d063ccb Major think-o. Iterate over all live out-of-loop values, and perform the
other calculations on each individually, rather than trying to delay it and do
them all at the end.

llvm-svn: 28527
2006-05-28 19:33:28 +00:00
Owen Anderson 1310e42803 Make LCSSA insert proper Phi nodes throughout the rest of the CFG by computing
the iterated Dominance Frontier of the loop-closure Phi's.  This is the
second phase of the LCSSA pass.  The third phase (coming soon) will be to
update all uses of loop variables to use the loop-closure Phi's instead.

llvm-svn: 28524
2006-05-27 18:47:11 +00:00
Chris Lattner 67c424e010 Fix some regression from the inliner patch I committed last night. This fixes
ldecod, lencod, and SPASS.

llvm-svn: 28523
2006-05-27 17:28:13 +00:00
Chris Lattner be853d77e9 Switch the inliner over to using CloneAndPruneFunctionInto. This effectively
makes it so that it constant folds instructions on the fly.  This is good
for several reasons:

0. Many instructions are constant foldable after inlining, particularly if
   inlining a call with constant arguments.
1. Without this, the inliner has to allocate memory for all of the instructions
   that can be constant folded, then a subsequent pass has to delete them.  This
   gets the job done without this extra work.
2. This makes the inliner *pass* a bit more aggressive: in particular, it
   partially solves a phase order issue where the inliner would inline lots
   of code that folds away to nothing, but think that the resultant function
   is big because of this code that will be gone.  Now the code never exists.

This is the first part of a 2-step process.  The second part will be smart
enough to see when this implicit constant folding propagates a constant into
a branch or switch instruction, making CFG edges dead.

This implements Transforms/Inline/inline_constprop.ll

llvm-svn: 28521
2006-05-27 01:28:04 +00:00
Chris Lattner 3df13f4f22 Implement a new method, CloneAndPruneFunctionInto, as documented.
llvm-svn: 28519
2006-05-27 01:22:24 +00:00
Chris Lattner bc3c879fcf Refactor some code to expose an interface to constant fold and instruction given it's opcode, typeand operands.
llvm-svn: 28517
2006-05-27 01:18:04 +00:00
Owen Anderson b4e16996f1 A few small clean-ups, and the addition of an LCSSA statistic.
llvm-svn: 28512
2006-05-27 00:31:37 +00:00
Owen Anderson 6e047ab8fc Fix a copy-and-paste-o that would break some compilers.
llvm-svn: 28507
2006-05-26 21:19:17 +00:00
Owen Anderson f3dd3e2bfd Clean up and refactor LCSSA a bunch. It should also run faster now, though
there's still a lot of work to be done on it.

llvm-svn: 28506
2006-05-26 21:11:53 +00:00
Chris Lattner dab43b2b0e Implement Transforms/InstCombine/store.ll:test2.
llvm-svn: 28503
2006-05-26 19:19:20 +00:00
Owen Anderson 8eca8910b6 Skeletal LCSSA pass. This is currently non-functional. Expect functionality
and documentation updates soo.

llvm-svn: 28495
2006-05-26 13:58:26 +00:00
Chris Lattner 0e47716e69 Transform things like (splat(splat)) -> splat
llvm-svn: 28490
2006-05-26 00:29:06 +00:00
Chris Lattner 12249be286 Introduce a helper function that simplifies interpretation of shuffle masks.
No functionality change.

llvm-svn: 28489
2006-05-25 23:48:38 +00:00
Chris Lattner 99155be33f Turn (cast (shuffle (cast)) -> shuffle (cast) if it reduces the # casts in
the program.  This exposes more opportunities for the instcombiner, and implements
vec_shuffle.ll:test6

llvm-svn: 28487
2006-05-25 23:24:33 +00:00
Chris Lattner 83f6578b0c extract element from a shuffle vector can be trivially turned into an
extractelement from the SV's source.  This implement vec_shuffle.ll:test[45]

llvm-svn: 28485
2006-05-25 22:53:38 +00:00
Chris Lattner 0853700582 Revert a patch that is unsafe, due to out of range array accesses in inner
array scopes possibly accessing valid memory in outer subscripts.

llvm-svn: 28478
2006-05-25 21:25:12 +00:00
Chris Lattner a643d528bd Patch for a new instcombine xform, patch contributed by Nick Lewycky!
This implements Transforms/InstCombine/2006-05-10-InvalidIndexUndef.ll

llvm-svn: 28450
2006-05-24 17:34:30 +00:00
Chris Lattner aa2372562e Patches to make the LLVM sources more -pedantic clean. Patch provided
by Anton Korobeynikov!  This is a step towards closing PR786.

llvm-svn: 28447
2006-05-24 17:04:05 +00:00
Chris Lattner d0622b6894 Silence a bogus gcc warning
llvm-svn: 28422
2006-05-20 23:14:03 +00:00
Reid Spencer 2452c94df4 Fix a doxygen problem and break lines at 80 columns
llvm-svn: 28395
2006-05-19 19:09:46 +00:00
Chris Lattner e4cb4768fa Declare that lowerinvoke doesn't interact with other lowering passes.
Patch written by Domagoj Babic!

llvm-svn: 28367
2006-05-17 21:05:27 +00:00
Chris Lattner 2e266807c3 Add a CloneModule call that exposes the mapping of values from the old module
to the new module.  Patch provided by Nick Lewycky!

llvm-svn: 28349
2006-05-17 18:05:35 +00:00
Chris Lattner 35515557c7 remove some dead code identified by coverity
llvm-svn: 28289
2006-05-14 18:45:44 +00:00
Chris Lattner 3237da073e remove dead variables
llvm-svn: 28286
2006-05-14 18:33:57 +00:00
Evan Cheng 18d0438148 Backing out last check-in for now. It's causing an infinite loop gccas lencode.
llvm-svn: 28284
2006-05-14 06:46:03 +00:00
Chris Lattner 3987a8532d Add/Sub/Mul are safe to promote here as well. Incrementing a single-bit
bitfield now gives this code:

_plus:
        lwz r2, 0(r3)
        rlwimi r2, r2, 0, 1, 31
        xoris r2, r2, 32768
        stw r2, 0(r3)
        blr

instead of this:

_plus:
        lwz r2, 0(r3)
        srwi r4, r2, 31
        slwi r4, r4, 31
        addis r4, r4, -32768
        rlwimi r2, r4, 0, 0, 0
        stw r2, 0(r3)
        blr

this can obviously still be improved.

llvm-svn: 28275
2006-05-13 02:16:08 +00:00
Chris Lattner 1ebbe6a22e Implement simple promotion for cast elimination in instcombine. This is
currently very limited, but can be extended in the future.  For example,
we now compile:

uint %test30(uint %c1) {
        %c2 = cast uint %c1 to ubyte
        %c3 = xor ubyte %c2, 1
        %c4 = cast ubyte %c3 to uint
        ret uint %c4
}

to:

_xor:
        movzbl 4(%esp), %eax
        xorl $1, %eax
        ret

instead of:

_xor:
        movb $1, %al
        xorb 4(%esp), %al
        movzbl %al, %eax
        ret

More impressively, we now compile:

struct B { unsigned bit : 1; };
void xor(struct B *b) { b->bit = b->bit ^ 1; }

To (X86/PPC):

_xor:
        movl 4(%esp), %eax
        xorl $-2147483648, (%eax)
        ret
_xor:
        lwz r2, 0(r3)
        xoris r2, r2, 32768
        stw r2, 0(r3)
        blr

instead of (X86/PPC):

_xor:
        movl 4(%esp), %eax
        movl (%eax), %ecx
        movl %ecx, %edx
        shrl $31, %edx
        # TRUNCATE movb %dl, %dl
        xorb $1, %dl
        movzbl %dl, %edx
        andl $2147483647, %ecx
        shll $31, %edx
        orl %ecx, %edx
        movl %edx, (%eax)
        ret

_xor:
        lwz r2, 0(r3)
        srwi r4, r2, 31
        xori r4, r4, 1
        rlwimi r2, r4, 31, 0, 0
        stw r2, 0(r3)
        blr

This implements InstCombine/cast.ll:test30.

llvm-svn: 28273
2006-05-13 02:06:03 +00:00
Chris Lattner cd60d38b30 Remove some dead variables.
Fix a nasty bug in the memcmp optimizer where we used the wrong variable!

llvm-svn: 28269
2006-05-12 23:35:26 +00:00
Chris Lattner 94acc47654 Remove dead stuff
llvm-svn: 28268
2006-05-12 23:32:01 +00:00
Chris Lattner 1443bc52be Refactor some code, making it simpler.
When doing the initial pass of constant folding, if we get a constantexpr,
simplify the constant expr like we would do if the constant is folded in the
normal loop.

This fixes the missed-optimization regression in
Transforms/InstCombine/getelementptr.ll last night.

llvm-svn: 28224
2006-05-11 17:11:52 +00:00
Chris Lattner a36ee4ea34 Two changes:
1. Implement InstCombine/deadcode.ll by not adding instructions in unreachable
   blocks (due to constants in conditional branches/switches) to the worklist.
   This causes them to be deleted before instcombine starts up, leading to
   better optimization.

2. In the prepass over instructions, do trivial constprop/dce as we go.  This
   has the effect of improving the effectiveness of #1.  In addition, it
   *significantly* speeds up instcombine on test cases with large amounts of
   constant folding code (for example, that produced by code specialization
   or partial evaluation).  In one example, it speeds up instcombine from
   0.0589s to 0.0224s with a release build (a 2.6x speedup).

llvm-svn: 28215
2006-05-10 19:00:36 +00:00
Chris Lattner 4fe87d67c4 Patch to make some xforms preserve each other. Patch contributed by
Domagoj Babic!

llvm-svn: 28181
2006-05-09 04:13:41 +00:00
Chris Lattner 1d441adfbf Move some code around.
Make the "fold (and (cast A), (cast B)) -> (cast (and A, B))" transformation
only apply when both casts really will cause code to be generated.  If one or
both doesn't, then this xform doesn't remove a cast.

This fixes Transforms/InstCombine/2006-05-06-Infloop.ll

llvm-svn: 28141
2006-05-06 09:00:16 +00:00
Chris Lattner e745c7de0e Fix an infinite loop compiling oggenc last night.
llvm-svn: 28128
2006-05-05 20:51:30 +00:00
Chris Lattner 3af1053488 Implement InstCombine/cast.ll:test29
llvm-svn: 28126
2006-05-05 06:39:07 +00:00
Chris Lattner fb29692055 Fix Transforms/InstCombine/2006-05-04-DemandedBitCrash.ll
llvm-svn: 28101
2006-05-04 17:33:35 +00:00
Chris Lattner 2d3a02725d Add pass ID's for various passes, so they can be AddRequiredID. Patch by
Domagoj Babic!

llvm-svn: 28048
2006-05-02 04:24:36 +00:00
Chris Lattner 655d08fda8 Fix InstCombine/2006-04-28-ShiftShiftLongLong.ll
llvm-svn: 28019
2006-04-28 22:21:41 +00:00
Chris Lattner e63d808b6e Fix Transforms/Reassociate/2006-04-27-ReassociateVector.ll
llvm-svn: 28007
2006-04-28 04:14:49 +00:00
Chris Lattner b6cb64b7e6 Add support for inserting undef into a vector. This implements
Transforms/InstCombine/vec_insert_to_shuffle.ll

llvm-svn: 27997
2006-04-27 21:14:21 +00:00
Chris Lattner f98b4aa2e7 Fix some nondeterminstic behavior in the mem2reg pass that (in addition to
nondeterminism being bad) could cause some trivial missed optimizations (dead
phi nodes being left around for later passes to clean up).

With this, llvm-gcc4 now bootstraps and correctly compares.  I don't know
why I never tried to do it before... :)

llvm-svn: 27984
2006-04-27 01:14:43 +00:00
Chris Lattner dae49df407 Fix Transforms/ScalarRepl/2006-04-20-PromoteCrash.ll
llvm-svn: 27912
2006-04-20 20:48:50 +00:00
Andrew Lenharth f89e630b2f Make code match cvs commit message :)
llvm-svn: 27881
2006-04-20 15:41:37 +00:00
Andrew Lenharth 61eae29ad6 If we can convert the return pointer type into an integer that IntPtrType
can be converted to losslessly, we can continue the conversion to a direct call.

llvm-svn: 27880
2006-04-20 14:56:47 +00:00
Chris Lattner 36dd7c98d1 Turn x86 unaligned load/store intrinsics into aligned load/store instructions
if the pointer is known aligned.

llvm-svn: 27781
2006-04-17 22:26:56 +00:00
Chris Lattner 9095186deb Fix a bug in the 'shuffle(undef,x,mask) -> shuffle(x, undef,mask')' xform
Make the insert/extract elt -> shuffle code more aggressive.

This fixes CodeGen/PowerPC/vec_shuffle.ll

llvm-svn: 27728
2006-04-16 00:51:47 +00:00
Chris Lattner 34cebe785d Canonicalize shuffle(undef,x,mask) -> shuffle(x, undef,mask').
llvm-svn: 27727
2006-04-16 00:03:56 +00:00
Chris Lattner 39fac448d6 significant cleanups to code that uses insert/extractelt heavily. This builds
maximal shuffles out of them where possible.

llvm-svn: 27717
2006-04-15 01:39:45 +00:00
Chris Lattner 3323ce165d Teach scalarrepl to promote unions of vectors and floats, producing
insert/extractelement operations.  This implements
Transforms/ScalarRepl/vector_promote.ll

llvm-svn: 27710
2006-04-14 21:42:41 +00:00
Andrew Lenharth 92cf71f6d7 linear -> constant time
llvm-svn: 27652
2006-04-13 13:43:31 +00:00
Reid Spencer 13a1a7a4a6 Get rid of a signed/unsigned compare warning.
llvm-svn: 27625
2006-04-12 19:28:15 +00:00
Chris Lattner b19a5c661b Turn casts into getelementptr's when possible. This enables SROA to be more
aggressive in some cases where LLVMGCC 4 is inserting casts for no reason.

This implements InstCombine/cast.ll:test27/28.

llvm-svn: 27620
2006-04-12 18:09:35 +00:00
Chris Lattner 2d37f920ad Implement vec_shuffle.ll:test3
llvm-svn: 27573
2006-04-10 23:06:36 +00:00
Chris Lattner fbb77a408b Implement InstCombine/vec_shuffle.ll:test[12]
llvm-svn: 27571
2006-04-10 22:45:52 +00:00
Andrew Lenharth a9cdcca3c3 Add a simple pass to make sure that all (non-library) calls to malloc and free
are visible to analysis as intrinsics.  That is, make sure someone doesn't pass
free around by address in some struct (as happens in say 176.gcc).

This doesn't get rid of any indirect calls, just ensure calls to free and malloc
are always direct.

llvm-svn: 27560
2006-04-10 19:26:09 +00:00
Chris Lattner 17bd60588c Add supprot for shufflevector
llvm-svn: 27513
2006-04-08 01:19:12 +00:00
Chris Lattner 8ec0205de4 Fix inlining of insert/extract element constantexprs
llvm-svn: 27478
2006-04-07 04:41:03 +00:00
Chris Lattner e79d249c29 Lower vperm(x,y, mask) -> shuffle(x,y,mask) if mask is constant. This allows
us to compile oh-so-realistic stuff like this:

 vec_vperm(A, B, (vector unsigned char){14});

to:
        vspltb v0, v0, 14

instead of:

        vspltisb v0, 14
        vperm v0, v2, v1, v0

llvm-svn: 27452
2006-04-06 19:19:17 +00:00
Chris Lattner caba72b6ff vector casts of casts are eliminable. Transform this:
%tmp = cast <4 x uint> %tmp to <4 x int>                ; <<4 x int>> [#uses=1]
        %tmp = cast <4 x int> %tmp to <4 x float>               ; <<4 x float>> [#uses=1]

into:

        %tmp = cast <4 x uint> %tmp to <4 x float>              ; <<4 x float>> [#uses=1]

llvm-svn: 27355
2006-04-02 05:43:13 +00:00
Chris Lattner ebca476b27 Allow transforming this:
%tmp = cast <4 x uint>* %testData to <4 x int>*         ; <<4 x int>*> [#uses=1]
        %tmp = load <4 x int>* %tmp             ; <<4 x int>> [#uses=1]

to this:

        %tmp = load <4 x uint>* %testData               ; <<4 x uint>> [#uses=1]
        %tmp = cast <4 x uint> %tmp to <4 x int>                ; <<4 x int>> [#uses=1]

llvm-svn: 27353
2006-04-02 05:37:12 +00:00
Chris Lattner f42d0aeda1 Turn altivec lvx/stvx intrinsics into loads and stores. This allows the
elimination of one load from this:

int AreSecondAndThirdElementsBothNegative( vector float *in ) {
#define QNaN 0x7FC00000
const vector unsigned int testData = (vector unsigned int)( QNaN, 0, 0, QNaN );
vector float test = vec_ld( 0, (float*) &testData );
return ! vec_any_ge( test, *in );
}

Now generating:

_AreSecondAndThirdElementsBothNegative:
        mfspr r2, 256
        oris r4, r2, 49152
        mtspr 256, r4
        li r4, lo16(LCPI1_0)
        lis r5, ha16(LCPI1_0)
        addi r6, r1, -16
        lvx v0, r5, r4
        stvx v0, 0, r6
        lvx v1, 0, r3
        vcmpgefp. v0, v0, v1
        mfcr r3, 2
        rlwinm r3, r3, 27, 31, 31
        xori r3, r3, 1
        cntlzw r3, r3
        srwi r3, r3, 5
        mtspr 256, r2
        blr

llvm-svn: 27352
2006-04-02 05:30:25 +00:00
Chris Lattner 70ec96fa32 Adjust to change in Intrinsics.gen interface.
llvm-svn: 27344
2006-04-02 03:35:01 +00:00
Chris Lattner 1b2436a624 add valuemapper support for inline asm
llvm-svn: 27332
2006-04-01 23:17:11 +00:00
Chris Lattner 6cf4914fd4 Fix InstCombine/2006-04-01-InfLoop.ll
llvm-svn: 27330
2006-04-01 22:05:01 +00:00
Chris Lattner dcd0792622 Fold A^(B&A) -> (B&A)^A
Fold (B&A)^A == ~B & A

This implements InstCombine/xor.ll:test2[56]

llvm-svn: 27328
2006-04-01 08:03:55 +00:00
Chris Lattner 8d1d8d364c If we can look through vector operations to find the scalar version of an
extract_element'd value, do so.

llvm-svn: 27323
2006-03-31 23:01:56 +00:00
Chris Lattner 92346c315e extractelement(undef,x) -> undef
llvm-svn: 27300
2006-03-31 18:25:14 +00:00
Chris Lattner 612fa8e6f3 Fix Transforms/InstCombine/2006-03-30-ExtractElement.ll
llvm-svn: 27261
2006-03-30 22:02:40 +00:00
Chris Lattner 42e0ba09aa teach the inliner to work with packed constants
llvm-svn: 27161
2006-03-27 05:50:18 +00:00
Chris Lattner d70d9f5b24 Don't crash on packed logical ops
llvm-svn: 27125
2006-03-25 21:58:26 +00:00
Chris Lattner f365f5f0c1 Fix spello
llvm-svn: 27052
2006-03-24 07:14:34 +00:00
Chris Lattner 5821a6a17a add the actual cost to the debug info
llvm-svn: 27051
2006-03-24 07:14:00 +00:00
Jim Laskey 8f64426f5c Strip changes to llvm.dbg intrinsics.
llvm-svn: 26993
2006-03-23 18:11:33 +00:00
Jim Laskey 83f99115db Can't combine anymore - we don't have a chain through llvm.dbg intrinsics.
llvm-svn: 26992
2006-03-23 18:10:42 +00:00
Chris Lattner 7d80b4f366 silence a bogus gcc warning
llvm-svn: 26953
2006-03-22 17:27:24 +00:00
Chris Lattner d783c76c18 Teach cee to propagate through switch statements. This implements
Transforms/CorrelatedExprs/switch.ll

Patch contributed by Eric Kidd!

llvm-svn: 26872
2006-03-19 19:37:24 +00:00
Evan Cheng c28282bd87 - Fixed a bogus if condition.
- Added more debugging info.
- Allow reuse of IV of negative stride. e.g. -4 stride == 2 * iv of -2 stride.

llvm-svn: 26841
2006-03-18 08:03:12 +00:00
Evan Cheng f09f0ebd48 Sort StrideOrder so we can process the smallest strides first. This allows
for more IV reuses.

llvm-svn: 26837
2006-03-18 00:44:49 +00:00
Evan Cheng 4520698820 Allow users of iv / stride to be rewritten with expression that is a multiply
of a smaller stride even if they have a common loop invariant expression part.

llvm-svn: 26828
2006-03-17 19:52:23 +00:00
Evan Cheng 3df447d354 For each loop, keep track of all the IV expressions inserted indexed by
stride. For a set of uses of the IV of a stride which is a multiple
of another stride, do not insert a new IV expression. Rather, reuse the
previous IV and rewrite the uses as uses of IV expression multiplied by
the factor.

e.g.
x = 0 ...; x ++
y = 0 ...; y += 4
then use of y can be rewritten as use of 4*x for x86.

llvm-svn: 26803
2006-03-16 21:53:05 +00:00
Chris Lattner 6d6084fd04 Teach the strip pass to strip type names in addition to value names. This
is fallout from the type/value split in the symtab long long ago :)

llvm-svn: 26785
2006-03-15 19:22:41 +00:00
Chris Lattner c5f866bb4a Implement a FIXME, recusively reassociating
A*A*B + A*A*C   -->   A*(A*B+A*C)   -->   A*(A*(B+C))

This implements Reassociate/mul-factor3.ll

llvm-svn: 26757
2006-03-14 16:04:29 +00:00
Chris Lattner 2fc319d444 extract some code into a method, no functionality change
llvm-svn: 26755
2006-03-14 07:11:11 +00:00
Chris Lattner d6bde46d85 Promote shifts by a constant to multiplies so that we can reassociate
(x<<1)+(y<<1) -> (X+Y)<<1.  This implements
Transforms/Reassociate/shift-factor.ll

llvm-svn: 26753
2006-03-14 06:55:18 +00:00
Evan Cheng c567c4efbb Added target lowering hooks which LSR consults to make more intelligent
transformation decisions.

llvm-svn: 26738
2006-03-13 23:14:23 +00:00
Jim Laskey acb6e34277 Handle the removal of the debug chain.
llvm-svn: 26729
2006-03-13 13:07:37 +00:00
Chris Lattner 60f6833376 use autogenerated side-effect information
llvm-svn: 26673
2006-03-09 22:38:10 +00:00
Chris Lattner 6b7847a5bc fix a pasto
llvm-svn: 26627
2006-03-09 06:09:41 +00:00
Chris Lattner fc34f8bb48 Fix a miscompilation of 188.ammp with the new CFE. 188.ammp is accessing
arrays out of range in a horrible way, but we shouldn't break it anyway.
Details in the comments.

llvm-svn: 26606
2006-03-08 01:05:29 +00:00
Jim Laskey 69effa2325 Switch to using a numeric id for anchors.
llvm-svn: 26598
2006-03-07 20:53:47 +00:00
Chris Lattner 7b87fd53f9 Fix ConstantMerge/2006-03-07-DontMergeDiffSections.ll, a problem Jim
hypotheticalized about, where we would incorrectly merge two globals in
different sections.

llvm-svn: 26597
2006-03-07 17:56:59 +00:00
Chris Lattner 53ef5a032c Teach the alignment handling code to look through constant expr casts and GEPs
llvm-svn: 26580
2006-03-07 01:28:57 +00:00
Chris Lattner 82f2ef20b6 Teach instcombine to increase the alignment of memset/memcpy/memmove when
the pointer is known to come from either a global variable, alloca or
malloc.  This allows us to compile this:

  P = malloc(28);
  memset(P, 0, 28);

into explicit stores on PPC instead of a memset call.

llvm-svn: 26577
2006-03-06 20:18:44 +00:00
Chris Lattner 6bc98653c2 Make vector narrowing more effective, implementing
Transforms/InstCombine/vec_narrow.ll.  This add support for narrowing
extract_element(insertelement) also.

llvm-svn: 26538
2006-03-05 00:22:33 +00:00
Chris Lattner 4c065091d8 Add factoring of multiplications, e.g. turning A*A+A*B into A*(A+B).
Testcase here: Transforms/Reassociate/mulfactor.ll

llvm-svn: 26524
2006-03-04 09:31:13 +00:00
Chris Lattner 32c01df299 Canonicalize (X+C1)*C2 -> X*C2+C1*C2
This implements Transforms/InstCombine/add.ll:test31

llvm-svn: 26519
2006-03-04 06:04:02 +00:00
Chris Lattner 681ef2f083 Change this to work with renamed intrinsics.
llvm-svn: 26484
2006-03-03 01:34:17 +00:00
Chris Lattner ea7986aeca Make this work with renamed intrinsics.
llvm-svn: 26482
2006-03-03 01:30:23 +00:00
Chris Lattner 85dda9a2bd Generalize the REM folding code to handle another case Nick Lewycky
pointed out: realize the AND can provide factors and look through Casts.

llvm-svn: 26469
2006-03-02 06:50:58 +00:00
Chris Lattner c5b6c9a12a Fix a regression in a patch from a couple of days ago. This fixes
Transforms/InstCombine/2006-02-28-Crash.ll

llvm-svn: 26427
2006-02-28 19:47:20 +00:00
Chris Lattner b70f141893 Implement rem.ll:test[7-9] and PR712
llvm-svn: 26415
2006-02-28 05:49:21 +00:00
Chris Lattner 2a7c7b8bab Simplify some code now that the RHS of a rem can't be 0
llvm-svn: 26413
2006-02-28 05:40:55 +00:00
Chris Lattner 0de4a8d7b7 Rearrange some code, fold "rem X, 0", implementing rem.ll:test6
llvm-svn: 26411
2006-02-28 05:30:45 +00:00
Chris Lattner c7bfed0f7b Merge two almost-identical pieces of code.
Make this code more powerful by using ComputeMaskedBits instead of looking
for an AND operand.  This lets us fold this:

int %test23(int %a) {
        %tmp.1 = and int %a, 1
        %tmp.2 = seteq int %tmp.1, 0
        %tmp.3 = cast bool %tmp.2 to int  ;; xor tmp1, 1
        ret int %tmp.3
}

into: xor (and a, 1), 1
llvm-svn: 26396
2006-02-27 02:38:23 +00:00
Chris Lattner f5c8a0b83f Fold (A^B) == A -> B == 0
and  (A-B) == A  ->  B == 0

llvm-svn: 26394
2006-02-27 01:44:11 +00:00
Chris Lattner f78df7c14d Fold (X|C1)^C2 -> X^(C1|C2) when possible. This implements
InstCombine/or.ll:test23.

llvm-svn: 26385
2006-02-26 19:57:54 +00:00
Chris Lattner b580d26e7d Fix a problem that Nate noticed that boils down to an over conservative check
in the code that does "select C, (X+Y), (X-Y) --> (X+(select C, Y, (-Y)))".
We now compile this loop:

LBB1_1: ; no_exit
        add r6, r2, r3
        subf r3, r2, r3
        cmpwi cr0, r2, 0
        addi r7, r5, 4
        lwz r2, 0(r5)
        addi r4, r4, 1
        blt cr0, LBB1_4 ; no_exit
LBB1_3: ; no_exit
        mr r3, r6
LBB1_4: ; no_exit
        cmpwi cr0, r4, 16
        mr r5, r7
        bne cr0, LBB1_1 ; no_exit

into this instead:

LBB1_1: ; no_exit
        srawi r6, r2, 31
        add r2, r2, r6
        xor r6, r2, r6
        addi r7, r5, 4
        lwz r2, 0(r5)
        addi r4, r4, 1
        add r3, r3, r6
        cmpwi cr0, r4, 16
        mr r5, r7
        bne cr0, LBB1_1 ; no_exit

llvm-svn: 26356
2006-02-24 18:05:58 +00:00
Chris Lattner e5521db5bc Fix Regression/Transforms/LoopUnswitch/2006-02-22-UnswitchCrash.ll, which
caused SPASS to fail building last night.

We can't trivially unswitch a loop if the exit block has phi nodes in it,
because we don't know which predecessor to use.

llvm-svn: 26320
2006-02-22 23:55:00 +00:00
Chris Lattner 8a5a324dac Add some comments, simplify some code, and fix a bug that caused rewriting
to rewrite with the wrong value.

llvm-svn: 26311
2006-02-22 06:37:14 +00:00
Chris Lattner c2e3a7a4ce improved support for branch folding, still not enabled.
llvm-svn: 26289
2006-02-18 07:57:38 +00:00
Jeff Cohen 0add83e969 Fix bugs identified by VC++.
llvm-svn: 26287
2006-02-18 03:20:33 +00:00
Chris Lattner 19fa8ac938 Implement deletion of dead blocks, currently disabled.
llvm-svn: 26285
2006-02-18 02:42:34 +00:00
Chris Lattner cb853de534 a previous patch completely disabled trivial unswitching, this fixees it.
Thanks to nate for pointing this out :)

llvm-svn: 26280
2006-02-18 01:32:04 +00:00
Chris Lattner 29f771ba21 initial trivial support for folding branches that have now-constant destinations.
llvm-svn: 26279
2006-02-18 01:27:45 +00:00
Chris Lattner 8e44ff50b0 When unswitching a loop, make sure to update loop info with exit blocks in
the right loop.

llvm-svn: 26277
2006-02-18 00:55:32 +00:00
Chris Lattner d95665188b Fix Transforms/SimplifyCFG/2006-02-17-InfiniteUnroll.ll
llvm-svn: 26275
2006-02-18 00:33:17 +00:00
Chris Lattner baddba41c7 Fix loops where the header has an exit, fixing a loop-unswitch crash on crafty
llvm-svn: 26258
2006-02-17 06:39:56 +00:00
Chris Lattner 6fd136239b start of some new simplification code, not thoroughly tested, use at your own
risk :)

llvm-svn: 26248
2006-02-17 00:31:07 +00:00
Nate Begeman 8a77efe4f7 Rework the SelectionDAG-based implementations of SimplifyDemandedBits
and ComputeMaskedBits to match the new improved versions in instcombine.
Tested against all of multisource/benchmarks on ppc.

llvm-svn: 26238
2006-02-16 21:11:51 +00:00
Chris Lattner fa335f6083 Change SplitBlock to increment a BasicBlock::iterator, not an Instruction*. Apparently they do different things :)
This fixes a testcase that nate reduced from spass.

Also included are a couple minor code changes that don't affect the generated
code at all.

llvm-svn: 26235
2006-02-16 19:36:22 +00:00
Jeff Cohen 55f63f1b53 Fix VC++ warning.
llvm-svn: 26228
2006-02-16 04:07:37 +00:00
Chris Lattner ff42e81028 fix a bug where we unswitched the wrong way
llvm-svn: 26225
2006-02-16 01:24:41 +00:00
Chris Lattner fdff0bb43e Implement trivial unswitching for switch stmts. This allows us to trivial
unswitch this loop on 2 before sweating to unswitch on 1/3.

void test4(int N, int i, int C, int*P, int*Q) {
  int j;
  for (j = 0; j < N; ++j) {
    switch (C) {                // general unswitching.
    default: P[i+j] = 0; break;
    case 1: Q[i+j] = 0; break;
    case 3: P[i+j] = Q[i+j]; break;
    case 2: break;              //  TRIVIAL UNSWITCH on C==2
    }
  }
}

llvm-svn: 26223
2006-02-15 22:52:05 +00:00
Chris Lattner e5cb76d744 make "trivial" unswitching significantly more general. It can now handle
this for example:

  for (j = 0; j < N; ++j) {     // trivial unswitch
    if (C)
      P[i+j] = 0;
  }

turning it into the obvious code without bothering to duplicate an empty loop.

llvm-svn: 26220
2006-02-15 22:03:36 +00:00
Andrew Lenharth 47da60130a fix a bunch of alpha regressions. see bug 709
llvm-svn: 26218
2006-02-15 21:13:37 +00:00
Chris Lattner 65152d80ec Checking the wrong value. This caused us to emit silly code like
Y = seteq bool X, true
instead of just using X :)

llvm-svn: 26215
2006-02-15 19:05:52 +00:00
Chris Lattner 01db04efb0 more refactoring, no functionality change.
llvm-svn: 26194
2006-02-15 01:44:42 +00:00
Chris Lattner b0cbe7106e pull some code out into a function
llvm-svn: 26191
2006-02-15 00:07:43 +00:00
Chris Lattner 9c5693fb2a Canonicalize inner loops before outer loops. Inner loop canonicalization
can provide work for the outer loop to canonicalize.

This fixes a case that breaks unswitching.

llvm-svn: 26189
2006-02-14 23:06:02 +00:00
Chris Lattner cffbbee8d1 When splitting exit edges to canonicalize loops, make sure to put the new
block in the appropriate loop nest.

Third time is the charm, right?

llvm-svn: 26187
2006-02-14 22:34:08 +00:00
Chris Lattner 0b8ec1a132 Use statistics to keep track of what flavors of loops we are unswitching
llvm-svn: 26157
2006-02-14 01:01:41 +00:00
Chris Lattner 8b10ab3002 Implement Instcombine/and.ll:test34
llvm-svn: 26155
2006-02-13 23:07:23 +00:00
Chris Lattner 7d8522884b If any of the sign extended bits are demanded, the input sign bit is demanded
for a sign extension.

This fixes InstCombine/2006-02-13-DemandedMiscompile.ll and Ptrdist/bc.

llvm-svn: 26152
2006-02-13 22:41:07 +00:00
Chris Lattner 68e7475777 Be careful not to request or look at bits shifted in from outside the size
of the input.  This fixes the mediabench/gsm/toast failure last night.

llvm-svn: 26138
2006-02-13 06:09:08 +00:00
Chris Lattner f5b4ef7f58 remove some more dead special case code
llvm-svn: 26135
2006-02-12 08:07:37 +00:00
Chris Lattner 5b2edb1fca Eliminate special case hacks that are superceded by general purpose hacks
llvm-svn: 26134
2006-02-12 08:02:11 +00:00
Chris Lattner ee0f280743 Three changes:
1. Teach GetConstantInType to handle boolean constants.
2. Teach instcombine to fold (compare X, CST) when X has known 0/1 bits.
   Testcase here: set.ll:test22
3. Improve the "(X >> c1) & C2 == 0" folding code to allow a noop cast
   between the shift and and.  More aggressive bitfolding for other reasons
   was turning signed shr's into unsigned shr's, leaving the noop cast in
   the way.

llvm-svn: 26131
2006-02-12 02:07:56 +00:00
Chris Lattner 02f53ad3a2 Revert my last patch. It too breaks stuff
llvm-svn: 26128
2006-02-12 01:59:10 +00:00
Chris Lattner 35248e06bc Fix for my previously reverted patch
llvm-svn: 26126
2006-02-11 21:24:54 +00:00
Chris Lattner 0157e7f55b Port the recent innovations in ComputeMaskedBits to SimplifyDemandedBits.
This allows us to simplify on conditions where bits are not known, but they
are not demanded either!  This also fixes a couple of bugs in
ComputeMaskedBits that were exposed during this work.

In the future, swaths of instcombine should be removed, as this code
subsumes a bunch of ad-hockery.

llvm-svn: 26122
2006-02-11 09:31:47 +00:00
Chris Lattner b24ce3a2a8 revert my previous change, it exposed other problems.
llvm-svn: 26121
2006-02-11 08:47:47 +00:00
Chris Lattner 05bf90dddf Make this check stricter. Disallow loop exit blocks from being shared by
loops and their subloops.

llvm-svn: 26118
2006-02-11 02:13:17 +00:00
Chris Lattner a6ae101afa remove dead expr
llvm-svn: 26116
2006-02-11 01:43:37 +00:00
Chris Lattner fbadd7e1ee implement unswitching of loops with switch stmts and selects in them
llvm-svn: 26114
2006-02-11 00:43:37 +00:00
Chris Lattner f1b151684d Update PHI nodes in successors of exit blocks.
llvm-svn: 26113
2006-02-10 23:26:14 +00:00
Chris Lattner fe4151efe7 Reform the unswitching code in terms of edge splitting, not block splitting.
llvm-svn: 26112
2006-02-10 23:16:39 +00:00
Chris Lattner ec6b40a093 Fix a case where UnswitchTrivialCondition broke critical edges with
phi's in the successors

llvm-svn: 26108
2006-02-10 19:08:15 +00:00
Chris Lattner 6e263155a6 add some notes, move some code around. Implement unswitching of loops
with branches on partially invariant computations.

llvm-svn: 26104
2006-02-10 02:30:37 +00:00
Chris Lattner 4935417a84 Move code around to be more logical, no functionality change.
llvm-svn: 26103
2006-02-10 02:01:22 +00:00
Chris Lattner 3fc3148b85 When unswitching a trivial loop, do admit we are doing it! :)
llvm-svn: 26102
2006-02-10 01:36:35 +00:00
Chris Lattner ed7a67b0de Implement unconditional unswitching of 'trivial' loops, those loops that contain
branches in their entry block that control whether or not the loop is a noop or not.

llvm-svn: 26101
2006-02-10 01:24:09 +00:00
Chris Lattner 4f0e66df6a Simplify control flow a bit, note that unswitch preserves canonical loop form
llvm-svn: 26098
2006-02-09 22:15:42 +00:00
Chris Lattner 8976219850 Make the threshold a parameter
llvm-svn: 26093
2006-02-09 20:15:48 +00:00
Chris Lattner 2826e0511b Simplify the loop-unswitch pass, by not even trying to unswitch loops with
uses of loop values outside the loop.  We need loop-closed SSA form to do
this right, or to use SSA rewriting if we really care.

llvm-svn: 26089
2006-02-09 19:14:52 +00:00
Chris Lattner 24cd2fa269 Fix 80-column violations
llvm-svn: 26088
2006-02-09 07:41:14 +00:00
Chris Lattner 4534dd59a3 Enhance MVIZ in three ways:
1. Teach it new tricks: in particular how to propagate through signed shr and sexts.
2. Teach it to return a bitset of known-1 and known-0 bits, instead of just zero.
3. Teach instcombine (AND X, C) to fold when we know all C bits of X.

This implements Regression/Transforms/InstCombine/bittest.ll, and allows
future things to be simplified.

llvm-svn: 26087
2006-02-09 07:38:58 +00:00
Chris Lattner ab2dc4d70d Simplify some code, reducing calls to MaskedValueIsZero. Implement a minor
optimization where we reduce the number of bits in AND masks when possible.

llvm-svn: 26056
2006-02-08 07:34:50 +00:00
Chris Lattner 5997cf9381 Use EraseInstFromFunction in a few cases to put the uses of the removed
instruction onto the worklist (in case they are now dead).

Add a really trivial local DSE implementation to help out bitfield code.
We now fold this:

struct S {
    unsigned char a : 1, b : 1, c : 1, d : 2, e : 3;
    S();
};

S::S() : a(0), b(0), c(1), d(0), e(6) {}

to this:

void %_ZN1SC1Ev(%struct.S* %this) {
entry:
        %tmp.1 = getelementptr %struct.S* %this, int 0, uint 0
        store ubyte 38, ubyte* %tmp.1
        ret void
}

much earlier (in gccas instead of only in gccld after DSE runs).

llvm-svn: 26050
2006-02-08 03:25:32 +00:00
Chris Lattner 06a0ed1ee0 Implement some more interesting select sccp cases. This implements:
test/Regression/Transforms/SCCP/select.ll

llvm-svn: 26049
2006-02-08 02:38:11 +00:00
Chris Lattner ddba3289b5 Fix a problem in my patch yesterday, causing a miscompilation of 176.gcc
llvm-svn: 26045
2006-02-08 01:20:23 +00:00
Chris Lattner 44314827d6 Fix Transforms/InstCombine/2006-02-07-SextZextCrash.ll
llvm-svn: 26040
2006-02-07 19:07:40 +00:00
Chris Lattner 92a6865321 Generalize MaskedValueIsZero into a ComputeMaskedNonZeroBits function, which
is just as efficient as MVIZ and is also more general.

Fix a few minor bugs introduced in recent patches

llvm-svn: 26036
2006-02-07 08:05:22 +00:00
Chris Lattner c3ebf40031 Make MaskedValueIsZero take a uint64_t instead of a ConstantIntegral as a
mask.  This allows the code to be simpler and more efficient.

Also, generalize some of the cases in MVIZ a bit, making it slightly more aggressive.

llvm-svn: 26035
2006-02-07 07:27:52 +00:00
Chris Lattner 77defbae0a Use Type::getIntegralTypeMask() to simplify some code
llvm-svn: 26034
2006-02-07 07:00:41 +00:00
Chris Lattner 2590e511d8 Implement the beginnings of a facility for simplifying expressions based on
'demanded bits', inspired by Nate's work in the dag combiner.  This isn't
complete, but needs to unrelated instcombiner changes to continue.

llvm-svn: 26033
2006-02-07 06:56:34 +00:00
Chris Lattner 2e90b732fa Turn A % (C << N), where C is 2^k, into A & ((C << N)-1) [urem only].
Turn A / (C1 << N), where C1 is "1<<C2" into A >> (N+C2) [udiv only].

Tested with: rem.ll:test5, div.ll:test10

llvm-svn: 26003
2006-02-05 07:54:04 +00:00
Chris Lattner d30c4991a1 Use SCEVExpander::InsertCastOfTo instead of our own code. This reduces
#LLVM LOC, and auto-cse's cast instructions.

llvm-svn: 25974
2006-02-04 09:52:43 +00:00
Chris Lattner 2959f0003e Fix two significant bugs in LSR:
1. When rewriting code in outer loops, sometimes we would insert code into
   inner loops that is invariant in that loop.
2. Notice that 4*(2+x) is 8+4*x and use that to simplify expressions.

This is a performance neutral change.

llvm-svn: 25964
2006-02-04 07:36:50 +00:00
Jeff Cohen 15a8c15a1f Improve compatibility with VC2005, patch by Morten Ofstad!
llvm-svn: 25661
2006-01-26 20:41:32 +00:00
Chris Lattner 120f31b1fd teach the cloner to handle inline asms
llvm-svn: 25633
2006-01-26 01:55:22 +00:00
Chris Lattner c0f633a598 Fix Regression/Transforms/ScalarRepl/2006-01-24-IllegalUnionPromoteCrash.ll
llvm-svn: 25587
2006-01-24 19:36:27 +00:00
Chris Lattner 00fcdfef0d rename method
llvm-svn: 25572
2006-01-24 04:16:34 +00:00
Chris Lattner 37992b34c2 When cloning a module, clone the inline asm.
llvm-svn: 25559
2006-01-23 23:06:28 +00:00
Chris Lattner 5774040c09 add a bunch more optimizations for unary double math functions
llvm-svn: 25530
2006-01-23 06:24:46 +00:00
Chris Lattner 57a2863cbb Refactor/genericize this, no functionality change
llvm-svn: 25525
2006-01-23 05:57:36 +00:00
Chris Lattner c597b8a55e Make iostream #inclusion explicit
llvm-svn: 25514
2006-01-22 23:32:06 +00:00
Chris Lattner 33081b4648 Make this more efficient in the following ways:
1. Do not statically construct a map when the program starts up, this
   is expensive and cannot be optimized.  Instead, create a list.
2. Do not insert entries for all function in the module into a hashmap
   that lives the full life of the compiler.

llvm-svn: 25512
2006-01-22 23:10:26 +00:00
Chris Lattner 469640e506 Add explicit #includes of <iostream>
llvm-svn: 25509
2006-01-22 22:53:01 +00:00
Chris Lattner 0d4ebfc15b Several non-functionality changing changes:
1. Use the varargs version of getOrInsertFunction to simplify code.
2. remove #include
3. Reduce the number of #ifdef's.
4. remove extraneous vertical whitespace.

llvm-svn: 25508
2006-01-22 22:35:08 +00:00
Robert Bocchino 027c18da98 ConstantFoldLoadThroughGEPConstantExpr wasn't handling pointers to
packed types correctly.

llvm-svn: 25470
2006-01-19 23:53:23 +00:00
Reid Spencer ade182125f For PR696:
Don't do floor->floorf conversion if floorf is not available. This checks
the compiler's host, not its target, which is incorrect for cross-compilers
Not sure that's important as we don't build many cross-compilers.

llvm-svn: 25456
2006-01-19 08:36:56 +00:00
Chris Lattner e154abf9b3 Implement casts.ll:test26: a cast from float -> double -> integer, doesn't
need the float->double part.

llvm-svn: 25452
2006-01-19 07:40:22 +00:00
Chris Lattner 7be2203c9f If not internalizing, don't mark llvm.global[cd]tors const, as a fix for a
hypothetical future boog.

llvm-svn: 25430
2006-01-19 00:46:54 +00:00
Chris Lattner d693b7943a Don't internalize llvm.global[cd]tor unless there are uses of it. This
unbreaks front-ends that don't use __main (like the new CFE).

llvm-svn: 25429
2006-01-19 00:40:39 +00:00
Chris Lattner b98282d2d6 Make sure that cloning a module clones its target triple and dependent
library list as well.  This should help bugpoint.

llvm-svn: 25424
2006-01-18 21:32:45 +00:00
Robert Bocchino e6336a9b69 Constant folding support for the insertelement operation.
llvm-svn: 25407
2006-01-17 20:07:07 +00:00
Robert Bocchino 6dce25019d Lowerpacked and SCCP support for the insertelement operation.
llvm-svn: 25406
2006-01-17 20:06:55 +00:00
Chris Lattner 801f47512d Clean up the FFS optimization code, and make it correctly create the appropriate
unsigned llvm.cttz.* intrinsic, fixing the 2005-05-11-Popcount-ffs-fls regression
last night.

llvm-svn: 25398
2006-01-17 18:27:17 +00:00
Reid Spencer b4f9a6f110 For PR411:
This patch is an incremental step towards supporting a flat symbol table.
It de-overloads the intrinsic functions by providing type-specific intrinsics
and arranging for automatically upgrading from the old overloaded name to
the new non-overloaded name. Specifically:
  llvm.isunordered -> llvm.isunordered.f32, llvm.isunordered.f64
  llvm.sqrt -> llvm.sqrt.f32, llvm.sqrt.f64
  llvm.ctpop -> llvm.ctpop.i8, llvm.ctpop.i16, llvm.ctpop.i32, llvm.ctpop.i64
  llvm.ctlz -> llvm.ctlz.i8, llvm.ctlz.i16, llvm.ctlz.i32, llvm.ctlz.i64
  llvm.cttz -> llvm.cttz.i8, llvm.cttz.i16, llvm.cttz.i32, llvm.cttz.i64
New code should not use the overloaded intrinsic names. Warnings will be
emitted if they are used.

llvm-svn: 25366
2006-01-16 21:12:35 +00:00
Chris Lattner 307b7ea15f fix a crash due to missing parens
llvm-svn: 25363
2006-01-16 19:47:21 +00:00
Chris Lattner 0de2c7d3d8 This pass has never worked correctly. Remove.
llvm-svn: 25349
2006-01-16 01:06:00 +00:00
Chris Lattner f6d6823f09 Let the inliner update the callgraph to reflect the changes it makes, instead
of doing it ourselves.  This fixes Transforms/Inline/2006-01-14-CallGraphUpdate.ll

llvm-svn: 25321
2006-01-14 20:09:18 +00:00
Chris Lattner 0841fb1d4c Teach the inliner to update the CallGraph itself, and have it add edges to
llvm.stacksave/restore when it inserts calls to them.

llvm-svn: 25320
2006-01-14 20:07:50 +00:00
Chris Lattner ef530c24c1 FunctionPass's cannot do IPO things.
llvm-svn: 25315
2006-01-14 19:30:35 +00:00
Nate Begeman 82049eba2c Add bswap intrinsics as documented in the Language Reference
llvm-svn: 25309
2006-01-14 01:25:24 +00:00
Robert Bocchino a83529678e Added instcombine support for extractelement.
llvm-svn: 25299
2006-01-13 22:48:06 +00:00
Chris Lattner 5fba6e6696 it is ok to dce stacksave.
llvm-svn: 25295
2006-01-13 21:31:54 +00:00
Chris Lattner 503221f5c5 Do a simple instcombine xforms to delete llvm.stackrestore cases.
llvm-svn: 25294
2006-01-13 21:28:09 +00:00
Chris Lattner c66b223b28 Simplify this a tiny bit by using the new IntrinsicInst functionality.
llvm-svn: 25292
2006-01-13 20:11:04 +00:00
Chris Lattner 45406c0c53 Permit inlining functions that contain dynamic allocations now that
InlineFunction handles this case safely.  This implements
Transforms/Inline/dynamic_alloca_test.ll.

llvm-svn: 25288
2006-01-13 19:35:43 +00:00
Chris Lattner 2be0607a8d If inlining a call to a function that contains dynamic allocas, wrap the
resultant code with llvm.stacksave/llvm.stackrestore intrinsics.

llvm-svn: 25286
2006-01-13 19:34:14 +00:00
Chris Lattner e24f79a032 Use ClonedCodeInfo to avoid another walk over the inlined code, this this
time in common C cases.

llvm-svn: 25285
2006-01-13 19:18:11 +00:00
Chris Lattner 19e6a08d78 Use the ClonedCodeInfo object to avoid scans of the inlined code when
it doesn't contain any calls.  This is a fairly common case for C++ code,
so it will probably speed up the inliner marginally in these cases.

llvm-svn: 25284
2006-01-13 19:15:15 +00:00
Chris Lattner 908d79556d Refactor a bunch of invoke handling stuff out into a new function
"HandleInlinedInvoke".  No functionality change.

llvm-svn: 25283
2006-01-13 19:05:59 +00:00
Chris Lattner edad1288fd Allow the code cloning interfaces to capture some important info about the
code being cloned if the client wants.

llvm-svn: 25281
2006-01-13 18:39:17 +00:00
Chris Lattner 257492c0ab Fix a bug I noticed by inspection: if the first instruction in the inlined
function was not an alloca, we wouldn't check the entry block for any allocas,
leading to increased stack space in some cases.  In practice, allocas are almost
always at the top of the block, so this was never noticed.

llvm-svn: 25280
2006-01-13 18:16:48 +00:00
Chris Lattner 49c4d536bd Fix 80 column violations
llvm-svn: 25279
2006-01-13 18:06:56 +00:00
Chris Lattner 0770d8e326 Preserve and update ETForest. Patch by Daniel Berlin
llvm-svn: 25203
2006-01-11 05:11:13 +00:00
Chris Lattner cb36710ff9 Switch these to using ETForest instead of DominatorSet to compute itself.
Patch written by Daniel Berlin!

llvm-svn: 25202
2006-01-11 05:10:20 +00:00
Chris Lattner 48e4a2ebd8 Switch this to using ETForest instead of DominatorSet to compute itself.
Patch written by Daniel Berlin!

llvm-svn: 25201
2006-01-11 05:09:40 +00:00
Robert Bocchino 230044839d Added support for the extractelement operation.
llvm-svn: 25181
2006-01-10 19:05:34 +00:00
Robert Bocchino bd518d153b Added lower packed support for the extractelement operation.
llvm-svn: 25180
2006-01-10 19:05:05 +00:00
Chris Lattner cda4aa6eb4 Teach loopsimplify to update et-forest. Patch contributed by Daniel Berlin!
llvm-svn: 25153
2006-01-09 08:03:08 +00:00
Chris Lattner 9cbfbc21bb fix some 176.gcc miscompilation from my previous patch.
llvm-svn: 25137
2006-01-07 01:32:28 +00:00
Chris Lattner 330628a6d8 silence some bogus gcc warnings on fenris
llvm-svn: 25130
2006-01-06 17:59:59 +00:00
Chris Lattner eb372a0276 Enhance the shift-shift folding code to allow a no-op cast to occur in between
the shifts.

This allows us to fold this (which is the 'integer add a constant' sequence
from cozmic's scheme compmiler):

int %x(uint %anf-temporary776) {
        %anf-temporary777 = shr uint %anf-temporary776, ubyte 1
        %anf-temporary800 = cast uint %anf-temporary777 to int
        %anf-temporary804 = shl int %anf-temporary800, ubyte 1
        %anf-temporary805 = add int %anf-temporary804, -2
        %anf-temporary806 = or int %anf-temporary805, 1
        ret int %anf-temporary806
}

into this:

int %x(uint %anf-temporary776) {
        %anf-temporary776 = cast uint %anf-temporary776 to int
        %anf-temporary776.mask1 = add int %anf-temporary776, -2
        %anf-temporary805 = or int %anf-temporary776.mask1, 1
        ret int %anf-temporary805
}

note that instcombine already knew how to eliminate the AND that the two
shifts fold into.  This is tested by InstCombine/shift.ll:test26

-Chris

llvm-svn: 25128
2006-01-06 07:52:12 +00:00
Chris Lattner b330939d90 Simplify the code a bit more
llvm-svn: 25126
2006-01-06 07:22:22 +00:00
Chris Lattner 145539343f Extract a bunch of code out of visitShiftInst into FoldShiftByConstant. No
functionality changes.

llvm-svn: 25125
2006-01-06 07:12:35 +00:00
Chris Lattner 8cdc773748 Pull inline methods out of the pass class definition to make it easier to
read the code.

Do not internalize debugger anchors.

llvm-svn: 25067
2006-01-03 19:13:17 +00:00
Duraid Madina 7a3ad6cae2 getting there...
llvm-svn: 25021
2005-12-26 13:48:44 +00:00
Chris Lattner 8c9e14620f Fix Transforms/ScalarRepl/2005-12-14-UnionPromoteCrash.ll, a crash on undefined
behavior in 126.gcc on big-endian systems.

llvm-svn: 24708
2005-12-14 17:23:59 +00:00
Reid Spencer 175613adf6 Improve ResolveFunctions to:
a) use better local variable names (OldMT -> OldFT) where "M" is used to
   mean "Function" (perhaps it was previously "Method"?)
b) print out the module identifier in a warning message so that it is
   possible to track down in which module the error occurred.

llvm-svn: 24698
2005-12-13 19:56:51 +00:00
Chris Lattner 3b0a62d8a5 Implement a little hack for parity with GCC on crafty. This speeds up
186.crafty by about 16% (from 15.109s to 13.045s) on my system.

This turns allocas with unions/casts into scalars.  For example crafty has
something like this:

    union doub {
      unsigned short i[4];
      long long d;
    };
int f(long long a) {
  return ((union doub){.d=a}).i[1];
}

Instead of generating loads and stores to an alloca, we now promote the
whole thing to a scalar long value.

This implements: Transforms/ScalarRepl/AggregatePromote.ll

llvm-svn: 24667
2005-12-12 07:19:13 +00:00
Chris Lattner 077200737c getRawValue zero extens for unsigned values, use getsextvalue so that we
know that small negative values fit into the immediate field of addressing
modes.

llvm-svn: 24608
2005-12-05 18:23:57 +00:00
Chris Lattner 165998207e Wrap a long line, never internalize llvm.used.
llvm-svn: 24602
2005-12-05 05:07:38 +00:00
Chris Lattner 2820b8c855 Fix SimplifyCFG/2005-12-03-IncorrectPHIFold.ll
llvm-svn: 24581
2005-12-03 18:25:58 +00:00
Chris Lattner dc4ffef633 Fix a bug where we didn't realize that vaarg reads memory. This fixes
Transforms/DeadStoreElimination/2005-11-30-vaarg.ll

llvm-svn: 24545
2005-11-30 19:38:22 +00:00
Andrew Lenharth d251192910 a few more comments on the interfaces and functions
llvm-svn: 24500
2005-11-28 18:10:59 +00:00
Andrew Lenharth 517caef495 Added documented rsprofiler interface. Also remove new profiler passes, the
old ones have been updated to implement the interface.

llvm-svn: 24499
2005-11-28 18:00:38 +00:00
Jeff Cohen 7ff44ec372 Fix VC++ warning.
llvm-svn: 24496
2005-11-28 06:45:57 +00:00
Andrew Lenharth 93e59f6032 Random sampling (aka Arnold and Ryder) profiling. This is still preliminary, but it works on spec on x86 and alpha. The idea is to allow profiling passes to remember what profiling they inserted, then a random sampling framework is inserted which consists of duplicated basic blocks (without profiling), such that at each backedge in the program and entry into every function, the framework chooses whether to use the instrumented code or the instrumentation free code. The goal of such a framework is to make it reasonably cheap to do random sampling of very expensive profiling products (such as load-value profiling).
The code is organized into 3 parts (2 passes)
1) a linked set of profiling passes, which implement an analysis group (linked, like alias analysis are).  These insert profiling into the program, and remember what they inserted, so that at a later time they can be queried about any instruction.

2) a pass that handles inserting the random sampling framework.  This also has options to control how random samples are choosen.  Currently implemented are Global counters, register allocated global counters, and read cycle counter (see? there was a reason for it).

The profiling passes are almost identical to the existing ones (block, function, and null profiling is supported right now), and they are valid passes without the sampling framework (hence the existing passes can be unified with the new ones, not done yet).

Some things are a bit ugly still, but that should be fixed up soon enough.

Other todo? making the counter values not "magic 2^16 -1" values, but dynamically choosable.

llvm-svn: 24493
2005-11-28 00:58:09 +00:00
Andrew Lenharth 5fc3794e71 since reg2mem requires it, might as well mention that it preserves it
llvm-svn: 24491
2005-11-25 16:04:54 +00:00
Andrew Lenharth 061029dee2 Reg2Mem is something a pass may depend on, so allow that
llvm-svn: 24488
2005-11-22 22:14:23 +00:00
Andrew Lenharth 71b09bbb07 turns out, demotion and invokes and critical edges don't mix
llvm-svn: 24487
2005-11-22 21:45:19 +00:00
Chris Lattner 9c37f23645 Fix a crash building 176.gcc due to my recent patch, which only fixed
half the problem.

llvm-svn: 24414
2005-11-18 18:30:47 +00:00
Chris Lattner 3e9e8bd25c Implement a refinement to the mem2reg algorithm for cases where an alloca
has a single def.  In this case, look for uses that are dominated by the def
and attempt to rewrite them to directly use the stored value.

This speeds up mem2reg on these values and reduces the number of phi nodes
inserted.  This should address PR665.

llvm-svn: 24411
2005-11-18 07:31:42 +00:00
Chris Lattner 31dc3827d3 This needs proper dominance
llvm-svn: 24410
2005-11-18 07:29:44 +00:00
Chris Lattner bca0be812d This was checking the wrong GEP expression. Fixing this fixes a gccas crash
compiling mysql reported by Ted Kremenek.

llvm-svn: 24402
2005-11-17 19:35:42 +00:00
Andrew Lenharth d9c13b1336 the pain isn't gone unless the phinodes are spilled too
llvm-svn: 24288
2005-11-10 19:39:09 +00:00
Andrew Lenharth 8e66c0c8a9 this works with backedges to the existing entry block alot better
llvm-svn: 24270
2005-11-10 17:35:34 +00:00
Andrew Lenharth 4130a4f061 The pass everyone has been waiting for!
Reg2Mem

for fun you can opt -reg2mem -mem2reg

llvm-svn: 24267
2005-11-10 01:58:38 +00:00
Nate Begeman 848622f87f Add support alignment of allocation instructions.
Add support for specifying alignment and size of setjmp jmpbufs.

No targets currently do anything with this information, nor is it presrved
in the bytecode representation.  That's coming up next.

llvm-svn: 24196
2005-11-05 09:21:28 +00:00
Chris Lattner 16b29e9562 Implement Transforms/TailCallElim/return-undef.ll, a trivial case
that has been sitting in my inbox since May 18. :)

llvm-svn: 24194
2005-11-05 08:21:11 +00:00
Chris Lattner dd0c174082 Turn sdiv into udiv if both operands have a clear sign bit. This occurs
a few times in crafty:

OLD:    %tmp.36 = div int %tmp.35, 8            ; <int> [#uses=1]
NEW:    %tmp.36 = div uint %tmp.35, 8           ; <uint> [#uses=0]
OLD:    %tmp.19 = div int %tmp.18, 8            ; <int> [#uses=1]
NEW:    %tmp.19 = div uint %tmp.18, 8           ; <uint> [#uses=0]
OLD:    %tmp.117 = div int %tmp.116, 8          ; <int> [#uses=1]
NEW:    %tmp.117 = div uint %tmp.116, 8         ; <uint> [#uses=0]
OLD:    %tmp.92 = div int %tmp.91, 8            ; <int> [#uses=1]
NEW:    %tmp.92 = div uint %tmp.91, 8           ; <uint> [#uses=0]

Which all turn into shrs.

llvm-svn: 24190
2005-11-05 07:40:31 +00:00
Chris Lattner e9ff0eaf5b Turn srem -> urem when neither input has their sign bit set. This triggers
8 times in vortex, allowing the srems to be turned into shrs:

OLD:    %tmp.104 = rem int %tmp.5.i37, 16               ; <int> [#uses=1]
NEW:    %tmp.104 = rem uint %tmp.5.i37, 16              ; <uint> [#uses=0]
OLD:    %tmp.98 = rem int %tmp.5.i24, 16                ; <int> [#uses=1]
NEW:    %tmp.98 = rem uint %tmp.5.i24, 16               ; <uint> [#uses=0]
OLD:    %tmp.91 = rem int %tmp.5.i19, 8         ; <int> [#uses=1]
NEW:    %tmp.91 = rem uint %tmp.5.i19, 8                ; <uint> [#uses=0]
OLD:    %tmp.88 = rem int %tmp.5.i14, 8         ; <int> [#uses=1]
NEW:    %tmp.88 = rem uint %tmp.5.i14, 8                ; <uint> [#uses=0]
OLD:    %tmp.85 = rem int %tmp.5.i9, 1024               ; <int> [#uses=2]
NEW:    %tmp.85 = rem uint %tmp.5.i9, 1024              ; <uint> [#uses=0]
OLD:    %tmp.82 = rem int %tmp.5.i, 512         ; <int> [#uses=2]
NEW:    %tmp.82 = rem uint %tmp.5.i1, 512               ; <uint> [#uses=0]
OLD:    %tmp.48.i = rem int %tmp.5.i.i161, 4            ; <int> [#uses=1]
NEW:    %tmp.48.i = rem uint %tmp.5.i.i161, 4           ; <uint> [#uses=0]
OLD:    %tmp.20.i2 = rem int %tmp.5.i.i, 4              ; <int> [#uses=1]
NEW:    %tmp.20.i2 = rem uint %tmp.5.i.i, 4             ; <uint> [#uses=0]

it also occurs 9 times in gcc, but with odd constant divisors (1009 and 61)
so the payoff isn't as great.

llvm-svn: 24189
2005-11-05 07:28:37 +00:00
Andrew Lenharth 662295587d make this 64 bit clean, fixed test30 of /Regression/Transforms/InstCombine/add.ll
llvm-svn: 24158
2005-11-02 18:35:40 +00:00
Chris Lattner 09efd4e5b6 Limit the search depth of MaskedValueIsZero to 6 instructions, to avoid
bad cases.  This fixes Markus's second testcase in PR639, and should
seal it for good.

llvm-svn: 24123
2005-10-31 18:35:52 +00:00
Chris Lattner 27d351f159 This pass is now obsolete since all targets have moved to the SelectionDAG
infrastructure and the simple isels have been removed.

llvm-svn: 24090
2005-10-29 05:33:46 +00:00
Chris Lattner 752717d4ec Remove dead #include
llvm-svn: 24083
2005-10-29 04:41:30 +00:00
Chris Lattner ceb9d5adaa Now that instcombine does this xform, remove it from the -raise pass
llvm-svn: 24082
2005-10-29 04:40:23 +00:00
Chris Lattner 8f663e8bbc Pull some code out into a function, give it the ability to see through +.
This allows us to turn code like malloc(4*x+4) -> malloc int, (x+1)

llvm-svn: 24081
2005-10-29 04:36:15 +00:00
Chris Lattner 8270c33606 Remove a special case, allowing the general case to handle it. No functionality
change.

llvm-svn: 24076
2005-10-29 03:19:53 +00:00
Chris Lattner b9d3ca5c3c Fix a bit of backwards logic that broke exptree and smg2000
llvm-svn: 24056
2005-10-28 16:27:35 +00:00
Chris Lattner c4f67e67d2 Do not sink any instruction with side effects, including vaarg. This fixes
PR640

llvm-svn: 24046
2005-10-27 17:13:11 +00:00
Chris Lattner 479911f971 Fix #include order
llvm-svn: 24044
2005-10-27 16:34:00 +00:00
John Criswell fe5f33b120 Move some constant folding code shared by Analysis and Transform passes
into the LLVMAnalysis library.
This allows LLVMTranform and LLVMTransformUtils to be archives and linked
with LLVMAnalysis.a, which provides any missing definitions.

llvm-svn: 24036
2005-10-27 15:54:34 +00:00
Chris Lattner c6372cca78 Fix typo
llvm-svn: 24033
2005-10-27 06:26:26 +00:00
Chris Lattner 0fe7551bc0 Teach instcombine to promote stuff like (cast (malloc sbyte, 8*X) to int*)
into: malloc int, (2*X)

llvm-svn: 24032
2005-10-27 06:24:46 +00:00
Chris Lattner b3ecf96900 Promote cases like cast (malloc sbyte, 100) to int* into
(malloc [25 x int]) directly without having to convert to
(malloc [100 x sbyte]) first.

llvm-svn: 24031
2005-10-27 06:12:00 +00:00
Chris Lattner bb17180a23 Minor change to this file to support obscure cases with constant array amounts
llvm-svn: 24030
2005-10-27 05:53:56 +00:00
John Criswell 94b7bea733 1. Remove libraries no longer created from the list of libraries linked into the
SparcV9 JIT.
2. Make LLVMTransformUtils a relinked object file and always link it before
   LLVMAnalysis.a.  These two libraries have circular dependencies on each
   other which creates problem when building the SparcV9 JIT.  This change
   fixes the dependency on all platforms problems with a minimum of fuss.

llvm-svn: 24023
2005-10-26 20:35:13 +00:00
Chris Lattner 38a1b00a0f fold nested and's early to avoid inefficiencies in MaskedValueIsZero. This
fixes a very slow compile in PR639.

llvm-svn: 24011
2005-10-26 17:18:16 +00:00
Jeff Cohen 2b8cbf319c Update Visual Studio projects to reflect moved file.
llvm-svn: 23998
2005-10-26 05:36:51 +00:00
Alkis Evlogimenos cb67b650b5 Stop using deprecated types
llvm-svn: 23973
2005-10-25 11:18:06 +00:00
Chris Lattner 46705b2f2d Handle allocations that, even after removing dead uses, still have more than
one use (but one is a cast).  This handles the very common case of:

 X = alloc [n x byte]
 Y = cast X to somethingbetter
 seteq X, null

In order to avoid infinite looping when there are multiple casts, we only
allow this if the xform is strictly increasing the alignment of the
allocation.

llvm-svn: 23961
2005-10-24 06:35:18 +00:00
Chris Lattner 355ecc09f8 Fix a bug where we would 'promote' an allocation from one type to another
where the second has less alignment required.  If we had explicit alignment
support in the IR, we could handle this case, but we can't until we do.

llvm-svn: 23960
2005-10-24 06:26:18 +00:00
Chris Lattner ac87beb03a Before promoting a malloc type, remove dead uses. This makes instcombine
more effective at promoting these allocations, catching them earlier in the
compile process.

llvm-svn: 23959
2005-10-24 06:22:12 +00:00
Chris Lattner 216be91817 Pull some code out into a function, no functionality change
llvm-svn: 23958
2005-10-24 06:03:58 +00:00
Chris Lattner b37336978f Remove some beta code that no longer has an owner.
llvm-svn: 23944
2005-10-24 02:32:41 +00:00
Chris Lattner f9998d9704 Do not build the ProfilePaths directory anymore
llvm-svn: 23943
2005-10-24 02:31:49 +00:00
Chris Lattner bde3845548 DONT_BUILD_RELINKED is gone and implied by BUILD_ARCHIVE now
llvm-svn: 23940
2005-10-24 02:26:13 +00:00
Chris Lattner 8c087e962c Only build .a file versions of these libraries, instead of .a and .o versions.
This should speed up build times.

llvm-svn: 23933
2005-10-24 01:59:48 +00:00
Chris Lattner bd77fac034 Make sure that anything using the ADCE pass pulls in the UnifyFunctionExitNodes
code

llvm-svn: 23931
2005-10-24 01:40:23 +00:00
Jeff Cohen 11e26b52b2 When a function takes a variable number of pointer arguments, with a zero
pointer marking the end of the list, the zero *must* be cast to the pointer
type.  An un-cast zero is a 32-bit int, and at least on x86_64, gcc will
not extend the zero to 64 bits, thus allowing the upper 32 bits to be
random junk.

The new END_WITH_NULL macro may be used to annotate a such a function
so that GCC (version 4 or newer) will detect the use of un-casted zero
at compile time.

llvm-svn: 23888
2005-10-23 04:37:20 +00:00
Chris Lattner 5df0e36e98 My previous patch was too conservative. Reject FP and void types, but do
allow pointer types.

llvm-svn: 23859
2005-10-21 05:45:41 +00:00
Chris Lattner 0c0b38bb4c Do NOT touch FP ops with LSR. This fixes a testcase Nate sent me from an
inner loop like this:

LBB_RateConvertMono8AltiVec_2:  ; no_exit
        lis r2, ha16(.CPI_RateConvertMono8AltiVec_0)
        lfs f3, lo16(.CPI_RateConvertMono8AltiVec_0)(r2)
        fmr f3, f3
        fadd f0, f2, f0
        fadd f3, f0, f3
        fcmpu cr0, f3, f1
        bge cr0, LBB_RateConvertMono8AltiVec_2  ; no_exit

to an inner loop like this:

LBB_RateConvertMono8AltiVec_1:  ; no_exit
        fsub f2, f2, f1
        fcmpu cr0, f2, f1
        fmr f0, f2
        bge cr0, LBB_RateConvertMono8AltiVec_1  ; no_exit

Doh! good catch!

llvm-svn: 23838
2005-10-20 04:47:10 +00:00
Chris Lattner 45517baf9f Add an option to this pass. If it is set, we are allowed to internalize
all but main.  If it's not set, we can still internalize, but only if an
explicit symbol list is provided.

llvm-svn: 23783
2005-10-18 06:29:22 +00:00
Chris Lattner da1b152c43 Make this work for FP constantexprs
llvm-svn: 23773
2005-10-17 20:18:38 +00:00
Chris Lattner 7fde91e365 Oops, X+0.0 isn't foldable, but X+-0.0 is.
llvm-svn: 23772
2005-10-17 17:56:38 +00:00
Chris Lattner 32979336a7 relax this a bit, as we only support the default rounding mode
llvm-svn: 23771
2005-10-17 17:49:32 +00:00
Chris Lattner 192cd18f53 Fix (hopefully the last) issue where LSR is nondeterminstic. When pulling
out CSE's of base expressions it could build a result whose order was
nondet.

llvm-svn: 23698
2005-10-11 18:41:04 +00:00
Chris Lattner 5c9d63da31 Fix another problem where LSR was being nondeterminstic. Also remove elements
from the end of a vector instead of the beginning

llvm-svn: 23697
2005-10-11 18:30:57 +00:00
Chris Lattner b7a3894e7c Fix another lsr-is-nondeterministic case
llvm-svn: 23695
2005-10-11 18:17:57 +00:00
Chris Lattner 03b9eb506c Make MaskedValueIsZero a bit more aggressive
llvm-svn: 23677
2005-10-09 22:08:50 +00:00
Chris Lattner 62010c450f Fix funky xcode indentation
llvm-svn: 23674
2005-10-09 06:36:35 +00:00
Chris Lattner eb4be8b942 Hrm, you didn't see this.
llvm-svn: 23673
2005-10-09 06:24:02 +00:00
Chris Lattner 4ea0a3eaac Fix a source of non-determinism in the backend: the order of processing
IV strides dependend on the pointer order of the strides in memory.
Non-determinism is bad.

llvm-svn: 23672
2005-10-09 06:20:55 +00:00
Jeff Cohen 572910c9a2 Remove useless variable.
llvm-svn: 23656
2005-10-07 05:28:29 +00:00
Chris Lattner 20b0754c41 Fix DemoteRegToStack on an invoke. This fixes PR634.
llvm-svn: 23618
2005-10-04 00:44:01 +00:00
Chris Lattner 4c3b2b536c Clean up the code a bit. Use isInstructionTriviallyDead to be more aggressive
and more correct than use_empty().  This fixes PR635 and
SimplifyCFG/2005-10-02-InvokeSimplify.ll

llvm-svn: 23616
2005-10-03 23:43:43 +00:00
Chris Lattner f07a587c79 Make IVUseShouldUsePostIncValue more aggressive when the use is a PHI. In
particular, it should realize that phi's use their values in the pred block
not the phi block itself.  This change turns our em3d loop from this:

_test:
        cmpwi cr0, r4, 0
        bgt cr0, LBB_test_2     ; entry.no_exit_crit_edge
LBB_test_1:     ; entry.loopexit_crit_edge
        li r2, 0
        b LBB_test_6    ; loopexit
LBB_test_2:     ; entry.no_exit_crit_edge
        li r6, 0
LBB_test_3:     ; no_exit
        or r2, r6, r6
        lwz r6, 0(r3)
        cmpw cr0, r6, r5
        beq cr0, LBB_test_6     ; loopexit
LBB_test_4:     ; endif
        addi r3, r3, 4
        addi r6, r2, 1
        cmpw cr0, r6, r4
        blt cr0, LBB_test_3     ; no_exit
LBB_test_5:     ; endif.loopexit.loopexit_crit_edge
        addi r3, r2, 1
        blr
LBB_test_6:     ; loopexit
        or r3, r2, r2
        blr

into:

_test:
        cmpwi cr0, r4, 0
        bgt cr0, LBB_test_2     ; entry.no_exit_crit_edge
LBB_test_1:     ; entry.loopexit_crit_edge
        li r2, 0
        b LBB_test_5    ; loopexit
LBB_test_2:     ; entry.no_exit_crit_edge
        li r6, 0
LBB_test_3:     ; no_exit
        lwz r2, 0(r3)
        cmpw cr0, r2, r5
        or r2, r6, r6
        beq cr0, LBB_test_5     ; loopexit
LBB_test_4:     ; endif
        addi r3, r3, 4
        addi r6, r6, 1
        cmpw cr0, r6, r4
        or r2, r6, r6
        blt cr0, LBB_test_3     ; no_exit
LBB_test_5:     ; loopexit
        or r3, r2, r2
        blr


Unfortunately, this is actually worse code, because the register coallescer
is getting confused somehow.  If it were doing its job right, it could turn the
code into this:

_test:
        cmpwi cr0, r4, 0
        bgt cr0, LBB_test_2     ; entry.no_exit_crit_edge
LBB_test_1:     ; entry.loopexit_crit_edge
        li r6, 0
        b LBB_test_5    ; loopexit
LBB_test_2:     ; entry.no_exit_crit_edge
        li r6, 0
LBB_test_3:     ; no_exit
        lwz r2, 0(r3)
        cmpw cr0, r2, r5
        beq cr0, LBB_test_5     ; loopexit
LBB_test_4:     ; endif
        addi r3, r3, 4
        addi r6, r6, 1
        cmpw cr0, r6, r4
        blt cr0, LBB_test_3     ; no_exit
LBB_test_5:     ; loopexit
        or r3, r6, r6
        blr

... which I'll work on next. :)

llvm-svn: 23604
2005-10-03 02:50:05 +00:00
Chris Lattner e4ed42a426 Refactor some code into a function
llvm-svn: 23603
2005-10-03 01:04:44 +00:00
Chris Lattner 360928dbed This break is bogus and I have no idea why it was there. Basically it prevents
memoizing code when IV's are used by phinodes outside of loops.  In a simple
example, we were getting this code before (note that r6 and r7 are isomorphic
IV's):

        li r6, 0
        or r7, r6, r6
LBB_test_3:     ; no_exit
        lwz r2, 0(r3)
        cmpw cr0, r2, r5
        or r2, r7, r7
        beq cr0, LBB_test_5     ; loopexit
LBB_test_4:     ; endif
        addi r2, r7, 1
        addi r7, r7, 1
        addi r3, r3, 4
        addi r6, r6, 1
        cmpw cr0, r6, r4
        blt cr0, LBB_test_3     ; no_exit

Now we get:

        li r6, 0
LBB_test_3:     ; no_exit
        or r2, r6, r6
        lwz r6, 0(r3)
        cmpw cr0, r6, r5
        beq cr0, LBB_test_6     ; loopexit
LBB_test_4:     ; endif
        addi r3, r3, 4
        addi r6, r2, 1
        cmpw cr0, r6, r4
        blt cr0, LBB_test_3     ; no_exit

this was noticed in em3d.

llvm-svn: 23602
2005-10-03 00:37:33 +00:00
Chris Lattner 8fcce170cf when checking if we should move a split edge block outside of a loop,
check the presplit pred, not the post-split pred.  This was causing us
to make the wrong decision in some cases, leaving the critical edge block
in the loop.

llvm-svn: 23601
2005-10-03 00:31:52 +00:00
Jeff Cohen f8a5e5ae6e Fix VC++ warnings.
llvm-svn: 23579
2005-10-01 03:57:14 +00:00
Chris Lattner a554c9470b Insert stores after phi nodes in the normal dest. This fixes
LowerInvoke/2005-08-03-InvokeWithPHI.ll

llvm-svn: 23525
2005-09-29 17:44:20 +00:00
Chris Lattner 87ef943a4c Fold isascii into a simple comparison. This speeds up 197.parser by 7.4%,
bringing the LLC time down to the CBE time.

llvm-svn: 23521
2005-09-29 06:17:27 +00:00
Chris Lattner 5f6035feb0 remove a bunch of unneeded stuff, or self evident comments
llvm-svn: 23519
2005-09-29 06:16:11 +00:00
Chris Lattner c244e7c178 Implement a couple of memcmp folds from the todo list
llvm-svn: 23517
2005-09-29 04:54:20 +00:00
Chris Lattner ea7214b23d Constant fold llvm.sqrt
llvm-svn: 23487
2005-09-28 01:34:32 +00:00
Chris Lattner 3b63bb375c add a note about a way to improve this code further, that I won't be getting
to right now.

llvm-svn: 23485
2005-09-27 22:44:59 +00:00
Chris Lattner eb953f0ef8 Fix a regression in my previous patch, fixing GlobalOpt/2005-09-27-Crash.ll
and PR632.

llvm-svn: 23484
2005-09-27 22:28:11 +00:00
Chris Lattner e285f5ed8f Avoid spilling stack slots... to stack slots.
llvm-svn: 23478
2005-09-27 21:33:12 +00:00
Chris Lattner 87eb249300 Completely rewrite 'correct' eh support. This changes how setjmp insertion
is performed so it is only at most once per function that contains an invoke
instead of once per invoke in the function.  This patch has the following perks:

1. It fixes PR631, which complains about slowness.
2. If fixes PR240, which complains about non-volatile vars being live across
   setjmp/longjmps.
3. It improves (but does not fix) the jmpbuf alignment issue on itanium by not
   forcing the jmpbufs to always be 8-bytes off the alignment of the structure.
4. It speeds up 253.perlbmk from 338s to 13.70s (a 25x improvement!), making us
   now about 4% faster than GCC.

Further improvements are also possible.

llvm-svn: 23477
2005-09-27 21:18:17 +00:00
Chris Lattner 92233d2175 Make the pass name simpler
llvm-svn: 23476
2005-09-27 21:10:32 +00:00
Chris Lattner 16cd356fb2 allow demotion to volatile values, add support for invoke
llvm-svn: 23473
2005-09-27 19:39:00 +00:00
Chris Lattner 3d27e7f27f Add support for external calls that we know how to constant fold. This implements
ctor-list-opt.ll:CTOR8

llvm-svn: 23465
2005-09-27 05:02:43 +00:00
Chris Lattner 29b2780c8a Fix a bug where we would evaluate stores into linkonce objects which could be
potentially replaced at link-time.

llvm-svn: 23463
2005-09-27 04:50:03 +00:00
Chris Lattner 65a3a0918f Implement support for static constructors with calls in them. This is useful
because gccas runs globalopt before inlining.

This implements ctor-list-opt.ll:CTOR7

llvm-svn: 23462
2005-09-27 04:45:34 +00:00
Chris Lattner da1889b778 Refactor this code a bit, no functionality changes.
llvm-svn: 23460
2005-09-27 04:27:01 +00:00
Chris Lattner f2f89af69a Remove some dead code. ctor evaluation subsumes empty ctor elim
llvm-svn: 23453
2005-09-26 20:38:20 +00:00
Chris Lattner 6bf2cd5735 Add support for alloca, implementing ctor-list-opt.ll:CTOR6
llvm-svn: 23452
2005-09-26 17:07:09 +00:00
Chris Lattner 46d9ff081d Add a debug printout, fix a crash on kc++
llvm-svn: 23450
2005-09-26 07:34:35 +00:00
Chris Lattner 46af55e0e4 Implement loads/stores through GEP's of globals. This implements
ctor-list-opt.ll:CTOR5.

llvm-svn: 23449
2005-09-26 06:52:44 +00:00
Chris Lattner 61ff32cd70 Replace TraverseGEPInitializer with ConstantFoldLoadThroughGEPConstantExpr
llvm-svn: 23447
2005-09-26 05:34:07 +00:00
Chris Lattner 02ae21e1e0 Eliminate GetGEPGlobalInitializer in favor of the more powerful
ConstantFoldLoadThroughGEPConstantExpr function in the utils lib.

llvm-svn: 23446
2005-09-26 05:28:52 +00:00
Chris Lattner 0b011ec8e2 Factor the GetGEPGlobalInitializer out of this pass and into Transforms/Utils
as ConstantFoldLoadThroughGEPConstantExpr.

llvm-svn: 23445
2005-09-26 05:28:06 +00:00
Chris Lattner c13c7b9376 Move the ConstantFoldLoadThroughGEPConstantExpr function out of the InstCombine
pass.

llvm-svn: 23444
2005-09-26 05:27:10 +00:00
Chris Lattner b009663e27 add a comment
llvm-svn: 23442
2005-09-26 05:16:34 +00:00
Chris Lattner 4b05c322d5 Add support for getelementptr, load, and correctly reject volatile stores.
llvm-svn: 23441
2005-09-26 05:15:37 +00:00
Chris Lattner 3e9ea5ffec Add support for br/brcond/switch and phi
llvm-svn: 23439
2005-09-26 04:57:38 +00:00
Chris Lattner 99e23fa74c Add a simple interpreter to this code, allowing us to statically evaluate
global ctors that are simple enough.  This implements ctor-list-opt.ll:CTOR2.

llvm-svn: 23437
2005-09-26 04:44:35 +00:00
Chris Lattner 696beefabb factor some code into a InstallGlobalCtors method, add comments. No functionality change.
llvm-svn: 23435
2005-09-26 02:31:18 +00:00
Chris Lattner 838bdc1836 Make the global opt optimizer work on modules with a null terminator, by
accepting the null even with a non-65535 init prio

llvm-svn: 23434
2005-09-26 02:19:27 +00:00
Chris Lattner 41b6a5a693 Factor this code out into a few methods.
Implement the start of global ctor optimization.  It is currently smart
enough to remove the global ctor for cases like this:

struct foo {
  foo() {}
} x;

... saving a bit of startup time for the program.

llvm-svn: 23433
2005-09-26 01:43:45 +00:00
Chris Lattner f487768062 Fix some logic I broke that caused a regression on
SimplifyLibCalls/2005-05-20-sprintf-crash.ll

llvm-svn: 23430
2005-09-25 07:06:48 +00:00
Chris Lattner 0b3557f54a Move MaskedValueIsZero up.
Match a bunch of idioms for sign extensions, implementing InstCombine/signext.ll

llvm-svn: 23428
2005-09-24 23:43:33 +00:00
Chris Lattner 175463a165 Simplify this code a bit by relying on recursive simplification. Support
sprintf("%s", P)'s that have uses.

s/hasNUses(0)/use_empty()/

llvm-svn: 23425
2005-09-24 22:17:06 +00:00
Chris Lattner 499e33646e remove some debugging code
llvm-svn: 23411
2005-09-23 18:49:09 +00:00
Chris Lattner c59a371d45 Fold two consequtive branches that share a common destination between them.
This implements SimplifyCFG/branch-fold.ll, and is useful on ?:/min/max heavy
code

llvm-svn: 23410
2005-09-23 18:47:20 +00:00
Chris Lattner 3a978bf66d simplify some logic further
llvm-svn: 23408
2005-09-23 07:23:18 +00:00
Chris Lattner cc14ebc17b pull a bunch of logic out of SimplifyCFG into a helper fn
llvm-svn: 23407
2005-09-23 06:39:30 +00:00
Chris Lattner 6c70106053 Start threading across blocks with code in them, so long as the code does
not define a value that is used outside of it's block.  This catches many
more simplifications, e.g. 854 in 176.gcc, 137 in vpr, etc.

This implements branch-phi-thread.ll:test3.ll

llvm-svn: 23397
2005-09-20 01:48:40 +00:00
Chris Lattner f0bd8d0107 Implement merging of blocks with the same condition if the block has multiple
predecessors.  This implements branch-phi-thread.ll::test1

llvm-svn: 23395
2005-09-20 00:43:16 +00:00
Chris Lattner 049cb4482f Reject a case we don't handle yet
llvm-svn: 23393
2005-09-19 23:57:04 +00:00
Chris Lattner a160924d57 remove debugging code :-/
llvm-svn: 23392
2005-09-19 23:50:15 +00:00
Chris Lattner 748f903046 Implement SimplifyCFG/branch-phi-thread.ll, the most trivial case of threading
control across branches with determined outcomes.  More generality to follow.
This triggers a couple thousand times in specint.

llvm-svn: 23391
2005-09-19 23:49:37 +00:00
Chris Lattner b4b2530a1a Refactor this code a bit and make it more general. This now compiles:
struct S { unsigned int i : 6, j : 11, k : 15; } b;
void plus2 (unsigned int x) { b.j += x; }

To:

_plus2:
        lis r2, ha16(L_b$non_lazy_ptr)
        lwz r2, lo16(L_b$non_lazy_ptr)(r2)
        lwz r4, 0(r2)
        slwi r3, r3, 6
        add r3, r4, r3
        rlwimi r3, r4, 0, 26, 14
        stw r3, 0(r2)
        blr


instead of:

_plus2:
        lis r2, ha16(L_b$non_lazy_ptr)
        lwz r2, lo16(L_b$non_lazy_ptr)(r2)
        lwz r4, 0(r2)
        rlwinm r5, r4, 26, 21, 31
        add r3, r5, r3
        rlwimi r4, r3, 6, 15, 25
        stw r4, 0(r2)
        blr

by eliminating an 'and'.

I'm pretty sure this is as small as we can go :)

llvm-svn: 23386
2005-09-18 07:22:02 +00:00
Chris Lattner 797dee7705 Compile
struct S { unsigned int i : 6, j : 11, k : 15; } b;
void plus2 (unsigned int x) {
  b.j += x;
}

to:

plus2:
        mov %EAX, DWORD PTR [b]
        mov %ECX, %EAX
        and %ECX, 131008
        mov %EDX, DWORD PTR [%ESP + 4]
        shl %EDX, 6
        add %EDX, %ECX
        and %EDX, 131008
        and %EAX, -131009
        or %EDX, %EAX
        mov DWORD PTR [b], %EDX
        ret

instead of:

plus2:
        mov %EAX, DWORD PTR [b]
        mov %ECX, %EAX
        shr %ECX, 6
        and %ECX, 2047
        add %ECX, DWORD PTR [%ESP + 4]
        shl %ECX, 6
        and %ECX, 131008
        and %EAX, -131009
        or %ECX, %EAX
        mov DWORD PTR [b], %ECX
        ret

llvm-svn: 23385
2005-09-18 06:30:59 +00:00
Chris Lattner 01f56c68e9 Generalize this transform, using MaskedValueIsZero, allowing us to compile:
struct S { unsigned int i : 6, j : 11, k : 15; } b;
void plus3 (unsigned int x) { b.k += x; }

To:

plus3:
        mov %EAX, DWORD PTR [%ESP + 4]
        shl %EAX, 17
        add DWORD PTR [b], %EAX
        ret

instead of:

plus3:
        mov %EAX, DWORD PTR [%ESP + 4]
        shl %EAX, 17
        mov %ECX, DWORD PTR [b]
        add %EAX, %ECX
        and %EAX, -131072
        and %ECX, 131071
        or %ECX, %EAX
        mov DWORD PTR [b], %ECX
        ret

llvm-svn: 23384
2005-09-18 06:02:59 +00:00
Chris Lattner 4ebc8ab4e0 fix typeo
llvm-svn: 23383
2005-09-18 05:25:20 +00:00
Chris Lattner e5b23a6d67 Remove unintentionally committed code
llvm-svn: 23382
2005-09-18 05:12:51 +00:00
Chris Lattner 27cb9dbd35 implement shift.ll:test25. This compiles:
struct S { unsigned int i : 6, j : 11, k : 15; } b;
void plus3 (unsigned int x) {
  b.k += x;
}

to:

_plus3:
        lis r2, ha16(L_b$non_lazy_ptr)
        lwz r2, lo16(L_b$non_lazy_ptr)(r2)
        lwz r3, 0(r2)
        rlwinm r4, r3, 0, 0, 14
        add r4, r4, r3
        rlwimi r4, r3, 0, 15, 31
        stw r4, 0(r2)
        blr

instead of:

_plus3:
        lis r2, ha16(L_b$non_lazy_ptr)
        lwz r2, lo16(L_b$non_lazy_ptr)(r2)
        lwz r4, 0(r2)
        srwi r5, r4, 17
        add r3, r5, r3
        slwi r3, r3, 17
        rlwimi r3, r4, 0, 15, 31
        stw r3, 0(r2)
        blr

llvm-svn: 23381
2005-09-18 05:12:10 +00:00
Chris Lattner af517574ce Implement add.ll:test29. Codegening:
struct S { unsigned int i : 6, j : 11, k : 15; } b;
void plus1 (unsigned int x) {
  b.i += x;
}

as:
_plus1:
        lis r2, ha16(L_b$non_lazy_ptr)
        lwz r2, lo16(L_b$non_lazy_ptr)(r2)
        lwz r4, 0(r2)
        add r3, r4, r3
        rlwimi r3, r4, 0, 0, 25
        stw r3, 0(r2)
        blr

instead of:

_plus1:
        lis r2, ha16(L_b$non_lazy_ptr)
        lwz r2, lo16(L_b$non_lazy_ptr)(r2)
        lwz r4, 0(r2)
        rlwinm r5, r4, 0, 26, 31
        add r3, r5, r3
        rlwimi r3, r4, 0, 0, 25
        stw r3, 0(r2)
        blr

llvm-svn: 23379
2005-09-18 04:24:45 +00:00
Chris Lattner 027eaf01cf remove debug output
llvm-svn: 23377
2005-09-18 03:50:25 +00:00
Chris Lattner 1521298993 Implement or.ll:test21. This teaches instcombine to be able to turn this:
struct {
   unsigned int bit0:1;
   unsigned int ubyte:31;
} sdata;

void foo() {
  sdata.ubyte++;
}

into this:

foo:
        add DWORD PTR [sdata], 2
        ret

instead of this:

foo:
        mov %EAX, DWORD PTR [sdata]
        mov %ECX, %EAX
        add %ECX, 2
        and %ECX, -2
        and %EAX, 1
        or %EAX, %ECX
        mov DWORD PTR [sdata], %EAX
        ret

llvm-svn: 23376
2005-09-18 03:42:07 +00:00
Chris Lattner a393e4d4b3 Fix the regression last night compiling povray
llvm-svn: 23348
2005-09-14 17:32:56 +00:00
Chris Lattner 2a8932960d Add a simple xform to simplify array accesses with casts in the way.
This is useful for 178.galgel where resolution of dope vectors (by the
optimizer) causes the scales to become apparent.

llvm-svn: 23328
2005-09-13 18:36:04 +00:00
Chris Lattner fd018c8dfe Fix an issue where LSR would miss rewriting a use of an IV expression by a PHI node that is not the original PHI.
This fixes up a dot-product loop in galgel, speeding it up from 18.47s to
16.13s.

llvm-svn: 23327
2005-09-13 02:09:55 +00:00
Chris Lattner 567b81f0d2 Add a helper function, allowing us to simplify some code a bit, changing
indentation, no functionality change

llvm-svn: 23325
2005-09-13 00:40:14 +00:00
Chris Lattner 219175c84d Implement a simple xform to turn code like this:
if () { store A -> P; } else { store B -> P; }

into a PHI node with one store, in the most trival case.  This implements
load.ll:test10.

llvm-svn: 23324
2005-09-12 23:23:25 +00:00
Chris Lattner e0bfdf1485 Another load-peephole optimization: do gcse when two loads are next to
each other.  This implements InstCombine/load.ll:test9

llvm-svn: 23322
2005-09-12 22:21:03 +00:00
Chris Lattner b990f7d8ed Implement a trivial form of store->load forwarding where the store and the
load are exactly consequtive.  This is picked up by other passes, but this
triggers thousands of times in fortran programs that use static locals
(and is thus a compile-time speedup).

llvm-svn: 23320
2005-09-12 22:00:15 +00:00
Chris Lattner 8048b85e8f Fix a regression from last night, which caused this pass to create invalid
code for IV uses outside of loops that are not dominated by the latch block.
We should only convert these uses to use the post-inc value if they ARE
dominated by the latch block.

Also use a new LoopInfo method to simplify some code.

This fixes Transforms/LoopStrengthReduce/2005-09-12-UsesOutOutsideOfLoop.ll

llvm-svn: 23318
2005-09-12 17:11:27 +00:00
Chris Lattner a67648396a _test:
li r2, 0
LBB_test_1:     ; no_exit.2
        li r5, 0
        stw r5, 0(r3)
        addi r2, r2, 1
        addi r3, r3, 4
        cmpwi cr0, r2, 701
        blt cr0, LBB_test_1     ; no_exit.2
LBB_test_2:     ; loopexit.2.loopexit
        addi r2, r2, 1
        stw r2, 0(r4)
        blr
[zion ~/llvm]$ cat > ~/xx
Uses of IV's outside of the loop should use hte post-incremented version
of the IV, not the preincremented version.  This helps many loops (e.g. in sixtrack)
which used to generate code like this (this is the code from the
dont-hoist-simple-loop-constants.ll testcase):

_test:
        li r2, 0                 **** IV starts at 0
LBB_test_1:     ; no_exit.2
        or r5, r2, r2            **** Copy for loop exit
        li r2, 0
        stw r2, 0(r3)
        addi r3, r3, 4
        addi r2, r5, 1
        addi r6, r5, 2           **** IV+2
        cmpwi cr0, r6, 701
        blt cr0, LBB_test_1     ; no_exit.2
LBB_test_2:     ; loopexit.2.loopexit
        addi r2, r5, 2       ****  IV+2
        stw r2, 0(r4)
        blr

And now generated code like this:

_test:
        li r2, 1               *** IV starts at 1
LBB_test_1:     ; no_exit.2
        li r5, 0
        stw r5, 0(r3)
        addi r2, r2, 1
        addi r3, r3, 4
        cmpwi cr0, r2, 701     *** IV.postinc + 0
        blt cr0, LBB_test_1
LBB_test_2:     ; loopexit.2.loopexit
        stw r2, 0(r4)          *** IV.postinc + 0
        blr

llvm-svn: 23313
2005-09-12 06:04:47 +00:00
Chris Lattner 530fe6ab30 implement Transforms/LoopStrengthReduce/dont-hoist-simple-loop-constants.ll.
We used to emit this code for it:

_test:
        li r2, 1     ;; Value tying up a register for the whole loop
        li r5, 0
LBB_test_1:     ; no_exit.2
        or r6, r5, r5
        li r5, 0
        stw r5, 0(r3)
        addi r5, r6, 1
        addi r3, r3, 4
        add r7, r2, r5  ;; should be addi r7, r5, 1
        cmpwi cr0, r7, 701
        blt cr0, LBB_test_1     ; no_exit.2
LBB_test_2:     ; loopexit.2.loopexit
        addi r2, r6, 2
        stw r2, 0(r4)
        blr

now we emit this:

_test:
        li r2, 0
LBB_test_1:     ; no_exit.2
        or r5, r2, r2
        li r2, 0
        stw r2, 0(r3)
        addi r3, r3, 4
        addi r2, r5, 1
        addi r6, r5, 2   ;; whoa, fold those adds!
        cmpwi cr0, r6, 701
        blt cr0, LBB_test_1     ; no_exit.2
LBB_test_2:     ; loopexit.2.loopexit
        addi r2, r5, 2
        stw r2, 0(r4)
        blr

more improvement coming.

llvm-svn: 23306
2005-09-10 01:18:45 +00:00
Chris Lattner b5e381a8cf Fix a problem that Dan Berlin noticed, where reassociation would not succeed
in building maximal expressions before simplifying them.  In particular, i
cases like this:

X-(A+B+X)

the code would consider A+B+X to be a maximal expression (not understanding
that the single use '-' would be turned into a + later), simplify it (a noop)
then later get simplified again.

Each of these simplify steps is where the cost of reassociation comes from,
so this patch should speed up the already fast pass a bit.

Thanks to Dan for noticing this!

llvm-svn: 23214
2005-09-02 07:07:58 +00:00
Chris Lattner 9fe263aa75 Avoid creating garbage instructions, just move the old add instruction
to where we need it when converting -(A+B+C) -> -A + -B + -C.

llvm-svn: 23213
2005-09-02 06:38:04 +00:00
Chris Lattner d1325da091 add some assertions and fix problems where reassociate could access the
Ops vector out of range

llvm-svn: 23211
2005-09-02 05:23:22 +00:00
Chris Lattner 8ca5b2a6d2 Fix Regression/Transforms/Reassociate/2005-08-24-Crash.ll
llvm-svn: 23019
2005-08-24 17:55:32 +00:00
Chris Lattner 4201cd1bbc Transform floor((double)FLT) -> (double)floorf(FLT), implementing
Regression/Transforms/SimplifyLibCalls/floor.ll.  This triggers 19 times in
177.mesa.

llvm-svn: 23017
2005-08-24 17:22:17 +00:00
Chris Lattner ea7dfd53d6 Fix Transforms/LoopStrengthReduce/2005-08-17-OutOfLoopVariant.ll, a crash
on 177.mesa

llvm-svn: 22843
2005-08-17 21:22:41 +00:00
Chris Lattner 2bf7cb5213 Use a new helper to split critical edges, making the code simpler.
Do not claim to not change the CFG.  We do change the cfg to split critical
edges.  This isn't causing us a problem now, but could likely do so in the
future.

llvm-svn: 22824
2005-08-17 06:35:16 +00:00
Chris Lattner 5cf983ee0f Fix a bad case in gzip where we put lots of things in registers across the
loop, because a IV-dependent value was used outside of the loop and didn't
have immediate-folding capability

llvm-svn: 22798
2005-08-16 00:38:11 +00:00
Chris Lattner 47d3ec3525 Ooops, don't forget to clear this. The real inner loop is now:
.LBB_foo_3:     ; no_exit.1
        lfd f2, 0(r9)
        lfd f3, 8(r9)
        fmul f4, f1, f2
        fmadd f4, f0, f3, f4
        stfd f4, 8(r9)
        fmul f3, f1, f3
        fmsub f2, f0, f2, f3
        stfd f2, 0(r9)
        addi r9, r9, 16
        addi r8, r8, 1
        cmpw cr0, r8, r4
        ble .LBB_foo_3  ; no_exit.1

llvm-svn: 22782
2005-08-13 07:42:01 +00:00
Chris Lattner 5949d49032 Recursively scan scev expressions for common subexpressions. This allows us
to handle nested loops much better, for example, by being able to tell that
these two expressions:

{( 8 + ( 16 * ( 1 +  %Tmp11 +  %Tmp12)) +  %c_),+,( 16 *  %Tmp 12)}<loopentry.1>

{(( 16 * ( 1 +  %Tmp11 +  %Tmp12)) +  %c_),+,( 16 *  %Tmp12)}<loopentry.1>

Have the following common part that can be shared:
{(( 16 * ( 1 +  %Tmp11 +  %Tmp12)) +  %c_),+,( 16 *  %Tmp12)}<loopentry.1>

This allows us to codegen an important inner loop in 168.wupwise as:

.LBB_foo_4:     ; no_exit.1
        lfd f2, 16(r9)
        fmul f3, f0, f2
        fmul f2, f1, f2
        fadd f4, f3, f2
        stfd f4, 8(r9)
        fsub f2, f3, f2
        stfd f2, 16(r9)
        addi r8, r8, 1
        addi r9, r9, 16
        cmpw cr0, r8, r4
        ble .LBB_foo_4  ; no_exit.1

instead of:

.LBB_foo_3:     ; no_exit.1
        lfdx f2, r6, r9
        add r10, r6, r9
        lfd f3, 8(r10)
        fmul f4, f1, f2
        fmadd f4, f0, f3, f4
        stfd f4, 8(r10)
        fmul f3, f1, f3
        fmsub f2, f0, f2, f3
        stfdx f2, r6, r9
        addi r9, r9, 16
        addi r8, r8, 1
        cmpw cr0, r8, r4
        ble .LBB_foo_3  ; no_exit.1

llvm-svn: 22781
2005-08-13 07:27:18 +00:00
Chris Lattner 89c1dfc733 Teach SplitCriticalEdge to update LoopInfo if it is alive. This fixes
a problem in LoopStrengthReduction, where it would split critical edges
then confused itself with outdated loop information.

llvm-svn: 22776
2005-08-13 01:38:43 +00:00
Chris Lattner 79396539d3 remove dead code. The exit block list is computed on demand, thus does not
need to be updated.  This code is a relic from when it did.

llvm-svn: 22775
2005-08-13 01:30:36 +00:00
Chris Lattner 8447b49526 When splitting critical edges, make sure not to leave the new block in the
middle of the loop.  This turns a critical loop in gzip into this:

.LBB_test_1:    ; loopentry
        or r27, r28, r28
        add r28, r3, r27
        lhz r28, 3(r28)
        add r26, r4, r27
        lhz r26, 3(r26)
        cmpw cr0, r28, r26
        bne .LBB_test_8 ; loopentry.loopexit_crit_edge
.LBB_test_2:    ; shortcirc_next.0
        add r28, r3, r27
        lhz r28, 5(r28)
        add r26, r4, r27
        lhz r26, 5(r26)
        cmpw cr0, r28, r26
        bne .LBB_test_7 ; shortcirc_next.0.loopexit_crit_edge
.LBB_test_3:    ; shortcirc_next.1
        add r28, r3, r27
        lhz r28, 7(r28)
        add r26, r4, r27
        lhz r26, 7(r26)
        cmpw cr0, r28, r26
        bne .LBB_test_6 ; shortcirc_next.1.loopexit_crit_edge
.LBB_test_4:    ; shortcirc_next.2
        add r28, r3, r27
        lhz r26, 9(r28)
        add r28, r4, r27
        lhz r25, 9(r28)
        addi r28, r27, 8
        cmpw cr7, r26, r25
        mfcr r26, 1
        rlwinm r26, r26, 31, 31, 31
        add r25, r8, r27
        cmpw cr7, r25, r7
        mfcr r25, 1
        rlwinm r25, r25, 29, 31, 31
        and. r26, r26, r25
        bne .LBB_test_1 ; loopentry

instead of this:

.LBB_test_1:    ; loopentry
        or r27, r28, r28
        add r28, r3, r27
        lhz r28, 3(r28)
        add r26, r4, r27
        lhz r26, 3(r26)
        cmpw cr0, r28, r26
        beq .LBB_test_3 ; shortcirc_next.0
.LBB_test_2:    ; loopentry.loopexit_crit_edge
        add r2, r30, r27
        add r8, r29, r27
        b .LBB_test_9   ; loopexit
.LBB_test_3:    ; shortcirc_next.0
        add r28, r3, r27
        lhz r28, 5(r28)
        add r26, r4, r27
        lhz r26, 5(r26)
        cmpw cr0, r28, r26
        beq .LBB_test_5 ; shortcirc_next.1
.LBB_test_4:    ; shortcirc_next.0.loopexit_crit_edge
        add r2, r11, r27
        add r8, r12, r27
        b .LBB_test_9   ; loopexit
.LBB_test_5:    ; shortcirc_next.1
        add r28, r3, r27
        lhz r28, 7(r28)
        add r26, r4, r27
        lhz r26, 7(r26)
        cmpw cr0, r28, r26
        beq .LBB_test_7 ; shortcirc_next.2
.LBB_test_6:    ; shortcirc_next.1.loopexit_crit_edge
        add r2, r9, r27
        add r8, r10, r27
        b .LBB_test_9   ; loopexit
.LBB_test_7:    ; shortcirc_next.2
        add r28, r3, r27
        lhz r26, 9(r28)
        add r28, r4, r27
        lhz r25, 9(r28)
        addi r28, r27, 8
        cmpw cr7, r26, r25
        mfcr r26, 1
        rlwinm r26, r26, 31, 31, 31
        add r25, r8, r27
        cmpw cr7, r25, r7
        mfcr r25, 1
        rlwinm r25, r25, 29, 31, 31
        and. r26, r26, r25
        bne .LBB_test_1 ; loopentry

Next up, improve the code for the loop.

llvm-svn: 22769
2005-08-12 22:22:17 +00:00
Chris Lattner 4fec86d348 Fix a FIXME: if we are inserting code for a PHI argument, split the critical
edge so that the code is not always executed for both operands.  This
prevents LSR from inserting code into loops whose exit blocks contain
PHI uses of IV expressions (which are outside of loops).  On gzip, for
example, we turn this ugly code:

.LBB_test_1:    ; loopentry
        add r27, r3, r28
        lhz r27, 3(r27)
        add r26, r4, r28
        lhz r26, 3(r26)
        add r25, r30, r28    ;; Only live if exiting the loop
        add r24, r29, r28    ;; Only live if exiting the loop
        cmpw cr0, r27, r26
        bne .LBB_test_5 ; loopexit

into this:

.LBB_test_1:    ; loopentry
        or r27, r28, r28
        add r28, r3, r27
        lhz r28, 3(r28)
        add r26, r4, r27
        lhz r26, 3(r26)
        cmpw cr0, r28, r26
        beq .LBB_test_3 ; shortcirc_next.0
.LBB_test_2:    ; loopentry.loopexit_crit_edge
        add r2, r30, r27
        add r8, r29, r27
        b .LBB_test_9   ; loopexit
.LBB_test_2:    ; shortcirc_next.0
        ...
        blt .LBB_test_1


into this:

.LBB_test_1:    ; loopentry
        or r27, r28, r28
        add r28, r3, r27
        lhz r28, 3(r28)
        add r26, r4, r27
        lhz r26, 3(r26)
        cmpw cr0, r28, r26
        beq .LBB_test_3 ; shortcirc_next.0
.LBB_test_2:    ; loopentry.loopexit_crit_edge
        add r2, r30, r27
        add r8, r29, r27
        b .LBB_t_3:    ; shortcirc_next.0
.LBB_test_3:    ; shortcirc_next.0
        ...
        blt .LBB_test_1


Next step: get the block out of the loop so that the loop is all
fall-throughs again.

llvm-svn: 22766
2005-08-12 22:06:11 +00:00
Chris Lattner b7ebe65c56 Change break critical edges to not remove, then insert, PHI node entries.
Instead, just update the BB in-place.  This is both faster, and it prevents
split-critical-edges from shuffling the PHI argument list unneccesarily.

llvm-svn: 22765
2005-08-12 21:58:07 +00:00
Chris Lattner 62df798919 remove some trickiness that broke yacr2 and some other programs last night
llvm-svn: 22751
2005-08-10 17:15:20 +00:00
Chris Lattner f83ce5faee Make loop-simplify produce better loops by turning PHI nodes like X = phi [X, Y]
into just Y.  This often occurs when it seperates loops that have collapsed loop
headers.  This implements LoopSimplify/phi-node-simplify.ll

llvm-svn: 22746
2005-08-10 02:07:32 +00:00
Chris Lattner 677d85784a Allow indvar simplify to canonicalize ANY affine IV, not just affine IVs with
constant stride.  This implements Transforms/IndVarsSimplify/variable-stride-ivs.ll

llvm-svn: 22744
2005-08-10 01:12:06 +00:00
Chris Lattner edff91a49a Teach LSR to strength reduce IVs that have a loop-invariant but non-constant stride.
For code like this:

void foo(float *a, float *b, int n, int stride_a, int stride_b) {
  int i;
  for (i=0; i<n; i++)
      a[i*stride_a] = b[i*stride_b];
}

we now emit:

.LBB_foo2_2:    ; no_exit
        lfs f0, 0(r4)
        stfs f0, 0(r3)
        addi r7, r7, 1
        add r4, r2, r4
        add r3, r6, r3
        cmpw cr0, r7, r5
        blt .LBB_foo2_2 ; no_exit

instead of:

.LBB_foo_2:     ; no_exit
        mullw r8, r2, r7     ;; multiply!
        slwi r8, r8, 2
        lfsx f0, r4, r8
        mullw r8, r2, r6     ;; multiply!
        slwi r8, r8, 2
        stfsx f0, r3, r8
        addi r2, r2, 1
        cmpw cr0, r2, r5
        blt .LBB_foo_2  ; no_exit

loops with variable strides occur pretty often.  For example, in SPECFP2K
there are 317 variable strides in 177.mesa, 3 in 179.art, 14 in 188.ammp,
56 in 168.wupwise, 36 in 172.mgrid.

Now we can allow indvars to turn functions written like this:

void foo2(float *a, float *b, int n, int stride_a, int stride_b) {
  int i, ai = 0, bi = 0;
  for (i=0; i<n; i++)
    {
      a[ai] = b[bi];
      ai += stride_a;
      bi += stride_b;
    }
}

into code like the above for better analysis.  With this patch, they generate
identical code.

llvm-svn: 22740
2005-08-10 00:45:21 +00:00
Chris Lattner dde7dc525e Fix Regression/Transforms/LoopStrengthReduce/phi_node_update_multiple_preds.ll
by being more careful about updating PHI nodes

llvm-svn: 22739
2005-08-10 00:35:32 +00:00
Chris Lattner c6c4d99a21 Fix some 80 column violations.
Once we compute the evolution for a GEP, tell SE about it.  This allows users
of the GEP to know it, if the users are not direct.  This allows us to compile
this testcase:

void fbSolidFillmmx(int w, unsigned char *d) {
    while (w >= 64) {
        *(unsigned long long *) (d +  0) = 0;
        *(unsigned long long *) (d +  8) = 0;
        *(unsigned long long *) (d + 16) = 0;
        *(unsigned long long *) (d + 24) = 0;
        *(unsigned long long *) (d + 32) = 0;
        *(unsigned long long *) (d + 40) = 0;
        *(unsigned long long *) (d + 48) = 0;
        *(unsigned long long *) (d + 56) = 0;
        w -= 64;
        d += 64;
    }
}

into:

.LBB_fbSolidFillmmx_2:  ; no_exit
        li r2, 0
        stw r2, 0(r4)
        stw r2, 4(r4)
        stw r2, 8(r4)
        stw r2, 12(r4)
        stw r2, 16(r4)
        stw r2, 20(r4)
        stw r2, 24(r4)
        stw r2, 28(r4)
        stw r2, 32(r4)
        stw r2, 36(r4)
        stw r2, 40(r4)
        stw r2, 44(r4)
        stw r2, 48(r4)
        stw r2, 52(r4)
        stw r2, 56(r4)
        stw r2, 60(r4)
        addi r4, r4, 64
        addi r3, r3, -64
        cmpwi cr0, r3, 63
        bgt .LBB_fbSolidFillmmx_2       ; no_exit

instead of:

.LBB_fbSolidFillmmx_2:  ; no_exit
        li r11, 0
        stw r11, 0(r4)
        stw r11, 4(r4)
        stwx r11, r10, r4
        add r12, r10, r4
        stw r11, 4(r12)
        stwx r11, r9, r4
        add r12, r9, r4
        stw r11, 4(r12)
        stwx r11, r8, r4
        add r12, r8, r4
        stw r11, 4(r12)
        stwx r11, r7, r4
        add r12, r7, r4
        stw r11, 4(r12)
        stwx r11, r6, r4
        add r12, r6, r4
        stw r11, 4(r12)
        stwx r11, r5, r4
        add r12, r5, r4
        stw r11, 4(r12)
        stwx r11, r2, r4
        add r12, r2, r4
        stw r11, 4(r12)
        addi r4, r4, 64
        addi r3, r3, -64
        cmpwi cr0, r3, 63
        bgt .LBB_fbSolidFillmmx_2       ; no_exit

llvm-svn: 22737
2005-08-09 23:39:36 +00:00
Chris Lattner 02742710f3 SCEVAddExpr::get() of an empty list is invalid.
llvm-svn: 22724
2005-08-09 01:13:47 +00:00
Chris Lattner a091ff1764 Implement: LoopStrengthReduce/share_ivs.ll
Two changes:
  * Only insert one PHI node for each stride.  Other values are live in
    values.  This cannot introduce higher register pressure than the
    previous approach, and can take advantage of reg+reg addressing modes.
  * Factor common base values out of uses before moving values from the
    base to the immediate fields.  This improves codegen by starting the
    stride-specific PHI node out at a common place for each IV use.

As an example, we used to generate this for a loop in swim:

.LBB_main_no_exit_2E_6_2E_i_no_exit_2E_7_2E_i_2:        ; no_exit.7.i
        lfd f0, 0(r8)
        stfd f0, 0(r3)
        lfd f0, 0(r6)
        stfd f0, 0(r7)
        lfd f0, 0(r2)
        stfd f0, 0(r5)
        addi r9, r9, 1
        addi r2, r2, 8
        addi r5, r5, 8
        addi r6, r6, 8
        addi r7, r7, 8
        addi r8, r8, 8
        addi r3, r3, 8
        cmpw cr0, r9, r4
        bgt .LBB_main_no_exit_2E_6_2E_i_no_exit_2E_7_2E_i_1

now we emit:

.LBB_main_no_exit_2E_6_2E_i_no_exit_2E_7_2E_i_2:        ; no_exit.7.i
        lfdx f0, r8, r2
        stfdx f0, r9, r2
        lfdx f0, r5, r2
        stfdx f0, r7, r2
        lfdx f0, r3, r2
        stfdx f0, r6, r2
        addi r10, r10, 1
        addi r2, r2, 8
        cmpw cr0, r10, r4
        bgt .LBB_main_no_exit_2E_6_2E_i_no_exit_2E_7_2E_i_1

As another more dramatic example, we used to emit this:

.LBB_main_L_90_no_exit_2E_0_2E_i16_no_exit_2E_1_2E_i19_2:       ; no_exit.1.i19
        lfd f0, 8(r21)
        lfd f4, 8(r3)
        lfd f5, 8(r27)
        lfd f6, 8(r22)
        lfd f7, 8(r5)
        lfd f8, 8(r6)
        lfd f9, 8(r30)
        lfd f10, 8(r11)
        lfd f11, 8(r12)
        fsub f10, f10, f11
        fadd f5, f4, f5
        fmul f5, f5, f1
        fadd f6, f6, f7
        fadd f6, f6, f8
        fadd f6, f6, f9
        fmadd f0, f5, f6, f0
        fnmsub f0, f10, f2, f0
        stfd f0, 8(r4)
        lfd f0, 8(r25)
        lfd f5, 8(r26)
        lfd f6, 8(r23)
        lfd f9, 8(r28)
        lfd f10, 8(r10)
        lfd f12, 8(r9)
        lfd f13, 8(r29)
        fsub f11, f13, f11
        fadd f4, f4, f5
        fmul f4, f4, f1
        fadd f5, f6, f9
        fadd f5, f5, f10
        fadd f5, f5, f12
        fnmsub f0, f4, f5, f0
        fnmsub f0, f11, f3, f0
        stfd f0, 8(r24)
        lfd f0, 8(r8)
        fsub f4, f7, f8
        fsub f5, f12, f10
        fnmsub f0, f5, f2, f0
        fnmsub f0, f4, f3, f0
        stfd f0, 8(r2)
        addi r20, r20, 1
        addi r2, r2, 8
        addi r8, r8, 8
        addi r10, r10, 8
        addi r12, r12, 8
        addi r6, r6, 8
        addi r29, r29, 8
        addi r28, r28, 8
        addi r26, r26, 8
        addi r25, r25, 8
        addi r24, r24, 8
        addi r5, r5, 8
        addi r23, r23, 8
        addi r22, r22, 8
        addi r3, r3, 8
        addi r9, r9, 8
        addi r11, r11, 8
        addi r30, r30, 8
        addi r27, r27, 8
        addi r21, r21, 8
        addi r4, r4, 8
        cmpw cr0, r20, r7
        bgt .LBB_main_L_90_no_exit_2E_0_2E_i16_no_exit_2E_1_2E_i19_1

we now emit:

.LBB_main_L_90_no_exit_2E_0_2E_i16_no_exit_2E_1_2E_i19_2:       ; no_exit.1.i19
        lfdx f0, r21, r20
        lfdx f4, r3, r20
        lfdx f5, r27, r20
        lfdx f6, r22, r20
        lfdx f7, r5, r20
        lfdx f8, r6, r20
        lfdx f9, r30, r20
        lfdx f10, r11, r20
        lfdx f11, r12, r20
        fsub f10, f10, f11
        fadd f5, f4, f5
        fmul f5, f5, f1
        fadd f6, f6, f7
        fadd f6, f6, f8
        fadd f6, f6, f9
        fmadd f0, f5, f6, f0
        fnmsub f0, f10, f2, f0
        stfdx f0, r4, r20
        lfdx f0, r25, r20
        lfdx f5, r26, r20
        lfdx f6, r23, r20
        lfdx f9, r28, r20
        lfdx f10, r10, r20
        lfdx f12, r9, r20
        lfdx f13, r29, r20
        fsub f11, f13, f11
        fadd f4, f4, f5
        fmul f4, f4, f1
        fadd f5, f6, f9
        fadd f5, f5, f10
        fadd f5, f5, f12
        fnmsub f0, f4, f5, f0
        fnmsub f0, f11, f3, f0
        stfdx f0, r24, r20
        lfdx f0, r8, r20
        fsub f4, f7, f8
        fsub f5, f12, f10
        fnmsub f0, f5, f2, f0
        fnmsub f0, f4, f3, f0
        stfdx f0, r2, r20
        addi r19, r19, 1
        addi r20, r20, 8
        cmpw cr0, r19, r7
        bgt .LBB_main_L_90_no_exit_2E_0_2E_i16_no_exit_2E_1_2E_i19_1

llvm-svn: 22722
2005-08-09 00:18:09 +00:00
Chris Lattner 37c24cc98c Suck the base value out of the UsersToProcess vector into the BasedUser
class to simplify the code.  Fuse two loops.

llvm-svn: 22721
2005-08-08 22:56:21 +00:00
Chris Lattner 37ed895bf1 Split MoveLoopVariantsToImediateField out from MoveImmediateValues. The
first is a correctness thing, and the later is an optzn thing.  This also
is needed to support a future change.

llvm-svn: 22720
2005-08-08 22:32:34 +00:00
Chris Lattner 9f269e40c9 Use the new 'moveBefore' method to simplify some code. Really, which is
easier to understand?  :)

llvm-svn: 22706
2005-08-08 19:11:57 +00:00
Chris Lattner 14203e85b2 Not all constants are legal immediates in load/store instructions.
llvm-svn: 22704
2005-08-08 06:25:50 +00:00
Chris Lattner c70bbc0c41 Implement LoopStrengthReduce/share_code_in_preheader.ll by having one
rewriter for all code inserted into the preheader, which is never flushed.

llvm-svn: 22702
2005-08-08 05:47:49 +00:00
Chris Lattner 9bfa6f8784 Implement a simple optimization for the termination condition of the loop.
The termination condition actually wants to use the post-incremented value
of the loop, not a new indvar with an unusual base.

On PPC, for example, this allows us to compile
LoopStrengthReduce/exit_compare_live_range.ll to:

_foo:
        li r2, 0
.LBB_foo_1:     ; no_exit
        li r5, 0
        stw r5, 0(r3)
        addi r2, r2, 1
        cmpw cr0, r2, r4
        bne .LBB_foo_1  ; no_exit
        blr

instead of:

_foo:
        li r2, 1                ;; IV starts at 1, not 0
.LBB_foo_1:     ; no_exit
        li r5, 0
        stw r5, 0(r3)
        addi r5, r2, 1
        cmpw cr0, r2, r4
        or r2, r5, r5           ;; Reg-reg copy, extra live range
        bne .LBB_foo_1  ; no_exit
        blr

This implements LoopStrengthReduce/exit_compare_live_range.ll

llvm-svn: 22699
2005-08-08 05:28:22 +00:00
Chris Lattner 579b20b747 All stats are "Number of ..."
llvm-svn: 22694
2005-08-07 20:02:04 +00:00
Chris Lattner 2c14cf7b74 Add some simple folds that occur in bitfield cases. Fix a minor bug in
isHighOnes, where it would consider 0 to have high ones.

llvm-svn: 22693
2005-08-07 07:03:10 +00:00
Chris Lattner 134ebd0801 Fix typoCVS: ----------------------------------------------------------------------
llvm-svn: 22692
2005-08-07 07:00:52 +00:00
Chris Lattner f4dd8c445c * Use the new PHINode::hasConstantValue method to simplify some code
* Teach this code to move allocas out of the loop when tail call eliminating
  a call marked 'tail'.  This implements TailCallElim/move_alloca_for_tail_call.ll
* Do not perform this transformation if a call is marked 'tail' and if there
  are allocas that we cannot move out of the loop in #2.  Doing so would increase
  the stack usage of the function.  This implements fixes
  PR615 and TailCallElim/dont-tce-tail-marked-call.ll.

llvm-svn: 22690
2005-08-07 04:27:41 +00:00
Chris Lattner 11e7a5eda7 Make sure to clean CastedPointers after casts are potentially deleted.
This fixes LSR crashes on 301.apsi, 191.fma3d, and 189.lucas

llvm-svn: 22673
2005-08-05 01:30:11 +00:00
Chris Lattner 9f9c260b8c now that hasConstantValue defaults to only returning values that dominate
the PHI node, this ugly code can vanish.

llvm-svn: 22672
2005-08-05 01:04:30 +00:00
Chris Lattner 257efb2ad3 This code can handle non-dominating instructions
llvm-svn: 22667
2005-08-05 00:57:45 +00:00
Nate Begeman b392321cae Fix a fixme in CondPropagate.cpp by moving a PhiNode optimization into
BasicBlock's removePredecessor routine.  This requires shuffling around
the definition and implementation of hasContantValue from Utils.h,cpp into
Instructions.h,cpp

llvm-svn: 22664
2005-08-04 23:24:19 +00:00
Chris Lattner 45f8b6e7aa Modify how immediates are removed from base expressions to deal with the fact
that the symbolic evaluator is not always able to use subtraction to remove
expressions.  This makes the code faster, and fixes the last crash on 178.galgel.
Finally, add a statistic to see how many phi nodes are inserted.

On 178.galgel, we get the follow stats:

2562 loop-reduce  - Number of PHIs inserted
3927 loop-reduce  - Number of GEPs strength reduced

llvm-svn: 22662
2005-08-04 22:34:05 +00:00
Chris Lattner a6d7c355bc * Refactor some code into a new BasedUser::RewriteInstructionToUseNewBase
method.
* Fix a crash on 178.galgel, where we would insert expressions before PHI
  nodes instead of into the PHI node predecessor blocks.

llvm-svn: 22657
2005-08-04 20:03:32 +00:00
Chris Lattner 0f7c0fa2a7 Fix a case that caused this to crash on 178.galgel
llvm-svn: 22653
2005-08-04 19:26:19 +00:00
Chris Lattner acc42c4df1 Teach LSR about loop-variant expressions, such as loops like this:
for (i = 0; i < N; ++i)
    A[i][foo()] = 0;

here we still want to strength reduce the A[i] part, even though foo() is
l-v.

This also simplifies some of the 'CanReduce' logic.

This implements Transforms/LoopStrengthReduce/ops_after_indvar.ll

llvm-svn: 22652
2005-08-04 19:08:16 +00:00
Nate Begeman 456044b724 Remove some more dead code.
llvm-svn: 22650
2005-08-04 18:13:56 +00:00
Chris Lattner eaf24725b2 Refactor this code substantially with the following improvements:
1. We only analyze instructions once, guaranteed
  2. AnalyzeGetElementPtrUsers has been ripped apart and replaced with
     something much simpler.

The next step is to handle expressions that are not all indvar+loop-invariant
values (e.g. handling indvar+loopvariant).

llvm-svn: 22649
2005-08-04 17:40:30 +00:00
Chris Lattner 6f286b760f refactor some code
llvm-svn: 22643
2005-08-04 01:19:13 +00:00
Chris Lattner 6510749050 invert to if's to make the logic simpler
llvm-svn: 22641
2005-08-04 00:40:47 +00:00
Chris Lattner a0102fbc4f When processing outer loops and we find uses of an IV in inner loops, make
sure to handle the use, just don't recurse into it.

This permits us to generate this code for a simple nested loop case:

.LBB_foo_0:     ; entry
        stwu r1, -48(r1)
        stw r29, 44(r1)
        stw r30, 40(r1)
        mflr r11
        stw r11, 56(r1)
        lis r2, ha16(L_A$non_lazy_ptr)
        lwz r30, lo16(L_A$non_lazy_ptr)(r2)
        li r29, 1
.LBB_foo_1:     ; no_exit.0
        bl L_bar$stub
        li r2, 1
        or r3, r30, r30
.LBB_foo_2:     ; no_exit.1
        lfd f0, 8(r3)
        stfd f0, 0(r3)
        addi r4, r2, 1
        addi r3, r3, 8
        cmpwi cr0, r2, 100
        or r2, r4, r4
        bne .LBB_foo_2  ; no_exit.1
.LBB_foo_3:     ; loopexit.1
        addi r30, r30, 800
        addi r2, r29, 1
        cmpwi cr0, r29, 100
        or r29, r2, r2
        bne .LBB_foo_1  ; no_exit.0
.LBB_foo_4:     ; return
        lwz r11, 56(r1)
        mtlr r11
        lwz r30, 40(r1)
        lwz r29, 44(r1)
        lwz r1, 0(r1)
        blr

instead of this:

_foo:
.LBB_foo_0:     ; entry
        stwu r1, -48(r1)
        stw r28, 44(r1)                   ;; uses an extra register.
        stw r29, 40(r1)
        stw r30, 36(r1)
        mflr r11
        stw r11, 56(r1)
        li r30, 1
        li r29, 0
        or r28, r29, r29
.LBB_foo_1:     ; no_exit.0
        bl L_bar$stub
        mulli r2, r28, 800           ;; unstrength-reduced multiply
        lis r3, ha16(L_A$non_lazy_ptr)   ;; loop invariant address computation
        lwz r3, lo16(L_A$non_lazy_ptr)(r3)
        add r2, r2, r3
        mulli r4, r29, 800           ;; unstrength-reduced multiply
        addi r3, r3, 8
        add r3, r4, r3
        li r4, 1
.LBB_foo_2:     ; no_exit.1
        lfd f0, 0(r3)
        stfd f0, 0(r2)
        addi r5, r4, 1
        addi r2, r2, 8                 ;; multiple stride 8 IV's
        addi r3, r3, 8
        cmpwi cr0, r4, 100
        or r4, r5, r5
        bne .LBB_foo_2  ; no_exit.1
.LBB_foo_3:     ; loopexit.1
        addi r28, r28, 1               ;;; Many IV's with stride 1
        addi r29, r29, 1
        addi r2, r30, 1
        cmpwi cr0, r30, 100
        or r30, r2, r2
        bne .LBB_foo_1  ; no_exit.0
.LBB_foo_4:     ; return
        lwz r11, 56(r1)
        mtlr r11
        lwz r30, 36(r1)
        lwz r29, 40(r1)
        lwz r28, 44(r1)
        lwz r1, 0(r1)
        blr

llvm-svn: 22640
2005-08-04 00:14:11 +00:00
Chris Lattner fc62470466 Teach loop-reduce to see into nested loops, to pull out immediate values
pushed down by SCEV.

In a nested loop case, this allows us to emit this:

        lis r3, ha16(L_A$non_lazy_ptr)
        lwz r3, lo16(L_A$non_lazy_ptr)(r3)
        add r2, r2, r3
        li r3, 1
.LBB_foo_2:     ; no_exit.1
        lfd f0, 8(r2)        ;; Uses offset of 8 instead of 0
        stfd f0, 0(r2)
        addi r4, r3, 1
        addi r2, r2, 8
        cmpwi cr0, r3, 100
        or r3, r4, r4
        bne .LBB_foo_2  ; no_exit.1

instead of this:

        lis r3, ha16(L_A$non_lazy_ptr)
        lwz r3, lo16(L_A$non_lazy_ptr)(r3)
        add r2, r2, r3
        addi r3, r3, 8
        li r4, 1
.LBB_foo_2:     ; no_exit.1
        lfd f0, 0(r3)
        stfd f0, 0(r2)
        addi r5, r4, 1
        addi r2, r2, 8
        addi r3, r3, 8
        cmpwi cr0, r4, 100
        or r4, r5, r5
        bne .LBB_foo_2  ; no_exit.1

llvm-svn: 22639
2005-08-03 23:44:42 +00:00
Chris Lattner bb78c97e24 improve debug output
llvm-svn: 22638
2005-08-03 23:30:08 +00:00
Chris Lattner db23c74e5e Move from Stage 0 to Stage 1.
Only emit one PHI node for IV uses with identical bases and strides (after
moving foldable immediates to the load/store instruction).

This implements LoopStrengthReduce/dont_insert_redundant_ops.ll, allowing
us to generate this PPC code for test1:

        or r30, r3, r3
.LBB_test1_1:   ; Loop
        li r2, 0
        stw r2, 0(r30)
        stw r2, 4(r30)
        bl L_pred$stub
        addi r30, r30, 8
        cmplwi cr0, r3, 0
        bne .LBB_test1_1        ; Loop

instead of this code:

        or r30, r3, r3
        or r29, r3, r3
.LBB_test1_1:   ; Loop
        li r2, 0
        stw r2, 0(r29)
        stw r2, 4(r30)
        bl L_pred$stub
        addi r30, r30, 8        ;; Two iv's with step of 8
        addi r29, r29, 8
        cmplwi cr0, r3, 0
        bne .LBB_test1_1        ; Loop

llvm-svn: 22635
2005-08-03 22:51:21 +00:00
Chris Lattner 430d0022df Rename IVUse to IVUsersOfOneStride, use a struct instead of a pair to
unify some parallel vectors and get field names more descriptive than
"first" and "second".  This isn't lisp afterall :)

llvm-svn: 22633
2005-08-03 22:21:05 +00:00
Chris Lattner 84e9baa925 Fix a nasty dangling pointer issue. The ScalarEvolution pass would keep a
map from instruction* to SCEVHandles.  When we delete instructions, we have
to tell it about it.  We would run into nasty cases where new instructions
were reallocated at old instruction addresses and get the old map values.
Bad bad bad :(

llvm-svn: 22632
2005-08-03 21:36:09 +00:00
Chris Lattner 3de05cc930 The correct fix for PR612, which also fixes
Transforms/LowerInvoke/2005-08-03-InvokeWithPHIUse.ll

llvm-svn: 22628
2005-08-03 18:51:44 +00:00
Chris Lattner f8a81a9886 When inserting code, make sure not to insert it before PHI nodes. This
fixes PR612 and Transforms/LowerInvoke/2005-08-03-InvokeWithPHI.ll

llvm-svn: 22626
2005-08-03 18:34:29 +00:00
Chris Lattner d683bdd0f8 Fix Transforms/SimplifyCFG/2005-08-03-PHIFactorCrash.ll, a problem that
occurred while bugpointing another testcase

llvm-svn: 22621
2005-08-03 17:59:45 +00:00
Chris Lattner 2dbf1960ff Finally, add the required constraint checks to fix Transforms/SimplifyCFG/2005-08-01-PHIUpdateFail.ll
the right way

llvm-svn: 22615
2005-08-03 00:59:12 +00:00
Chris Lattner 908036942c Simplify some code, add the correct pred checks
llvm-svn: 22613
2005-08-03 00:38:27 +00:00
Chris Lattner 982b75c061 Refactor code out of PropagatePredecessorsForPHIs, turning it into a pure function with no side-effects
llvm-svn: 22612
2005-08-03 00:29:26 +00:00
Chris Lattner 1f047fd513 use splice instead of remove/insert to avoid some symtab operations
llvm-svn: 22611
2005-08-03 00:23:42 +00:00
Chris Lattner 76dc204488 move two functions up in the file, use SafeToMergeTerminators to eliminate
some duplicated code

llvm-svn: 22610
2005-08-03 00:19:45 +00:00
Chris Lattner 733d6704ce Rip some code out of the main SimplifyCFG function into a subfunction and
call it from the only place it is live.  No functionality changes.

llvm-svn: 22609
2005-08-03 00:11:16 +00:00
Chris Lattner ac594de8dc Disable this patch:
http://lists.cs.uiuc.edu/pipermail/llvm-commits/Week-of-Mon-20050801/027345.html

This breaks real programs and only fixes an obscure regression testcase.  A
real fix is in development.

llvm-svn: 22606
2005-08-02 23:31:38 +00:00