Commit Graph

5144 Commits

Author SHA1 Message Date
Dale Johannesen 1f0e0e7c9c Fix the time regression I introduced in 464.h264ref with
my earlier patch to this file.

The issue there was that all uses of an IV inside a loop
are actually references to Base[IV*2], and there was one
use outside that was the same but LSR didn't see the base
or the scaling because it didn't recurse into uses outside
the loop; thus, it used base+IV*scale mode inside the loop
instead of pulling base out of the loop.  This was extra bad
because register pressure later forced both base and IV into
memory.  Doing that recursion, at least enough
to figure out addressing modes, is a good idea in general;
the change in AddUsersIfInteresting does this.  However,
there were side effects....

It is also possible for recursing outside the loop to
introduce another IV where there was only 1 before (if
the refs inside are not scaled and the ref outside is).
I don't think this is a common case, but it's in the testsuite.
It is right to be very aggressive about getting rid of
such introduced IVs (CheckForIVReuse and the handling of
nonzero RewriteFactor in StrengthReduceStridedIVUsers).
In the testcase in question the new IV produced this way
has both a nonconstant stride and a nonzero base, neither
of which was handled before.  And when inserting 
new code that feeds into a PHI, it's right to put such 
code at the original location rather than in the PHI's 
immediate predecessor(s) when the original location is outside 
the loop (a case that couldn't happen before)
(RewriteInstructionToUseNewBase); better to avoid making
multiple copies of it in this case.

Also, the mechanism for keeping SCEV's corresponding to GEP's
no longer works, as the GEP might change after its SCEV
is remembered, invalidating the SCEV, and we might get a bad
SCEV value when looking up the GEP again for a later loop.  
This also couldn't happen before, as we weren't recursing
into GEP's outside the loop.

Also, when we build an expression that involves a (possibly
non-affine) IV from a different loop as well as an IV from
the one we're interested in (containsAddRecFromDifferentLoop),
don't recurse into that.  We can't do much with it and will
get in trouble if we try to create new non-affine IVs or something.

More testcases are coming.

llvm-svn: 62212
2009-01-14 02:35:31 +00:00
Chris Lattner 2538eb664c rewrite OptimizeAwayTrappingUsesOfLoads to 1) avoid a temporary
vector and extraneous loop over it, 2) not delete globals used by
phis/selects etc which could actually be useful.  This fixes PR3321.
Many thanks to Duncan for narrowing this down.

llvm-svn: 62201
2009-01-14 00:12:58 +00:00
Dale Johannesen 0aeabdff57 Fix testsuite regressions from recursive inlining.
llvm-svn: 62189
2009-01-13 22:43:37 +00:00
Dan Gohman 59af77376c Make instcombine ensure that all allocas are explicitly aligned at at
least their preferred alignment.

llvm-svn: 62176
2009-01-13 20:18:38 +00:00
Duncan Sands 944ccc5d6a Correct a comment.
llvm-svn: 62165
2009-01-13 13:48:44 +00:00
Dale Johannesen 433a9086c0 Enable recursive inlining. Reduce inlining threshold
back to 200; 400 seems to be too high, loses more than
it gains.

llvm-svn: 62107
2009-01-12 22:11:50 +00:00
Duncan Sands dc020f9c3c Rename getABITypeSize to getTypePaddedSize, as
suggested by Chris.

llvm-svn: 62099
2009-01-12 20:38:59 +00:00
Dale Johannesen f84685290a Increase default inlining aggressiveness in partial
compensation for turning off gcc's inliner.  This gets
us closer to the amount of inlining we were getting before.
It is not a win on everything, of course, but seems to
gain overall.

llvm-svn: 62058
2009-01-11 23:11:00 +00:00
Chris Lattner bd3c7c8b52 Duncan is nervous about undefinedness of % with negatives. I'm
not thrilled about 64-bit % in general, so rewrite to use * instead.

llvm-svn: 62047
2009-01-11 20:41:36 +00:00
Chris Lattner b19151686f do not generated GEPs into vectors where they don't already exist.
We should treat vectors as atomic types, not like arrays.

llvm-svn: 62046
2009-01-11 20:23:52 +00:00
Chris Lattner 171d2d474f Make a couple of cleanups to the instcombine bitcast/gep
canonicalization transform based on duncan's comments:

1) improve the comment about %.
2) within our index loop make sure the offset stays 
   within the *type size*, instead of within the *abi size*.
   This allows us to reason explicitly about landing in tail
   padding and means that issues like non-zero offsets into
   [0 x foo] types don't occur anymore.

llvm-svn: 62045
2009-01-11 20:15:20 +00:00
Chris Lattner 5f54d50917 fix typo Duncan noticed.
llvm-svn: 61997
2009-01-09 18:31:39 +00:00
Chris Lattner ae0e857b98 Fix PR3304
llvm-svn: 61995
2009-01-09 18:18:43 +00:00
Misha Brukman 5cbf223916 Removed trailing whitespace from Makefiles.
llvm-svn: 61991
2009-01-09 16:44:42 +00:00
Chris Lattner f50aa6ae5c Implement rdar://6480391, extending of equality icmp's to avoid a truncation.
I noticed this in the code compiled for a routine using std::map, which produced
this code:
	%25 = tail call i32 @memcmp(i8* %24, i8* %23, i32 6) nounwind readonly
	%.lobit.i = lshr i32 %25, 31		; <i32> [#uses=1]
	%tmp.i = trunc i32 %.lobit.i to i8		; <i8> [#uses=1]
	%toBool = icmp eq i8 %tmp.i, 0		; <i1> [#uses=1]
	br i1 %toBool, label %bb3, label %bb4
which compiled to:

	call	L_memcmp$stub
	shrl	$31, %eax
	testb	%al, %al
	jne	LBB1_11	## 

with this change, we compile it to:

	call	L_memcmp$stub
	testl	%eax, %eax
	js	LBB1_11

This triggers all the time in common code, with patters like this:

	%169 = and i32 %ply, 1		; <i32> [#uses=1]
	%170 = trunc i32 %169 to i8		; <i8> [#uses=1]
	%toBool = icmp ne i8 %170, 0		; <i1> [#uses=1]

 	%7 = lshr i32 %6, 24		; <i32> [#uses=1]
	%9 = trunc i32 %7 to i8		; <i8> [#uses=1]
	%10 = icmp ne i8 %9, 0		; <i1> [#uses=1]

etc

llvm-svn: 61985
2009-01-09 07:47:06 +00:00
Chris Lattner 0f7cf1d7e1 Remove some old code that looks like a remanant from signed-types days.
llvm-svn: 61984
2009-01-09 07:10:58 +00:00
Chris Lattner 482eb70a10 Fix PR3298, a crash in Jump Threading. Apparently even
jump threading can have bugs, who knew? ;-)

llvm-svn: 61983
2009-01-09 06:08:12 +00:00
Chris Lattner fef138b140 Fix part 3/2 of PR3290, making instcombine zap (gep(bitcast)) when possible.
llvm-svn: 61980
2009-01-09 05:44:56 +00:00
Chris Lattner a784a2ce01 move some code, check to see if the input to the GEP is a bitcast
(which is constant time and cheap) before checking hasAllZeroIndices.

llvm-svn: 61976
2009-01-09 04:53:57 +00:00
Dale Johannesen 4755d9df78 Adjustments to last patch based on review.
llvm-svn: 61969
2009-01-09 01:30:11 +00:00
Dale Johannesen b48fc71fc6 Do not inline functions with (dynamic) alloca into
functions that don't already have a (dynamic) alloca.
Dynamic allocas cause inefficient codegen and we shouldn't
propagate this (behavior follows gcc).  Two existing tests
assumed such inlining would be done; they are hacked by
adding an alloca in the caller, preserving the point of
the tests.

llvm-svn: 61946
2009-01-08 21:45:23 +00:00
Chris Lattner c518dfd11b This implements the second half of the fix for PR3290, handling
loads from allocas that cover the entire aggregate.  This handles
some memcpy/byval cases that are produced by llvm-gcc.  This triggers
a few times in kc++ (with std::pair<std::_Rb_tree_const_iterator
<kc::impl_abstract_phylum*>,bool>) and once in 176.gcc (with %struct..0anon).

llvm-svn: 61915
2009-01-08 05:42:05 +00:00
Duncan Sands 0bcf085845 Whitespace - correct formatting.
llvm-svn: 61879
2009-01-07 20:01:06 +00:00
Duncan Sands 289f59f233 Remove alloca tracking from nocapture analysis. Not only
was it not very helpful, it was also wrong!  The problem
is shown in the testcase: the alloca might be passed to
a nocapture callee which dereferences it and returns the
original pointer.  But because it was a nocapture call we
think we don't need to track its uses, but we do.

llvm-svn: 61876
2009-01-07 19:39:06 +00:00
Duncan Sands 94bcbbab74 Reorder these.
llvm-svn: 61873
2009-01-07 19:17:02 +00:00
Duncan Sands 02599850b4 Use a switch rather than a sequence of "isa" tests.
llvm-svn: 61872
2009-01-07 19:10:21 +00:00
Duncan Sands 187c5716b6 The verifier checks that the aliasee is not null.
llvm-svn: 61870
2009-01-07 18:45:53 +00:00
Chris Lattner f2b8c82ad1 Implement the first half of PR3290: if there is a store of an
integer to a (transitive) bitcast the alloca and if that integer
has the full size of the alloca, then it clobbers the whole thing.
Handle this by extracting pieces out of the stored integer and 
filing them away in the SROA'd elements.

This triggers fairly frequently because the CFE uses integers to
pass small structs by value and the inliner exposes these.  For 
example, in kimwitu++, I see a bunch of these with i64 stores to
"%struct.std::pair<std::_Rb_tree_const_iterator<kc::impl_abstract_phylum*>,bool>"

In 176.gcc I see a few i32 stores to "%struct..0anon".

In the testcase, this is a difference between compiling test1 to:

_test1:
	subl	$12, %esp
	movl	20(%esp), %eax
	movl	%eax, 4(%esp)
	movl	16(%esp), %eax
	movl	%eax, (%esp)
	movl	(%esp), %eax
	addl	4(%esp), %eax
	addl	$12, %esp
	ret

vs:

_test1:
	movl	8(%esp), %eax
	addl	4(%esp), %eax
	ret

The second half of this will be to handle loads of the same form.

llvm-svn: 61853
2009-01-07 08:11:13 +00:00
Chris Lattner 9a2de65fd6 Factor a bunch of code out into a helper method.
llvm-svn: 61852
2009-01-07 07:18:45 +00:00
Chris Lattner db561146aa use continue to simplify code and reduce nesting, no functionality
change.

llvm-svn: 61851
2009-01-07 06:39:58 +00:00
Chris Lattner 938b54f383 Get TargetData once up front and cache as an ivar instead of
requerying it all over the place.

llvm-svn: 61850
2009-01-07 06:34:28 +00:00
Chris Lattner a63dba9e6c Use the hasAllZeroIndices predicate to simplify some
code, no functionality change.

llvm-svn: 61849
2009-01-07 06:25:07 +00:00
Chris Lattner 2fdcc59bb6 Change m_ConstantInt and m_SelectCst to take their constant integers
as template arguments instead of as instance variables, exposing more
optimization opportunities to the compiler earlier.

llvm-svn: 61776
2009-01-05 23:53:12 +00:00
Duncan Sands 582c53d147 Teach the internalize pass to also internalize
global aliases.

llvm-svn: 61754
2009-01-05 21:24:45 +00:00
Evan Cheng 8804293fe9 Find loop back edges only after empty blocks are eliminated.
llvm-svn: 61752
2009-01-05 21:17:27 +00:00
Duncan Sands 52e5deece5 Not having an aliasee is a theoretical possibility.
llvm-svn: 61745
2009-01-05 20:47:56 +00:00
Duncan Sands 821d13cf78 Format more neatly.
llvm-svn: 61744
2009-01-05 20:39:50 +00:00
Duncan Sands d24b93f339 Remove trailing spaces.
llvm-svn: 61743
2009-01-05 20:38:27 +00:00
Duncan Sands f5dbbae4f4 Delete unused global aliases with internal linkage.
In fact this also deletes those with linkonce linkage,
however this is currently dead because for the moment
aliases aren't allowed to have this linkage type.

llvm-svn: 61742
2009-01-05 20:37:33 +00:00
Dan Gohman 906152a20f Tidy up #includes, deleting a bunch of unnecessary #includes.
llvm-svn: 61715
2009-01-05 17:59:02 +00:00
Nick Lewycky e4e5532e05 Move the libcall annotating part from doFinalization to doInitialization.
Finalization occurs after all the FunctionPasses in the group have run, which
is clearly not what we want.

This also means that we have to make sure that we apply the right param 
attributes when creating a new function.

Also, add a missed optimization: strdup and strndup. NoCapture and 
NoAlias return!

llvm-svn: 61658
2009-01-05 00:07:50 +00:00
Nick Lewycky 959af7ba30 Run a post-pass that marks known function declarations by name.
llvm-svn: 61632
2009-01-04 20:27:34 +00:00
Bill Wendling 0c04f9fdc3 Revert this transform. It was causing some dramatic slowdowns in a few tests. See PR3266.
llvm-svn: 61623
2009-01-04 06:19:11 +00:00
Nick Lewycky 1d805c62c4 Any void readonly functions are provably dead, don't waste time adding
nocapture attributes to them.

llvm-svn: 61610
2009-01-03 17:05:32 +00:00
Duncan Sands c7affb0a8f Load tracking means that the value analyzed may
not have pointer type.  In particular, it may
be the condition argument for a select or a GEP
index.  While I was unable to construct a testcase
for which some bits of the original pointer are
captured due to one of these, it's very very close
to being possible - so play safe and exclude these
possibilities.

llvm-svn: 61580
2009-01-02 15:16:38 +00:00
Duncan Sands b193a37cd3 When calculating 'nocapture' argument attributes, allow
the argument to be stored to an alloca by tracking uses
of the alloca.  This occurs 4 times (out of 7121, 0.05%)
in MultiSource/Applications, so may not be worth it.  On
the other hand, it is easy to do and fairly cheap.  The
functions it helps are: W_addcom and W_addlit in spiff;
process_args (argv) in d (make_dparser); ercPixConcealIMB
in JM/ldecod.

llvm-svn: 61570
2009-01-02 11:54:37 +00:00
Duncan Sands cefc8604aa Improve comments and reorganize a bit - no functionality
change.

llvm-svn: 61569
2009-01-02 11:46:24 +00:00
Nick Lewycky 7e82055e88 Make adding nocapture a bit stronger. FreeInst is nocapture. Also,
functions that don't write can't leak a pointer except through 
the return value, so a void readonly function is implicitly nocapture.

Test these, and add a test that verifies that f1 calling f2 with an 
otherwise dead pointer gets both of them marked nocapture.

llvm-svn: 61552
2009-01-02 03:46:56 +00:00
Duncan Sands 1f11d2bbc1 Mention that this pass does escape analysis in the
leading comments.

llvm-svn: 61548
2009-01-01 20:45:19 +00:00
Bill Wendling 0fcff2c203 Fix comment.
llvm-svn: 61538
2009-01-01 01:19:59 +00:00
Bill Wendling aedb54a947 Add transformation:
xor (or (icmp, icmp), true) -> and(icmp, icmp)

This is possible because of De Morgan's law.

llvm-svn: 61537
2009-01-01 01:18:23 +00:00
Duncan Sands 163848021b Look through phi nodes and select instructions when
calculating nocapture attributes.

llvm-svn: 61535
2008-12-31 20:21:34 +00:00
Duncan Sands df128eb477 Don't analyze arguments already marked 'nocapture'.
llvm-svn: 61532
2008-12-31 18:08:59 +00:00
Duncan Sands 44c8cd97a5 Rename AddReadAttrs to FunctionAttrs, and teach it how
to work out (in a very simplistic way) which function
arguments (pointer arguments only) are only dereferenced
and so do not escape.  Mark such arguments 'nocapture'.

llvm-svn: 61525
2008-12-31 16:14:43 +00:00
Duncan Sands f6069577fa Experiments show that looking through phi nodes
and select instructions doesn't buy anything here
except extra complexity: the only difference in
the entire testsuite was that a readonly function
became readnone in MiBench/consumer-typeset.  Add
a comment about this.

llvm-svn: 61478
2008-12-29 20:51:17 +00:00
Duncan Sands c125d6a3d3 Allow readnone functions to read (and write!) global
constants, since doing so is irrelevant for aliasing
purposes.  While this doesn't increase the total number
of functions marked readonly or readnone in MultiSource/
Applications (3089), it does result in 12 functions being
marked readnone rather than readonly.
Before:
  readnone: 820
  readonly: 2269
After:
  readnone: 832
  readonly: 2257

llvm-svn: 61469
2008-12-29 11:34:09 +00:00
Dale Johannesen 656237beca Revert 61362 and 61402 until SPEC breakage is fixed.
llvm-svn: 61403
2008-12-23 23:21:35 +00:00
Dale Johannesen f8b161bcd1 This fixes the bug in 175.vpr. It doesn't fix the
other SPEC breakage.  I'll be reverting all recent
changes shortly, this checking is mostly so this
change doesn't get lost.

llvm-svn: 61402
2008-12-23 23:05:26 +00:00
Dale Johannesen 93b9aa8799 Fix the time regression I introduced in 464.h264ref with
my last patch to this file.

The issue there was that all uses of an IV inside a loop
are actually references to Base[IV*2], and there was one
use outside that was the same but LSR didn't see the base
or the scaling because it didn't recurse into uses outside
the loop; thus, it used base+IV*scale mode inside the loop
instead of pulling base out of the loop.  This was extra bad
because register pressure later forced both base and IV into
memory.  Doing that recursion, at least enough
to figure out addressing modes, is a good idea in general;
the change in AddUsersIfInteresting does this.  However,
there were side effects....

It is also possible for recursing outside the loop to
introduce another IV where there was only 1 before (if
the refs inside are not scaled and the ref outside is).
I don't think this is a common case, but it's in the testsuite.
It is right to be very aggressive about getting rid of
such introduced IVs (CheckForIVReuse and the handling of
nonzero RewriteFactor in StrengthReduceStridedIVUsers).
In the testcase in question the new IV produced this way
has both a nonconstant stride and a nonzero base, neither
of which was handled before.  And when inserting 
new code that feeds into a PHI, it's right to put such 
code at the original location rather than in the PHI's 
immediate predecessor(s) when the original location is outside 
the loop (a case that couldn't happen before)
(RewriteInstructionToUseNewBase); better to avoid making
multiple copies of it in this case.

Also, the mechanism for keeping SCEV's corresponding to GEP's
no longer works, as the GEP might change after its SCEV
is remembered, invalidating the SCEV, and we might get a bad
SCEV value when looking up the GEP again for a later loop.  
This also couldn't happen before, as we weren't recursing
into GEP's outside the loop.

I owe some testcases for this, want to get it in for nightly runs.

llvm-svn: 61362
2008-12-23 02:12:52 +00:00
Owen Anderson 164274eeb1 Don't forget to remove phi nodes from the value numbering table after we collapse them.
llvm-svn: 61358
2008-12-23 00:49:51 +00:00
Bill Wendling 456e885382 Comment clean-ups. No functionality change.
llvm-svn: 61354
2008-12-22 22:32:22 +00:00
Bill Wendling e7f08e7250 Check that the instruction isn't in the value numbering scope.
llvm-svn: 61353
2008-12-22 22:28:56 +00:00
Bill Wendling 86f01cb9f6 Simplification: Negate the operator== method instead of implementing a full operator!= method.
llvm-svn: 61352
2008-12-22 22:16:31 +00:00
Bill Wendling 3c793441cb Add verification that deleted instruction isn't hiding in the PHI map.
llvm-svn: 61350
2008-12-22 22:14:07 +00:00
Bill Wendling ebb6a543fa Verify removed in a few more places.
llvm-svn: 61349
2008-12-22 21:57:30 +00:00
Bill Wendling 6b18a3994b Add verification functions to GVN which check to see that an instruction was
truely deleted. These will be expanded with further checks of all of the data
structures.

llvm-svn: 61347
2008-12-22 21:36:08 +00:00
Nick Lewycky 10eb8e533f Turn strcmp into memcmp, such as strcmp(P, "x") --> memcmp(P, "x", 2).
llvm-svn: 61297
2008-12-21 00:19:21 +00:00
Nick Lewycky 4bc10c9e77 Remove redundant test for vector-nature. Scan the vector first to see whether
our optz'n will apply to it, then build the replacement vector only if needed.

llvm-svn: 61279
2008-12-20 16:48:00 +00:00
Evan Cheng 3b3de7c228 - CodeGenPrepare does not split loop back edges but it only knows about back edges of single block loops. It now does a DFS walk to find loop back edges.
- Use SplitBlockPredecessors to factor out common predecessors of the critical edge destination. This is disabled for now due to some regressions.

llvm-svn: 61248
2008-12-19 18:03:11 +00:00
Bill Wendling 070de29fcf Didn't mean to commit this.
llvm-svn: 61222
2008-12-18 22:19:50 +00:00
Bill Wendling 4c13e77d49 Re-XFAIL this test until debug stuff settles down.
llvm-svn: 61219
2008-12-18 22:13:31 +00:00
Nick Lewycky c3a70ade66 Oops! Left out a line.
Simplifying the sdiv might allow further simplifications for our users.

llvm-svn: 61196
2008-12-18 06:42:28 +00:00
Nick Lewycky 0f0e63fe73 Make all the vector elements positive in an srem of constant vector.
llvm-svn: 61195
2008-12-18 06:31:11 +00:00
Chris Lattner 4caf5eb70c Fix PR2929 by making bugpoint/code extract propagate the nothrow
bit from the original function to the cloned one.

llvm-svn: 61194
2008-12-18 05:52:56 +00:00
Dale Johannesen 3e5843b992 Revert previous patch, appears to break bootstrap.
llvm-svn: 61181
2008-12-18 01:23:41 +00:00
Dale Johannesen 12d031b716 Fix the time regression I introduced in 464.h264ref with
my last patch to this file.

The issue there was that all uses of an IV inside a loop
are actually references to Base[IV*2], and there was one
use outside that was the same but LSR didn't see the base
or the scaling because it didn't recurse into uses outside
the loop; thus, it used base+IV*scale mode inside the loop
instead of pulling base out of the loop.  This was extra bad
because register pressure later forced both base and IV into
memory.  Doing that recursion, at least enough
to figure out addressing modes, is a good idea in general;
the change in AddUsersIfInteresting does this.  However,
there were side effects....

It is also possible for recursing outside the loop to
introduce another IV where there was only 1 before (if
the refs inside are not scaled and the ref outside is).
I don't think this is a common case, but it's in the testsuite.
It is right to be very aggressive about getting rid of
such introduced IVs (CheckForIVReuse and the handling of
nonzero RewriteFactor in StrengthReduceStridedIVUsers).
In the testcase in question the new IV produced this way
has both a nonconstant stride and a nonzero base, neither
of which was handled before.  (This patch does not handle 
all the cases where this can happen.)  And when inserting 
new code that feeds into a PHI, it's right to put such 
code at the original location rather than in the PHI's 
immediate predecessor(s) when the original location is outside 
the loop (a case that couldn't happen before)
(RewriteInstructionToUseNewBase); better to avoid making
multiple copies of it in this case.

Everything above is exercised in
CodeGen/X86/lsr-negative-stride.ll (and ifcvt4 in ARM which is
the same IR).

llvm-svn: 61178
2008-12-18 00:57:22 +00:00
Chris Lattner b6372933b5 reapply this hunk from Bill's reversion in r61169, it is conservative
and safe and orthogonal from turning off load pre.

llvm-svn: 61177
2008-12-18 00:51:32 +00:00
Chris Lattner c1c6404bba make instnamer name unnamed blocks as well as instructions and args.
llvm-svn: 61175
2008-12-18 00:33:11 +00:00
Bill Wendling be4fb8a25f Temporarily revert r61027. It was causing a bootstrap failure in "release" mode
with everyone's favorite error messages:

Comparing stages 2 and 3
warning: ./cc1-checksum.o differs
warning: ./cc1plus-checksum.o differs
Bootstrap comparison failure!
./c-decl.o differs
./cp/decl.o differs
./df-core.o differs
./gcc.o differs
./i386.o differs
./stor-layout.o differs
./tree-pretty-print.o differs
./tree.o differs
make[2]: *** [compare] Error 1
make[1]: *** [stage3-bubble] Error 2

See PR3227.

llvm-svn: 61169
2008-12-17 23:31:20 +00:00
Chris Lattner 0cdf52310a insert some sequence points and preincrement an iterator to avoid
iterator invalidation problems.

llvm-svn: 61124
2008-12-17 05:42:08 +00:00
Chris Lattner 222ef4c489 Enhance heap sra to be substantially more aggressive w.r.t PHI
nodes.  This allows it to do fairly general phi insertion if a 
load from a pointer global wants to be SRAd but the load is used
by (recursive) phi nodes.  This fixes a pessimization on ppc
introduced by Load PRE.

llvm-svn: 61123
2008-12-17 05:28:49 +00:00
Dale Johannesen 904ce8120d Clarify that the scale factor from CheckForIVReuse
can be negative.  Keep track of whether all uses of
an IV are outside the loop.  Some cosmetics; no
functional change.

llvm-svn: 61109
2008-12-16 22:16:28 +00:00
Chris Lattner 56b55387fc Fix another crash found by inspection. If we have a PHI node merging
the load multiple times, make sure the check the uses of the PHI to 
ensure they are transformable.

llvm-svn: 61102
2008-12-16 21:24:51 +00:00
Chris Lattner 06a456b3f4 fix a crash found by inspection.
llvm-svn: 61101
2008-12-16 21:04:51 +00:00
Eli Friedman cb61afb546 Add a helper to remove a branch and DCE the condition, and use it
consistently for deleting branches.  In addition to being slightly 
more readable, this makes SimplifyCFG a bit better 
about cleaning up after itself when it makes conditions unused.

llvm-svn: 61100
2008-12-16 20:54:32 +00:00
Chris Lattner 6ddde53783 switch some std::set/std::map to SmallPtrSet/DenseMap.
llvm-svn: 61081
2008-12-16 07:34:30 +00:00
Chris Lattner 49e3bdc165 enhance heap-sra to apply to fixed sized array allocations, not just
variable sized array allocations.

llvm-svn: 61051
2008-12-15 21:44:34 +00:00
Chris Lattner 1c731fa86f Use stripPointerCasts.
llvm-svn: 61047
2008-12-15 21:20:32 +00:00
Chris Lattner f0eb568021 minor tweaks for formatting, allow bitcast in ValueIsOnlyUsedLocallyOrStoredToOneGlobal.
llvm-svn: 61046
2008-12-15 21:08:54 +00:00
Chris Lattner c4274a71d5 refactor some code into a new TryToOptimizeStoreOfMallocToGlobal function.
Use GetElementPtrInst::hasAllZeroIndices where possible.

llvm-svn: 61045
2008-12-15 21:02:25 +00:00
Chris Lattner 0c68ae0603 Enable Load PRE. This teaches GVN to push partially redundant loads up the
CFG when there is exactly one predecessor where the load is not available.
This is designed to not increase code size but still eliminate partially
redundant loads.  This fires 1765 times on 403.gcc even though it doesn't
do critical edge splitting yet (the most common reason for it to fail).

llvm-svn: 61027
2008-12-15 05:28:29 +00:00
Owen Anderson 03aacbae90 Ifdef out some code that I didn't mean to enable by default yet.
llvm-svn: 61024
2008-12-15 03:52:17 +00:00
Chris Lattner 69131fd872 make GVN try to rename inputs to the resultant replaced values, which
cleans up the generated code a bit.  This should have the added benefit of
not randomly renaming functions/globals like my previous patch did. :)

llvm-svn: 61023
2008-12-15 03:46:38 +00:00
Owen Anderson bfe133e4ac Add support for slow-path GVN with full phi construction for scalars. This is disabled for now, as it actually pessimizes code in the abscence
of phi translation for load elimination.  This slow down GVN a bit, by about 2% on 403.gcc.

llvm-svn: 61021
2008-12-15 02:03:00 +00:00
Chris Lattner f5eef9f6db eliminate warning when asserts disabled.
llvm-svn: 61012
2008-12-14 21:36:23 +00:00
Owen Anderson e34c2399de Generalize GVN's phi construciton routine to work for things other than loads.
llvm-svn: 61009
2008-12-14 19:10:35 +00:00
Bill Wendling 293b9181e5 Temporarily revert r60973. It's inexplicably causing a failure when self-hosting LLVM:
llvm[2]: Linking Release executable opt (without symbols)
...
Undefined symbols:
  "llvm::APFloat::IEEEsingle", referenced from:
      __ZN4llvm7APFloat10IEEEsingleE$non_lazy_ptr in libLLVMCore.a(Constants.o)
      __ZN4llvm7APFloat10IEEEsingleE$non_lazy_ptr in libLLVMCore.a(AsmWriter.o)
      __ZN4llvm7APFloat10IEEEsingleE$non_lazy_ptr in libLLVMCore.a(ConstantFold.o)
  "llvm::APFloat::IEEEdouble", referenced from:
      __ZN4llvm7APFloat10IEEEdoubleE$non_lazy_ptr in libLLVMCore.a(Constants.o)
      __ZN4llvm7APFloat10IEEEdoubleE$non_lazy_ptr in libLLVMCore.a(AsmWriter.o)
      __ZN4llvm7APFloat10IEEEdoubleE$non_lazy_ptr in libLLVMCore.a(ConstantFold.o)
ld: symbol(s) not found

This is in release mode. To replicate, compile llvm and llvm-gcc in optimized
mode. Then build llvm, in optimized mode, with the newly created compiler.

llvm-svn: 60977
2008-12-13 09:28:44 +00:00
Chris Lattner 1e29f7c97d make RLE preserve the name of the load that it replaces. This is just
a pretification of the IR.

llvm-svn: 60973
2008-12-13 07:22:47 +00:00
Misha Brukman 234b44add2 Fix spelling.
llvm-svn: 60971
2008-12-13 05:21:37 +00:00
Chris Lattner fa9f99aa12 Teach GVN to invalidate some memdep information when it does an RAUW
of a pointer.  This allows is to catch more equivalencies.  For example,
the type_lists_compatible_p function used to require two iterations of
the gvn pass (!) to delete its 18 redundant loads because the first pass
would CSE all the addressing computation cruft, which would unblock the
second memdep/gvn passes from recognizing them.  This change allows
memdep/gvn to catch all 18 when run just once on the function (as is 
typical :) instead of just 3.

On all of 403.gcc, this bumps up the # reundandancies found from:

     63 gvn    - Number of instructions PRE'd
 153991 gvn    - Number of instructions deleted
  50069 gvn    - Number of loads deleted
to:
     63 gvn    - Number of instructions PRE'd
 154137 gvn    - Number of instructions deleted
  50185 gvn    - Number of loads deleted

+120 loads deleted isn't bad.

llvm-svn: 60799
2008-12-09 22:06:23 +00:00
Chris Lattner 254314e6bc rename getNonLocalDependency -> getNonLocalCallDependency, and remove
pointer stuff from it, simplifying the code a bit.

llvm-svn: 60783
2008-12-09 19:38:05 +00:00
Chris Lattner b6fc4b8d92 Switch GVN::processNonLocalLoad to using the new
MemDep::getNonLocalPointerDependency method.  There are
some open issues with this (missed optimizations) and
plenty of future work, but this does allow GVN to eliminate
*slightly* more loads (49246 vs 49033).

Switching over now allows simplification of the other code
path in memdep.

llvm-svn: 60780
2008-12-09 19:25:07 +00:00
Chris Lattner 0a5a8d54a9 random cleanups, no functionality change.
llvm-svn: 60779
2008-12-09 19:21:47 +00:00
Chris Lattner 56b20ffc5f Fix a really subtle off-by-one bug that Duncan noticed with valgrind
on test/CodeGen/Generic/2007-06-06-CriticalEdgeLandingPad.

llvm-svn: 60739
2008-12-09 04:47:21 +00:00
Chris Lattner e598370ae9 remove DebugIterations option. Despite the accusations,
jump threading has been shown to only expose problems not
have bugs itself.  I'm sure it's completely bug free! ;-)

llvm-svn: 60725
2008-12-08 22:44:07 +00:00
Devang Patel 2bb8a2f80f Fix spelling.
Thanks Duncan!

llvm-svn: 60702
2008-12-08 17:07:24 +00:00
Devang Patel 1c469d36b0 Undo previous patch.
llvm-svn: 60701
2008-12-08 17:02:37 +00:00
Chris Lattner f50d7f76c6 fix a bug I introduced in simplifycfg handling single entry phi
nodes. FoldSingleEntryPHINodes deletes the PHI, so there is no
need to delete it afterward.

llvm-svn: 60653
2008-12-07 07:22:45 +00:00
Chris Lattner 5df5b4cc2e don't bother touching volatile stores, they will just return clobber on
everything interesting anyway.

llvm-svn: 60640
2008-12-07 00:25:15 +00:00
Chris Lattner 57e91eaf61 Reimplement the inner loop of DSE. It now uniformly uses getDependence(),
doesn't do its own local caching, and is slightly more aggressive about
free/store dse (see testcase).  This eliminates the last external client 
of MemDep::getDependenceFrom().

llvm-svn: 60619
2008-12-06 00:53:22 +00:00
Dale Johannesen 9efd2ce55b Make LoopStrengthReduce smarter about hoisting things out of
loops when they can be subsumed into addressing modes.

Change X86 addressing mode check to realize that
some PIC references need an extra register.
(I believe this is correct for Linux, if not, I'm sure
someone will tell me.)

llvm-svn: 60608
2008-12-05 21:47:27 +00:00
Chris Lattner 0e3d6337c6 Make a few major changes to memdep and its clients:
1. Merge the 'None' result into 'Normal', making loads
   and stores return their dependencies on allocations as Normal.
2. Split the 'Normal' result into 'Clobber' and 'Def' to
   distinguish between the cases when memdep knows the value is
   produced from when we just know if may be changed.
3. Move some of the logic for determining whether readonly calls
   are CSEs into memdep instead of it being in GVN.  This still
   leaves verification that the arguments are hte same to GVN to
   let it know about value equivalences in different contexts.
4. Change memdep's call/call dependency analysis to use 
   getModRefInfo(CallSite,CallSite) instead of doing something 
   very weak.  This only really matters for things like DSA, but
   someday maybe we'll have some other decent context sensitive
   analyses :)
5. This reimplements the guts of memdep to handle the new results.
6. This simplifies GVN significantly:
   a) readonly call CSE is slightly simpler
   b) I eliminated the "getDependencyFrom" chaining for load 
      elimination and load CSE doesn't have to worry about 
      volatile (they are always clobbers) anymore.
   c) GVN no longer does any 'lastLoad' caching, leaving it to 
      memdep.
7. The logic in DSE is simplified a bit and sped up.  A potentially
   unsafe case was eliminated.

llvm-svn: 60607
2008-12-05 21:04:20 +00:00
Anton Korobeynikov 24600bf05a Revert invalid r60393. It causes llvm-gcc bootstrap fails in release builds.
See PR3160 for details

llvm-svn: 60604
2008-12-05 19:38:49 +00:00
Chris Lattner c100828026 Fix test/Transforms/GVN/pre-load.ll
llvm-svn: 60594
2008-12-05 17:04:12 +00:00
Chris Lattner d2a653af0c Make IsValueFullyAvailableInBlock safe.
llvm-svn: 60588
2008-12-05 07:49:08 +00:00
Devang Patel c56423b500 Rewrite code that 1) filters loops and 2) calculates new loop bounds.
This fixes many bugs. I will add more test cases in a separate check-in.

Some day, the code that manipulates CFG and updates dom. info could use refactoring help.

llvm-svn: 60554
2008-12-04 21:38:42 +00:00
Chris Lattner 8f723670ce Start simplifying a switch that has a successor that is a switch.
llvm-svn: 60534
2008-12-04 06:31:07 +00:00
Chris Lattner 75c2661d24 add a debugging option to help track down j-t problems.
llvm-svn: 60514
2008-12-04 00:07:59 +00:00
Dale Johannesen 4e9e6ea604 Remove an unused field.
llvm-svn: 60508
2008-12-03 22:43:56 +00:00
Dale Johannesen f7a588b909 Fix a misspelled function name.
llvm-svn: 60506
2008-12-03 20:56:12 +00:00
Chris Lattner dc3f6f2c12 Factor some code into a new FoldSingleEntryPHINodes method.
llvm-svn: 60501
2008-12-03 19:44:02 +00:00
Dale Johannesen d49ceff6ba Fix a really wrong comment.
llvm-svn: 60494
2008-12-03 19:25:46 +00:00
Chris Lattner 595c7279bd Teach jump threading some more simple tricks:
1) have it fold "br undef", which does occur with
   surprising frequency as jump threading iterates.
2) teach j-t to delete dead blocks.  This removes the successor
   edges, reducing the in-edges of other blocks, allowing 
   recursive simplification.
3) Fold things like:
     br COND, BBX, BBY
  BBX:
     br COND, BBZ, BBW

   which also happens because jump threading iterates.

llvm-svn: 60470
2008-12-03 07:48:08 +00:00
Chris Lattner 37e0136fef third time is the charm.
llvm-svn: 60469
2008-12-03 07:45:15 +00:00
Chris Lattner c04a1ffa9a fix assertion.
llvm-svn: 60468
2008-12-03 07:43:05 +00:00
Chris Lattner 7eb270ed03 Rename DeleteBlockIfDead to DeleteDeadBlock and make it
unconditionally delete the block.  All likely clients will
do the checking anyway.

llvm-svn: 60464
2008-12-03 06:40:52 +00:00
Chris Lattner bcc904a67c Factor some code out of SimplifyCFG, forming a new
DeleteBlockIfDead method.

llvm-svn: 60463
2008-12-03 06:37:44 +00:00
Dale Johannesen 4d2ecb8f68 Minor rewrite per review feedback.
llvm-svn: 60442
2008-12-02 21:17:11 +00:00
Dale Johannesen 70060013d2 Make the code do what the comment says it does.
llvm-svn: 60431
2008-12-02 18:40:09 +00:00
Chris Lattner 1db9bbe802 Implement PRE of loads in the GVN pass with a pretty cheap and
straight-forward implementation.  This does not require any extra
alias analysis queries beyond what we already do for non-local loads.

Some programs really really like load PRE.  For example, SPASS triggers
this ~1000 times, ~300 times in 255.vortex, and ~1500 times on 403.gcc.

The biggest limitation to the implementation is that it does not split
critical edges.  This is a huge killer on many programs and should be
addressed after the initial patch is enabled by default.

The implementation of this should incidentally speed up rejection of 
non-local loads because it avoids creating the repl densemap in cases 
when it won't be used for fully redundant loads.

This is currently disabled by default.
Before I turn this on, I need to fix a couple of miscompilations in
the testsuite, look at compile time performance numbers, and look at
perf impact.  This is pretty close to ready though.

llvm-svn: 60408
2008-12-02 08:16:11 +00:00
Bill Wendling 87beb9b909 Remove some errors that crept in. No functionality change.
llvm-svn: 60403
2008-12-02 06:24:20 +00:00
Bill Wendling 790b4bf9a9 Merge two if-statements into one.
llvm-svn: 60402
2008-12-02 06:22:04 +00:00
Bill Wendling 5635295266 More styalistic changes. No functionality change.
llvm-svn: 60401
2008-12-02 06:18:11 +00:00
Bill Wendling 85de4b35ca - Remove the buggy -X/C -> X/-C transform. This isn't valid when X isn't a
constant. If X is a constant, then this is folded elsewhere.

- Added a note to Target/README.txt to indicate that we'd like to implement
  this when we're able.

llvm-svn: 60399
2008-12-02 05:12:47 +00:00
Bill Wendling 5369db5917 Improve comment.
llvm-svn: 60398
2008-12-02 05:09:00 +00:00
Bill Wendling 21716dff5e - Reduce nesting.
- No need to do a swap on a canonicalized pattern.

No functionality change.

llvm-svn: 60397
2008-12-02 05:06:43 +00:00
Chris Lattner ead1a61b47 some random comment improvements.
llvm-svn: 60395
2008-12-02 04:52:26 +00:00
Owen Anderson d930420ccf Fix an issue that Chris noticed, where local PRE was not properly instantiating
a new value numbering set after splitting a critical edge.  This increases
the number of instances of PRE on 403.gcc from ~60 to ~570.

llvm-svn: 60393
2008-12-02 04:09:22 +00:00
Dale Johannesen 069a4eee55 Consider only references to an IV within the loop when
figuring out the base of the IV.  This produces better
code in the example.  (Addresses use (IV) instead of 
(BASE,IV) - a significant improvement on low-register
machines like x86).

llvm-svn: 60374
2008-12-01 22:00:01 +00:00
Bill Wendling 6f71bce4cf Don't rebuild RHSNeg. Just use the one that's already there.
llvm-svn: 60370
2008-12-01 21:06:30 +00:00
Bill Wendling 84f6f2539f Document what this check is doing. Also, no need to cast to ConstantInt.
llvm-svn: 60369
2008-12-01 21:03:43 +00:00
Bill Wendling e6c87a4952 Use a simple comparison. Overflow on integer negation can only occur when the
integer is "minint".

llvm-svn: 60366
2008-12-01 19:46:27 +00:00
Bill Wendling 47f733e4ea Generalize the FoldOrWithConstant method to fold for any two constants which
don't have overlapping bits.

llvm-svn: 60344
2008-12-01 08:32:40 +00:00
Bill Wendling 22e761b302 Reduce copy-and-paste code by splitting out the code into its own function.
llvm-svn: 60343
2008-12-01 08:23:25 +00:00
Bill Wendling 582fe6b0ca Use m_Specific() instead of double matching.
llvm-svn: 60341
2008-12-01 08:09:47 +00:00
Bill Wendling 4eecfb655b Move pattern check outside of the if-then statement. This prevents us from fiddling with constants unless we have to.
llvm-svn: 60340
2008-12-01 07:47:02 +00:00
Chris Lattner 6f5bf6a718 Rename some variables, only increment BI once at the start of the loop instead of throughout it.
llvm-svn: 60339
2008-12-01 07:35:54 +00:00
Chris Lattner f00aae4968 pull the predMap densemap out of the inner loop of performPRE, so
that it isn't reallocated all the time.  This is a tiny speedup for
GVN: 3.90->3.88s

llvm-svn: 60338
2008-12-01 07:29:03 +00:00
Chris Lattner 2b07d3ccde switch a couple more calls to use array_pod_sort.
llvm-svn: 60337
2008-12-01 06:52:57 +00:00
Chris Lattner 2c2dd15a85 Introduce a new array_pod_sort function and switch LSR to use it
instead of std::sort.  This shrinks the release-asserts LSR.o file
by 1100 bytes of code on my system.

We should start using array_pod_sort where possible.

llvm-svn: 60335
2008-12-01 06:49:59 +00:00
Chris Lattner 2aebea5735 Eliminate use of setvector for the DeadInsts set, just use a smallvector.
This is a lot cheaper and conceptually simpler.

llvm-svn: 60332
2008-12-01 06:27:41 +00:00
Chris Lattner 4da78e3774 DeleteTriviallyDeadInstructions is always passed the
DeadInsts ivar, just use it directly.

llvm-svn: 60330
2008-12-01 06:14:28 +00:00
Chris Lattner a68a5a4784 simplify DeleteTriviallyDeadInstructions again, unlike my previous
buggy rewrite, this notifies ScalarEvolution of a pending instruction
about to be removed and then erases it, instead of erasing it then 
notifying.

llvm-svn: 60329
2008-12-01 06:11:32 +00:00
Chris Lattner 9e6b243428 simplify these patterns using m_Specific. No need to grep for
xor in testcase (or is a substring).

llvm-svn: 60328
2008-12-01 05:16:26 +00:00
Chris Lattner 88a1f0213d Teach jump threading to clean up after itself, DCE and constfolding the
new instructions it simplifies.  Because we're threading jumps on edges
with constants coming in from PHI's, we inherently are exposing a lot more
constants to the new block.  Folding them and deleting dead conditions
allows the cost model in jump threading to be more accurate as it iterates.

llvm-svn: 60327
2008-12-01 04:48:07 +00:00
Chris Lattner 084b3a47d3 Change instcombine to use FoldPHIArgGEPIntoPHI to fold two operand PHIs
instead of using FoldPHIArgBinOpIntoPHI.  In addition to being more
obvious, this also fixes a problem where instcombine wouldn't merge two
phis that had different variable indices.  This prevented instcombine
from factoring big chunks of code in 403.gcc.  For example:

 insn_cuid.exit:                
-       %tmp336 = load i32** @uid_cuid, align 4      
-       %tmp337 = getelementptr %struct.rtx_def* %insn_addr.0.ph.i, i32 0, i32 3    
-       %tmp338 = bitcast [1 x %struct.rtunion]* %tmp337 to i32*               
-       %tmp339 = load i32* %tmp338, align 4           
-       %tmp340 = getelementptr i32* %tmp336, i32 %tmp339     
        br label %bb62
 
 bb61:       
-       %tmp341 = load i32** @uid_cuid, align 4     
-       %tmp342 = getelementptr %struct.rtx_def* %insn, i32 0, i32 3        
-       %tmp343 = bitcast [1 x %struct.rtunion]* %tmp342 to i32*           
-       %tmp344 = load i32* %tmp343, align 4        
-       %tmp345 = getelementptr i32* %tmp341, i32 %tmp344          
        br label %bb62
 
 bb62:      
-       %iftmp.62.0.in = phi i32* [ %tmp345, %bb61 ], [ %tmp340, %insn_cuid.exit ]         
+       %insn.pn2 = phi %struct.rtx_def* [ %insn, %bb61 ], [ %insn_addr.0.ph.i, %insn_cuid.exit ]         
+       %tmp344.pn.in.in = getelementptr %struct.rtx_def* %insn.pn2, i32 0, i32 3     
+       %tmp344.pn.in = bitcast [1 x %struct.rtunion]* %tmp344.pn.in.in to i32*  
+       %tmp341.pn = load i32** @uid_cuid     
+       %tmp344.pn = load i32* %tmp344.pn.in 
+       %iftmp.62.0.in = getelementptr i32* %tmp341.pn, i32 %tmp344.pn   
        %iftmp.62.0 = load i32* %iftmp.62.0.in     

llvm-svn: 60325
2008-12-01 03:42:51 +00:00
Chris Lattner 9d02a70a7d Teach inst combine to merge GEPs through PHIs. This is really
important because it is sinking the loads using the GEPs, but
not the GEPs themselves.  This triggers 647 times on 403.gcc
and makes the .s file much much nicer.  For example before:

        je      LBB1_87 ## bb78
LBB1_62:        ## bb77
        leal    84(%esi), %eax
LBB1_63:        ## bb79
        movl    (%eax), %eax
...
LBB1_87:        ## bb78
        movl    $0, 4(%esp)
        movl    %esi, (%esp)
        call    L_make_decl_rtl$stub
        jmp     LBB1_62 ## bb77


after:

        jne     LBB1_63 ## bb79
LBB1_62:        ## bb78
        movl    $0, 4(%esp)
        movl    %esi, (%esp)
        call    L_make_decl_rtl$stub
LBB1_63:        ## bb79
        movl    84(%esi), %eax

The input code was (and the GEPs are merged and
the PHI is now eliminated by instcombine):

        br i1 %tmp233, label %bb78, label %bb77
bb77:           
        %tmp234 = getelementptr %struct.tree_node* %t_addr.3, i32 0, i32 0, i32 22              
        br label %bb79
bb78:           
        call void @make_decl_rtl(%struct.tree_node* %t_addr.3, i8* null) nounwind
        %tmp235 = getelementptr %struct.tree_node* %t_addr.3, i32 0, i32 0, i32 22              
        br label %bb79
bb79:           
        %iftmp.12.0.in = phi %struct.rtx_def** [ %tmp235, %bb78 ], [ %tmp234, %bb77 ]           
        %iftmp.12.0 = load %struct.rtx_def** %iftmp.12.0.in             

llvm-svn: 60322
2008-12-01 02:34:36 +00:00
Chris Lattner 9ce8995d24 Make GVN be more intelligent about redundant load
elimination: when finding dependent load/stores, realize that
they are the same if aliasing claims must alias instead of relying
on the pointers to be exactly equal.  This makes load elimination
more aggressive.  For example, on 403.gcc, we had:

<     68 gvn    - Number of instructions PRE'd
< 152718 gvn    - Number of instructions deleted
<  49699 gvn    - Number of loads deleted
<   6153 memdep - Number of dirty cached non-local responses
< 169336 memdep - Number of fully cached non-local responses
< 162428 memdep - Number of uncached non-local responses

now we have:

>     64 gvn    - Number of instructions PRE'd
> 153623 gvn    - Number of instructions deleted
>  49856 gvn    - Number of loads deleted
>   5022 memdep - Number of dirty cached non-local responses
> 159030 memdep - Number of fully cached non-local responses
> 162443 memdep - Number of uncached non-local responses

That's an extra 157 loads deleted and extra 905 other instructions nuked.

This slows down GVN very slightly, from 3.91 to 3.96s.

llvm-svn: 60314
2008-12-01 01:31:36 +00:00
Chris Lattner 7e61dafc95 Reimplement the non-local dependency data structure in terms of a sorted
vector instead of a densemap.  This shrinks the memory usage of this thing
substantially (the high water mark) as well as making operations like
scanning it faster.  This speeds up memdep slightly, gvn goes from
3.9376 to 3.9118s on 403.gcc

This also splits out the statistics for the cached non-local case to
differentiate between the dirty and clean cached case.  Here's the stats
for 403.gcc:

  6153 memdep - Number of dirty cached non-local responses
169336 memdep - Number of fully cached non-local responses
162428 memdep - Number of uncached non-local responses

yay for caching :)

llvm-svn: 60313
2008-12-01 01:15:42 +00:00
Bill Wendling 5b902c5b1e Implement ((A|B)&1)|(B&-2) -> (A&1) | B transformation. This also takes care of
permutations of this pattern.

llvm-svn: 60312
2008-12-01 01:07:11 +00:00
Chris Lattner 8541edec44 Cache analyses in ivars and add some useful DEBUG output.
This speeds up GVN from 4.0386s to 3.9376s.

llvm-svn: 60310
2008-12-01 00:40:32 +00:00
Chris Lattner 80c7d81e81 improve indentation, do cheap checks before expensive ones,
remove some fixme's.  This speeds up GVN very slightly on 403.gcc 
(4.06->4.03s)

llvm-svn: 60309
2008-11-30 23:39:23 +00:00
Eli Friedman 11c15a5de7 Minor cleanup: use getTrue and getFalse where appropriate. No
functional change.

llvm-svn: 60307
2008-11-30 22:48:49 +00:00
Eli Friedman 55e4becba9 Some minor cleanups to instcombine; no functionality change.
Note that the FoldOpIntoPhi call is dead because it's impossible for the 
first operand of a subtraction to be both a ConstantInt and a PHINode.

llvm-svn: 60306
2008-11-30 21:09:11 +00:00
Bill Wendling de89bc275c Add instruction combining for ((A&~B)|(~A&B)) -> A^B and all permutations.
llvm-svn: 60291
2008-11-30 13:52:49 +00:00
Bill Wendling 9eef421e12 Implement (A&((~A)|B)) -> A&B transformation in the instruction combiner. This
takes care of all permutations of this pattern.

llvm-svn: 60290
2008-11-30 13:08:13 +00:00
Bill Wendling 2fe3229824 Forgot one remaining call to getSExtValue().
llvm-svn: 60289
2008-11-30 12:41:09 +00:00
Bill Wendling 2d2e7861b5 getSExtValue() doesn't work for ConstantInts with bitwidth > 64 bits. Use all
APInt calls instead.

This fixes PR3144.

llvm-svn: 60288
2008-11-30 12:38:24 +00:00
Eli Friedman 09bc610945 Optimize memmove and memset into the LLVM builtins. Note that these
only show up in code from front-ends besides llvm-gcc, like clang.

llvm-svn: 60287
2008-11-30 08:32:11 +00:00
Bill Wendling 7abf352f44 Don't make TwoToExp signed by default.
llvm-svn: 60279
2008-11-30 05:29:33 +00:00
Bill Wendling af200e9237 From Hacker's Delight:
"For signed integers, the determination of overflow of x*y is not so simple. If
x and y have the same sign, then overflow occurs iff xy > 2**31 - 1. If they
have opposite signs, then overflow occurs iff xy < -2**31."

In this case, x == -1.

llvm-svn: 60278
2008-11-30 05:01:05 +00:00
Bill Wendling 70635adea3 Instcombine was illegally transforming -X/C into X/-C when either X or C
overflowed on negation. This commit checks to make sure that neithe C nor X
overflows. This requires that the RHS of X (a subtract instruction) be a
constant integer.

llvm-svn: 60275
2008-11-30 03:42:12 +00:00
Chris Lattner 3ff6d01586 Fix a fixme by making memdep's handling of allocations more logical.
If we see that a load depends on the allocation of its memory with no
intervening stores, we now return a 'None' depedency instead of "Normal".
This tweaks GVN to do its optimization with the new result.

llvm-svn: 60267
2008-11-30 01:39:32 +00:00
Chris Lattner 63bd586d35 Eliminate the dropInstruction method, which is not needed any more.
Fix a subtle iterator invalidation bug I introduced in the last commit.

llvm-svn: 60258
2008-11-29 23:30:39 +00:00
Chris Lattner 1c6b62eb4d Change MemDep::getNonLocalDependency to return its results as
a smallvector instead of a DenseMap.  This speeds up GVN by 5%
on 403.gcc.

llvm-svn: 60255
2008-11-29 21:33:22 +00:00
Chris Lattner f280b0c729 reimplement getNonLocalDependency with a simpler worklist
formulation that is faster and doesn't require nonLazyHelper.
Much less code.

llvm-svn: 60253
2008-11-29 21:22:42 +00:00
Chris Lattner 8c5ff516c6 Fix a thinko that manifested as a crash on clamav last night.
llvm-svn: 60251
2008-11-29 20:29:04 +00:00
Chris Lattner 51ba8d0630 Split getDependency into getDependency and getDependencyFrom, the
former does caching, the later doesn't.  This dramatically simplifies
the logic in getDependency and getDependencyFrom.

llvm-svn: 60234
2008-11-29 03:47:00 +00:00
Bill Wendling 469e3aa696 Temporarily revert r60195. It's causing an optimized bootstrap of llvm-gcc to fail.
llvm-svn: 60233
2008-11-29 03:43:04 +00:00
Chris Lattner 7f9c8a0f05 Introduce and use a new MemDepResult class to hold the results of a memdep
query.  This makes it crystal clear what cases can escape from MemDep that
the clients have to handle.  This also gives the clients a nice simplified
interface to it that is easy to poke at.

This patch also makes DepResultTy and MemoryDependenceAnalysis::DepType
private, yay.

llvm-svn: 60231
2008-11-29 02:29:27 +00:00
Chris Lattner de04e1173a Reimplement the internal abstraction used by MemDep in terms
of a pointer/int pair instead of a manually bitmangled pointer.
This forces clients to think a little more about checking the 
appropriate pieces and will be useful for internal 
implementation improvements later.

I'm not particularly happy with this.  After going through this
I don't think that the clients of memdep should be exposed to
the internal type at all.  I'll fix this in a subsequent commit.

This has no functionality change.

llvm-svn: 60230
2008-11-29 01:43:36 +00:00
Chris Lattner f3f6a801cc don't revisit instructions off the beginning of the block.
llvm-svn: 60221
2008-11-28 22:50:08 +00:00
Chris Lattner f2a8ba4cf0 simplify some code, remove escaped newline.
llvm-svn: 60213
2008-11-28 21:29:52 +00:00
Chris Lattner 8a172daa55 don't call MergeBasicBlockIntoOnlyPred on a block whose only
predecessor is itself.  This doesn't make sense, and this is
a dead infinite loop anyway.

llvm-svn: 60210
2008-11-28 19:54:49 +00:00
Chris Lattner e9f6c355bf rewrite RecursivelyDeleteTriviallyDeadInstructions to use a more efficient
formulation that doesn't require set lookups or scanning a set.

llvm-svn: 60203
2008-11-28 01:20:46 +00:00
Chris Lattner d4b5ba615e remove some weirdness that came from the LSR code that has
nothing to do with dead instruction elimination.  No tests in
dejagnu depend on this, so I don't know what it was needed for.

llvm-svn: 60202
2008-11-28 00:58:15 +00:00
Chris Lattner 1adb6759ef rewrite a big chunk of how DSE does recursive dead operand
elimination to use more modern infrastructure.  Also do a bunch
of small cleanups.

llvm-svn: 60201
2008-11-28 00:27:14 +00:00
Chris Lattner 8e84c129ce delete ErasePossiblyDeadInstructionTree, replacing uses of it with
RecursivelyDeleteTriviallyDeadInstructions.

llvm-svn: 60196
2008-11-27 23:25:44 +00:00
Chris Lattner c077a2a535 Simplify LoopStrengthReduce::DeleteTriviallyDeadInstructions by
making it use RecursivelyDeleteTriviallyDeadInstructions to do
the heavy lifting.

llvm-svn: 60195
2008-11-27 23:23:35 +00:00
Chris Lattner a1bbdff933 enhance RecursivelyDeleteTriviallyDeadInstructions to make
PHIs dead if they are single-value.

llvm-svn: 60194
2008-11-27 23:18:11 +00:00
Chris Lattner 1cb4f72706 Enhance RecursivelyDeleteTriviallyDeadInstructions to optionally
return a list of deleted instructions.

llvm-svn: 60193
2008-11-27 23:14:34 +00:00
Chris Lattner 96e2dbe008 use continue to reduce indentation
llvm-svn: 60192
2008-11-27 23:00:20 +00:00
Chris Lattner c6c481cdfc remove doConstantPropagation and dceInstruction, they are just
wrappers around the interesting code and use an obscure iterator
abstraction that dates back many many years.

Move EraseDeadInstructions to Transforms/Utils and name it
RecursivelyDeleteTriviallyDeadInstructions.

llvm-svn: 60191
2008-11-27 22:57:53 +00:00
Chris Lattner 5ef9ebf787 simplify code.
llvm-svn: 60190
2008-11-27 22:56:14 +00:00
Chris Lattner c92fa42ddd simplify this logic.
llvm-svn: 60189
2008-11-27 22:46:09 +00:00
Nick Lewycky 4ab50b93c8 Chris prefers icmp/select over udiv!
llvm-svn: 60187
2008-11-27 22:41:10 +00:00
Nick Lewycky 69941fd0a0 Add a couple of missed optimizations on integer vectors. Multiply and divide
by 1, as well as multiply by -1.

llvm-svn: 60182
2008-11-27 20:21:08 +00:00
Chris Lattner 4059f43b74 defensive patch: if CGP is merging a block with the entry block, make sure
it ends up being the entry block.

llvm-svn: 60180
2008-11-27 19:29:14 +00:00
Chris Lattner 5dfbfcd80d Fix PR3138: if we merge the entry block into another block, make sure to
move the other block back up into the entry position!

llvm-svn: 60179
2008-11-27 19:25:19 +00:00
Chris Lattner e0d019def6 switch InstCombine::visitLoadInst to use
FindAvailableLoadedValue

llvm-svn: 60169
2008-11-27 08:56:30 +00:00
Chris Lattner c6ae56d23f enhance FindAvailableLoadedValue to make use of AliasAnalysis
if it has it.

llvm-svn: 60167
2008-11-27 08:18:12 +00:00
Chris Lattner 72f16e70f0 move FindAvailableLoadedValue from JumpThreading to Transforms/Utils.
llvm-svn: 60166
2008-11-27 08:10:05 +00:00
Chris Lattner d6204bed3d simplify this code a bit.
llvm-svn: 60164
2008-11-27 07:54:38 +00:00
Chris Lattner 206250284d Use the new MergeBasicBlockIntoOnlyPred function.
llvm-svn: 60163
2008-11-27 07:54:12 +00:00
Chris Lattner 99d6809ac1 move MergeBasicBlockIntoOnlyPred to Transforms/Utils.
llvm-svn: 60162
2008-11-27 07:43:12 +00:00
Chris Lattner 240051aace rename ThreadBlock to ProcessBlock, since it does other things than
just simple threading.

llvm-svn: 60157
2008-11-27 07:20:04 +00:00
Chris Lattner 98d89d1b1b Make jump threading substantially more powerful, in the following ways:
1. Make it fold blocks separated by an unconditional branch.  This enables
   jump threading to see a broader scope.
2. Make jump threading able to eliminate locally redundant loads when they
   feed the branch condition of a block.  This frequently occurs due to
   reg2mem running.
3. Make jump threading able to eliminate *partially redundant* loads when
   they feed the branch condition of a block.  This is common in code with
   lots of loads and stores like C++ code and 255.vortex.

This implements thread-loads.ll and rdar://6402033.

Per the fixme's, several pieces of this should be moved into Transforms/Utils.

llvm-svn: 60148
2008-11-27 05:07:53 +00:00
Chris Lattner 397a11ccd8 Turn on my codegen prepare heuristic by default. It doesn't affect
performance in most cases on the Grawp tester, but does speed some 
things up (like shootout/hash by 15%).  This also doesn't impact 
compile time in a noticable way on the Grawp tester.

It also, of course, gets the testcase it was designed for right :)

llvm-svn: 60120
2008-11-26 22:16:44 +00:00
Chris Lattner fef04acc50 teach the new heuristic how to handle inline asm.
llvm-svn: 60088
2008-11-26 04:59:11 +00:00
Chris Lattner 6d71b7fb95 Improve ValueAlreadyLiveAtInst with a cheap and dirty, but effective
heuristic: the value is already live at the new memory operation if
it is used by some other instruction in the memop's block.  This is
cheap and simple to compute (moreso than full liveness).

This improves the new heuristic even more.  For example, it cuts two
out of three new instructions out of 255.vortex:DbmFileInGrpHdr, 
which is one of the functions that the heuristic regressed.  This
overall eliminates another 40 instructions from 403.gcc and visibly
reduces register pressure in 255.vortex (though this only actually
ends up saving the 2 instructions from the whole program).

llvm-svn: 60084
2008-11-26 03:20:37 +00:00
Chris Lattner e34fe2c52d Start rewroking a subpiece of the profitability heuristic to be
phrased in terms of liveness instead of as a horrible hack.  :)

In pratice, this doesn't change the generated code for either 
255.vortex or 403.gcc, but it could cause minor code changes in 
theory.  This is framework for coming changes.

llvm-svn: 60082
2008-11-26 03:02:41 +00:00
Chris Lattner 383a797f42 add a comment, make save/restore logic more obvious.
llvm-svn: 60076
2008-11-26 02:11:11 +00:00
Chris Lattner eb3e4fb6fb This adds in some code (currently disabled unless you pass
-enable-smarter-addr-folding to llc) that gives CGP a better
cost model for when to sink computations into addressing modes.
The basic observation is that sinking increases register 
pressure when part of the addr computation has to be available
for other reasons, such as having a use that is a non-memory
operation.  In cases where it works, it can substantially reduce
register pressure.

This code is currently an overall win on 403.gcc and 255.vortex
(the two things I've been looking at), but there are several 
things I want to do before enabling it by default:

1. This isn't doing any caching of results, so it is much slower 
   than it could be.  It currently slows down release-asserts llc 
   by 1.7% on 176.gcc: 27.12s -> 27.60s.
2. This doesn't think about inline asm memory operands yet.
3. The cost model botches the case when the needed value is live
   across the computation for other reasons.

I'll continue poking at this, and eventually turn it on as llcbeta.

llvm-svn: 60074
2008-11-26 02:00:14 +00:00
Evan Cheng 496b042e20 Revert r60042. IndVarSimplify should check if APFloat is PPCDoubleDouble first before trying to convert it to an integer.
llvm-svn: 60072
2008-11-26 01:11:57 +00:00
Chris Lattner a9ab165b08 Teach CodeGenPrepare to look through Bitcast instructions when attempting to
optimize addressing modes.  This allows us to optimize things like isel-sink2.ll
into:

	movl	4(%esp), %eax
	cmpb	$0, 4(%eax)
	jne	LBB1_2	## F
LBB1_1:	## TB
	movl	$4, %eax
	ret
LBB1_2:	## F
	movzbl	7(%eax), %eax
	ret

instead of:

_test:
	movl	4(%esp), %eax
	cmpb	$0, 4(%eax)
	leal	4(%eax), %eax
	jne	LBB1_2	## F
LBB1_1:	## TB
	movl	$4, %eax
	ret
LBB1_2:	## F
	movzbl	3(%eax), %eax
	ret

This shrinks (e.g.) 403.gcc from 1133510 to 1128345 lines of .s.

Note that the 2008-10-16-SpillerBug.ll testcase is dubious at best, I doubt
it is really testing what it thinks it is.

llvm-svn: 60068
2008-11-26 00:26:16 +00:00
Chris Lattner f3e95505c5 Teach MatchScaledValue to handle Scales by 1 with MatchAddr (which
can recursively match things) and scales by 0 by ignoring them.
This triggers once in 403.gcc, saving 1 (!!!!) instruction in the 
whole huge app.

llvm-svn: 60013
2008-11-25 07:25:26 +00:00
Chris Lattner 728f90220a significantly refactor all the addressing mode matching logic
into a new AddressingModeMatcher class.  This makes it easier
to reason about and reduces passing around of stuff, but has
no functionality change.

llvm-svn: 60012
2008-11-25 07:09:13 +00:00
Chris Lattner 58f49d2916 refactor all the constantexpr/instruction handling code out into a
new FindMaximalLegalAddressingModeForOperation helper method.

llvm-svn: 60011
2008-11-25 05:15:49 +00:00
Chris Lattner a3fbff15b9 another minor tweak
llvm-svn: 60010
2008-11-25 04:47:41 +00:00
Chris Lattner d616ef5683 minor cleanups no functionality change.
llvm-svn: 60009
2008-11-25 04:42:10 +00:00
Chris Lattner 6416a6b7a0 rearrange and tidy some code, no functionality change.
llvm-svn: 59990
2008-11-24 22:44:16 +00:00
Chris Lattner d917c8c8fe minor cleanups to debug code, no functionality change.
llvm-svn: 59989
2008-11-24 22:40:05 +00:00
Chris Lattner d78894197a reenable the right part of the code.
llvm-svn: 59985
2008-11-24 21:26:21 +00:00
Chris Lattner 992a541002 revert an accidental commit, this fixes the regression on test/CodeGen/X86/isel-sink.ll
llvm-svn: 59976
2008-11-24 19:40:34 +00:00
Chris Lattner 53d6a07869 Fix 3113: If we have a dead cyclic PHI, replace the whole thing
with an undef.

llvm-svn: 59972
2008-11-24 19:25:36 +00:00
Devang Patel 702f45df58 Fix build failure.
llvm-svn: 59844
2008-11-21 21:00:20 +00:00
Devang Patel cb181bb203 Silence unused variable warnings.
llvm-svn: 59841
2008-11-21 20:00:59 +00:00
Chris Lattner dd7083452f reapply Sanjiv's patch to genericize memcpy/memset/memmove to take an
arbitrary integer width for the count.

llvm-svn: 59823
2008-11-21 16:42:48 +00:00
Bill Wendling 4bce2bff88 Revert r59802. It was breaking the build of llvm-gcc:
g++ -m32 -c -g -DIN_GCC -W -Wall -Wwrite-strings -Wmissing-format-attribute -fno-common -mdynamic-no-pic -DHAVE_CONFIG_H -Wno-unused -DTARGET_NAME=\"i386-apple-darwin9.5.0\" -I. -I. -I../../llvm-gcc.src/gcc -I../../llvm-gcc.src/gcc/. -I../../llvm-gcc.src/gcc/../include -I./../intl -I../../llvm-gcc.src/gcc/../libcpp/include  -I../../llvm-gcc.src/gcc/../libdecnumber -I../libdecnumber -I/Volumes/Sandbox/Buildbot/llvm/full-llvm/build/llvm.obj/include -I/Volumes/Sandbox/Buildbot/llvm/full-llvm/build/llvm.src/include -DENABLE_LLVM -I/Volumes/Sandbox/Buildbot/llvm/full-llvm/build/llvm.obj/../llvm.src/include  -D_DEBUG  -D_GNU_SOURCE -D__STDC_LIMIT_MACROS -D__STDC_CONSTANT_MACROS   -I. -I. -I../../llvm-gcc.src/gcc -I../../llvm-gcc.src/gcc/. -I../../llvm-gcc.src/gcc/../include -I./../intl -I../../llvm-gcc.src/gcc/../libcpp/include  -I../../llvm-gcc.src/gcc/../libdecnumber -I../libdecnumber -I/Volumes/Sandbox/Buildbot/llvm/full-llvm/build/llvm.obj/include -I/Volumes/Sandbox/Buildbot/llvm/full-llvm/build/llvm.src/include ../../llvm-gcc.src/gcc/llvm-types.cpp -o llvm-types.o
../../llvm-gcc.src/gcc/llvm-convert.cpp: In member function 'void TreeToLLVM::EmitMemCpy(llvm::Value*, llvm::Value*, llvm::Value*, unsigned int)':
../../llvm-gcc.src/gcc/llvm-convert.cpp:1496: error: 'memcpy_i32' is not a member of 'llvm::Intrinsic'
../../llvm-gcc.src/gcc/llvm-convert.cpp:1496: error: 'memcpy_i64' is not a member of 'llvm::Intrinsic'
../../llvm-gcc.src/gcc/llvm-convert.cpp: In member function 'void TreeToLLVM::EmitMemMove(llvm::Value*, llvm::Value*, llvm::Value*, unsigned int)':
../../llvm-gcc.src/gcc/llvm-convert.cpp:1512: error: 'memmove_i32' is not a member of 'llvm::Intrinsic'
../../llvm-gcc.src/gcc/llvm-convert.cpp:1512: error: 'memmove_i64' is not a member of 'llvm::Intrinsic'
../../llvm-gcc.src/gcc/llvm-convert.cpp: In member function 'void TreeToLLVM::EmitMemSet(llvm::Value*, llvm::Value*, llvm::Value*, unsigned int)':
../../llvm-gcc.src/gcc/llvm-convert.cpp:1528: error: 'memset_i32' is not a member of 'llvm::Intrinsic'
../../llvm-gcc.src/gcc/llvm-convert.cpp:1528: error: 'memset_i64' is not a member of 'llvm::Intrinsic'
make[3]: *** [llvm-convert.o] Error 1
make[3]: *** Waiting for unfinished jobs....
rm fsf-funding.pod gcov.pod gfdl.pod cpp.pod gpl.pod gcc.pod
make[2]: *** [all-stage1-gcc] Error 2
make[1]: *** [stage1-bubble] Error 2
make: *** [all] Error 2

llvm-svn: 59809
2008-11-21 09:09:41 +00:00
Sanjiv Gupta 09a203765a Make mem[cpy,move,set] intrinsics overloaded.
llvm-svn: 59802
2008-11-21 07:49:09 +00:00
Nick Lewycky 07d726ec4d Optimize (x/y)*y into x-(x%y) in general. Div and rem are about the same, and
a subtract is cheaper than a multiply. This generalizes an existing transform.

llvm-svn: 59800
2008-11-21 07:33:58 +00:00
Devang Patel 45f1ae028e Fix unused variable warnings.
llvm-svn: 59778
2008-11-21 01:52:59 +00:00
Bill Wendling f5260d29c2 Fix error where it wasn't getting the correct caller function.
llvm-svn: 59758
2008-11-21 00:09:21 +00:00
Bill Wendling 26c6a3e736 If the function being inlined has a higher stack protection level than the
inlining function, then increase the stack protection level on the inlining
function.

llvm-svn: 59757
2008-11-21 00:06:32 +00:00
Devang Patel 38642e598e Don't forget arguments!
llvm-svn: 59745
2008-11-20 19:50:17 +00:00
Devang Patel c8b2fe1eed Do not forget llvm.dbg.declare's first argument while removing debugging information.
llvm-svn: 59688
2008-11-20 01:20:42 +00:00
Oscar Fuentes 4fb443f81b CMake: Removed source file.
llvm-svn: 59662
2008-11-19 19:32:19 +00:00
Devang Patel 79303b2572 Do not use separate utility to walk all instructions and remove dead dbg intrinsics. Let instcombiner do this job.
llvm-svn: 59659
2008-11-19 19:01:37 +00:00
Devang Patel 827bced2b1 Let instcombiner remove redundant dbg intrinsics.
llvm-svn: 59658
2008-11-19 18:59:41 +00:00
Devang Patel 7ed6c5317c If there are two consecutive llvm.dbg.stoppoint calls then
it is likely that the optimizer deleted code in between these
two intrinsics. Keep only the last llvm.dbg.stoppoint in this case.

llvm-svn: 59657
2008-11-19 18:56:50 +00:00
Devang Patel 25662f3e4a Remove unused variables.
llvm-svn: 59570
2008-11-19 00:22:02 +00:00
Devang Patel ebd2363339 Fix typo.
llvm-svn: 59569
2008-11-19 00:19:18 +00:00
Devang Patel b5e867acff Add new helper pass that strips all symbol names except debugging information.
This pass makes it easier to test wheter debugging info. influences optimization passes or not.

llvm-svn: 59552
2008-11-18 21:34:39 +00:00
Devang Patel 3b7a2be88e Remove even more llvm.dbg variables.
Remove all dead globals from llvm.metadata.
Ignore linkonce linkage for selected llvm.dbg values.

llvm-svn: 59547
2008-11-18 21:13:41 +00:00
Devang Patel a13f1f38fa Initialize MallocFunc and FreeFunc properly.
llvm-svn: 59538
2008-11-18 18:43:07 +00:00
Bill Wendling cf194e9a27 Cast to remove warning about comparing signed and unsigned.
llvm-svn: 59518
2008-11-18 10:57:27 +00:00
Devang Patel f1e9329209 Give SIToFPInst preference over UIToFPInst because it is faster on platforms that are widely used.
llvm-svn: 59476
2008-11-18 00:40:02 +00:00
Devang Patel 180afd2c55 While handling floating point IVs lift restrictions on initial value and increment value.
llvm-svn: 59471
2008-11-17 23:27:13 +00:00
Devang Patel aa3d68d301 Handle floating point ivs during doInitialization().
llvm-svn: 59466
2008-11-17 21:32:02 +00:00
Devang Patel b63c74730c Let AnalyzeAlloca() remove debug intrinsics.
llvm-svn: 59454
2008-11-17 18:37:53 +00:00
Torok Edwin 026259faeb If SI->size() is 0, we are not allowed to dereference ->begin().
This fixed PR3078.

llvm-svn: 59416
2008-11-16 17:21:25 +00:00
Chris Lattner 7917b43a28 eliminate some std::set's.
llvm-svn: 59409
2008-11-16 07:17:51 +00:00
Chris Lattner f8f6270f14 simplify loop
llvm-svn: 59406
2008-11-16 06:35:18 +00:00
Chris Lattner 44152742a0 simplify a bunch more instcombines to use m_Specific etc.
llvm-svn: 59403
2008-11-16 05:38:51 +00:00
Chris Lattner d397fef50d factor the code for simplifying (icmp)|(icmp) into its own function.
llvm-svn: 59402
2008-11-16 05:20:07 +00:00
Chris Lattner 909b969b18 do some computation with apints instead of ConstantInts.
llvm-svn: 59401
2008-11-16 05:14:43 +00:00
Chris Lattner feaea9bdf7 merge a check into a place where it is simpler.
llvm-svn: 59400
2008-11-16 05:10:52 +00:00
Chris Lattner 269cbd5770 factor a whole bunch of code out into a helper function.
llvm-svn: 59398
2008-11-16 05:06:21 +00:00
Chris Lattner b37b6e7e96 simplify the conditions on two gigantic if's, decreasing indentation
a bit.  Next step is to factor out into their own helper functions.

llvm-svn: 59397
2008-11-16 04:55:20 +00:00
Chris Lattner f1be285134 simplify some instcombine matches by using m_Specific
llvm-svn: 59395
2008-11-16 04:46:19 +00:00
Chris Lattner fae5e33111 Use new m_SelectCst template to eliminate macros.
llvm-svn: 59392
2008-11-16 04:33:38 +00:00
Chris Lattner 569d78cbb5 simplify code.
llvm-svn: 59390
2008-11-16 04:26:55 +00:00
Chris Lattner c3f3b059d0 Handle the case where there is no "not". It is possible it got
folded into the select.

llvm-svn: 59389
2008-11-16 04:25:26 +00:00
Chris Lattner 5f6d9a313b factor a bunch of copy/paste code out into a helper function.
Eliminate the cases checking for cond?0:-1, since that is already
handled by commutative checking.

llvm-svn: 59388
2008-11-16 04:24:12 +00:00
Chris Lattner 68d2da2a19 rearrange some code, no functionality change.
llvm-svn: 59381
2008-11-16 03:56:24 +00:00
Chris Lattner e02c7c7ad2 if we're going to use a macro, use it maximally. no functionality change.
llvm-svn: 59380
2008-11-16 03:54:57 +00:00
Devang Patel 8ada1d5de5 Refactor code.
Strip debug information before stripping symbol names. 

llvm-svn: 59328
2008-11-14 22:49:37 +00:00
Devang Patel 3dd51c5c62 Really remove all debug information.
llvm-svn: 59208
2008-11-13 01:28:40 +00:00
Oscar Fuentes 1b504d5372 CMake: Remove removed source file.
llvm-svn: 59098
2008-11-12 00:14:12 +00:00
Devang Patel 4f02a0b740 Remove
llvm-svn: 59093
2008-11-11 23:58:15 +00:00
Devang Patel bf0835706c Undo previous check-in.
llvm-svn: 59092
2008-11-11 23:57:33 +00:00
Oscar Fuentes 2353ef3e91 CMake: Updated list of source files for lib/Transforms/Utils.
llvm-svn: 59077
2008-11-11 19:51:36 +00:00
Devang Patel 6096f26bd4 Add utility pass to remove dbg info.
llvm-svn: 59068
2008-11-11 19:33:39 +00:00
Devang Patel 95b18126ee Use actual function name in comments.
llvm-svn: 59063
2008-11-11 19:16:41 +00:00
Cedric Venet 8cb2e28e43 Update CMakeLists.txt
llvm-svn: 59039
2008-11-11 09:55:48 +00:00
Devang Patel 53b39b5467 Cleanup debug info. assocated with deleted instructions.
llvm-svn: 59012
2008-11-11 00:54:10 +00:00
Devang Patel dc6699e82f Add utility routines to remove dead debug info.
llvm-svn: 59011
2008-11-11 00:53:02 +00:00
Devang Patel d0ce981372 If the sign of exit condition and split condition does not match
then do not split loop index.

llvm-svn: 58995
2008-11-10 19:48:34 +00:00
Bill Wendling 7ef7314d1a Third time's a charm.
The previous patches didn't match correctly. Also, we need to make sure that
the conditional is the same before doing the transformation.

llvm-svn: 58978
2008-11-10 06:59:06 +00:00
Mon P Wang 25f0106fd9 Added support for the following definition of shufflevector
<result> = shufflevector <n x <ty>> <v1>, <n x <ty>> <v2>, <m x i32> <mask> 

llvm-svn: 58964
2008-11-10 04:46:22 +00:00
Bill Wendling 4fb13c051d Correction for the last patch. Should match the conditional in the first part
of the select match, not the select instruction itself.

llvm-svn: 58947
2008-11-09 23:37:53 +00:00
Bill Wendling 1579287550 The method of doing the matching with a 'select' instruction was wrong. The
original code was matching like this:

	if (match(A, m_Not(m_Value(B))))

B was already matched as a 'select' instruction. However, this isn't matching
what we think it's matching. It would match B as a 'Value', so basically
anything would match to it. In this case, a Constant matched. B was replaced
with a constant representation. And then the wrong value would be used in the
SelectInst::Create statement, causing a crash.

After thinking on this for a moment, and after Nick L. told me how the pattern
matching stuff was supposed to work, the solution was to match NOT an m_Value,
but an m_Select.

llvm-svn: 58946
2008-11-09 23:17:42 +00:00
Nuno Lopes 2e42927e7c fix leakage of ValueNumbering
llvm-svn: 58933
2008-11-09 12:45:23 +00:00
Bill Wendling 3f547be28f If the LHS of the FCMP is coming from a UIToFP instruction, then we don't want
to generate signed ICMP instructions to replace the FCMP. This would violate
the following:

define i1 @test1(i32 %val) {
  %1 = uitofp i32 %val to double
  %2 = fcmp ole double %1, 0.000000e+00
  ret i1 %2
}

would be transformed into:

define i1 @test1(i32 %val) {
  %1 = icmp slt i33 %val, 1
  ret i1 %1
}

which is obviously wrong. This patch modifes InstCombiner::FoldFCmp_IntToFP_Cst
to handle when the LHS comes from UIToFP.

llvm-svn: 58929
2008-11-09 04:26:50 +00:00
Daniel Dunbar 2b9dce2669 Rework r58829, allowing removal of dbg info intrinsics during alloca
promotion.
 - Eliminate uses after free and simplify tests.

Devang: Please check that this is still doing what you intended.
llvm-svn: 58887
2008-11-08 04:12:17 +00:00
Bill Wendling b9656df4ac BCUI + 1 doesn't work. Use next instead.
llvm-svn: 58830
2008-11-07 01:59:41 +00:00
Devang Patel b8e0d59ceb Handle (delete) dbg intrinsics while promoting alloca.
llvm-svn: 58826
2008-11-07 01:30:07 +00:00
Mon P Wang 5ca2ec65bd Fixed scalarizing an extract subvector and prevent an infinite loop
when simplify a vector. 

llvm-svn: 58820
2008-11-06 22:52:21 +00:00
Devang Patel 5a5ab730e0 InstructionNamer preserves everything.
llvm-svn: 58787
2008-11-06 01:00:16 +00:00
Devang Patel f0ef35738c Do now allow InlineAlways pass to remove dead functions.
llvm-svn: 58744
2008-11-05 01:39:16 +00:00
Devang Patel 7a848b0ee3 Check Attribute::NoInline.
llvm-svn: 58742
2008-11-05 01:37:05 +00:00
Oscar Fuentes 076e048cf7 CMake: updated list of source files.
llvm-svn: 58736
2008-11-05 00:11:22 +00:00
Dan Gohman 8cdea717a3 Add a new pass to simplify specific half_powr function calls. This is
a specialized pass that it not likely to be generally useful.

llvm-svn: 58732
2008-11-04 23:41:45 +00:00
Dale Johannesen 0a7b4f5800 Allow SROA of vectors. Removing this caused a
huge performance regression in something we care
about.  This may not be final fix.

llvm-svn: 58718
2008-11-04 20:54:03 +00:00
Devang Patel f33f8a8606 Fix unused variable warnings.
llvm-svn: 58651
2008-11-03 23:14:09 +00:00
Devang Patel fe57d109b6 Ignore conditions that are outside the loop.
llvm-svn: 58631
2008-11-03 19:38:07 +00:00
Andrew Lenharth 348f3fa6a7 add a period at the end of the comment, ignoring the fact that the comment would be hard pressed to be considered a sentence, but if it makes Bill happy...
llvm-svn: 58630
2008-11-03 19:29:29 +00:00
Devang Patel c1631db93b Turn floating point IVs into integer IVs where possible.
This allows SCEV users to effectively calculate trip count.
LSR later on transforms back integer IVs to floating point IVs
later on to avoid int-to-float casts inside the loop.

llvm-svn: 58625
2008-11-03 18:32:19 +00:00
Andrew Lenharth 45b86322f2 Ensure that we are checking only calls to the function we are interested in specializing
llvm-svn: 58615
2008-11-03 16:05:35 +00:00
Nick Lewycky d73806a9cc Replace explicit loop with utility function.
llvm-svn: 58593
2008-11-03 03:49:14 +00:00
Nick Lewycky 3c6d34a7f0 Changes from Duncan's review:
* merge two weak functions by making them both alias a third non-weak fn
 * don't reimplement CallSite::hasArgument
 * whitelist the safe linkage types

llvm-svn: 58568
2008-11-02 16:46:26 +00:00
Duncan Sands cede1e035c Get this building on 64 bit machines (error:
cast from ‘const llvm::PointerType*’ to ‘unsigned int’
loses precision).

llvm-svn: 58561
2008-11-02 09:00:33 +00:00
Oscar Fuentes 0433be6feb CMake: added a source file.
llvm-svn: 58559
2008-11-02 06:01:39 +00:00
Nick Lewycky d01d42e76c Add a new MergeFunctions pass. It finds identical functions and merges them.
This triggers only 60 times in llvm-test (look at .llvm.bc, not .linked.rbc)
and so it probably wont be turned on by default. Also, may of those are likely
to go away when PR2973 is fixed.

llvm-svn: 58557
2008-11-02 05:52:50 +00:00
Nick Lewycky 8d8acf327b Fix demanded bits analysis with srem by negative number. Based on a patch
by Richard Osborne.

llvm-svn: 58555
2008-11-02 02:41:50 +00:00
Dan Gohman 83eea0b17f Fix this recently moved code to use the correct type. CI is now a
ConstantInt, and SI is the original cast instruction. This fixes
PR2996.

llvm-svn: 58549
2008-11-02 00:17:33 +00:00
Daniel Dunbar a1c4fcfc29 Fix warning.
llvm-svn: 58486
2008-10-31 01:50:01 +00:00
Dan Gohman 13cbcf1c18 Canonicalize sext(i1) to i1?-1:0, and update various instcombine
optimizations accordingly.

llvm-svn: 58457
2008-10-30 20:40:10 +00:00
Daniel Dunbar 3933e66a89 Add InlineCost class for represent the estimated cost of inlining a
function.
 - This explicitly models the costs for functions which should
   "always" or "never" be inlined. This fixes bugs where such costs
   were not previously respected.

llvm-svn: 58450
2008-10-30 19:26:59 +00:00
Chris Lattner 0934c0f35b Fix PR2967 by not deleting volatile load/stores that occur before unreachable.
I don't really see this as being needed, but there is little harm from doing
it.

llvm-svn: 58385
2008-10-29 17:46:26 +00:00
Daniel Dunbar e7fbf9f425 Factor shouldInline method out of Inliner.
- No functionality change.

llvm-svn: 58355
2008-10-29 01:02:02 +00:00
Daniel Dunbar cc20455346 Assorted comment/naming fixes, 80-col violations, and reindentation.
- No functionality change.

llvm-svn: 58352
2008-10-28 23:24:26 +00:00
Dan Gohman 2c34c130bf (A & sext(C)) | (B & ~sext(C) -> C ? A : B
llvm-svn: 58351
2008-10-28 22:38:57 +00:00
Torok Edwin ca97b42ef7 export an ID for the instructionNamer, allowing analysis/transformation passes
that need it to require it by ID.

llvm-svn: 58238
2008-10-27 10:16:27 +00:00
Chris Lattner 59b5691388 Rewrite all the 'PromoteLocallyUsedAlloca[s]' logic. With the power of
LargeBlockInfo, we can now dramatically simplify their implementation
and speed them up at the same time.  Now the code has time proportional
to the number of uses of the alloca, not the size of the block.

This also eliminates code that tried to batch up different allocas which
are used in the same blocks, and eliminates the 'retry list' logic which
was baroque and no unneccesary.  In addition to being a speedup for crazy
cases, this is also a nice cleanup:

PromoteMemoryToRegister.cpp |  270 +++++++++++++++-----------------------------
 1 file changed, 96 insertions(+), 174 deletions(-)

llvm-svn: 58229
2008-10-27 07:05:53 +00:00
Chris Lattner f594ecc453 Add a new LargeBlockInfo helper, which is just a wrapper around
a trivial dense map.  Use this in RewriteSingleStoreAlloca to
avoid aggressively rescanning blocks over and over again.  This
fixes PR2925, speeding up mem2reg on the testcase in that bug
from 4.56s to 0.02s in a debug build on my machine.

llvm-svn: 58227
2008-10-27 06:05:26 +00:00
Nick Lewycky f6e4dca67e Add value range analyzing of Add and Sub.
Understand that mul %x, 1 = %x.

llvm-svn: 58069
2008-10-24 04:00:26 +00:00
Daniel Dunbar 7f39e2d85a Change create*Pass factory functions to return Pass* instead of
LoopPass*.
 - Although less precise, this means they can be used in clients
   without RTTI (who would otherwise need to include LoopPass.h, which
   eventually includes things using dynamic_cast). This was the
   simplest solution that presented itself, but I am happy to use a
   better one if available.

llvm-svn: 58010
2008-10-22 23:32:42 +00:00
Dan Gohman 72e66eedb8 Use Function::getEntryBlock() instead of Function::front(), for clarity.
llvm-svn: 57870
2008-10-21 03:10:28 +00:00
Dan Gohman fa29b67aee Fix a bug that prevented llvm-extract -delete from working.
llvm-svn: 57864
2008-10-21 01:08:07 +00:00
Dan Gohman 215742a966 Use 0 instead of false to return a null pointer.
llvm-svn: 57660
2008-10-17 00:56:52 +00:00
Dan Gohman bc0278400c Teach instcombine's visitLoad to scan back several instructions
to find opportunities for store-to-load forwarding or load CSE,
in the same way that visitStore scans back to do DSE. Also, define
a new helper function for testing whether the addresses of two
memory accesses are known to have the same value, and use it in
both visitStore and visitLoad.

These two changes allow instcombine to eliminate loads in code
produced by front-ends that frequently emit obviously redundant
addressing for memory references.

llvm-svn: 57608
2008-10-15 23:19:35 +00:00
Evan Cheng d885f6e139 Combine (fcmp cc0 x, y) | (fcmp cc1 x, y) into a single fcmp when possible.
llvm-svn: 57515
2008-10-14 18:44:08 +00:00
Evan Cheng ce70752b11 - Somehow I forgot about one / une.
- Renumber fcmp predicates to match their icmp counterparts.
- Try swapping operands to expose more optimization opportunities.

llvm-svn: 57513
2008-10-14 18:13:38 +00:00
Evan Cheng 67786cce66 Optimize anding of two fcmp into a single fcmp if the operands are the same. e.g. uno && ueq -> ueq
ord && olt -> olt
     ord && ueq -> oeq

llvm-svn: 57507
2008-10-14 17:15:11 +00:00
Matthijs Kooijman f7d3cb5435 Make InstructionCombining::getBitCastOperand() recognize GEP instructions and
constant expression with all zero indices as being the same as a bitcast.

llvm-svn: 57442
2008-10-13 15:17:01 +00:00
Chris Lattner da435910e8 Fix PR2697 by rewriting the '(X / pos) op neg' logic. This also changes
a couple other cases for clarity, but shouldn't affect correctness.

Patch by Eli Friedman!

llvm-svn: 57387
2008-10-11 22:55:00 +00:00
Devang Patel 647a1e532b Check loop exit predicate properly while eliminating one iteration loop.
This patch fixes PR 2869

llvm-svn: 57369
2008-10-10 22:02:57 +00:00
Nuno Lopes e3127f3f80 fix memleak by cleaning the global sets on pass exit
llvm-svn: 57353
2008-10-10 16:25:50 +00:00
Dale Johannesen 4f0bd68cfe Add a "loses information" return value to APFloat::convert
and APFloat::convertToInteger.  Restore return value to
IEEE754.  Adjust all users accordingly.

llvm-svn: 57329
2008-10-09 23:00:39 +00:00
Nick Lewycky 03c5fa18f1 Don't drop alignment on globals when cloning.
llvm-svn: 57320
2008-10-09 06:27:14 +00:00
Nuno Lopes 06c67f88d7 dont specialize weak functions and the like
llvm-svn: 57305
2008-10-08 18:45:59 +00:00
Duncan Sands 26ff6f9c54 Add <cstdio> include where needed by gcc-4.4.
Patch by Samuel Tardieu.

llvm-svn: 57291
2008-10-08 07:23:46 +00:00
Chris Lattner 42d5785dbd Add parentheses to avoid warnings in GCC 4.4.0,
patch by Samuel Tardieu!

llvm-svn: 57288
2008-10-08 06:42:28 +00:00
Andrew Lenharth 5aa1cc4065 Correctly set attributes when removing args during cloning. Fixes PR2765
llvm-svn: 57254
2008-10-07 18:08:38 +00:00
Devang Patel 40aafce00d Fix typo, fix PR 2865.
llvm-svn: 57221
2008-10-06 23:22:54 +00:00
Matthijs Kooijman cbe5e16eb5 Allow scalarrepl to treat an all-zero GEP just as bitcast.
This includes not marking a GEP involving a vector as unsafe, but only when it
has all zero indices. This allows scalarrepl to work in a few more cases.

llvm-svn: 57177
2008-10-06 16:23:31 +00:00
Chris Lattner 917a6c1343 rewrite bswap matching to be more general, allowing arbitrary
shifting and masking inside a bswap expr.  This allows it to handle
the cases from PR2842, which involve the intermediate 'or' 
expressions being shifted, not just the input value.

llvm-svn: 57095
2008-10-05 02:13:19 +00:00
Chris Lattner ca91f265c4 fix a bug where the bswap matcher could match a case involving
ashr.  It should only apply to lshr.

llvm-svn: 57089
2008-10-05 00:50:57 +00:00
Duncan Sands 1d35e9aebe Ignore loads from and stores to local memory (i.e. allocas)
when deciding whether to mark a function readnone/readonly.
Since the pass is currently run before SROA, this may be
quite helpful.  Requested by Chris on IRC.

llvm-svn: 57050
2008-10-04 13:24:24 +00:00
Dan Gohman e21903987f Clean up some multiple-return-value code that is no longer
applicable.

llvm-svn: 57033
2008-10-03 22:21:24 +00:00
Devang Patel f963403b58 Nick Lewycky's patch.
While hosting instruction check PHI node.

llvm-svn: 57025
2008-10-03 18:57:37 +00:00
Duncan Sands 3a813a5d3f Teach internalize to preserve the callgraph.
Why?  Because it was there!

llvm-svn: 56996
2008-10-03 07:36:09 +00:00
Owen Anderson cb4f156b6b SplitBlock should only attempt to update LoopInfo if it is actually being used.
llvm-svn: 56994
2008-10-03 06:55:35 +00:00
Duncan Sands d65a4daeea Factorize code: remove variants of "strip off
pointer bitcasts and GEP's", and centralize the
logic in Value::getUnderlyingObject.  The
difference with stripPointerCasts is that
stripPointerCasts only strips GEPs if all
indices are zero, while getUnderlyingObject
strips GEPs no matter what the indices are.

llvm-svn: 56922
2008-10-01 15:25:41 +00:00
Nuno Lopes 96740aad86 revert the addition of Preverves(CallGraph), per Duncan's comments
llvm-svn: 56917
2008-10-01 09:13:40 +00:00
Dan Gohman 67d90de2b0 Call ScalarEvolution's deleteValueFromRecords before deleting an
instruction, not after. This fixes some uses of free'd memory.

llvm-svn: 56908
2008-10-01 02:02:03 +00:00
Nuno Lopes 5093ab4c76 add preserversCFG() + preservers(CallGraph)
llvm-svn: 56887
2008-09-30 22:04:30 +00:00
Nuno Lopes 2bd7b24f1a add AU.setPreservesCFG() since this pass only adds and removes function attributes
llvm-svn: 56868
2008-09-30 18:34:38 +00:00
Nick Lewycky e8ced3ec19 Fix misoptimization of: xor i1 (icmp eq (X, C1), icmp s[lg]t (X, C2))
llvm-svn: 56834
2008-09-30 06:08:34 +00:00