Commit Graph

5368 Commits

Author SHA1 Message Date
Bob Wilson c8056a952e Check for empty structs, and for consistency, zero-element arrays.
llvm-svn: 123383
2011-01-13 18:26:59 +00:00
Bob Wilson 08713d3c5f Extend SROA to handle arrays accessed as homogeneous structs and vice versa.
This is a minor extension of SROA to handle a special case that is
important for some ARM NEON operations.  Some of the NEON intrinsics
return multiple values, which are handled as struct types containing
multiple elements of the same vector type.  The corresponding return
types declared in the arm_neon.h header have equivalent arrays.  We
need SROA to recognize that it can split up those arrays and structs
into separate vectors, even though they are not always accessed with
the same type.  SROA already handles loads and stores of an entire
alloca by using insertvalue/extractvalue to access the individual
pieces, and that code works the same regardless of whether the type
is a struct or an array.  So, all that needs to be done is to check
for compatible arrays and homogeneous structs.

llvm-svn: 123381
2011-01-13 17:45:11 +00:00
Bob Wilson 12eec40c83 Make SROA more aggressive with allocas containing padding.
SROA only split up structs and arrays one level at a time, so padding can
only cause trouble if it is located in between the struct or array elements.

llvm-svn: 123380
2011-01-13 17:45:08 +00:00
Devang Patel 30f3ebbc1f Use SmallVector instead of SmallPtrSet and avoid non-deterministic behavior.
llvm-svn: 123318
2011-01-12 19:12:45 +00:00
Chris Lattner dd5f60b7a7 revert 123144, reenabling the rest of memset formation.
llvm-svn: 123302
2011-01-12 03:25:15 +00:00
Chris Lattner 654098f411 revert r123146 which disabled code that wasn't the root cause
of the bootstrap miscompare issue.

llvm-svn: 123299
2011-01-12 01:52:23 +00:00
Chris Lattner fa7c29d255 revert r123149, reenabling an improvement to memcpyopt that wasn't
the source of the bootstrap problem.

llvm-svn: 123298
2011-01-12 01:43:46 +00:00
Jakob Stoklund Olesen 12cc296bd4 Remove the PR8954 workaround.
llvm-svn: 123288
2011-01-11 22:56:41 +00:00
Cameron Zwarich cb9c4f85ec Dial back the speculative fix for PR8954 a bit, so that we only recompute dominators
once at the beginning of GVN instead of once per iteration.

llvm-svn: 123278
2011-01-11 22:14:42 +00:00
Cameron Zwarich 51eb403907 Attempt to fix the bootstrap buildbot. Rafael says this works for him on x86-64 Linux.
llvm-svn: 123270
2011-01-11 20:23:34 +00:00
Chris Lattner 193ce7c4d1 update memdep when an instruction is deleted. This code isn't
actually reached in the testcase in PR8954, but it's safe and good
practice.

llvm-svn: 123224
2011-01-11 08:19:16 +00:00
Chris Lattner f6ae904e34 Fix FoldSingleEntryPHINodes to update memdep and AA when it deletes
phi nodes.  It is called from MergeBlockIntoPredecessor which is 
called from GVN, which claims to preserve these.

I'm skeptical that this is the actual problem behind PR8954, but
this is a stab in the right direction.

llvm-svn: 123222
2011-01-11 08:13:40 +00:00
Chris Lattner dfcfcb49fa random cleanups
llvm-svn: 123221
2011-01-11 08:00:40 +00:00
Chris Lattner 63fe78de68 remove a bogus assertion: the latch block of a loop is not
neccesarily an uncond branch to the header.  This fixes 
PR8955 (the assertion tripping).

llvm-svn: 123219
2011-01-11 07:47:59 +00:00
Chris Lattner 88bc848ab6 another random stab in the dark trying to fix llvm-gcc-i386-linux-selfhost
llvm-svn: 123149
2011-01-10 02:34:11 +00:00
Chris Lattner 4662bd4b13 another (more) aggressive attempt to bring llvm-gcc-i386-linux-selfhost
back to life.

llvm-svn: 123146
2011-01-10 00:47:34 +00:00
Chris Lattner 1017fa6746 temporarily disable memset formation from memsets in an effort to restore buildbot stability.
llvm-svn: 123144
2011-01-09 23:52:48 +00:00
Chris Lattner caf5c0d037 fix a few old bugs (found by inspection) where we would zap instructions
without informing memdep.  This could cause nondeterminstic weirdness 
based on where instructions happen to get allocated, and will hopefully
breath some life into some broken testers.

llvm-svn: 123124
2011-01-09 19:26:10 +00:00
Cameron Zwarich a42e5915bf LoopInstSimplify preserves LoopSimplify.
llvm-svn: 123117
2011-01-09 12:35:16 +00:00
Chris Lattner a337f5ec5c reduce indentation. Print <nuw> and <nsw> when dumping SCEV AddRec's
that have the bit set.

llvm-svn: 123104
2011-01-09 02:16:18 +00:00
Chris Lattner 7d6433ae76 fix a latent bug in memcpyoptimizer that my recent patches exposed: it wasn't
updating memdep when fusing stores together.  This fixes the crash optimizing
the bullet benchmark.

llvm-svn: 123091
2011-01-08 22:19:21 +00:00
Chris Lattner ff6ed2ac5f tryMergingIntoMemset can only handle constant length memsets.
llvm-svn: 123090
2011-01-08 22:11:56 +00:00
Chris Lattner 9a1d63ba9f Merge memsets followed by neighboring memsets and other stores into
larger memsets.  Among other things, this fixes rdar://8760394 and
allows us to handle "Example 2" from http://blog.regehr.org/archives/320,
compiling it into a single 4096-byte memset:

_mad_synth_mute:                        ## @mad_synth_mute
## BB#0:                                ## %entry
	pushq	%rax
	movl	$4096, %esi             ## imm = 0x1000
	callq	___bzero
	popq	%rax
	ret

llvm-svn: 123089
2011-01-08 21:19:19 +00:00
Chris Lattner 5120ebf184 fix an issue in IsPointerOffset that prevented us from recognizing that
P and P+1 are relative to the same base pointer.

llvm-svn: 123087
2011-01-08 21:07:56 +00:00
Chris Lattner 4dc1fd938f enhance memcpyopt to merge a store and a subsequent
memset into a single larger memset.

llvm-svn: 123086
2011-01-08 20:54:51 +00:00
Chris Lattner c638147e9f constify TargetData references.
Split memset formation logic out into its own
"tryMergingIntoMemset" helper function.

llvm-svn: 123081
2011-01-08 20:24:01 +00:00
Chris Lattner 59c82f850d When loop rotation happens, it is *very* common for the duplicated condbr
to be foldable into an uncond branch.  When this happens, we can make a
much simpler CFG for the loop, which is important for nested loop cases
where we want the outer loop to be aggressively optimized.

Handle this case more aggressively.  For example, previously on
phi-duplicate.ll we would get this:


define void @test(i32 %N, double* %G) nounwind ssp {
entry:
  %cmp1 = icmp slt i64 1, 1000
  br i1 %cmp1, label %bb.nph, label %for.end

bb.nph:                                           ; preds = %entry
  br label %for.body

for.body:                                         ; preds = %bb.nph, %for.cond
  %j.02 = phi i64 [ 1, %bb.nph ], [ %inc, %for.cond ]
  %arrayidx = getelementptr inbounds double* %G, i64 %j.02
  %tmp3 = load double* %arrayidx
  %sub = sub i64 %j.02, 1
  %arrayidx6 = getelementptr inbounds double* %G, i64 %sub
  %tmp7 = load double* %arrayidx6
  %add = fadd double %tmp3, %tmp7
  %arrayidx10 = getelementptr inbounds double* %G, i64 %j.02
  store double %add, double* %arrayidx10
  %inc = add nsw i64 %j.02, 1
  br label %for.cond

for.cond:                                         ; preds = %for.body
  %cmp = icmp slt i64 %inc, 1000
  br i1 %cmp, label %for.body, label %for.cond.for.end_crit_edge

for.cond.for.end_crit_edge:                       ; preds = %for.cond
  br label %for.end

for.end:                                          ; preds = %for.cond.for.end_crit_edge, %entry
  ret void
}

Now we get the much nicer:

define void @test(i32 %N, double* %G) nounwind ssp {
entry:
  br label %for.body

for.body:                                         ; preds = %entry, %for.body
  %j.01 = phi i64 [ 1, %entry ], [ %inc, %for.body ]
  %arrayidx = getelementptr inbounds double* %G, i64 %j.01
  %tmp3 = load double* %arrayidx
  %sub = sub i64 %j.01, 1
  %arrayidx6 = getelementptr inbounds double* %G, i64 %sub
  %tmp7 = load double* %arrayidx6
  %add = fadd double %tmp3, %tmp7
  %arrayidx10 = getelementptr inbounds double* %G, i64 %j.01
  store double %add, double* %arrayidx10
  %inc = add nsw i64 %j.01, 1
  %cmp = icmp slt i64 %inc, 1000
  br i1 %cmp, label %for.body, label %for.end

for.end:                                          ; preds = %for.body
  ret void
}

With all of these recent changes, we are now able to compile:

void foo(char *X) {
 for (int i = 0; i != 100; ++i) 
   for (int j = 0; j != 100; ++j)
     X[j+i*100] = 0;
}

into a single memset of 10000 bytes.  This series of changes
should also be helpful for other nested loop scenarios as well.

llvm-svn: 123079
2011-01-08 19:59:06 +00:00
Chris Lattner 30f318e5d1 split ssa updating code out to its own helper function. Don't bother
moving the OrigHeader block anymore: we just merge it away anyway so
its code layout doesn't matter.

llvm-svn: 123077
2011-01-08 19:26:33 +00:00
Chris Lattner 2615130e1d Implement a TODO: Enhance loopinfo to merge away the unconditional branch
that it was leaving in loops after rotation (between the original latch
block and the original header.

With this change, it is possible for rotated loops to have just a single
basic block, which is useful.

llvm-svn: 123075
2011-01-08 19:10:28 +00:00
Chris Lattner fee37c5fa3 inline preserveCanonicalLoopForm now that it is simple.
llvm-svn: 123073
2011-01-08 18:55:50 +00:00
Chris Lattner 063dca0f6a Three major changes:
1. Rip out LoopRotate's domfrontier updating code.  It isn't
   needed now that LICM doesn't use DF and it is super complex
   and gross.
2. Make DomTree updating code a lot simpler and faster.  The 
   old loop over all the blocks was just to find a block??
3. Change the code that inserts the new preheader to just use
   SplitCriticalEdge instead of doing an overcomplex 
   reimplementation of it.

No behavior change, except for the name of the inserted preheader.

llvm-svn: 123072
2011-01-08 18:52:51 +00:00
Chris Lattner 7fab23bc1d LoopRotate requires canonical loop form, so it always has preheaders
and latch blocks.  Reorder entry conditions to make hte pass faster
and more logical.

llvm-svn: 123069
2011-01-08 18:06:22 +00:00
Chris Lattner d62691f4e8 use the LI ivar.
llvm-svn: 123068
2011-01-08 17:49:51 +00:00
Chris Lattner 385f2ec6d8 some cleanups: remove dead arguments and eliminate ivars
that are just passed to one function.

llvm-svn: 123067
2011-01-08 17:48:33 +00:00
Chris Lattner 25ba40a0cc fix an issue duncan pointed out, which could cause loop rotate
to violate LCSSA form

llvm-svn: 123066
2011-01-08 17:38:45 +00:00
Cameron Zwarich b4ab257bcc Fix coding style issues.
llvm-svn: 123065
2011-01-08 17:07:11 +00:00
Cameron Zwarich 84986b298a Make more passes preserve dominators (or state that they preserve dominators if
they all ready do). This removes two dominator recomputations prior to isel,
which is a 1% improvement in total llc time for 403.gcc.

The only potentially suspect thing is making GCStrategy recompute dominators if
it used a custom lowering strategy.

llvm-svn: 123064
2011-01-08 17:01:52 +00:00
Cameron Zwarich 80bd9af7c5 Contract subloop bodies. However, it is still important to visit the phis at the
top of subloop headers, as the phi uses logically occur outside of the subloop.

llvm-svn: 123062
2011-01-08 15:52:22 +00:00
Chris Lattner 8c5defd0b0 Have loop-rotate simplify instructions (yay instsimplify!) as it clones
them into the loop preheader, eliminating silly instructions like
"icmp i32 0, 100" in fixed tripcount loops.  This also better exposes the 
bigger problem with loop rotate that I'd like to fix: once this has been
folded, the duplicated conditional branch *often* turns into an uncond branch.

Not aggressively handling this is pessimizing later loop optimizations 
somethin' fierce by making "dominates all exit blocks" checks fail.

llvm-svn: 123060
2011-01-08 08:24:46 +00:00
Chris Lattner 43f8d16482 Revamp the ValueMapper interfaces in a couple ways:
1. Take a flags argument instead of a bool.  This makes
   it more clear to the reader what it is used for.
2. Add a flag that says that "remapping a value not in the
   map is ok".
3. Reimplement MapValue to share a bunch of code and be a lot
   more efficient.  For lookup failures, don't drop null values
   into the map.
4. Using the new flag a bunch of code can vaporize in LinkModules
   and LoopUnswitch, kill it.

No functionality change.

llvm-svn: 123058
2011-01-08 08:15:20 +00:00
Chris Lattner 2b3f20e6ec two minor changes: switch to the standard ValueToValueMapTy
map from ValueMapper.h (giving us access to its utilities)
and add a fastpath in the loop rotation code, avoiding expensive
ssa updator manipulation for values with nothing to update.

llvm-svn: 123057
2011-01-08 07:21:31 +00:00
Cameron Zwarich 9ec19ea06a Add the CallInst optimizations that don't involve expanding inline assembly to
OptimizeInst() so that they can be used on a worklist instruction.

llvm-svn: 122945
2011-01-06 02:56:42 +00:00
Cameron Zwarich d28c78eb4f Move the GEP handling in CodeGenPrepare to OptimizeInst().
llvm-svn: 122944
2011-01-06 02:44:52 +00:00
Cameron Zwarich 14ac865ca9 Split the optimizations in CodeGenPrepare that don't manipulate the iterators
into a separate function, so that it can be called from a loop using a worklist
rather than a loop traversing a whole basic block.

llvm-svn: 122943
2011-01-06 02:37:26 +00:00
Jakob Stoklund Olesen 70be93a200 Zap the last two -Wself-assign warnings in llvm.
Simplify RALinScan::DowngradeRegister with TRI::getOverlaps while we are there.

llvm-svn: 122940
2011-01-06 01:33:22 +00:00
Cameron Zwarich ce3b930a98 Stop reallocating SunkAddrs for each basic block. When we move to an instruction
worklist, the key will need to become std::pair<BasicBlock*, Value*>.

llvm-svn: 122932
2011-01-06 00:42:50 +00:00
Cameron Zwarich b62ccb241b Add some more statistics to CodeGenPrepare.
llvm-svn: 122891
2011-01-05 17:47:38 +00:00
Cameron Zwarich ced753fadf Add some stats to CodeGenPrepare to make it easier to speed it up without
regressing code quality.

llvm-svn: 122887
2011-01-05 17:27:27 +00:00
Cameron Zwarich 6a78995369 Use pop_back_val instead of back followed by pop_back.
llvm-svn: 122876
2011-01-05 16:08:47 +00:00
Cameron Zwarich 5a2bb998ac Use a worklist for later iterations just like ordinary instsimplify. The next
step is to only process instructions in subloops if they have been modified by
an earlier simplification.

llvm-svn: 122869
2011-01-05 05:47:47 +00:00
Cameron Zwarich 4c51d122d5 Change LoopInstSimplify back to a LoopPass. It revisits subloops rather than
skipping them, but it should probably use a worklist and only revisit those
instructions in subloops that have actually changed. It should probably also
use a worklist after the first iteration like instsimplify now does. Regardless,
it's only 0.3% of opt -O2 time on 403.gcc if it replaces the instcombine placed
in the middle of the loop passes.

llvm-svn: 122868
2011-01-05 05:15:53 +00:00
Owen Anderson 7b25ff04bd Don't bother value numbering instructions with void types in GVN. In theory this should allow us to insert
fewer things into the value numbering maps, but any speedup is beneath the noise threshold on my machine
on 403.gcc.

llvm-svn: 122844
2011-01-04 22:15:21 +00:00
Owen Anderson e39cb57b09 Complete the NumberTable --> LeaderTable rename.
llvm-svn: 122828
2011-01-04 19:29:46 +00:00
Owen Anderson d7d06d3aaf Fix typo in a comment.
llvm-svn: 122827
2011-01-04 19:25:18 +00:00
Owen Anderson 51489b3b28 Prune #include's.
llvm-svn: 122826
2011-01-04 19:24:57 +00:00
Owen Anderson c7c3bc63f7 Clarify terminology, settling on referring to what was the "number table" as the "leader table", and
rename methods to make it much more clear what they're doing.

llvm-svn: 122823
2011-01-04 19:13:25 +00:00
Owen Anderson 83546f2fe0 When removing a value from GVN's leaders list, don't drop the Next pointer in a corner case.
llvm-svn: 122822
2011-01-04 19:10:54 +00:00
Owen Anderson 41a1550ef5 Branch instructions don't produce values, so there's no need to generate a value number for them. This
avoids adding them to the various value numbering tables, resulting in a minor (~3%) speedup for GVN
on 40.gcc.

llvm-svn: 122819
2011-01-04 18:54:18 +00:00
Owen Anderson 22c53e277a Remove commented out code.
llvm-svn: 122817
2011-01-04 18:22:08 +00:00
Cameron Zwarich b2a41e9388 Switch to the new style of asterisk placement.
llvm-svn: 122815
2011-01-04 18:19:19 +00:00
Chris Lattner 8643810ede Teach loop-idiom to turn a loop containing a memset into a larger memset
when safe.

The testcase is basically this nested loop:
void foo(char *X) {
  for (int i = 0; i != 100; ++i) 
    for (int j = 0; j != 100; ++j)
      X[j+i*100] = 0;
}

which gets turned into a single memset now.  clang -O3 doesn't optimize
this yet though due to a phase ordering issue I haven't analyzed yet.

llvm-svn: 122806
2011-01-04 07:46:33 +00:00
Chris Lattner a62b01dc37 restructure this a bit. Initialize the WeakVH with "I", the
instruction *after* the store.  The store will always be deleted
if the transformation kicks in, so we'd do an N^2 scan of every
loop block.  Whoops.

llvm-svn: 122805
2011-01-04 07:27:30 +00:00
Cameron Zwarich f4e13699e7 Avoid finding loop back edges when we are not splitting critical edges in
CodeGenPrepare (which is the default behavior).

llvm-svn: 122801
2011-01-04 04:43:31 +00:00
Cameron Zwarich e924969380 Address most of Duncan's review comments. Also, make LoopInstSimplify a simple
FunctionPass. It probably doesn't have a reason to be a LoopPass, as it will
probably drop the simple fixed point and either use RPO iteration or Duncan's
approach in instsimplify of only revisiting instructions that have changed.

The next step is to preserve LoopSimplify. This looks like it won't be too hard,
although the pass manager doesn't actually seem to respect when non-loop passes
claim to preserve LCSSA or LoopSimplify. This will have to be fixed.

llvm-svn: 122791
2011-01-04 00:12:46 +00:00
Chris Lattner 0ba473c218 use the very-handy getTruncateOrZeroExtend helper function, and
stop setting NSW: signed overflow is possible.  Thanks to Dan
for pointing these out.

llvm-svn: 122790
2011-01-04 00:06:55 +00:00
Owen Anderson 0839d3930a Fix comment.
llvm-svn: 122788
2011-01-03 23:51:56 +00:00
Owen Anderson d62d37225a Use the new addEscapingValue callback to update GlobalsModRef when GVN adds PHIs of GEPs. For the moment,
have GlobalsModRef handle this conservatively by simply removing the value from its maps.

llvm-svn: 122787
2011-01-03 23:51:43 +00:00
Chris Lattner bde6ec1db6 Duncan deftly points out that readnone functions aren't
invalidated by stores, so they can be handled as 'simple'
operations.

llvm-svn: 122785
2011-01-03 23:38:13 +00:00
Owen Anderson 3a33d0cc4a Simplify GVN's value expression structure, allowing the elimination of a lot of
almost-but-not-quite-identical code.  No intended functionality change.

llvm-svn: 122760
2011-01-03 19:00:11 +00:00
Chris Lattner 16ca19ffc5 stength reduce my previous patch a bit. The only instructions
that are allowed to have metadata operands are intrinsic calls,
and the only ones that take metadata currently return void.
Just reject all void instructions, which should not be value
numbered anyway.  To future proof things, add an assert to the
getHashValue impl for calls to check that metadata operands 
aren't present.

llvm-svn: 122759
2011-01-03 18:43:03 +00:00
Chris Lattner 142f1cd251 fix PR8895: metadata operands don't have a strong use of their
nested values, so they can change and drop to null, which can
change the hash and cause havok.

It turns out that it isn't a good idea to value number stuff
with metadata operands anyway, so... don't.

llvm-svn: 122758
2011-01-03 18:28:15 +00:00
Cameron Zwarich 43cecb1200 Switch a worklist in CodeGenPrepare to SmallVector and increase the inline
capacity on the Visited SmallPtrSet. On 403.gcc, this is about a 4.5% speedup of
CodeGenPrepare time (which itself is 10% of time spent in the backend).

This is progress towards PR8889.

llvm-svn: 122741
2011-01-03 06:33:01 +00:00
Chris Lattner 9e5e9ed79a earlycse can do trivial with-a-block dead store
elimination as well.  This deletes 60 stores in 176.gcc
that largely come from bitfield code.

llvm-svn: 122736
2011-01-03 04:17:24 +00:00
Chris Lattner 4b9a525742 switch the load table to use a recycling bump pointer allocator,
speeding earlycse up by 6%.

llvm-svn: 122733
2011-01-03 03:53:50 +00:00
Chris Lattner e0e32a9ef0 now that loads are in their own table, we can implement
store->load forwarding.  This allows EarlyCSE to zap 600 more
loads from 176.gcc.

llvm-svn: 122732
2011-01-03 03:46:34 +00:00
Chris Lattner 92bb0f9f9d split loads and calls into separate tables. Loads are now just indexed
by their pointer instead of using MemoryValue to wrap it.

llvm-svn: 122731
2011-01-03 03:41:27 +00:00
Chris Lattner 4cb365414f various cleanups, no functionality change.
llvm-svn: 122729
2011-01-03 03:28:23 +00:00
Chris Lattner b9a8efc960 Teach EarlyCSE to do trivial CSE of loads and read-only calls.
On 176.gcc, this catches 13090 loads and calls, and increases the
number of simple instructions CSE'd from 29658 to 36208.

llvm-svn: 122727
2011-01-03 03:18:43 +00:00
Chris Lattner 79d83067ee rename InstValue to SimpleValue, add some comments.
llvm-svn: 122725
2011-01-03 02:20:48 +00:00
Michael J. Spencer edb5bcdde5 CMake: Add missing source file.
llvm-svn: 122724
2011-01-03 02:13:05 +00:00
Chris Lattner d815f69b30 Allocate nodes for the scoped hash table from a recyling bump pointer
allocator.  This speeds up early cse by about 20%

llvm-svn: 122723
2011-01-03 01:42:46 +00:00
Chris Lattner 02a9776b64 reduce redundancy in the hashing code and other misc cleanups.
llvm-svn: 122720
2011-01-03 01:10:08 +00:00
Cameron Zwarich cab9a0abab Add a new loop-instsimplify pass, with the intention of replacing the instance
of instcombine that is currently in the middle of the loop pass pipeline. This
commit only checks in the pass; it will hopefully be enabled by default later.

llvm-svn: 122719
2011-01-03 00:25:16 +00:00
Chris Lattner 0844c76f9a fix some pastos
llvm-svn: 122718
2011-01-02 23:29:58 +00:00
Chris Lattner 8fac5db251 add DEBUG and -stats output to earlycse.
Teach it to CSE the rest of the non-side-effecting instructions.

llvm-svn: 122716
2011-01-02 23:19:45 +00:00
Chris Lattner 18ae5436b1 Enhance earlycse to do CSE of casts, instsimplify and die.
Add a testcase.

llvm-svn: 122715
2011-01-02 23:04:14 +00:00
Chris Lattner bf0aa927cc split dom frontier handling stuff out to its own DominanceFrontier header,
so that Dominators.h is *just* domtree.  Also prune #includes a bit.

llvm-svn: 122714
2011-01-02 22:09:33 +00:00
Chris Lattner 704541bb23 sketch out a new early cse pass. No functionality yet.
llvm-svn: 122713
2011-01-02 21:47:05 +00:00
Chris Lattner 9c69406f2b fix a miscompilation of tramp3d-v4: when forming a memcpy, we have to make
sure that the loop we're promoting into a memcpy doesn't mutate the input
of the memcpy.  Before we were just checking that the dest of the memcpy
wasn't mod/ref'd by the loop.

llvm-svn: 122712
2011-01-02 21:14:18 +00:00
Chris Lattner 5702a43c09 If a loop iterates exactly once (has backedge count = 0) then don't
mess with it.  We'd rather peel/unroll it than convert all of its 
stores into memsets.

llvm-svn: 122711
2011-01-02 20:24:21 +00:00
Chris Lattner 8455b6e45e enhance loop idiom recognition to scan *all* unconditionally executed
blocks in a loop, instead of just the header block.  This makes it more
aggressive, able to handle Duncan's Ada examples.

llvm-svn: 122704
2011-01-02 19:01:03 +00:00
Chris Lattner 0cdc6f62a5 make inSubLoop much more efficient.
llvm-svn: 122703
2011-01-02 18:53:08 +00:00
Chris Lattner 27497ece96 rip out isExitBlockDominatedByBlockInLoop, calling DomTree::dominates instead.
isExitBlockDominatedByBlockInLoop is a relic of the days when domtree was 
*just* a tree and didn't have DFS numbers.  Checking DFS numbers is faster
and easier than "limiting the search of the tree".

llvm-svn: 122702
2011-01-02 18:45:39 +00:00
Chris Lattner 0469e01c02 add a list of opportunities for future improvement.
llvm-svn: 122701
2011-01-02 18:32:09 +00:00
Chris Lattner ddf58010bd Allow loop-idiom to run on multiple BB loops, but still only scan the loop
header for now for memset/memcpy opportunities.  It turns out that loop-rotate
is successfully rotating loops, but *DOESN'T MERGE THE BLOCKS*, turning "for 
loops" into 2 basic block loops that loop-idiom was ignoring.

With this fix, we form many *many* more memcpy and memsets than before, including
on the "history" loops in the viterbi benchmark, which look like this:

        for (j=0; j<MAX_history; ++j) {
          history_new[i][j+1] = history[2*i][j];
        }

Transforming these loops into memcpy's speeds up the viterbi benchmark from
11.98s to 3.55s on my machine.  Woo.

llvm-svn: 122685
2011-01-02 07:58:36 +00:00
Chris Lattner 5b5a043d82 remove debugging code.
llvm-svn: 122683
2011-01-02 07:37:13 +00:00
Chris Lattner 12f91befce add some -stats output.
llvm-svn: 122682
2011-01-02 07:36:44 +00:00
Chris Lattner 679572e584 improve loop rotation to use CodeMetrics to analyze the
size of a loop header instead of its own code size estimator.
This allows it to handle bitcasts etc more precisely.

llvm-svn: 122681
2011-01-02 07:35:53 +00:00
Chris Lattner 85b6d81d41 teach loop idiom recognition to form memcpy's from simple loops.
llvm-svn: 122678
2011-01-02 03:37:56 +00:00
Chris Lattner a3514441e0 add a validity check that was missed, fixing a crash on the
new testcase.

llvm-svn: 122662
2011-01-01 20:12:04 +00:00
Chris Lattner 91a4435875 improve validity check to handle constant-trip-count loops more
aggressively.  In practice, this doesn't help anything though,
see the todo.

llvm-svn: 122660
2011-01-01 19:54:22 +00:00
Chris Lattner 8b3baf6d75 implement the "no aliasing accesses in loop" safety check. This pass
should be correct now.

llvm-svn: 122659
2011-01-01 19:39:01 +00:00
Chris Lattner 65a699d4d0 simplify this, isBytewiseValue handles the extra check. We still
check for "multiple of a byte" in size to make it clear that the
>> 3 below is safe.

llvm-svn: 122604
2010-12-28 18:53:48 +00:00
Duncan Sands 5cf10e691b Silence gcc warning about an unused variable when doing a release build.
llvm-svn: 122593
2010-12-28 09:41:15 +00:00
Chris Lattner cb18bfa3d2 fix some issues Frits noticed, add AliasAnalysis as a dependency
llvm-svn: 122585
2010-12-27 18:39:08 +00:00
Benjamin Kramer 7cba269dfb SimplifyLibCalls: Use IRBuilder to simplify code.
llvm-svn: 122575
2010-12-27 00:16:46 +00:00
Chris Lattner b9fe685b9a have loop-idiom nuke instructions that feed stores that get removed.
llvm-svn: 122574
2010-12-27 00:03:23 +00:00
Chris Lattner 29e14edc8d implement enough of the memset inference algorithm to recognize and insert
memsets.  This is still missing one important validity check, but this is enough
to compile stuff like this:

void test0(std::vector<char> &X) {
  for (std::vector<char>::iterator I = X.begin(), E = X.end(); I != E; ++I)
    *I = 0;
}

void test1(std::vector<int> &X) {
  for (long i = 0, e = X.size(); i != e; ++i)
    X[i] = 0x01010101;
}

With:
 $ clang t.cpp -S -o - -O2 -emit-llvm | opt -loop-idiom | opt -O3 | llc 

to:

__Z5test0RSt6vectorIcSaIcEE:            ## @_Z5test0RSt6vectorIcSaIcEE
## BB#0:                                ## %entry
	subq	$8, %rsp
	movq	(%rdi), %rax
	movq	8(%rdi), %rsi
	cmpq	%rsi, %rax
	je	LBB0_2
## BB#1:                                ## %bb.nph
	subq	%rax, %rsi
	movq	%rax, %rdi
	callq	___bzero
LBB0_2:                                 ## %for.end
	addq	$8, %rsp
	ret
...
__Z5test1RSt6vectorIiSaIiEE:            ## @_Z5test1RSt6vectorIiSaIiEE
## BB#0:                                ## %entry
	subq	$8, %rsp
	movq	(%rdi), %rax
	movq	8(%rdi), %rdx
	subq	%rax, %rdx
	cmpq	$4, %rdx
	jb	LBB1_2
## BB#1:                                ## %for.body.preheader
	andq	$-4, %rdx
	movl	$1, %esi
	movq	%rax, %rdi
	callq	_memset
LBB1_2:                                 ## %for.end
	addq	$8, %rsp
	ret

llvm-svn: 122573
2010-12-26 23:42:51 +00:00
Chris Lattner 6cf8d6cc6e start using irbuilder to make mem intrinsics in a few passes.
llvm-svn: 122572
2010-12-26 22:57:41 +00:00
Chris Lattner 7c5f9c35d1 sketch more of this out.
llvm-svn: 122567
2010-12-26 20:45:45 +00:00
Chris Lattner 9cb1035f94 move isBytewiseValue out to ValueTracking.h/cpp
llvm-svn: 122565
2010-12-26 20:15:01 +00:00
Chris Lattner 81ae3f299a actually add the file...
llvm-svn: 122563
2010-12-26 19:39:38 +00:00
Chris Lattner 2ef535a4e4 Start of a pass for recognizing memset and memcpy idioms.
No functionality yet.

llvm-svn: 122562
2010-12-26 19:32:44 +00:00
Benjamin Kramer 30342fb1fd Simplify code.
llvm-svn: 122561
2010-12-26 15:23:45 +00:00
Benjamin Kramer b90b2f0635 Fix a thinko pointed out by Frits van Bommel: looking through global variables in isBytewiseValue is not safe.
llvm-svn: 122550
2010-12-24 22:23:59 +00:00
Benjamin Kramer ea9152e551 MemCpyOpt: Turn memcpys from a constant into a memset if possible.
This allows us to compile "int cst[] = {-1, -1, -1};" into
  movl  $-1, 16(%rsp)
  movq  $-1, 8(%rsp)
instead of
  movl  _cst+8(%rip), %eax
  movl  %eax, 16(%rsp)
  movq  _cst(%rip), %rax
  movq  %rax, 8(%rsp)

llvm-svn: 122548
2010-12-24 21:17:12 +00:00
Owen Anderson 5d690d4168 It is possible for SimplifyCFG to cause PHI nodes to become redundant too late in the optimization
pipeline to be caught by instcombine, and it's not feasible to catch them in SimplifyCFG because the
use-lists are in an inconsistent state at the point where it could know that it need to simplify them.
Instead, have CodeGenPrepare look for trivially redundant PHIs as part of its general cleanup effort.

llvm-svn: 122516
2010-12-23 20:57:35 +00:00
Mon P Wang 18b762a946 Preserve the address space when generating bitcasts for MemTransferInst in ConvertToScalarInfo
llvm-svn: 122462
2010-12-23 01:41:32 +00:00
Jeffrey Yasskin 9b43f33620 Change all self assignments X=X to (void)X, so that we can turn on a
new gcc warning that complains on self-assignments and
self-initializations.

llvm-svn: 122458
2010-12-23 00:58:24 +00:00
Owen Anderson 5ab8d4b5e5 Give GVN back the ability to perform simple conditional propagation on conditional branch values.
I still think that LVI should be handling this, but that capability is some ways off in the future,
and this matters for some significant benchmarks.

llvm-svn: 122378
2010-12-21 23:54:34 +00:00
Owen Anderson 12470778d7 Remove dead code.
llvm-svn: 122371
2010-12-21 22:31:24 +00:00
Benjamin Kramer 43493c089f GVN's Expression is not POD-like (it contains a SmallVector). Simplify code while at it.
llvm-svn: 122362
2010-12-21 21:30:19 +00:00
Chris Lattner b6252a376a tidy up
llvm-svn: 122190
2010-12-19 20:24:28 +00:00
Chris Lattner 408a684d29 Enhance LICM to promote alias sets whose pointers themselves are stored,
which doesn't affect the memory address being promoted.

llvm-svn: 122172
2010-12-19 05:57:25 +00:00
Chris Lattner 3337a81450 fix PR8602, a bug in an assertion: a volatile store *of* a pointer
does not make the alias set for that pointer volatile, just stores
*to* the pointer.

llvm-svn: 122171
2010-12-19 05:51:54 +00:00
Chris Lattner fb888622c3 revert r122164, I'm going to go with a different approach.
llvm-svn: 122168
2010-12-19 04:23:03 +00:00
Chris Lattner 583ec6fa44 first step to fixing PR8642: don't fold away empty basic blocks
which have trapping constant exprs in them due to PHI nodes.
Eliminating them can cause the constant expr to be evalutated
on new paths if the input edges are critical.

llvm-svn: 122164
2010-12-19 03:02:34 +00:00
Dan Gohman 93dc2b808f Revert r64460. strtol and friends cannot be marked readonly, even with
a null endptr argument, because they may write to errno.

This fixes a seflhost miscompile observed on Linux targets when TBAA
was enabled.

llvm-svn: 122014
2010-12-17 01:09:43 +00:00
Frits van Bommel 9bbe849fc3 Fix a bug in the loop in JumpThreading::ProcessThreadableEdges() where it could falsely produce a MultipleDestSentinel value if the first predecessor ended with an 'indirectbr'. If that happened, it caused an unnecessary FindMostPopularDest() call.
This wasn't a correctness problem, but it broke the fast path for single-predecessor blocks.

llvm-svn: 121966
2010-12-16 12:16:00 +00:00
Dan Gohman e1a17a3473 Make memcpyopt TBAA-aware.
llvm-svn: 121944
2010-12-16 02:51:19 +00:00
Dan Gohman 4467aa5294 Preserve TBAA tags when doing load PRE.
llvm-svn: 121921
2010-12-15 23:53:55 +00:00
Dan Gohman a4fcd2418d Move Value::getUnderlyingObject to be a standalone
function so that it can live in Analysis instead of
VMCore.

llvm-svn: 121885
2010-12-15 20:02:24 +00:00
Frits van Bommel 3d1803495e Teach jump threading to "look through" a select when the branch direction of a terminator depends on it.
When it sees a promising select it now tries to figure out whether the condition of the select is known in any of the predecessors and if so it maps the operands appropriately.

llvm-svn: 121859
2010-12-15 09:51:20 +00:00
Owen Anderson 35609d97ae Fix PR8790, another instance where unreachable code can cause instruction simplification to fail,
this case involve a select that simplifies to itself.

llvm-svn: 121817
2010-12-15 00:55:35 +00:00
Owen Anderson 15c85c916f Cleanup trailing whitespace.
llvm-svn: 121816
2010-12-15 00:52:44 +00:00
Chris Lattner 73a58627c3 simplify code and reduce indentation
llvm-svn: 121670
2010-12-13 02:38:13 +00:00
Chris Lattner bc4457e317 enhance memcpyopt to zap memcpy's that have the same src/dst.
llvm-svn: 121362
2010-12-09 07:45:45 +00:00
Chris Lattner fd51c52ef6 fix PR8753, eliminating a case where we'd infinitely make a
substitution because it doesn't actually change the IR.  Patch by
Jakub Staszak!

llvm-svn: 121361
2010-12-09 07:39:50 +00:00
Frits van Bommel d2f4b09e10 Remove some dead code from the jump threading pass.
The last uses of these functions were removed in r113852 when LazyValueInfo was permanently enabled and removed the need for them.

llvm-svn: 121133
2010-12-07 13:08:07 +00:00
Jay Foad 583abbc4df PR5207: Change APInt methods trunc(), sext(), zext(), sextOrTrunc() and
zextOrTrunc(), and APSInt methods extend(), extOrTrunc() and new method
trunc(), to be const and to return a new value instead of modifying the
object in place.

llvm-svn: 121120
2010-12-07 08:25:19 +00:00
Frits van Bommel d9df6eaa9c Implement jump threading of 'indirectbr' by keeping track of whether we're looking for ConstantInt*s or BlockAddress*s.
llvm-svn: 121066
2010-12-06 23:36:56 +00:00
Chris Lattner 4dc53e37d9 Use a stronger predicate here, pointed out by Duncan
llvm-svn: 121040
2010-12-06 21:48:10 +00:00
Chris Lattner ca335e38cf add some DEBUG statements.
llvm-svn: 121038
2010-12-06 21:13:51 +00:00
Chris Lattner 94fbdf3814 Fix PR8728, a miscompilation I recently introduced. When optimizing
memcpy's like:
  memcpy(A, B)
  memcpy(A, C)

we cannot delete the first memcpy as dead if A and C might be aliases.
If so, we actually get:

  memcpy(A, B)
  memcpy(A, A)

which is not correct to transform into:

  memcpy(A, A)

This patch was heavily influenced by Jakub Staszak's patch in PR8728, thanks
Jakub!

llvm-svn: 120974
2010-12-06 01:48:06 +00:00
Frits van Bommel 76244867cf Refactor jump threading.
Should have no functional change other than the order of two transformations that are mutually-exclusive and the exact formatting of debug output.
Internally, it now stores the ConstantInt*s as Constant*s, and actual undef values instead of nulls.

llvm-svn: 120946
2010-12-05 19:06:41 +00:00
Frits van Bommel 5e75ef4a8e Remove trailing whitespace.
llvm-svn: 120945
2010-12-05 19:02:47 +00:00
Chris Lattner 1c577b54b0 fix a bozo bug I introduced in r119930, causing a miscompile of
20040709-1.c from the gcc testsuite.  I was using the size of a
pointer instead of the pointee.  This fixes rdar://8713376

llvm-svn: 120519
2010-12-01 01:24:55 +00:00
Chris Lattner 903add84d9 Enhance DSE to handle the variable index case in PR8657.
llvm-svn: 120498
2010-11-30 23:43:23 +00:00
Chris Lattner c0f3379ae0 teach DSE to use GetPointerBaseWithConstantOffset to analyze
may-aliasing stores that partially overlap with different base
pointers.  This implements PR6043 and the non-variable part of
PR8657

llvm-svn: 120485
2010-11-30 23:05:20 +00:00
Chris Lattner e28618de59 move GetPointerBaseWithConstantOffset out of GVN into ValueTracking.h
llvm-svn: 120476
2010-11-30 22:25:26 +00:00
Chris Lattner 50162e3c2a remove a fixed fixme
llvm-svn: 120474
2010-11-30 22:18:11 +00:00
Chris Lattner 6712251f41 Make DeleteDeadInstruction be a static function, move some code around.
llvm-svn: 120471
2010-11-30 21:58:14 +00:00
Chris Lattner 51d67ce2ff switch RemoveAccessedObjects to use AliasAnalysis::Location to simplify
the code.  We now get accurate sizes on Loads, though it surely doesn't
matter in practice.

llvm-svn: 120469
2010-11-30 21:47:58 +00:00
Chris Lattner f80b39986f two improvements to RemoveAccessedObjects:
1. if the underlying pointer passed in can be resolved
   to any argument or alloca, then we don't need to scan.
   Previously we would only avoid the scan if the alloca
   or byval was actually considered dead.
2. The dead store processing code is itself completely
   dead and didn't handle volatile stores right anyway,
   so delete it.  This allows simplifying the interface
   to RemoveAccessedObjects.

llvm-svn: 120467
2010-11-30 21:38:30 +00:00
Chris Lattner 7fe08b67fa remove the "undead" terminology, which is nonstandard and never
made sense to me.  We now have a set of dead stack objects, and
they become live when loaded.  Fix a theoretical problem where
we'd pass in the wrong pointer to the alias query.

llvm-svn: 120465
2010-11-30 21:32:12 +00:00
Chris Lattner 127818d746 move call handling in handleEndBlock up a bit, and simplify it.
If the call might read all the allocas, stop scanning early.
Convert a vector to smallvector, shrink SmallPtrSet to 16 instead
of 64 to avoid crazy linear scans.

llvm-svn: 120463
2010-11-30 21:18:46 +00:00
Dale Johannesen d3a58c8fa1 Avoid exponential growth of a table. It feels like
there should be a better way to do this.  PR 8679.

llvm-svn: 120457
2010-11-30 20:23:21 +00:00
Chris Lattner 60a8b3dab8 various cleanups and code simplification
llvm-svn: 120454
2010-11-30 19:48:15 +00:00
Chris Lattner 51c28a93cc make getPointerSize a static function. Add ivars to DSE for
AA and MD pass info instead of using getAnalysis<> all over.

llvm-svn: 120453
2010-11-30 19:34:42 +00:00
Chris Lattner 77d79fa25f reduce indentation, clean up TD use a bit.
llvm-svn: 120452
2010-11-30 19:28:23 +00:00
Chris Lattner b63ba73b1b enhance isRemovable to refuse to delete volatile mem transfers
now that DSE hacks on them.  This fixes a regression I introduced,
by generalizing DSE to hack on transfers.

llvm-svn: 120445
2010-11-30 19:12:10 +00:00
Chris Lattner 58b779e9c2 Rewrite the main DSE loop to be written in terms of reasoning
about pairs of AA::Location's instead of looking for MemDep's
"Def" predicate.  This is more powerful and general, handling
memset/memcpy/store all uniformly, and implementing PR8701 and
probably obsoleting parts of memcpyoptimizer.

This also fixes an obscure bug with init.trampoline and i8
stores, but I'm not surprised it hasn't been hit yet.  Enhancing
init.trampoline to carry the size that it stores would allow
DSE to be much more aggressive about optimizing them.

llvm-svn: 120406
2010-11-30 07:23:21 +00:00
Anders Carlsson e3ea1cba79 Add a puts optimization that converts puts() to putchar('\n').
llvm-svn: 120398
2010-11-30 06:19:18 +00:00
Chris Lattner 3590ef817c rename a function and reduce some indentation, no functionality change.
llvm-svn: 120391
2010-11-30 05:30:45 +00:00
Chris Lattner 2227a8a192 rename doesClobberMemory -> hasMemoryWrite to be more specific, and
remove an actively-wrong comment.

llvm-svn: 120378
2010-11-30 01:37:52 +00:00
Chris Lattner 9d179d911d clean up handling of 'free', detangling it from everything else.
It can be seriously improved, but at least now it isn't intertwined
with the other logic.

llvm-svn: 120377
2010-11-30 01:28:33 +00:00
Chris Lattner 9a146372b5 Teach basicaa that memset's modref set is at worst "mod" and never
contains "ref".

Enhance DSE to use a modref query instead of a store-specific hack
to generalize the "ignore may-alias stores" optimization to handle
memset and memcpy.

llvm-svn: 120368
2010-11-30 00:28:45 +00:00
Chris Lattner c3c754f750 my previous patch would cause us to start deleting some volatile
stores, fix and add a testcase.

llvm-svn: 120363
2010-11-30 00:12:39 +00:00
Chris Lattner d4f1090948 two changes to DSE that shouldn't affect anything:
1. Don't bother trying to optimize:

lifetime.end(ptr)
store(ptr)

as it is undefined, and therefore shouldn't exist.

2. Move the 'storing a loaded pointer' xform up, simplifying
  the may-aliased store code.

llvm-svn: 120359
2010-11-30 00:01:19 +00:00
Chris Lattner b4df1d5a3e prune an llvmcontext include and simplify some code.
llvm-svn: 120347
2010-11-29 23:35:33 +00:00
Chris Lattner 2e8793482c fix PR8677, patch by Jakub Staszak!
llvm-svn: 120325
2010-11-29 21:59:31 +00:00
Owen Anderson 8ba5f39f70 Second attempt at fixing the performance regressions introduced
by my recent GVN improvement.  Looking through a single layer of
PHI nodes when attempting to sink GEPs, we need to iteratively
look through arbitrary PHI nests.

llvm-svn: 120202
2010-11-27 08:15:55 +00:00
Nick Lewycky b8de00ee07 Treat a call of function pointer like a load of the pointer when considering
whether the pointer can be replaced with the global variable it is a copy of.
Fixes PR8680.

llvm-svn: 120126
2010-11-24 22:04:20 +00:00
Duncan Sands 433c1679cf Replace calls to ConstantFoldInstruction with calls to SimplifyInstruction
in two places that are really interested in simplified instructions, not
constants.

llvm-svn: 120044
2010-11-23 20:26:33 +00:00
Duncan Sands bb2cd025a9 Constant folding here is pointless, because InstructionSimplify
(which does constant folding and more) is called a few lines
later.

llvm-svn: 120042
2010-11-23 20:24:21 +00:00
Chris Lattner fc9aead6fd fix comment
llvm-svn: 119948
2010-11-21 19:05:34 +00:00
Chris Lattner 5957229659 rework some DSE paths to use the newly-public "getPointerDependencyFrom"
method in MemDep instead of inserting an instruction, doing a query,
then removing it.  Neither operation is effectively cached.

llvm-svn: 119930
2010-11-21 08:06:10 +00:00
Chris Lattner e48c31ce33 implement PR8576, deleting dead stores with intervening may-alias stores.
llvm-svn: 119927
2010-11-21 07:34:32 +00:00
Chris Lattner 58f9f58716 Implement PR8644: forwarding a memcpy value to a byval,
allowing the memcpy to be eliminated.

Unfortunately, the requirements on byval's without explicit 
alignment are really weak and impossible to predict in the 
mid-level optimizer, so this doesn't kick in much with current
frontends.  The fix is to change clang to set alignment on all
byval arguments.

llvm-svn: 119916
2010-11-21 00:28:59 +00:00
Benjamin Kramer ddd1b7b801 Simplify code. No change in functionality.
llvm-svn: 119908
2010-11-20 18:43:35 +00:00
Owen Anderson ea326db47b Document the new GVN number table structure.
llvm-svn: 119865
2010-11-19 22:48:40 +00:00
Owen Anderson dfb8c3bbfc When folding addressing modes in CodeGenPrepare, attempt to look through PHI nodes
if all the operands of the PHI are equivalent.  This allows CodeGenPrepare to undo
unprofitable PRE transforms.

llvm-svn: 119853
2010-11-19 22:15:03 +00:00
Duncan Sands aef146b890 Factor code for testing whether replacing one value with another
preserves LCSSA form out of ScalarEvolution and into the LoopInfo
class.  Use it to check that SimplifyInstruction simplifications
are not breaking LCSSA form.  Fixes PR8622.

llvm-svn: 119727
2010-11-18 19:59:41 +00:00
Owen Anderson c21c100f3d Completely rework the datastructure GVN uses to represent the value number to leader mapping. Previously,
this was a tree of hashtables, and a query recursed into the table for the immediate dominator ad infinitum
if the initial lookup failed.  This led to really bad performance on tall, narrow CFGs.

We can instead replace it with what is conceptually a multimap of value numbers to leaders (actually
represented by a hashtable with a list of Value*'s as the value type), and then
determine which leader from that set to use very cheaply thanks to the DFS numberings maintained by
DominatorTree.  Because there are typically few duplicates of a given value, this scan tends to be
quite fast.  Additionally, we use a custom linked list and BumpPtr allocation to avoid any unnecessary
allocation in representing the value-side of the multimap.

This change brings with it a 15% (!) improvement in the total running time of GVN on 403.gcc, which I
think is pretty good considering that includes all the "real work" being done by MemDep as well.

The one downside to this approach is that we can no longer use GVN to perform simple conditional progation,
but that seems like an acceptable loss since we now have LVI and CorrelatedValuePropagation to pick up
the slack.  If you see conditional propagation that's not happening, please file bugs against LVI or CVP.

llvm-svn: 119714
2010-11-18 18:32:40 +00:00
Chris Lattner 1385dff8c0 slightly simplify code and substantially improve comment. Instead of
saying "it would be bad", give an example of what is going on.

llvm-svn: 119695
2010-11-18 08:07:09 +00:00
Chris Lattner 731caac7c6 remove a pointless restriction from memcpyopt. It was
refusing to optimize two memcpy's like this:

copy A <- B
copy C <- A

if it couldn't prove that noalias(B,C).  We can eliminate
the copy by producing a memmove instead of memcpy.

llvm-svn: 119694
2010-11-18 08:00:57 +00:00
Chris Lattner c274a83442 remove another pointless noalias check: M is a memcpy, so the
source and dest are known to not overlap.

llvm-svn: 119692
2010-11-18 07:39:57 +00:00
Chris Lattner 75cfe98534 use AA::isNoAlias instead of open coding it. Remove an extraneous noalias check:
there is no need to check to see if the source and dest of a memcpy are noalias,
behavior is undefined if not.

llvm-svn: 119691
2010-11-18 07:38:43 +00:00
Chris Lattner 1e37bbafbb finish a thought.
llvm-svn: 119690
2010-11-18 07:32:33 +00:00
Chris Lattner 7e9b2ea3bf rearrange some code, splitting memcpy/memcpy optimization
out of processMemCpy into its own function.

llvm-svn: 119687
2010-11-18 07:02:37 +00:00
Chris Lattner ac5701319b allow eliminating an alloca that is just copied from an constant global
if it is passed as a byval argument.  The byval argument will just be a
read, so it is safe to read from the original global instead.  This allows
us to promote away the %agg.tmp alloca in PR8582

llvm-svn: 119686
2010-11-18 06:41:51 +00:00
Chris Lattner f183d5c4be enhance the "alloca is just a memcpy from constant global"
to ignore calls that obviously can't modify the alloca
because they are readonly/readnone.

llvm-svn: 119683
2010-11-18 06:26:49 +00:00
Chris Lattner 7aeae25c78 fix a small oversight in the "eliminate memcpy from constant global"
optimization.  If the alloca that is "memcpy'd from constant" also has
a memcpy from *it*, ignore it: it is a load.  We now optimize the testcase to:

define void @test2() {
  %B = alloca %T
  %a = bitcast %T* @G to i8*
  %b = bitcast %T* %B to i8*
  call void @llvm.memcpy.p0i8.p0i8.i64(i8* %b, i8* %a, i64 124, i32 4, i1 false)
  call void @bar(i8* %b)
  ret void
}

previously we would generate:

define void @test() {
  %B = alloca %T
  %b = bitcast %T* %B to i8*
  %G.0 = getelementptr inbounds %T* @G, i32 0, i32 0
  %tmp3 = load i8* %G.0, align 4
  %G.1 = getelementptr inbounds %T* @G, i32 0, i32 1
  %G.15 = bitcast [123 x i8]* %G.1 to i8*
  %1 = bitcast [123 x i8]* %G.1 to i984*
  %srcval = load i984* %1, align 1
  %B.0 = getelementptr inbounds %T* %B, i32 0, i32 0
  store i8 %tmp3, i8* %B.0, align 4
  %B.1 = getelementptr inbounds %T* %B, i32 0, i32 1
  %B.12 = bitcast [123 x i8]* %B.1 to i8*
  %2 = bitcast [123 x i8]* %B.1 to i984*
  store i984 %srcval, i984* %2, align 1
  call void @bar(i8* %b)
  ret void
}

llvm-svn: 119682
2010-11-18 06:20:47 +00:00
Dan Gohman 20d9ce21ef Move SCEV::dominates and properlyDominates to ScalarEvolution.
llvm-svn: 119570
2010-11-17 21:41:58 +00:00
Dan Gohman afd6db9932 Move SCEV::isLoopInvariant and hasComputableLoopEvolution to be member
functions of ScalarEvolution, in preparation for memoization and
other optimizations.

llvm-svn: 119562
2010-11-17 21:23:15 +00:00
Dan Gohman 1ee6d24072 Reference ScalarEvolution by name rather than directly in LICM,
to avoid an unneeded dependence.

llvm-svn: 119557
2010-11-17 20:50:07 +00:00
Duncan Sands 72313843d5 Remove dead code in GVN: now that SimplifyInstruction is called
systematically, CollapsePhi will always return null here.  Note
that CollapsePhi did an extra check, isSafeReplacement, which
the SimplifyInstruction logic does not do.  I think that check
was bogus - I guess we will soon find out!  (It was originally
added in commit 41998 without a testcase).

llvm-svn: 119456
2010-11-17 04:05:21 +00:00
Duncan Sands 637049515f Have a few places that want to simplify phi nodes use SimplifyInstruction
rather than calling hasConstantValue.  No intended functionality change.

llvm-svn: 119352
2010-11-16 17:41:24 +00:00
Duncan Sands b99f39b9f6 If dom tree information is available, make it possible to pass
it to get better phi node simplification.

llvm-svn: 119055
2010-11-14 18:36:10 +00:00
Duncan Sands 246b71c596 Have GVN simplify instructions as it goes. For example, consider
"%z = %x and %y".  If GVN can prove that %y equals %x, then it turns
this into "%z = %x and %x".  With the new code, %z will be replaced
with %x everywhere (and then deleted).  Previously %z would be value
numbered too, which is a waste of time.  Also, while a clever value
numbering algorithm would give %z the same value number as %x, our
current one doesn't do so (at least I don't think it does).  The new
logic has an essentially equivalent effect to what you would get if
%z was given the same value number as %x, i.e. it should make value
numbering smarter.  While there, get hold of target data once at the
start rather than a gazillion times all over the place.

llvm-svn: 118923
2010-11-12 21:10:24 +00:00
Dan Gohman d4b7fff2e8 Enhance DSE to handle the case where a free call makes more than
one store dead. This is especially noticeable in
SingleSource/Benchmarks/Shootout/objinst.

llvm-svn: 118875
2010-11-12 02:19:17 +00:00
Dan Gohman 65316d6749 Add helper functions for computing the Location of load, store,
and vaarg instructions.

llvm-svn: 118845
2010-11-11 21:50:19 +00:00
Dan Gohman 0cc4c7516e Make Sink tbaa-aware.
llvm-svn: 118788
2010-11-11 16:21:47 +00:00
Dan Gohman c3b4ea7b7d It's safe to sink some instructions which are not safe to speculatively
execute. Make Sink's predicate more precise.

llvm-svn: 118787
2010-11-11 16:20:28 +00:00
Dan Gohman 0a6021a54d Enhance GVN to do more precise alias queries for non-local memory
references. For example, this allows gvn to eliminate the load in
this example:

  void foo(int n, int* p, int *q) {
    p[0] = 0;
    p[1] = 1;
    if (n) {
      *q = p[0];
    }
  }

llvm-svn: 118714
2010-11-10 20:37:15 +00:00
Dan Gohman d209911642 Use getValueOperand() and getPointerOperand() on load and store
instructions instead of hard-coding operand numbers.

llvm-svn: 118698
2010-11-10 19:03:33 +00:00
Dan Gohman 0f17507478 Teach LICM and AliasSetTracker about AccessesArgumentsReadonly.
llvm-svn: 118618
2010-11-09 19:58:21 +00:00
Owen Anderson 374e1464ae Give up on doing in-line instruction simplification during correlated value propagation. Instruction simplification
needs to be guaranteed never to be run on an unreachable block.  However, earlier block simplifications may have
changed the CFG to make block that were reachable when we began our iteration unreachable by the time we try to
simplify them. (Note that this also means that our depth-first iterators were potentially being invalidated).

This should not have a large impact on code quality, since later runs of instcombine should pick up these simplifications.
Fixes PR8506.

llvm-svn: 117709
2010-10-29 21:05:17 +00:00
John Thompson e8360b7182 Inline asm multiple alternative constraints development phase 2 - improved basic logic, added initial platform support.
llvm-svn: 117667
2010-10-29 17:29:13 +00:00
Dan Gohman f372cf869b Reapply r116831 and r116839, converting AliasAnalysis to use
uint64_t, plus fixes for places I missed before.

llvm-svn: 116875
2010-10-19 22:54:46 +00:00
Dan Gohman b4aa503501 Revert r116831 and r116839, which are breaking selfhost builds.
llvm-svn: 116858
2010-10-19 21:06:16 +00:00
Owen Anderson a4fefc1949 Passes do not need to recursively initialize passes that they preserve, if
they do not also require them.  This allows us to reduce inter-pass linkage
dependencies.

llvm-svn: 116854
2010-10-19 20:08:44 +00:00
Dan Gohman 896ac62346 Oops, check in all the files for converting AliasAnalysis to
use uint64_t.

llvm-svn: 116839
2010-10-19 18:08:27 +00:00
Owen Anderson 6c18d1aac0 Get rid of static constructors for pass registration. Instead, every pass exposes an initializeMyPassFunction(), which
must be called in the pass's constructor.  This function uses static dependency declarations to recursively initialize
the pass's dependencies.

Clients that only create passes through the createFooPass() APIs will require no changes.  Clients that want to use the
CommandLine options for passes will need to manually call the appropriate initialization functions in PassInitialization.h
before parsing commandline arguments.

I have tested this with all standard configurations of clang and llvm-gcc on Darwin.  It is possible that there are problems
with the static dependencies that will only be visible with non-standard options.  If you encounter any crash in pass
registration/creation, please send the testcase to me directly.

llvm-svn: 116820
2010-10-19 17:21:58 +00:00
Dan Gohman 14fe8cf238 Consistently use AliasAnalysis::UnknownSize instead of hardcoding ~0u.
llvm-svn: 116815
2010-10-19 17:06:23 +00:00
Dan Gohman 71af9db0e8 Make AliasSetTracker TBAA-aware, enabling TBAA-enabled LICM.
llvm-svn: 116743
2010-10-18 20:44:50 +00:00
Benjamin Kramer 1dc34b48dd Eliminate some calls to Value::getNameStr.
llvm-svn: 116670
2010-10-16 11:28:23 +00:00
Owen Anderson 18e4fed3fa Generalize MemCpyOpt's handling of call slot forwarding to function properly when the call slot
forwarding is implemented with a load/store pair rather than a memcpy.

llvm-svn: 116637
2010-10-15 22:52:12 +00:00
Rafael Espindola 229e38f0fe Be more consistent in using ValueToValueMapTy.
llvm-svn: 116387
2010-10-13 01:36:30 +00:00
Owen Anderson 8ac477ffb5 Begin adding static dependence information to passes, which will allow us to
perform initialization without static constructors AND without explicit initialization
by the client.  For the moment, passes are required to initialize both their
(potential) dependencies and any passes they preserve.  I hope to be able to relax
the latter requirement in the future.

llvm-svn: 116334
2010-10-12 19:48:12 +00:00
Dan Gohman 2fd85d7cd2 Filter out illegal formulae after updating offsets, not before, so that
formulae which become illegal as a result of the offset updating don't
escape.

This is for rdar://8529692. No testcase yet, because the given cases
hit use-list ordering differences.

llvm-svn: 116093
2010-10-08 19:33:26 +00:00
Daniel Dunbar d4e9c3b43a Update CMake.
llvm-svn: 116034
2010-10-08 02:30:03 +00:00
Dan Gohman 5947e1626a Delete the FormulaSorter class and inline its one method into its
one user. This code will be restructured soon and FormulaSorter
is getting in the way.

llvm-svn: 116012
2010-10-07 23:52:18 +00:00
Dan Gohman 1b61fd9bff Fix a spello.
llvm-svn: 116011
2010-10-07 23:43:09 +00:00
Dan Gohman 34f37e0d04 Charge a formula for explicit multiplies on scaled registers too,
not just base registers.

llvm-svn: 116010
2010-10-07 23:41:58 +00:00
Dan Gohman 49d638b45a Use size_t for consistency.
llvm-svn: 116009
2010-10-07 23:37:58 +00:00
Dan Gohman 8e72611058 When merging one use into another, transfer the offsets from
the old use to the new one.

llvm-svn: 116008
2010-10-07 23:36:45 +00:00
Dan Gohman a7b68d6d95 Fix LSR to keep the RegUseTracker up to date when combining users.
This doesn't usually matter, because the other heuristics usually
succeed regardless, but it's good to keep the register use
bookkeeping consistent.

llvm-svn: 116005
2010-10-07 23:33:43 +00:00
Devang Patel 57da4caa85 Remove LoopIndexSplit pass. It is neither maintained nor used by anyone.
llvm-svn: 116004
2010-10-07 23:29:37 +00:00
Owen Anderson df7a4f2515 Now with fewer extraneous semicolons!
llvm-svn: 115996
2010-10-07 22:25:06 +00:00
Owen Anderson 4698c5d7f7 Next step on the getting-rid-of-static-ctors train: begin adding per-library
initialization functions that initialize the set of passes implemented in
that library.  Add C bindings for these functions as well.

llvm-svn: 115927
2010-10-07 17:55:47 +00:00
Owen Anderson 13a642da0b Now that the profitable bits of EnableFullLoadPRE have been enabled by default, rip out the remainder.
Anyone interested in more general PRE would be better served by implementing it separately, to get real
anticipation calculation, etc.

llvm-svn: 115337
2010-10-01 20:02:55 +00:00
Eric Christopher 3ad2f3a2f2 Fix the other half of the alignment changing issue by making sure that the
memcpy alignment is the minimum of the incoming alignments.

Fixes PR 8266.

llvm-svn: 115305
2010-10-01 09:02:05 +00:00
Dale Johannesen dd224d2333 Massive rewrite of MMX:
The x86_mmx type is used for MMX intrinsics, parameters and
return values where these use MMX registers, and is also
supported in load, store, and bitcast.

Only the above operations generate MMX instructions, and optimizations
do not operate on or produce MMX intrinsics. 

MMX-sized vectors <2 x i32> etc. are lowered to XMM or split into
smaller pieces.  Optimizations may occur on these forms and the
result casted back to x86_mmx, provided the result feeds into a
previous existing x86_mmx operation.

The point of all this is prevent optimizations from introducing
MMX operations, which is unsafe due to the EMMS problem.

llvm-svn: 115243
2010-09-30 23:57:10 +00:00
Owen Anderson 3170a25a84 We do want to allow LoadPRE to perform LICM-like transformations: we already consider PHI nodes to be negligible for
code size (making this transform code size neutral), and it allows us to hoist values out of loops, which is always
a good thing.

llvm-svn: 115205
2010-09-30 20:53:04 +00:00
Jakob Stoklund Olesen eb12f49fb7 Try again to disable critical edge splitting in CodeGenPrepare.
The bug that broke i386 linux has been fixed in r115191.

llvm-svn: 115204
2010-09-30 20:51:52 +00:00
Benjamin Kramer 5d66e5feb8 Tighten up prototype verification of strchr and strrchr to avoid a crash in the very unlikely case that someone passes an integer > i64 to strchr.
llvm-svn: 115144
2010-09-30 11:21:59 +00:00
Benjamin Kramer 2b76c66fd6 Add constant folding for strspn and strcspn to SimplifyLibCalls.
llvm-svn: 115116
2010-09-30 00:58:35 +00:00
Benjamin Kramer 38d22f69fc Add strpbrk folding to SimplifyLibCalls.
llvm-svn: 115111
2010-09-29 23:52:12 +00:00
Benjamin Kramer 8e861d7eee Simplify the loop in StrChrOptimizer. FileCheckize test.
llvm-svn: 115095
2010-09-29 22:29:12 +00:00
Benjamin Kramer 824645abc9 Teach SimplifyLibCalls how to optimize strrchr.
llvm-svn: 115091
2010-09-29 21:50:51 +00:00
Owen Anderson 99c985c37d Fix PR8247: JumpThreading can cause a block to become unreachable while still having predecessor, if it is part of a self-loop.
Because of this, we cannot use the Simplify* APIs, as they can assert-fail on unreachable code.  Since it's not easy to determine
if a given threading will cause a block to become unreachable, simply defer simplifying simplification to later InstCombine and/or
DCE passes.

llvm-svn: 115082
2010-09-29 20:34:41 +00:00
Owen Anderson d67ca0ed4c Revert r114919, which caused some serious regressions on ARM.
llvm-svn: 115053
2010-09-29 18:05:19 +00:00
Oscar Fuentes b4b12535e8 Removed a bunch of unnecessary target_link_libraries.
llvm-svn: 114999
2010-09-28 22:39:14 +00:00
Owen Anderson 9c93fd5598 Weight loop unrolling counts by nesting depth. Unrolling deeply nested loops tends to cause high
register pressure and thus excess spills, which we don't currently recover from well.  This should
be re-evaluated in the future if our ability to generate good spills/splits improves.

Partial fix for <rdar://problem/7635585>.

llvm-svn: 114919
2010-09-27 22:58:54 +00:00
Jakob Stoklund Olesen 415a7a6fec Revert "Disable codegen prepare critical edge splitting. Machine instruction passes now"
This reverts revision 114633. It was breaking llvm-gcc-i386-linux-selfhost.

It seems there is a downstream bug that is exposed by
-cgp-critical-edge-splitting=0. When that bug is fixed, this patch can go back
in.

Note that the changes to tailcallfp2.ll are not reverted. They were good are
required.

llvm-svn: 114859
2010-09-27 18:43:48 +00:00
Dan Gohman 16ef49686c Delete an unused function.
llvm-svn: 114841
2010-09-27 16:58:21 +00:00
Owen Anderson b590a927cd LoadPRE was not properly checking that the load it was PRE'ing post-dominated the block it was being hoisted to.
Splitting critical edges at the merge point only addressed part of the issue; it is also possible for non-post-domination
to occur when the path from the load to the merge has branches in it.  Unfortunately, full anticipation analysis is
time-consuming, so for now approximate it.  This is strictly more conservative than real anticipation, so we will miss
some cases that real PRE would allow, but we also no longer insert loads into paths where they didn't exist before. :-)

This is a very slight net positive on SPEC for me (0.5% on average).  Most of the benchmarks are largely unaffected, but
when it pays off it pays off decently: 181.mcf improves by 4.5% on my machine.

llvm-svn: 114785
2010-09-25 05:26:18 +00:00
Eric Christopher ebacd2b023 If we're changing the source of a memcpy we need to use the alignment
of the source, not the original alignment since it may no longer
be valid.

Fixes rdar://8400094

llvm-svn: 114781
2010-09-25 00:57:26 +00:00
Evan Cheng 794aaa79e2 Disable codegen prepare critical edge splitting. Machine instruction passes now
break critical edges on demand.

llvm-svn: 114633
2010-09-23 06:55:34 +00:00
Bob Wilson b6832a4372 When moving zext/sext to be folded with a load, ignore the issue of whether
truncates are free only in the case where the extended type is legal but the
load type is not.  If both types are illegal, such as when they are too big,
the load may not be legalized into an extended load.

llvm-svn: 114568
2010-09-22 18:44:56 +00:00
Bob Wilson 4ddcb6a6b4 Move a sign-extend or a zero-extend of a load to the same basic block as the
load when the type of the load is not legal, even if truncates are not free.
The load is going to be legalized to an extending load anyway.

llvm-svn: 114488
2010-09-21 21:54:27 +00:00
Bob Wilson ff714f9992 Clarify a comment.
llvm-svn: 114487
2010-09-21 21:44:14 +00:00
Gabor Greif a06741b356 do not rely on the implicit-dereference semantics of dyn_cast_or_null
llvm-svn: 114278
2010-09-18 11:55:34 +00:00
Gabor Greif aaa22cf1b6 do not rely on the implicit-dereference semantics of dyn_cast_or_null
llvm-svn: 114277
2010-09-18 11:53:39 +00:00
Owen Anderson d104806575 Use a depth-first iteratation in CorrelatedValuePropagation to avoid wasting time trying
to optimize unreachable blocks.

llvm-svn: 114105
2010-09-16 18:35:07 +00:00
Dale Johannesen f95f59a0c2 When substituting sunkaddrs into indirect arguments an asm, we were
walking the asm arguments once and stashing their Values.  This is
wrong because the same memory location can be in the list twice, and
if the first one has a sunkaddr substituted, the stashed value for the
second one will be wrong (use-after-free).  PR 8154.

llvm-svn: 114104
2010-09-16 18:30:55 +00:00
Owen Anderson d361aac3d0 Remove the option to disable LazyValueInfo in JumpThreading, as it is now
on by default and has received significant testing.

llvm-svn: 113852
2010-09-14 20:57:41 +00:00
Chris Lattner f1144f0929 fix PR8102, a case where we'd copyValue from a value that we already
deleted.  Fix this by doing the copyValue's before we delete stuff!

The testcase only repros the problem on my system with valgrind.

llvm-svn: 113820
2010-09-14 00:19:00 +00:00
Michael J. Spencer 93c9b2ea93 Revert "CMake: Get rid of LLVMLibDeps.cmake and export the libraries normally."
This reverts commit r113632

Conflicts:

	cmake/modules/AddLLVM.cmake

llvm-svn: 113819
2010-09-13 23:59:48 +00:00
Eric Christopher e3a89f9f9c Remove unused variable.
llvm-svn: 113769
2010-09-13 18:27:59 +00:00
John Thompson 1094c80281 Added skeleton for inline asm multiple alternative constraint support.
llvm-svn: 113766
2010-09-13 18:15:37 +00:00
Gabor Greif 2f5f696b66 typoes
llvm-svn: 113647
2010-09-10 22:25:58 +00:00
Michael J. Spencer dc38d36ccb CMake: Get rid of LLVMLibDeps.cmake and export the libraries normally.
llvm-svn: 113632
2010-09-10 21:14:25 +00:00
Owen Anderson d85c9ccdba Lower the unrolling theshold to 150. Empirical tests indicate that this is a sweet spot in the performance per
code size increase curve.

llvm-svn: 113595
2010-09-10 17:57:00 +00:00
Owen Anderson 04cf3fd761 What the loop unroller cares about, rather than just not unrolling loops with calls, is
not unrolling loops that contain calls that would be better off getting inlined.  This mostly
comes up when an interleaved devirtualization pass has devirtualized a call which the inliner
will inline on a future pass.  Thus, rather than blocking all loops containing calls, add
a metric for "inline candidate calls" and block loops containing those instead.

llvm-svn: 113535
2010-09-09 20:32:23 +00:00
Owen Anderson 6270515918 Revert r113439, which relaxed the requirement that loops containing calls cannot be unrolled. After some discussion,
there seems to be a better way to achieve the same effect.

llvm-svn: 113528
2010-09-09 20:02:23 +00:00
Owen Anderson 11ab204fdc r113526 introduced an unintended change to the loop unrolling threshold. Revert it.
llvm-svn: 113527
2010-09-09 19:11:57 +00:00
Owen Anderson b61b1647e2 Fix typo in code to cap the loop code size reduction calculation.
llvm-svn: 113526
2010-09-09 19:08:59 +00:00
Owen Anderson 62ea1b718c Use code-size reduction metrics to estimate the amount of savings we'll get when we unroll a loop.
Next step is to recalculate the threshold values given this new heuristic.

llvm-svn: 113525
2010-09-09 19:07:31 +00:00
Owen Anderson 8084dbaf8e Relax the "don't unroll loops containing calls" rule. Instead, when a loop contains a call, lower the
unrolling threshold to the optimize-for-size threshold.  Basically, for loops containing calls, unrolling
can still be profitable as long as the loop is REALLY small.

llvm-svn: 113439
2010-09-08 23:10:07 +00:00
Owen Anderson a4d9c78aa1 Add a separate unrolling threshold when the current function is being optimized for size.
The threshold value of 50 is arbitrary, and I chose it simply by analogy to the inlining thresholds, where
the baseline unrolling threshold is slightly smaller than the baseline inlining threshold.  This could
undoubtedly use some tuning.

llvm-svn: 113306
2010-09-07 23:15:30 +00:00
Chris Lattner be9019090e fix PR8067, an over-aggressive assertion in LICM.
llvm-svn: 113146
2010-09-06 05:11:24 +00:00
Chris Lattner b01c24a945 Teach loop rotate to hoist trivially invariant instructions
in the duplicated block instead of duplicating them.  

Duplicating them into the end of the loop and the preheader 
means that we got a phi node in the header of the loop, 
which prevented LICM from hoisting them.  GVN would
usually come around later and merge the duplicated 
instructions so we'd get reasonable output... except that
anything dependent on the shoulda-been-hoisted value can't
be hoisted.  In PR5319 (which this fixes), a memory value
didn't get promoted.

llvm-svn: 113134
2010-09-06 01:10:22 +00:00
Chris Lattner da24b9a49a pull a simple method out of LICM into a new
Loop::hasLoopInvariantOperands method. Remove
a useless and confusing Loop::isLoopInvariant(Instruction)
method, which didn't do what you thought it did.

No functionality change.

llvm-svn: 113133
2010-09-06 01:05:37 +00:00
Chris Lattner 1edf7434cf more cleanups
llvm-svn: 113115
2010-09-05 20:13:07 +00:00
Chris Lattner e6214557e7 Change lower atomic pass to use IntrinsicInst to simplify it a bit.
llvm-svn: 113114
2010-09-05 20:10:47 +00:00
Chris Lattner 05ef361b5e eliminate some non-obvious casts. UndefValue isa Constant.
llvm-svn: 113113
2010-09-05 20:03:09 +00:00
Chris Lattner 65b48b5dfc zap dead code.
llvm-svn: 113073
2010-09-04 18:12:00 +00:00
Chris Lattner 50506787d1 fix a bug in my licm rewrite when a load from the promoted memory
location is being re-stored to the memory location.  We would get
a dangling pointer from the SSAUpdate data structure and miss a 
use.  This fixes PR8068

llvm-svn: 113042
2010-09-04 00:12:30 +00:00
Owen Anderson c91c1a205a Propagate non-local comparisons. Fixes PR1757.
llvm-svn: 113025
2010-09-03 22:47:08 +00:00
Owen Anderson c725462245 Add support for simplifying a load from a computed value to a load from a global when it
is provable that they're equivalent.  This fixes PR4855.

llvm-svn: 112994
2010-09-03 19:08:37 +00:00
Chris Lattner affc0e42f0 fix more AST updating bugs, correcting miscompilation in PR8041
llvm-svn: 112878
2010-09-02 22:19:10 +00:00
Duncan Sands 6778149f7e Reapply commit 112699, speculatively reverted by echristo, since
I'm sure it is harmless.  Original commit message:
If PrototypeValue is erased in the middle of using the SSAUpdator
then the SSAUpdator may access freed memory.  Instead, simply pass
in the type and name explicitly, which is all that was used anyway.

llvm-svn: 112810
2010-09-02 08:14:03 +00:00
Chris Lattner 8af45a889d deepen my MMX/SRoA hack to avoid hurting non-x86 codegen.
llvm-svn: 112763
2010-09-01 23:09:27 +00:00
Dan Gohman 0ad7d9c24e Fix loop unswitching's assumption that a code path which either
infinite loops or exits will eventually exit. This fixes PR5373.

llvm-svn: 112745
2010-09-01 21:46:45 +00:00
Owen Anderson 73f988cafa JumpThreading keeps LazyValueInfo up to date, so we don't need to rerun it
if we schedule another LVI-using pass afterwards.

llvm-svn: 112722
2010-09-01 18:27:22 +00:00
Eric Christopher a5d315c665 Speculatively revert 112699 and 112702, they seem to be causing
self host errors on clang-x86-64.

llvm-svn: 112719
2010-09-01 17:29:10 +00:00
Duncan Sands f7b18437b5 If PrototypeValue is erased in the middle of using the SSAUpdator
then the SSAUpdator may access freed memory.  Instead, simply pass
in the type and name explicitly, which is all that was used anyway.

llvm-svn: 112699
2010-09-01 10:29:33 +00:00
Chris Lattner 34e5361eb5 add a gross hack to work around a problem that Argiris reported
on llvmdev: SRoA is introducing MMX datatypes like <1 x i64>,
which then cause random problems because the X86 backend is
producing mmx stuff without inserting proper emms calls.

In the short term, force off MMX datatypes.  In the long term,
the X86 backend should not select generic vector types to MMX
registers.  This is being worked on, but won't be done in time
for 2.8.  rdar://8380055

llvm-svn: 112696
2010-09-01 05:14:33 +00:00
Dan Gohman 110ed64fbb Revert 112442 and 112440 until the compile time problems introduced
by 112440 are resolved.

llvm-svn: 112692
2010-09-01 01:45:53 +00:00
Chris Lattner 030f02021b licm is wasting time hoisting constant foldable operations,
instead of hoisting them, just fold them away.  This occurs in the
testcase for PR8041, for example.

llvm-svn: 112669
2010-08-31 23:00:16 +00:00
Chris Lattner daca6f3483 tidy up
llvm-svn: 112643
2010-08-31 21:21:25 +00:00
Owen Anderson 3c84ecb067 More cleanups of my JumpThreading transforms, including extracting some duplicated code into a helper function.
llvm-svn: 112634
2010-08-31 20:26:04 +00:00
Owen Anderson 6fdcb172a9 Add an RAII helper to make cleanup of the RecursionSet more fool-proof.
llvm-svn: 112628
2010-08-31 19:24:27 +00:00
Owen Anderson 048efbe225 Only try to clean up the current block if we changed that block already.
llvm-svn: 112625
2010-08-31 18:55:52 +00:00
Owen Anderson cd4de7f399 Refactor my fix for PR5652 to terminate the predecessor lookups after the first failure.
llvm-svn: 112620
2010-08-31 18:48:48 +00:00
Owen Anderson ce401be792 Don't perform an extra traversal of the function just to do cleanup. We can safely simplify instructions after each block has been processed without worrying about iterator invalidation.
llvm-svn: 112594
2010-08-31 07:55:56 +00:00
Owen Anderson 48d58ad64c Rename ValuePropagation to a more descriptive CorrelatedValuePropagation.
llvm-svn: 112591
2010-08-31 07:48:34 +00:00
Owen Anderson d2918a07bd Rename file to something more descriptive.
llvm-svn: 112590
2010-08-31 07:41:39 +00:00
Owen Anderson 3997a07fb9 More Chris-inspired JumpThreading fixes: use ConstantExpr to correctly constant-fold undef, and be more careful with its return value.
This actually exposed an infinite recursion bug in ComputeValueKnownInPredecessors which theoretically already existed (in JumpThreading's
handling of and/or of i1's), but never manifested before.  This patch adds a tracking set to prevent this case.

llvm-svn: 112589
2010-08-31 07:36:34 +00:00
Owen Anderson b58b3c0dda Fix a typo.
llvm-svn: 112560
2010-08-30 23:59:30 +00:00
Owen Anderson b974dbbdd7 Cleanups suggested by Chris.
llvm-svn: 112553
2010-08-30 23:34:17 +00:00
Owen Anderson c910acb54a Re-apply r112539, being more careful to respect the return values of the constant folding methods. Additionally,
use the ConstantExpr::get*() methods to simplify some constant folding.

llvm-svn: 112550
2010-08-30 23:22:36 +00:00
Owen Anderson 30bacbdfdf Add statistics to evaluate this pass.
llvm-svn: 112545
2010-08-30 22:45:55 +00:00
Owen Anderson 1ddcbbe49c Revert r112539. It accidentally introduced a miscompilation.
llvm-svn: 112543
2010-08-30 22:33:41 +00:00
Owen Anderson 75f6037c7c Fixes and cleanups pointed out by Chris. In general, be careful to handle 0 results from ComputeValueKnownInPredecessors
(indicating undef), and re-use existing constant folding APIs.

llvm-svn: 112539
2010-08-30 22:07:52 +00:00
Chris Lattner c843fca2fd rewrite DwarfEHPrepare to use SSAUpdater to promote its allocas
instead of PromoteMemToReg.  This allows it to stop using DF and DT,
eliminating a computation of DT and DF from clang -O3.  Clang is now
down to 2 runs of DomFrontier.

llvm-svn: 112457
2010-08-29 19:54:28 +00:00
Chris Lattner f58382ed87 two changes: 1) make AliasSet hold the list of call sites with an
assertingvh so we get a violent explosion if the pointer dangles.

2) Fix AliasSetTracker::deleteValue to remove call sites with
   by-pointer comparisons instead of by-alias queries.  Using
   findAliasSetForCallSite can cause alias sets to get merged
   when they shouldn't, and can also miss alias sets when the
   call is readonly.

#2 fixes PR6889, which only repros with a .c file :(

llvm-svn: 112452
2010-08-29 18:42:23 +00:00
Chris Lattner 263f804699 LICM does get dead instructions input to it. Instead of sinking them
out of loops, just delete them.

llvm-svn: 112451
2010-08-29 18:22:25 +00:00
Chris Lattner 6ac0659a1c use moveBefore instead of remove+insert, it avoids some
symtab manipulation, so its faster (in addition to being
more elegant)

llvm-svn: 112450
2010-08-29 18:18:40 +00:00
Chris Lattner f03b4eac48 revert 112448 for now.
llvm-svn: 112449
2010-08-29 18:11:16 +00:00
Chris Lattner 11f8ad8211 optimize LICM::hoist to use moveBefore. Correct its updating
of AST to remove the hoisted instruction from the AST, since it
is no longer in the loop.

llvm-svn: 112448
2010-08-29 18:03:33 +00:00
Chris Lattner 1a1ed69435 fix some bugs (found by inspection) where LICM would not update
LICM correctly.  When sinking an instruction, it should not add
entries for the sunk instruction to the AST, it should remove
the entry for the sunk instruction.  The blocks being sunk to
are not in the loop, so their instructions shouldn't be in the
AST (yet)!

llvm-svn: 112447
2010-08-29 18:00:00 +00:00
Chris Lattner cc9cbc66a3 rework the ownership of subloop alias information: instead of
keeping them around until the pass is destroyed, keep them
around a) just when useful (not for outer loops) and b) destroy
them right after we use them.  This should reduce memory use
and fixes potential bugs where a loop is deleted and another
loop gets allocated to the same address.

llvm-svn: 112446
2010-08-29 17:46:00 +00:00
Chris Lattner bc1a65ac6c apparently unswitch had the same "Feature". Stop its
claims that it preserves domfrontier if it doesn't really.

llvm-svn: 112445
2010-08-29 17:23:19 +00:00
Chris Lattner d6f46b8af8 now that loop passes don't use DomFrontier, there is no reason
for the unroller to pretend it supports updating it.  It still
has a horrible hack for DomTree.

llvm-svn: 112444
2010-08-29 17:21:35 +00:00
Dan Gohman 002ff89cbd Optionally rerun dedicated-register filtering after applying
other filtering techniques, as those may allow it to filter
out more obviously unprofitable candidates.

llvm-svn: 112441
2010-08-29 16:39:22 +00:00
Dan Gohman f031792cc6 Fix several areas in LSR to do a better job keeping the main
LSRInstance data structures up to date. This fixes some
pessimizations caused by stale data which will be exposed
in an upcoming change.

llvm-svn: 112440
2010-08-29 16:32:54 +00:00
Dan Gohman e9e0873b08 Refactor the three main groups of code out of
NarrowSearchSpaceUsingHeuristics into separate functions.

llvm-svn: 112439
2010-08-29 16:09:42 +00:00
Dan Gohman 37a0f68036 Delete a bogus check.
llvm-svn: 112438
2010-08-29 15:30:29 +00:00
Dan Gohman b6a520d63c Add some comments.
llvm-svn: 112437
2010-08-29 15:27:08 +00:00
Dan Gohman bf673e0652 Move this debug output into GenerateAllReuseFormula, to declutter
the high-level logic.

llvm-svn: 112436
2010-08-29 15:21:38 +00:00
Dan Gohman d366b6d5c8 Delete an unused declaration.
llvm-svn: 112435
2010-08-29 15:19:11 +00:00
Dan Gohman 4f13bbfefc Do one lookup instead of two.
llvm-svn: 112434
2010-08-29 15:18:49 +00:00
Chris Lattner f94f6bb0ba licm preserves the cfg, it doesn't have to explicitly say it
preserves domfrontier.  It does preserve AA though.

llvm-svn: 112419
2010-08-29 07:02:56 +00:00
Chris Lattner abe61ef3b4 now that it doesn't use the PromoteMemToReg function, LICM doesn't
require DomFrontier.  Dropping this doesn't actually save any runs
of the pass though.

llvm-svn: 112418
2010-08-29 06:49:44 +00:00
Chris Lattner 1dc98b47b5 completely rewrite the memory promotion algorithm in LICM.
Among other things, this uses SSAUpdater instead of 
PromoteMemToReg.

llvm-svn: 112417
2010-08-29 06:43:52 +00:00
Chris Lattner 9c3931a544 use getUniqueExitBlocks instead of a manual set.
llvm-svn: 112412
2010-08-29 05:12:21 +00:00
Chris Lattner 85bf5421e1 reimplement LICM::sink to use SSAUpdater instead of PromoteMemToReg.
This leads to much simpler code.

llvm-svn: 112410
2010-08-29 04:55:06 +00:00
Chris Lattner b50407f104 remove dead proto
llvm-svn: 112408
2010-08-29 04:53:24 +00:00
Chris Lattner cd96b4df56 reduce indentation in LICM::sink by using early exits, use
getUniqueExitBlocks instead of getExitBlocks and a manual
set to eliminate dupes.

llvm-svn: 112405
2010-08-29 04:28:20 +00:00
Chris Lattner 188cc5a0fc modernize this pass a bit: use efficient set/map and reduce indentation.
llvm-svn: 112404
2010-08-29 04:23:04 +00:00
Chris Lattner 504e5100d3 remove the ABCD and SSI passes. They don't have any clients that
I'm aware of, aren't maintained, and LVI will be replacing their value.
nlewycky approved this on irc.

llvm-svn: 112355
2010-08-28 03:51:24 +00:00
Chris Lattner 95bb297c26 squish dead code.
llvm-svn: 112350
2010-08-28 03:21:03 +00:00
Benjamin Kramer 83f9ff0452 Update CMake build. Add newline at end of file.
llvm-svn: 112332
2010-08-28 00:11:12 +00:00
Owen Anderson cf7f941121 Add a prototype of a new peephole optimizing pass that uses LazyValue info to simplify PHIs and select's.
This pass addresses the missed optimizations from PR2581 and PR4420.

llvm-svn: 112325
2010-08-27 23:31:36 +00:00
Owen Anderson 99d4cb861b Fix typos in comments.
llvm-svn: 112286
2010-08-27 20:32:56 +00:00
Owen Anderson 6ebbd92380 Use LVI to eliminate conditional branches where we've tested a related condition previously. Update tests for this change.
This fixes PR5652.

llvm-svn: 112270
2010-08-27 17:12:29 +00:00
Owen Anderson bd2ecc7e68 Make JumpThreading smart enough to properly thread StrSwitch when it's compiled with clang++.
llvm-svn: 112198
2010-08-26 17:40:24 +00:00
Chris Lattner 8df99b523e remove some llvmcontext arguments that are now dead post-refactoring.
llvm-svn: 112104
2010-08-25 23:00:45 +00:00
Owen Anderson 7c853e877e Turn LVI on, previously detected failures should be fixed now.
llvm-svn: 111923
2010-08-24 17:21:18 +00:00
Owen Anderson 6ffa3f2aea Turn LVI back off, I have a testcase now.
llvm-svn: 111834
2010-08-23 19:59:27 +00:00
Owen Anderson 630add39a6 Re-enable LazyValueInfo. Monitoring for failures.
llvm-svn: 111816
2010-08-23 18:12:23 +00:00
Owen Anderson d31d82d75c Now that PassInfo and Pass::ID have been separated, move the rest of the passes over to the new registration API.
llvm-svn: 111815
2010-08-23 17:52:01 +00:00
Owen Anderson aac8cbb261 Disable LVI while I evaluate a failure.
llvm-svn: 111551
2010-08-19 19:47:08 +00:00
Owen Anderson 5c87dd55d3 Tentatively enabled LVI by default. I'll be monitoring for any failures.
llvm-svn: 111543
2010-08-19 19:04:40 +00:00
Dan Gohman 129a816ee6 Process the step before the start, because it's usually the simpler
of the two.

llvm-svn: 111495
2010-08-19 01:02:31 +00:00
Owen Anderson 208636fa33 Inform LazyValueInfo whenever a block is deleted, to avoid dangling pointer issues.
llvm-svn: 111382
2010-08-18 18:39:01 +00:00
Chris Lattner 3c603024bb Fix PR7755: knowing something about an inval for a pred
from the LHS should disable reconsidering that pred on the
RHS.  However, knowing something about the pred on the RHS
shouldn't disable subsequent additions on the RHS from
happening.

llvm-svn: 111349
2010-08-18 03:14:36 +00:00
Chris Lattner b45de95345 remove some dead code.
llvm-svn: 111344
2010-08-18 02:41:56 +00:00
Chris Lattner 6aabb66139 remove dead prototype.
llvm-svn: 111342
2010-08-18 02:37:06 +00:00
Dan Gohman 5047ca0c02 When rotating loops, put the original header at the bottom of the
loop, making the resulting loop significantly less ugly.  Also, zap
its trivial PHI nodes, since it's easy.

llvm-svn: 111255
2010-08-17 17:39:21 +00:00
Evan Cheng 8b637b177c Add an option to disable codegen prepare critical edge splitting. In theory, PHI elimination is already doing all (most?) of the splitting needed. But machine-licm and machine-sink seem to miss some important optimizations when splitting is disabled.
llvm-svn: 111224
2010-08-17 01:34:49 +00:00
Dan Gohman 89fdbaf99a Instead of having CollectSubexpr's categorize operands as interesting or
uninteresting, just put all the operands on one list and make
GenerateReassociations make the decision about what's interesting.
This is simpler, and it avoids an extra ScalarEvolution::getAddExpr call.

llvm-svn: 111133
2010-08-16 15:50:00 +00:00
Dan Gohman 9b7632df26 Put add operands in ScalarEvolution-canonical order, when convenient.
This isn't necessary, because ScalarEvolution sorts them anyway,
but it's tidier this way.

llvm-svn: 111132
2010-08-16 15:39:27 +00:00
Dan Gohman 4a63fad976 Teach SimplifyCFG how to simplify indirectbr instructions.
- Eliminate redundant successors.
 - Convert an indirectbr with one successor into a direct branch.

Also, generalize SimplifyCFG to be able to be run on a function entry block.
It knows quite a few simplifications which are applicable to the entry
block, and it only needs a few checks to avoid trouble with the entry block.

llvm-svn: 111060
2010-08-14 00:29:42 +00:00
Dan Gohman 081ffcd00b Fix LSR's ExtractImmediate and ExtractSymbol to avoid calling
ScalarEvolution::getAddExpr, which can be pretty expensive, when nothing
has changed, which is pretty common.

llvm-svn: 111042
2010-08-13 21:17:19 +00:00
Chris Lattner 363226dfe8 fix PR7876: If ipsccp decides that a function's address is taken
before it rewrites the code, we need to use that in the post-rewrite pass.

llvm-svn: 110962
2010-08-12 22:25:23 +00:00
Owen Anderson 0398607714 Don't attempt the PRE inline asm calls, since we don't value number them yet. Fixes PR7835.
llvm-svn: 110489
2010-08-07 00:20:35 +00:00
Owen Anderson a7aed18624 Reapply r110396, with fixes to appease the Linux buildbot gods.
llvm-svn: 110460
2010-08-06 18:33:48 +00:00
Nick Lewycky 5a2849e166 Fix uninitialized variable warning.
Also move 'default' case next to a real case to help compiler optimize in
non-Debug builds.
No functionality change.

llvm-svn: 110435
2010-08-06 07:43:46 +00:00
Owen Anderson bda59bd247 Revert r110396 to fix buildbots.
llvm-svn: 110410
2010-08-06 00:23:35 +00:00
Owen Anderson 755aceb5d0 Don't use PassInfo* as a type identifier for passes. Instead, use the address of the static
ID member as the sole unique type identifier.  Clean up APIs related to this change.

llvm-svn: 110396
2010-08-05 23:42:04 +00:00
Owen Anderson 4674dd6cf5 Give JumpThreading+LVI a long-form cl::opt so that it's easier to toggle the default.
llvm-svn: 110384
2010-08-05 22:11:31 +00:00
Owen Anderson 9f2bca02d7 Experiments show that we can safely increase our unrolling threshold without unduly impacting code size, particularly
since unrolling is not enabled at -Os.

llvm-svn: 110233
2010-08-04 18:32:46 +00:00
Dan Gohman ba81fc16a5 Fix whitespace.
llvm-svn: 110223
2010-08-04 17:43:57 +00:00
Dan Gohman 839c972102 Fix a comment.
llvm-svn: 110181
2010-08-04 01:16:35 +00:00
Peter Collingbourne ddaaf40d24 Add an atomic lowering pass
llvm-svn: 110113
2010-08-03 16:19:16 +00:00
Oscar Fuentes 40b31ad3ee Prefix `next' iterator operation with `llvm::'.
Fixes potential ambiguity problems on VS 2010.

Patch by nobled!

llvm-svn: 110029
2010-08-02 06:00:15 +00:00
Nick Lewycky 299c6dfcbf Add missing newline to debug statement.
llvm-svn: 109886
2010-07-30 20:27:01 +00:00
Gabor Greif 62f0aac99d simplify by using CallSite constructors; virtually eliminates CallSite::get from the tree
llvm-svn: 109687
2010-07-28 22:50:26 +00:00
Gabor Greif 0a970698da use Value* constructor of CallSite to create potentially improper site, and test that
llvm-svn: 109581
2010-07-28 14:28:18 +00:00
Gabor Greif f159085414 recommit simplification (r109502, backed out r109509); seems to innocent
llvm-svn: 109510
2010-07-27 16:44:23 +00:00
Gabor Greif 5f91b7cf3e back out this too to restore the bots
llvm-svn: 109509
2010-07-27 15:56:07 +00:00
Gabor Greif 7527b2ed5c simplify
llvm-svn: 109502
2010-07-27 13:31:22 +00:00
Owen Anderson aa7f66ba67 Add an initial implementation of LazyValueInfo updating for JumpThreading. Disabled for now.
llvm-svn: 109424
2010-07-26 18:48:03 +00:00
Dan Gohman 0141c13b22 Remove LCSSA's bogus dependence on LoopSimplify and LoopSimplify's bogus
dependence on DominanceFrontier. Instead, add an explicit DominanceFrontier
pass in StandardPasses.h to ensure that it gets scheduled at the right
time.

Declare that loop unrolling preserves ScalarEvolution, and shuffle some
getAnalysisUsages.

This eliminates one LoopSimplify and one LCCSA run in the standard
compile opts sequence.

llvm-svn: 109413
2010-07-26 18:11:16 +00:00
Dan Gohman 65b257c9d2 Use DominatorTree::properlyDominates instead of dominates with an
explicit inequality check.

llvm-svn: 109401
2010-07-26 17:37:36 +00:00
Dan Gohman 31f73ef210 A block dominates itself, by definition.
llvm-svn: 109400
2010-07-26 17:35:32 +00:00
Gabor Greif dde79d8f1a mass elimination of reliance on automatic iterator dereferencing
llvm-svn: 109103
2010-07-22 13:36:47 +00:00
Gabor Greif 3e44ea1917 undo 80 column trespassing I caused
llvm-svn: 109092
2010-07-22 10:37:47 +00:00
Owen Anderson a57b97e7e7 Fix batch of converting RegisterPass<> to INTIALIZE_PASS().
llvm-svn: 109045
2010-07-21 22:09:45 +00:00
Dan Gohman 12725c7d46 Remember that the induction variable is always a PHINode and
use getIncomingValueForBlock instead of
LoopInfo::getCanonicalInductionVariableIncrement.

llvm-svn: 108865
2010-07-20 17:18:52 +00:00
Dan Gohman efd7f9c360 Reorder the contents of various getAnalysisUsage functions, eliminating
a redundant loopsimplify run from the default -O2 sequence.

llvm-svn: 108539
2010-07-16 17:58:45 +00:00
Gabor Greif 6d673953e3 eliminate CallInst::ArgOffset
llvm-svn: 108522
2010-07-16 09:38:02 +00:00
Dan Gohman 1415208292 Don't merge uses when they are targetting fixup sites with
different widths. In a use with a narrower fixup, formulae
may be wider than the fixup, in which case the high bits
aren't necessarily meaningful, so it isn't safe to reuse
them for uses with wider fixups.

This fixes PR7618, though the testcase is too large for a
reasonable regression test, since it heavily dependes on
hitting LSR's heuristics in a certain way.

llvm-svn: 108455
2010-07-15 20:24:58 +00:00
Dan Gohman a1501b9c50 Use dbgs() instead of errs() in a DEBUG.
llvm-svn: 108453
2010-07-15 20:12:42 +00:00
Dan Gohman 4afd412d6b Watch out for a constant offset cancelling out a base register, forming
a zero. This situation arrises in Fortran code with induction variables
that start at 1 instead of 0. This fixes PR7651.

llvm-svn: 108424
2010-07-15 15:14:45 +00:00
Duncan Sands f88a284579 Handle the case of a tail recursion in which the tail call is followed
by a return that returns a constant, while elsewhere in the function
another return instruction returns a different constant.  This is a
special case of accumulator recursion, so just generalize the existing
logic a bit.

llvm-svn: 108241
2010-07-13 15:41:41 +00:00
Gabor Greif a5fa885d47 cache results of operator*
llvm-svn: 108142
2010-07-12 14:10:24 +00:00
Gabor Greif 782f62412f cache dereferenced iterators
llvm-svn: 108138
2010-07-12 12:03:02 +00:00
Gabor Greif 433b975fe2 recommit r108131 (hich has been backed out in r108135) with a fix
llvm-svn: 108137
2010-07-12 12:02:10 +00:00
Gabor Greif f9610827ce back out r108131 (of TailDuplication.cpp) for now, it causes a buildbot failure
llvm-svn: 108135
2010-07-12 11:32:39 +00:00
Gabor Greif 2a464d7308 cache dereferenced iterators
llvm-svn: 108131
2010-07-12 10:36:48 +00:00
Duncan Sands 41b4a6b36a Convert some tab stops into spaces.
llvm-svn: 108130
2010-07-12 08:16:59 +00:00
Chris Lattner bbc25ff5cc if jump threading is able to infer interesting values on both
the LHS and RHS of an and/or instruction, don't multiply add
known predecessor values.  This fixes the crash on testcase
from PR7498

llvm-svn: 108114
2010-07-12 00:47:34 +00:00
Duncan Sands 82b21c086e The accumulator tail recursion transform claims to work for any associative
operation, but the way it's implemented requires the operation to also be
commutative.  So add a check for commutativity (and tweak the corresponding
comments).  This makes no difference in practice since every associative
LLVM instruction is also commutative!  Here's an example to show the need
for commutativity: the accum_recursion.ll testcase calculates the factorial
function.  Before the transformation the result of a call is
  ((((1*1)*2)*3)...)*x
while afterwards it is
  (((1*x)*(x-1))...*2)*1
which clearly requires both associativity and commutativity of * to be equal
to the original.

llvm-svn: 108056
2010-07-10 20:31:42 +00:00
Gabor Greif e82532a1c5 cache result of operator*
llvm-svn: 107976
2010-07-09 15:40:10 +00:00
Gabor Greif d323f5e161 cache result of operator* (found by inspection)
llvm-svn: 107971
2010-07-09 14:48:08 +00:00
Gabor Greif b0d56ffc85 cache result of operator*
llvm-svn: 107969
2010-07-09 14:36:49 +00:00
Chris Lattner efa3c824cc Fix the second half of PR7437: scalarrepl wasn't preserving
address spaces when SRoA'ing memcpy's.

llvm-svn: 107846
2010-07-08 00:27:05 +00:00
Nick Lewycky dace239949 Detabify this file.
llvm-svn: 107637
2010-07-06 03:53:43 +00:00
Dan Gohman 832282e061 Don't claim to preserve AliasAnalysis. First, this is doesn't actually
have any effect, and second, deleting stores can potentially invalidate
an AliasAnalysis, and there's currently no notification for this.

llvm-svn: 107496
2010-07-02 18:43:05 +00:00
Gabor Greif 74470192d7 use ArgOperand API
llvm-svn: 107278
2010-06-30 12:42:43 +00:00
Gabor Greif 743b3fd196 use getArgOperand (corrected by CallInst::ArgOffset) instead of getOperand
llvm-svn: 107273
2010-06-30 09:19:23 +00:00
Gabor Greif f628ecd15f use getNumArgOperands instead of getNumOperands
llvm-svn: 107272
2010-06-30 09:17:53 +00:00
Gabor Greif fe252e6fa0 use getArgOperand instead of getOperand
llvm-svn: 107271
2010-06-30 09:16:16 +00:00
Gabor Greif 8ae3095286 use getArgOperand instead of getOperand
llvm-svn: 107270
2010-06-30 09:15:28 +00:00
Gabor Greif 18c5bae727 employ CallInst::ArgOffset (for now)
llvm-svn: 107015
2010-06-28 16:43:57 +00:00
Gabor Greif 4300fc77ae use cached value
llvm-svn: 107000
2010-06-28 11:20:42 +00:00
Chris Lattner 25a843fcd2 minor cleanup to SROA: when lowering type unsafe accesses to
large integers, the first inserted value would always create
an 'or X, 0'.  Even though this is trivially zapped by
instcombine, don't bother creating this pointless instruction.

llvm-svn: 106979
2010-06-27 07:58:26 +00:00
Duncan Sands 3a5cb69cb8 Fix PR7328: when turning a tail recursion into a loop, need to preserve
the returned value after the tail call if it differs from other return
values.  The optimal thing to do would be to introduce a phi node for
the return value, but for the moment just fix the miscompile.

llvm-svn: 106947
2010-06-26 12:53:31 +00:00
Dan Gohman fb9712bdae In GenerateReassociations, don't bother thinking about individual
SCEVUnknown values which are loop-variant, as LSR can't do anything
interesting with these values in any case. This fixes very slow compile
times on loops which have large numbers of such values.

llvm-svn: 106897
2010-06-25 22:32:18 +00:00
Dale Johannesen ce97d55ad9 The hasMemory argument is irrelevant to how the argument
for an "i" constraint should get lowered; PR 6309.  While
this argument was passed around a lot, this is the only
place it was used, so it goes away from a lot of other
places.

llvm-svn: 106893
2010-06-25 21:55:36 +00:00
Gabor Greif 07e9284c75 use ArgOperand API; tighten type of handleFreeWithNonTrivialDependency to be able to use isFreeCall whithout a cast or new overload
llvm-svn: 106823
2010-06-25 07:40:32 +00:00
Dan Gohman 963b1c142e A few minor micro-optimizations.
llvm-svn: 106764
2010-06-24 16:57:52 +00:00
Dan Gohman 47ddf76d89 Teach getExactSDiv to evaluate x/1 to x up front, as it's a common
enough special case, and it theoretically allows more folding because
it works even when x is unanalyzable.

llvm-svn: 106763
2010-06-24 16:51:25 +00:00
Dan Gohman ab5422200b Fix copy+pasto issues in isMulSExtable.
llvm-svn: 106759
2010-06-24 16:45:11 +00:00
Gabor Greif 91f9589057 use ArgOperand API; introduce downcasted pointers into scope to facilitate this
llvm-svn: 106734
2010-06-24 12:03:56 +00:00
Gabor Greif e2f482ca0b use ArgOperand API
llvm-svn: 106731
2010-06-24 10:42:46 +00:00
Gabor Greif 2d958d4db5 use ArgOperand API
llvm-svn: 106730
2010-06-24 10:17:17 +00:00
Gabor Greif 5bcaa55761 use callsite to obtain all arguments
llvm-svn: 106729
2010-06-24 10:04:07 +00:00
Gabor Greif 0f60709f0e use getNumArgOperands
llvm-svn: 106709
2010-06-24 00:48:48 +00:00
Gabor Greif 4a39b84a9d use ArgOperand API
llvm-svn: 106707
2010-06-24 00:44:01 +00:00
Devang Patel 0dc3c2d37e Use ValueMap instead of DenseMap.
The ValueMapper used by various cloning utility maps MDNodes also.

llvm-svn: 106706
2010-06-24 00:33:28 +00:00
Dan Gohman 1081f1a0f5 Fix OptimizeMax to handle an odd case where one of the max operands
is another max which folds. This fixes PR7454.

llvm-svn: 106594
2010-06-22 23:07:13 +00:00
Dan Gohman d2d1ae105d Use pre-increment instead of post-increment when the result is not used.
llvm-svn: 106542
2010-06-22 15:08:57 +00:00
Dan Gohman dd41bba517 Use A.append(...) instead of A.insert(A.end(), ...) when A is a
SmallVector, and other SmallVector simplifications.

llvm-svn: 106452
2010-06-21 19:47:52 +00:00
Dan Gohman 32655906e4 Add a TODO comment.
llvm-svn: 106397
2010-06-19 21:30:18 +00:00
Dan Gohman 51d00092b6 Include the use kind along with the expression in the key of the
use sharing map. The reconcileNewOffset logic already forces a
separate use if the kinds differ, so incorporating the kind in the
key means we can track more sharing opportunities.

More sharing means fewer total uses to track, which means smaller
problem sizes, which means the conservative throttles don't kick
in as often.

llvm-svn: 106396
2010-06-19 21:29:59 +00:00
Dan Gohman 297fb8b9fc Don't include things in anonymous namespaces that don't need it.
llvm-svn: 106395
2010-06-19 21:21:39 +00:00
Dan Gohman f3aea7aecf Disable indvars on loops when LoopSimplify form is not available.
This fixes PR7333.

llvm-svn: 106267
2010-06-18 01:35:11 +00:00
Rafael Espindola a20e2dfe86 Make sure that simplify libcalls does not replace a call with one calling
convention with a new call with a different calling convention.

llvm-svn: 106134
2010-06-16 19:34:01 +00:00
Benjamin Kramer a13bd20396 simplify-libcalls: fold strncmp(x, y, 1) -> memcmp(x, y, 1)
The memcmp will be optimized further and even the pathological case
'strstr(x, "x") == x' generates optimal code now.

llvm-svn: 106097
2010-06-16 10:30:29 +00:00
Benjamin Kramer 1118860e3a simplify-libcalls: fold strstr(a, b) == a -> strncmp(a, b, strlen(b)) == 0
llvm-svn: 106047
2010-06-15 21:34:25 +00:00
Chris Lattner 329ea064ed jump threading can't split a critical edge from an indirectbr. This
fixes PR7356.

llvm-svn: 105950
2010-06-14 19:45:43 +00:00
Benjamin Kramer b82de426de SimplifyCFG: don't turn volatile stores to null/undef into unreachable. Fixes PR7369.
llvm-svn: 105914
2010-06-13 14:35:54 +00:00
Kenneth Uildriks 9b21208bfb Pulled CodeMetrics out of InlineCost.h and made it a bit more general, so it can be reused from PartialSpecializationCost
llvm-svn: 105725
2010-06-09 15:11:37 +00:00
Dan Gohman 67b4403101 Don't track users of undef values; they aren't interesting for
register pressure.

llvm-svn: 105501
2010-06-04 23:16:05 +00:00
Dan Gohman 826bdf8c10 Move FindAvailableLoadedValue isSafeToLoadUnconditionally out of
lib/Transforms/Utils and into lib/Analysis so that Analysis passes
can use them.

llvm-svn: 104949
2010-05-28 16:19:17 +00:00
Benjamin Kramer 6877119ef3 Kill unneeded SExt.
llvm-svn: 104692
2010-05-26 09:45:04 +00:00
Benjamin Kramer 9439084cea Properly promote operands when optimizing a single-character memcmp.
llvm-svn: 104648
2010-05-25 22:53:43 +00:00
Dan Gohman 9b48b856ea DominatorTree.getNode can return null for unreachable blocks.
llvm-svn: 104290
2010-05-20 22:46:54 +00:00
Dan Gohman 86110fa2bb Minor code cleanups.
llvm-svn: 104287
2010-05-20 22:25:20 +00:00
Dan Gohman 6295f2ebb8 Make Solve check its own post-condition, to reduce clutter in the
top-level LSRInstance logic.

llvm-svn: 104278
2010-05-20 20:59:23 +00:00
Dan Gohman a4ca28a3ae Add comments.
llvm-svn: 104276
2010-05-20 20:52:00 +00:00
Dan Gohman 927bcaadda More code cleanups. Use iterators instead of indices when indices
aren't needed.

llvm-svn: 104273
2010-05-20 20:33:18 +00:00
Dan Gohman 4c4043cf34 Fix OptimizeShadowIV to set Changed. Change OptimizeLoopTermCond to set
Changed directly instead of using a return value.

Rename FilterOutUndesirableDedicatedRegisters's Changed variable to
distinguish it from LSRInstance's Changed member.

llvm-svn: 104269
2010-05-20 20:05:31 +00:00
Dan Gohman 8ec018cedf Add some comments.
llvm-svn: 104268
2010-05-20 20:00:41 +00:00
Dan Gohman 8ce95cc3c5 Simplify this code. Don't do a DomTreeNode lookup for each visited block.
llvm-svn: 104267
2010-05-20 20:00:25 +00:00
Dan Gohman ab5fb7f559 Minor code cleanups.
llvm-svn: 104263
2010-05-20 19:44:23 +00:00
Dan Gohman ee2fea3cd7 When canonicalizing icmp operand order to put the loop invariant
operand on the left, the interesting operand is on the right. This
fixes a bug where LSR was failing to recognize ICmpZero uses,
which led it to be unable to reverse the induction variable in the
attached testcase.

Delete test/CodeGen/X86/stack-color-with-reg-2.ll, because its test
is extremely fragile and hard to meaningfully update.

llvm-svn: 104262
2010-05-20 19:26:52 +00:00
Dan Gohman fdf9874ba7 Set Changed to true when canonicalizing ICmp operand order; even though
it isn't a very interesting change, it's a change nonetheless.

llvm-svn: 104260
2010-05-20 19:16:03 +00:00
Dan Gohman 981563d0ba Rename a variable to avoid shadowing.
llvm-svn: 104234
2010-05-20 16:41:11 +00:00
Dan Gohman 6b733fc189 Minor code simplification.
llvm-svn: 104232
2010-05-20 16:23:28 +00:00
Dan Gohman 80a9608442 Move the code for deleting BaseRegs and LSRUses into helper functions,
and fix a bug that valgrind noticed where the code would std::swap an
element with itself.

llvm-svn: 104225
2010-05-20 15:17:54 +00:00
Dan Gohman 20fab456da Teach LSR how to cope better with unrolled loops on targets where
the addressing modes don't make this trivially easy. This allows
it to avoid falling into the less precise heuristics in more
cases.

llvm-svn: 104186
2010-05-19 23:43:12 +00:00
Dan Gohman beebef4137 Add a comment.
llvm-svn: 104089
2010-05-18 23:55:57 +00:00
Dan Gohman 50f8f2c23d Fix the predicate which checks for non-sensical formulae which have
constants in registers which partially cancel out their immediate fields.

llvm-svn: 104088
2010-05-18 23:48:08 +00:00
Dan Gohman 4cf99b5303 Factor out the code for recomputing an LSRUse's Regs set after some
of its formulae have been removed into a helper function, and also
teach it how to update the RegUseTracker.

llvm-svn: 104087
2010-05-18 23:42:37 +00:00
Dan Gohman a4eca05174 Factor out code for estimating search space complexity into a helper
function.

llvm-svn: 104082
2010-05-18 22:51:59 +00:00
Dan Gohman 63e9015248 Add some more debug output.
llvm-svn: 104080
2010-05-18 22:41:32 +00:00
Dan Gohman f1c7b1b42f Factor out the code for deleting a formula from an LSRUse into
a helper function.

llvm-svn: 104079
2010-05-18 22:39:15 +00:00
Dan Gohman 8aca7ef903 Make some debug output more informative.
llvm-svn: 104078
2010-05-18 22:37:37 +00:00
Dan Gohman 06ab08f795 Print an error message in Formula::print if the HasBaseReg flag
is inconsistent with the BaseRegs field. It's not print's job to
assert on an invalid condition, but it can make one more obvious.

llvm-svn: 104077
2010-05-18 22:35:55 +00:00
Dan Gohman 248c41d108 Rename RegUseTracker's RegUses member to RegUsesMap to avoid
confusion with LSRInstance's RegUses member.

llvm-svn: 104076
2010-05-18 22:33:00 +00:00
Douglas Gregor 6739a89117 Fixes for Microsoft Visual Studio 2010, from Steven Watanabe!
llvm-svn: 103457
2010-05-11 06:17:44 +00:00
Chris Lattner 84d4618659 make simplifycfg insert an llvm.trap before the 'unreachable' it introduces
when it detects undefined behavior.  llvm.trap generally codegens into some
thing really small (e.g. a 2 byte ud2 instruction on x86) and debugging this
sort of thing is "nontrivial".  For example, we now compile:

void foo() { *(int*)0 = 42; }

into:

_foo:
	pushl	%ebp
	movl	%esp, %ebp
	ud2

Some may even claim that this is a security hole, though that seems dubious
to me.  This addresses rdar://7958343 - Optimizing away null dereference 
potentially allows arbitrary code execution

llvm-svn: 103356
2010-05-08 22:15:59 +00:00
Chris Lattner 5a62d6e578 Fix PR7052, patch by Jakub Staszak!
llvm-svn: 103347
2010-05-08 20:01:44 +00:00
Dan Gohman d0800241d2 When pruning candidate formulae out of an LSRUse, update the
LSRUse's Regs set after all pruning is done, rather than trying
to do it on the fly, which can produce an incomplete result.

This fixes a case where heuristic pruning was stripping all
formulae from a use, which led the solver to enter an infinite
loop.

Also, add a few asserts to diagnose this kind of situation.

llvm-svn: 103328
2010-05-07 23:36:59 +00:00
Ted Kremenek d90773ebe0 Update CMake build.
llvm-svn: 103266
2010-05-07 17:13:20 +00:00
Dan Gohman 5d5b8b1b8c Add an LLVM IR version of code sinking. This uses the same simple algorithm
as MachineSink, but it isn't constrained by MachineInstr-level details.

llvm-svn: 103257
2010-05-07 15:40:13 +00:00
Bob Wilson 0c8b29bcdb Use the right version of "append" to combine two SmallVectors.
This fixes the compile-time regressions seen in last night's tests.

llvm-svn: 103118
2010-05-05 20:44:15 +00:00
Bob Wilson a2fda8b648 Defer adding critical edges to the "toSplit" list until after checking for
indirect branches in all the predecessors.  This avoids unnecessarily
splitting edges in cases where load PRE is not possible anyway.
Thanks to Jakub Staszak for pointing this out.

llvm-svn: 103034
2010-05-04 20:03:21 +00:00
Dan Gohman 1d2ded75e2 Use getConstant instead of getIntegerSCEV. The two are basically the
same, now that getConstant has overloads consistent with ConstantInt::get.

llvm-svn: 102965
2010-05-03 22:09:21 +00:00
Devang Patel 9f5200a122 Check for side effects before splitting loop.
Patch by Jakub Staszak!

llvm-svn: 102928
2010-05-03 18:06:58 +00:00
Chris Lattner 87aa2243e2 fix PR6940: sitofp(undef) folds to 0.0, not undef.
llvm-svn: 102358
2010-04-26 18:21:23 +00:00
Dan Gohman 534ba376f6 Generalize LSR's OptimizeMax to handle the new kinds of max expressions
that indvars may use, now that indvars is recognizing le and ge loops.

llvm-svn: 102235
2010-04-24 03:13:44 +00:00
Dan Gohman 997bbc54d6 Fix LSR to tolerate cases where ScalarEvolution initially
misses an opportunity to fold add operands, but folds them
after LSR has separated them out. This fixes rdar://7886751.

llvm-svn: 102157
2010-04-23 01:55:05 +00:00
Chris Lattner 4ba01ec869 refactor the interface to InlineFunction so that most of the in/out
arguments are handled with a new InlineFunctionInfo class.  This 
makes it easier to extend InlineFunction to return more info in the
future.

llvm-svn: 102137
2010-04-22 23:07:58 +00:00
Gabor Greif 27b3d55194 use abstract accessors to CallInst
llvm-svn: 101899
2010-04-20 13:13:04 +00:00
Chris Lattner 66e809acc0 remove a bunch of ad-hoc code to simplify instructions from
loop unswitch, and use inst simplify instead.  It is more
powerful and less duplication.

llvm-svn: 101874
2010-04-20 05:33:18 +00:00
Chris Lattner 5814d9d9da RewriteLoopBodyWithConditionConstant can end up rewriting the
condition we're unswitching on.  In this case, don't try to
simplify the second copy of the loop which may be dead or not,
but is probably a constant now.  This fixes PR6879

llvm-svn: 101870
2010-04-20 05:09:16 +00:00
Dan Gohman e637ff5e9a Remove the Expr member from IVUsers. Instead of remembering the expression,
just ask ScalarEvolution for it on demand. This helps IVUsers be more robust
in the case of expressions changing underneath it. This fixes PR6862.

llvm-svn: 101819
2010-04-19 21:48:58 +00:00
Eric Christopher 7258dcd77f Revert 101465, it broke internal OpenGL testing.
Probably the best way to know that all getOperand() calls have been handled
is to replace that API instead of updating.

llvm-svn: 101579
2010-04-16 23:37:20 +00:00
Dan Gohman 99e5327bfd Refine the detection of seemingly infinitely recursive calls where the
callee is expected to be expanded to something else by codegen, so that
normal infinitely recursive calls are still transformed.

llvm-svn: 101468
2010-04-16 15:57:50 +00:00
Gabor Greif f375520f7b reapply r101434
with a fix for self-hosting

rotate CallInst operands, i.e. move callee to the back
of the operand array

the motivation for this patch are laid out in my mail to llvm-commits:
more efficient access to operands and callee, faster callgraph-construction,
smaller compiler binary

llvm-svn: 101465
2010-04-16 15:33:14 +00:00
Chris Lattner bd2d9430d6 fix comment noticed by Bob
llvm-svn: 101437
2010-04-16 02:32:17 +00:00
Gabor Greif 403e9694f9 back out r101423 and r101397, they break llvm-gcc self-host on darwin10
llvm-svn: 101434
2010-04-16 01:16:20 +00:00
Chris Lattner 1146d326a7 fix PR6832: we were using the alignment of a pointer when we
wanted the alignment of the pointee.

llvm-svn: 101432
2010-04-16 01:05:38 +00:00
Chris Lattner b73552908e improve comments.
llvm-svn: 101429
2010-04-16 00:38:19 +00:00
Chris Lattner 78d7dbbc30 pull all the ConvertToScalarInfo code together into one
place.

llvm-svn: 101427
2010-04-16 00:24:57 +00:00
Chris Lattner d69c3ee958 more refactoring: suck some stuff out of SRoA into
ConvertToScalarInfo.

llvm-svn: 101425
2010-04-16 00:20:00 +00:00
Gabor Greif 6af0ad846e shift intrinsic operand
llvm-svn: 101423
2010-04-16 00:06:45 +00:00
Chris Lattner 9ef4eae6e6 introduce a new ConvertToScalarInfo struct to simplify
CanConvertToScalar/MergeInType.  Eliminate a pointless
LLVMContext argument to MergeInType.

llvm-svn: 101422
2010-04-15 23:50:26 +00:00
Chris Lattner 9c1172d848 tidy interface to isOnlyCopiedFromConstantGlobal
llvm-svn: 101405
2010-04-15 21:59:20 +00:00
Gabor Greif 33ae80bff7 reapply r101364, which has been backed out in r101368
with a fix

rotate CallInst operands, i.e. move callee to the back
of the operand array

the motivation for this patch are laid out in my mail to llvm-commits:
more efficient access to operands and callee, faster callgraph-construction,
smaller compiler binary

llvm-svn: 101397
2010-04-15 20:51:13 +00:00
Dan Gohman b29cda9b3c Fix a bunch of namespace polution.
llvm-svn: 101376
2010-04-15 17:08:50 +00:00
Gabor Greif 9fd00c7d25 back out r101364, as it trips the linux nightlybot on some clang C++ tests
llvm-svn: 101368
2010-04-15 12:46:56 +00:00
Gabor Greif aafd209632 rotate CallInst operands, i.e. move callee to the back
of the operand array

the motivation for this patch are laid out in my mail to llvm-commits:
more efficient access to operands and callee, faster callgraph-construction,
smaller compiler binary

llvm-svn: 101364
2010-04-15 10:49:53 +00:00
Gabor Greif c08e5df836 performance: cache the dereferenced use_iterator
llvm-svn: 101253
2010-04-14 16:48:56 +00:00
Gabor Greif a49686fa3e performance: cache the dereferenced use_iterator
llvm-svn: 101250
2010-04-14 16:13:56 +00:00
Owen Anderson b516f1c6cc Remove SCCVN from the CMake build system.
llvm-svn: 101125
2010-04-13 08:33:09 +00:00
Owen Anderson 9ed6abfe0b SCCVN, we hardly knew ye!
llvm-svn: 101117
2010-04-13 05:24:08 +00:00
Dan Gohman 5867a56db8 Teach IndVarSimplify how to eliminate remainder operators where the
numerator is an induction variable. For example, with code like this:

  for (i=0;i<n;++i)
    x[i%n] = 0;

IndVarSimplify will now recognize that i is always less than n inside
the loop, and eliminate the remainder.

llvm-svn: 101113
2010-04-13 01:46:36 +00:00
Dan Gohman 4a645b88ef Suppress LinearFunctionTestReplace when the computed backedge-taken
expression is a UDiv and it doesn't appear that the UDiv came from
the user's source.

ScalarEvolution has recently figured out how to compute a tripcount
expression for the inner loop in
SingleSource/Benchmarks/Shootout/sieve.c, using a udiv. Emitting a
udiv instruction dramatically slows down the enclosing loop.

llvm-svn: 101068
2010-04-12 21:13:43 +00:00
Dan Gohman 27c8e79839 Delete this code, which is no longer needed.
llvm-svn: 101033
2010-04-12 08:00:22 +00:00
Dan Gohman 07f6563e81 Move the EliminateIVUsers call back out to its original location. Now that
a ScalarEvolution bug with overflow handling is fixed, the normal analysis
code will automatically decline to operate on the icmp instructions which
are responsible for the loop exit.

llvm-svn: 101032
2010-04-12 07:56:56 +00:00
Dan Gohman 15f90c294c Use RecursivelyDeleteTriviallyDeadInstructions in EliminateIVComparisons,
instead of deleting just the user. This makes it more consistent with
other code in IndVarSimplify, and theoretically can eliminate more users
earlier.

llvm-svn: 101027
2010-04-12 07:29:15 +00:00
Dan Gohman fa5ad797e3 Re-apply r101000, with a fix: Don't eliminate an icmp which is part of
the loop exit test. This usually doesn't come up for a variety of
reasons, but it isn't impossible, so make IndVarSimplify handle it
conservatively.

llvm-svn: 101008
2010-04-12 02:21:50 +00:00
Dan Gohman c0f1efaf8d Revert 101000, which is breaking self-host builds.
llvm-svn: 101002
2010-04-12 00:17:10 +00:00
Dan Gohman af4ab1b681 Teach IndVarSimplify how to eliminate comparisons involving induction
variables. For example, with code like this:

  for (i=0;i<n;++i)
    if (i<n)
      x[i] = 0;

IndVarSimplify will now recognize that i is always less than n inside
the loop, and eliminate the if.

llvm-svn: 101000
2010-04-11 23:10:12 +00:00
Dan Gohman b50349a979 Rename isLoopGuardedByCond to isLoopEntryGuardedByCond, to emphasise
that it's only testing for the entry condition, not full loop-invariant
conditions.

llvm-svn: 100979
2010-04-11 19:27:13 +00:00
Chris Lattner 9ae28b141f fix PR6743, a case where we'd delete an instruction before using it
in some cases.

llvm-svn: 100937
2010-04-10 18:26:57 +00:00
Dan Gohman 607e02b33a When determining a canonical insert position, don't climb deeper
into adjacent loops. Also, ensure that the insert position is
dominated by the loop latch of any loop in the post-inc set which
has a latch.

llvm-svn: 100906
2010-04-09 22:07:05 +00:00
Dan Gohman 42ec4eb351 When looking for loop-invariant users, look through no-op instructions,
so that an unfortunately placed bitcast doesn't pin a value in a
register.

llvm-svn: 100883
2010-04-09 19:12:34 +00:00
Gabor Greif ce6dd889ec const-ize a predicate
llvm-svn: 100856
2010-04-09 10:57:00 +00:00
Dan Gohman d2df643ddb Refactor the code for computing the insertion point for an expression into
a separate function.

llvm-svn: 100845
2010-04-09 02:00:38 +00:00
Chris Lattner c6c153be45 fix a SCCP miscompilation that could happen when a
forced constant is changed to a constant, we would end
up adding the instruction to the wrong worklist, 
preventing it from being properly revisited.  This fixes
rdar://7832370

llvm-svn: 100837
2010-04-09 01:14:31 +00:00
Dan Gohman 9b5d0bb774 Avoid allocating a value of zero in a register if the initial formula
inputs happen to negate each other.

llvm-svn: 100828
2010-04-08 23:36:27 +00:00
Dan Gohman 4ce1fb1448 Add variants of ult, ule, etc. which take a uint64_t RHS, for convenience.
llvm-svn: 100824
2010-04-08 23:03:40 +00:00
Dan Gohman 4506539d84 When expanding expressions which are using post-inc mode for multiple loops,
ensure that the expansion is dominated by the increments of those loops.

llvm-svn: 100748
2010-04-08 05:57:57 +00:00
Dan Gohman d006ab90dd Generalize IVUsers to track arbitrary expressions rather than expressions
explicitly split into stride-and-offset pairs. Also, add the
ability to track multiple post-increment loops on the same expression.

This refines the concept of "normalizing" SCEV expressions used for
to post-increment uses, and introduces a dedicated utility routine for
normalizing and denormalizing expressions.

This fixes the expansion of expressions which are post-increment users
of more than one loop at a time. More broadly, this takes LSR another
step closer to being able to reason about more than one loop at a time.

llvm-svn: 100699
2010-04-07 22:27:08 +00:00
Gabor Greif df323a51f5 performance: get rid of repeated dereferencing of use_iterator by caching its result
llvm-svn: 100550
2010-04-06 19:32:30 +00:00
Chris Lattner adca608281 fix a really nasty bug that Evan was tracking in SCCP. When resolving
undefs in branches/switches, we have two cases: a branch on a literal
undef or a branch on a symbolic value which is undef.  If we have a
literal undef, the code was correct: forcing it to a constant is the
right thing to do.

If we have a branch on a symbolic value that is undef, we should force
the symbolic value to a constant, which then makes the successor block
live.  Forcing the condition of the branch to being a constant isn't 
safe if later paths become live and the value becomes overdefined.  This
is the case that 'forcedconstant' is designed to handle, so just use it.

This fixes rdar://7765019 but there is no good testcase for this, the
one I have is too insane to be useful in the future.

llvm-svn: 100478
2010-04-05 22:14:48 +00:00
Chris Lattner c832c1bf69 some code cleanups, use SwitchInst::findCaseValue, reduce indentation
llvm-svn: 100468
2010-04-05 21:18:32 +00:00
Evan Cheng ba930449a9 Code clean up.
llvm-svn: 100467
2010-04-05 21:16:25 +00:00
Mon P Wang c576ee9040 Reapply address space patch after fixing an issue in MemCopyOptimizer.
Added support for address spaces and added a isVolatile field to memcpy, memmove, and memset,
e.g., llvm.memcpy.i32(i8*, i8*, i32, i32) -> llvm.memcpy.p0i8.p0i8.i32(i8*, i8*, i32, i32, i1)

llvm-svn: 100304
2010-04-04 03:10:48 +00:00
Chris Lattner ecb536313f require that the branch being controlled by the IV
exits the loop.  With this information we can guarantee 
the iteration count of the loop is bounded by the 
compare.  I think this xforms is finally safe now.

llvm-svn: 100285
2010-04-03 07:21:39 +00:00
Chris Lattner 40060d33f6 add integer overflow check for the fp induction variable
checker.  Amusingly, we already had tests that we should
have rejects because they would be miscompiled in the
testsuite.

The remaining issue with this is that we don't check that
the branch causes us to exit the loop if it fails, so we
don't actually know if we remain in bounds.

llvm-svn: 100284
2010-04-03 07:18:48 +00:00
Chris Lattner 69913466cb add a comment and fix some consistency issues, converting
to a signed vs unsigned value depending on the sign of the
constant fp means that we can't distinguish between a 
truly negative number and a positive number so large the
32nd bit is set.  So, do don't this!

llvm-svn: 100283
2010-04-03 06:41:49 +00:00
Chris Lattner 40ea690f39 fix PR6761, a miscompilation due to the fp->int IV conversion
stuff.  More bugs remain though.

llvm-svn: 100282
2010-04-03 06:30:03 +00:00
Chris Lattner 42202868c3 just eliminate the uitofp checks. This code isn't doing
the required validity checks in the first place, and supporting
a condition large enough to require the 32'nd bit isn't worth it.

llvm-svn: 100280
2010-04-03 06:25:21 +00:00
Chris Lattner ca25b60f4e rename PH -> PN to be consistent with WeakPN and the rest
of llvm.

llvm-svn: 100276
2010-04-03 06:17:08 +00:00
Chris Lattner 774858fc38 improve comment and drop a dead check. If PH had
no uses, it would have been deleted by 
RecursivelyDeleteTriviallyDeadInstructions

llvm-svn: 100275
2010-04-03 06:16:22 +00:00
Chris Lattner 915322bc4a strength reduce a ridiculous use of APInt.
llvm-svn: 100274
2010-04-03 06:13:12 +00:00
Chris Lattner 0b941347f9 rename stuff improve comment grammar.
llvm-svn: 100273
2010-04-03 06:11:07 +00:00
Chris Lattner d77bde5f94 simplify some code and resolve a fixme.
llvm-svn: 100272
2010-04-03 06:06:59 +00:00
Chris Lattner 2ff33f91d5 There is no guarantee that the increment and the branch
are in the same block.  Insert the new increment in the
correct location.

Also, more cleanups.

llvm-svn: 100271
2010-04-03 06:05:10 +00:00
Chris Lattner c558b49f14 first half of a pass through IndVarSimplify::HandleFloatingPointIV,
this cleans up a bunch of code and also fixes several crashes and
miscompiles.  More to come unfortunately, this optimization
is quite broken.

llvm-svn: 100270
2010-04-03 05:54:59 +00:00
Evan Cheng ed66db3f9b Code refactoring.
llvm-svn: 100262
2010-04-03 02:23:43 +00:00
Mon P Wang 999c1b927b Revert r100191 since it breaks objc in clang
llvm-svn: 100199
2010-04-02 18:43:02 +00:00
Mon P Wang a972ab8564 Reapply address space patch after fixing an issue in MemCopyOptimizer.
Added support for address spaces and added a isVolatile field to memcpy, memmove, and memset,
e.g., llvm.memcpy.i32(i8*, i8*, i32, i32) -> llvm.memcpy.p0i8.p0i8.i32(i8*, i8*, i32, i32, i1)

llvm-svn: 100191
2010-04-02 18:04:15 +00:00
Dan Gohman f7239102fe Manually notify ScalarEvolution before making an operand replacement, since
it can't currently observe such changes automatically.

llvm-svn: 100186
2010-04-02 14:48:31 +00:00
Gabor Greif 5d5db5342b Introduce ImmutableCallSite, useful for contexts where no mutation
is necessary. Inherits from new templated baseclass CallSiteBase<>
which is highly customizable. Base CallSite on it too, in a configuration
that allows full mutation.
Adapt some call sites in analyses to employ ImmutableCallSite.

llvm-svn: 100100
2010-04-01 08:21:08 +00:00
Dale Johannesen b67a6e6620 Fix a nasty dangling-pointer heisenbug that could
generate wrong code pretty much anywhere AFAICT.
A case that hits the bug reproducibly is impossible,
but the situation was like this:
Addr = ...
Store -> Addr
Addr2 = GEP , 0, 0
Store -> Addr2
Handling the first store, the code changed replaced Addr
with a sunkaddr and deleted Addr, but not its table
entry.  Code in OptimizedBlock replaced Addr2 with a
bitcast; if that happened to reuse the memory of Addr,
the old table entry was erroneously found when handling
the second store.

llvm-svn: 100044
2010-03-31 20:37:15 +00:00
Bob Wilson 6f7fd28824 Revert Mon Ping's change 99928, since it broke all the llvm-gcc buildbots.
llvm-svn: 99948
2010-03-30 22:27:04 +00:00
Mon P Wang 7460571381 Added support for address spaces and added a isVolatile field to memcpy, memmove, and memset,
e.g., llvm.memcpy.i32(i8*, i8*, i32, i32) -> llvm.memcpy.p0i8.p0i8.i32(i8*, i8*, i32, i32, i1)
A update of langref will occur in a subsequent checkin.

llvm-svn: 99928
2010-03-30 20:55:56 +00:00
Jeffrey Yasskin 12fd516e51 Remove another memory leak from ABCD by using Edges by value instead of
pointer.  There was also a SmallPtrSet whose settiness wasn't being used, so I
changed it to a SmallVector.

llvm-svn: 99713
2010-03-27 09:09:17 +00:00
Jeffrey Yasskin 97e613b6da In ABCD, change the non-null Bound*s to Bound&s.
llvm-svn: 99711
2010-03-27 08:15:46 +00:00
Jeffrey Yasskin 33bc7e4cb5 Fix a memory leak in ABCD by giving ownership of Bound objects to the
MemoizedResultChart.

llvm-svn: 99710
2010-03-27 08:09:24 +00:00
Dan Gohman d42e09d91e Ignore debug intrinsics in yet more places.
llvm-svn: 99580
2010-03-26 00:33:27 +00:00
Gabor Greif c78d720f02 rename use_const_iterator to const_use_iterator for consistency's sake
llvm-svn: 99564
2010-03-25 23:06:16 +00:00
Chris Lattner 0563804982 fix PR6642, GVN forwarding from memset to load of the base of the memset.
llvm-svn: 99488
2010-03-25 05:58:19 +00:00
Evan Cheng c12c2d9bb4 Move OptChkCall off LibCallOptimization into StrCpyOpt.
llvm-svn: 99418
2010-03-24 20:19:04 +00:00
Gabor Greif a2fbc0ae1b Finally land the InvokeInst operand reordering.
I have audited all getOperandNo calls now, fixing
hidden assumptions. CallSite related uglyness will
be eliminated successively.

Note this patch has a long and griveous history,
for all the back-and-forths have a look at
CallSite.h's log.

llvm-svn: 99399
2010-03-24 13:21:49 +00:00
Gabor Greif 9027ffb918 increase const goodness and remove pointless getUser() calls
llvm-svn: 99395
2010-03-24 10:29:52 +00:00
Bill Wendling 04803e8ef6 Skip debugging intrinsics when sinking unused invariants.
llvm-svn: 99324
2010-03-23 21:15:59 +00:00
Evan Cheng d9e822345c Teach simplify libcall to transform __strcpy_chk to __memcpy_chk to enable optimizations down stream.
llvm-svn: 99282
2010-03-23 15:48:04 +00:00
Gabor Greif e1517a084f backing out r99170 because it still fails on clang-x86_64-darwin10-fnt
llvm-svn: 99171
2010-03-22 09:11:00 +00:00
Gabor Greif 7a743e15e3 Now that hopefully all direct accesses to InvokeInst operands are fixed
we can reapply the InvokeInst operand reordering patch. (see r98957).

llvm-svn: 99170
2010-03-22 08:28:00 +00:00
Dan Gohman 1a2abe5580 Clear the SCEVExpander's insertion point after making deletions,
so that the SCEVExpander doesn't retain a dangling pointer as its
insert position. The dangling pointer in this case wasn't ever used
to insert new instructions, but it was causing trouble with
SCEVExpander's code for automatically advancing its insert position
past debug intrinsics.

This fixes use-after-free errors that valgrind noticed in
test/Transforms/IndVarSimplify/2007-06-06-DeleteDanglesPtr.ll and
test/Transforms/IndVarSimplify/exit_value_tests.ll.

llvm-svn: 99036
2010-03-20 03:53:53 +00:00
Gabor Greif 6c56ed847e back out r98957, it broke http://smooshlab.apple.com:8010/builders/clang-x86_64-darwin10-fnt/builds/703 in the nightly test suite
llvm-svn: 98958
2010-03-19 13:50:02 +00:00
Gabor Greif 8335f9c0bf Recommit r80858 again (which has been backed out in r80871).
This time I did a self-hosted bootstrap on Linux x86-64,
with no problems. Let's see how darwin 64-bit self-hosting
goes. At the first sign of failure I'll back this out.

Maybe the valgrind bots give me a hint of what may be wrong
(it at all).

llvm-svn: 98957
2010-03-19 11:55:53 +00:00
Benjamin Kramer f2e4b5dd7f str[r]chr returns its pointer argument so we cannot mark it as nocapture. Thanks to Duncan for spotting my mistake.
llvm-svn: 98671
2010-03-16 20:33:15 +00:00
Benjamin Kramer 5cf5fd2ffa Mark str[r]chr readonly.
llvm-svn: 98663
2010-03-16 19:36:43 +00:00
Devang Patel 45c1505bf6 Skip debug info intrinsics.
llvm-svn: 98584
2010-03-15 22:23:03 +00:00
Devang Patel d3f41e8939 In "empty" bb, the return instruction may not be first instruction, if dbg value intrinsics are present in this bb. Use terminator to find return instructions.
llvm-svn: 98565
2010-03-15 19:05:46 +00:00
Bill Wendling 55e69d179b Skip over debug info when trying to merge two return BBs.
llvm-svn: 98491
2010-03-14 10:40:55 +00:00
Benjamin Kramer 7b88a49f3e Factor checked library call optimization into a common helper class and use it
to unify the almost identical code in CodeGenPrepare and InstCombineCalls.

llvm-svn: 98338
2010-03-12 09:27:41 +00:00
Nate Begeman 2e41605d4f Whoops this already existed.
llvm-svn: 98297
2010-03-11 23:21:19 +00:00
Nate Begeman 5daa235c91 Add a handful of additional useful pass manager things to the C API
llvm-svn: 98296
2010-03-11 23:06:07 +00:00
Benjamin Kramer 2fc395659c stpcpy is so similar to strcpy, it doesn't deserve a complete copy of the __strcpy_chk -> strcpy code.
llvm-svn: 98284
2010-03-11 20:45:13 +00:00
Eric Christopher 607de1de53 Lower stpcpy_chk when possible.
llvm-svn: 98274
2010-03-11 19:24:34 +00:00
Eric Christopher 4b7948e09e Do some final lowering in CodeGenPrepare of _chk calls similar to
that in InstCombineCalls.

More call lowering needed.

llvm-svn: 98228
2010-03-11 02:41:03 +00:00
Dan Gohman 2734ebd37f Add a DominatorTree argument to isLCSSA so that it doesn't have to
compute a set of reachable blocks for itself each time it is called, which
is fairly frequently.

llvm-svn: 98179
2010-03-10 19:38:49 +00:00
Eric Christopher a7fb58f5f5 Migrate _chk call lowering from SimplifyLibCalls to InstCombine. Stub
out the remainder of the calls that we should lower in some way and
move the tests to the new correct directory. Fix up tests that are now
optimized more than they were before by -instcombine.

llvm-svn: 97875
2010-03-06 10:50:38 +00:00
Eric Christopher 87abfc506f Move SimplifyLibCalls's LibCall builders to a separate file so they
can be used in more places.  Add an argument for the TargetData that
most of them need. Update for the getInt8PtrTy() change.  Should be
no functionality change.

llvm-svn: 97844
2010-03-05 22:25:30 +00:00
Evan Cheng d214ed0e75 Safely turn memset_chk etc. to non-chk variant if the known object size is >= memset / memcpy / memmove size.
llvm-svn: 97828
2010-03-05 20:59:47 +00:00
Chris Lattner c6c1523f59 fix a nice subtle reassociate bug which would only occur
in a very specific use pattern embodied in the carefully
reduced testcase.

llvm-svn: 97794
2010-03-05 07:18:54 +00:00
Eric Christopher 4899cbc77d Move GetStringLength and helper from SimplifyLibCalls to ValueTracking.
No functionality change.

llvm-svn: 97793
2010-03-05 06:58:57 +00:00
Dan Gohman 29707de4fe Make SCEVExpander and LSR more aggressive about hoisting expressions out
of loops.

llvm-svn: 97642
2010-03-03 05:29:13 +00:00
Dan Gohman 52f5563973 Non-affine post-inc SCEV expansions have more code which must be
emitted after the increment. Make sure the insert position
reflects this. This fixes PR6453.

llvm-svn: 97537
2010-03-02 01:59:21 +00:00
Bob Wilson 0fd415820b Don't attempt load PRE when there is no real redundancy (i.e., the load is in
a loop and is itself the only dependency).

llvm-svn: 97526
2010-03-02 00:09:29 +00:00
Bob Wilson 892432b7ef When GVN needs to split critical edges for load PRE, check all of the
predecessors before returning.  Otherwise, if multiple predecessor edges need
splitting, we only get one of them per iteration.  This makes a small but
measurable compile time improvement with -enable-full-load-pre.

llvm-svn: 97521
2010-03-01 23:37:32 +00:00
Evan Cheng 7263cf8431 MemoryDepAnalysis is not used if redundant load processing is disabled.
llvm-svn: 97512
2010-03-01 22:23:12 +00:00
Dan Gohman 8b0a419eb1 Spelling fixes.
llvm-svn: 97453
2010-03-01 17:49:51 +00:00
Bob Wilson 1136166ee9 Revert r97245 which seems to be causing performance problems.
llvm-svn: 97366
2010-02-28 05:34:05 +00:00
Chris Lattner 2af7e3dceb fix grammaro's pointed out by daniel
llvm-svn: 97313
2010-02-27 07:50:40 +00:00
Chris Lattner d887f1da73 fix PR6414, a nondeterminism issue in IPSCCP which was because
of a subtle interation in a loop operating in densemap order.

llvm-svn: 97288
2010-02-27 00:07:42 +00:00
Bob Wilson ed1b0c31a7 Move the EnableFullLoadPRE flag from a separate command-line option to an
argument of createGVNPass and set it automatically for -O3.

llvm-svn: 97245
2010-02-26 19:09:47 +00:00
Bob Wilson d4655991c3 Remove unused "NoPRE" parameter in GVN and createGVNPass().
llvm-svn: 97235
2010-02-26 18:35:19 +00:00
Dan Gohman a9c205cc88 Make LoopSimplify change conditional branches in loop exiting blocks
which branch on undef to branch on a boolean constant for the edge
exiting the loop. This helps ScalarEvolution compute trip counts for
loops.

Teach ScalarEvolution to recognize single-value PHIs, when safe, and
ForgetSymbolicName to forget such single-value PHI nodes as apprpriate
in ForgetSymbolicName.

llvm-svn: 97126
2010-02-25 06:57:05 +00:00
Daniel Dunbar 693ea89214 Reapply r97010, the speculative revert failed.
llvm-svn: 97036
2010-02-24 08:48:04 +00:00
Daniel Dunbar 0a2031e5b6 Speculatively revert r97010, "Add an argument to PHITranslateValue to specify
the DominatorTree. ...", in hopes of restoring poor old PPC bootstrap.

llvm-svn: 97027
2010-02-24 06:55:22 +00:00
Bob Wilson 66e58ac742 Add an argument to PHITranslateValue to specify the DominatorTree. If this
argument is non-null, pass it along to PHITranslateSubExpr so that it can
prefer using existing values that dominate the PredBB, instead of just
blindly picking the first equivalent value that it finds on a uselist.
Also when the DominatorTree is specified, have PHITranslateValue filter
out any result that does not dominate the PredBB.  This is basically just
refactoring the check that used to be in GetAvailablePHITranslatedSubExpr
and also in GVN.

Despite my initial expectations, this change does not affect the results
of GVN for any testcases that I could find, but it should help compile time.
Before this change, if PHITranslateSubExpr picked a value that does not
dominate, PHITranslateWithInsertion would then insert a new value, which GVN
would later determine to be redundant and would replace.  By picking a good
value to begin with, we save GVN the extra work of inserting and then
replacing a new value.

llvm-svn: 97010
2010-02-24 01:39:00 +00:00
Bob Wilson 923261bbe9 Update memdep when load PRE inserts a new load, and add some debug output.
I don't have a small testcase for this.

llvm-svn: 96890
2010-02-23 05:55:00 +00:00
Bob Wilson 1da9041913 Erase deleted instructions from GVN's ValueTable. This fixes assertion
failures from ValueTable::verifyRemoved() when using -debug.

llvm-svn: 96805
2010-02-22 21:39:41 +00:00
Dan Gohman 8c16b38262 Remove unused variables and parameters.
llvm-svn: 96780
2010-02-22 04:11:59 +00:00
Dan Gohman 4506fcb3c2 When emitting an instruction which depends on both a post-incremented
induction variable value and a loop-variant value, don't force the
insert position to be at the post-increment position, because it may
not be dominated by the loop-variant value. This fixes a
use-before-def problem noticed on PPC.

llvm-svn: 96774
2010-02-22 03:59:54 +00:00
Dan Gohman 740909be2d This cast<Instruction> is unnecessary.
llvm-svn: 96771
2010-02-22 02:07:36 +00:00
Dan Gohman 4eebb94094 Rename getSDiv to getExactSDiv to reflect its behavior in cases where
the division would have a remainder.

llvm-svn: 96693
2010-02-19 19:35:48 +00:00
Dan Gohman 85af256779 Check for overflow when scaling up an add or an addrec for
scaled reuse.

llvm-svn: 96692
2010-02-19 19:32:49 +00:00
Dale Johannesen 1d6827adef recommit 96626, evidence that it broke things appears
to be spurious

llvm-svn: 96662
2010-02-19 07:14:22 +00:00
Dale Johannesen 1f790c28d0 Revert 96626, which causes build failure on ppc Darwin.
llvm-svn: 96653
2010-02-19 01:54:37 +00:00
Dan Gohman 2446f57503 When determining the set of interesting reuse factors, consider
strides in foreign loops. This helps locate reuse opportunities
with existing induction variables in foreign loops and reduces
the need for inserting new ones. This fixes rdar://7657764.

llvm-svn: 96629
2010-02-19 00:05:23 +00:00
Dan Gohman 60b3326435 Indvars needs to explicitly notify ScalarEvolution when it is replacing
a loop exit value, so that if a loop gets deleted, ScalarEvolution
isn't stick holding on to dangling SCEVAddRecExprs for that loop. This
fixes PR6339.

llvm-svn: 96626
2010-02-18 23:26:33 +00:00
Dan Gohman c43d264cc0 Hoist this loop-invariant logic out of the loop.
llvm-svn: 96614
2010-02-18 21:34:02 +00:00
Dan Gohman 13ac3b2139 Delete some unneeded casts.
llvm-svn: 96429
2010-02-17 00:42:19 +00:00
Dan Gohman 5f10d6c52c Don't attempt to divide INT_MIN by -1; consider such cases to
have overflowed.

llvm-svn: 96428
2010-02-17 00:41:53 +00:00
Bob Wilson aff96b2132 Rename SuccessorNumber to GetSuccessorNumber.
llvm-svn: 96387
2010-02-16 21:06:42 +00:00
Dan Gohman 6deab96c81 Refactor rewriting for PHI nodes into a separate function.
llvm-svn: 96382
2010-02-16 20:25:07 +00:00
Bob Wilson 92cdb6eec5 Split critical edges as needed for load PRE.
llvm-svn: 96378
2010-02-16 19:51:59 +00:00
Bob Wilson 3de492ec35 Refactor to share code to find the position of a basic block successor in the
terminator's list of successors.

llvm-svn: 96377
2010-02-16 19:49:17 +00:00
Dan Gohman 0849ed5e26 Fix whitespace.
llvm-svn: 96372
2010-02-16 19:42:34 +00:00
Duncan Sands 19d0b47b1f There are two ways of checking for a given type, for example isa<PointerType>(T)
and T->isPointerTy().  Convert most instances of the first form to the second form.
Requested by Chris.

llvm-svn: 96344
2010-02-16 11:11:14 +00:00
Dan Gohman 521efe68ab Split the main for-each-use loop again, this time for GenerateTruncates,
as it also peeks at which registers are being used by other uses. This
makes LSR less sensitive to use-list order.

llvm-svn: 96308
2010-02-16 01:42:53 +00:00
Duncan Sands 9dff9bec31 Uniformize the names of type predicates: rather than having isFloatTy and
isInteger, we now have isFloatTy and isIntegerTy.  Requested by Chris!

llvm-svn: 96223
2010-02-15 16:12:20 +00:00
Dan Gohman e4e51a63da Fix whitespace.
llvm-svn: 96179
2010-02-14 18:51:39 +00:00
Dan Gohman e7f74bb16c Fix a comment.
llvm-svn: 96178
2010-02-14 18:51:20 +00:00
Dan Gohman bb7d52213c When complicated expressions are broken down into subexpressions
with multiplication by constants distributed through, occasionally
those subexpressions can include both x and -x. For now, if this
condition is discovered within LSR, just prune such cases away,
as they won't be profitable. This fixes a "zero allocated in a
base register" assertion failure.

llvm-svn: 96177
2010-02-14 18:50:49 +00:00
Dan Gohman 2d0f96d49a Actually, this code doesn't have to be quite so conservative in
the no-TLI case. But it should still default to declining the
transformation.

llvm-svn: 96152
2010-02-14 03:21:49 +00:00
Dan Gohman cb76a806f0 Don't attempt aggressive post-inc uses if TargetLowering is not available,
because profitability can't be sufficiently approximated.

llvm-svn: 96148
2010-02-14 02:45:21 +00:00
John McCall 0daaf13b97 Make LSR not crash if invoked without target lowering info, e.g. if invoked
from opt.

llvm-svn: 96135
2010-02-13 23:40:16 +00:00
Chris Lattner b8639bc2d1 remove dead code.
llvm-svn: 96109
2010-02-13 19:07:06 +00:00
Chris Lattner 42c66b7270 Split some code out to a helper function (FindReusablePredBB)
and add a doxygen comment.

Cache the phi entry to avoid doing tons of 
PHINode::getBasicBlockIndex calls in the common case.

On my insane testcase from re2c, this speeds up CGP from
617.4s to 7.9s (78x).

llvm-svn: 96083
2010-02-13 05:35:08 +00:00
Chris Lattner 96b8826542 speed up CGP a bit by scanning predecessors through phi operands
instead of with pred_begin/end.

llvm-svn: 96078
2010-02-13 04:04:42 +00:00
Dan Gohman 5b18f039eb Fix a pruning heuristic which implicitly assumed that SmallPtrSet is
deterministically sorted.

llvm-svn: 96071
2010-02-13 02:06:02 +00:00
Dan Gohman 2b75de97c0 Reapply 95979, a compile-time speedup, now that the bug it exposed is fixed.
llvm-svn: 96005
2010-02-12 19:35:25 +00:00
Dan Gohman 363f847ec6 Fix this code to avoid dereferencing an end() iterator in
offset distributions it doesn't expect.

llvm-svn: 96002
2010-02-12 19:20:37 +00:00
Daniel Dunbar e0b2c69d3c Revert "Reverse the order for collecting the parts of an addrec. The order", it
is breaking llvm-gcc bootstrap.

llvm-svn: 95988
2010-02-12 17:27:08 +00:00
Dan Gohman 0194f58047 Reverse the order for collecting the parts of an addrec. The order
doesn't matter, except that ScalarEvolution tends to need less time
to fold the results this way.

llvm-svn: 95979
2010-02-12 11:08:26 +00:00
Dan Gohman 45774ce0ad Reapply the new LoopStrengthReduction code, with compile time and
bug fixes, and with improved heuristics for analyzing foreign-loop
addrecs.

This change also flattens IVUsers, eliminating the stride-oriented
groupings, which makes it easier to work with.

llvm-svn: 95975
2010-02-12 10:34:29 +00:00
Chris Lattner c053cbbc4d Make DSE only scan blocks that are reachable from the entry
block.  Other blocks may have pointer cycles that will crash
basicaa and other alias analyses.  In any case, there is no
point wasting cycles optimizing dead blocks.  This fixes 
rdar://7635088

llvm-svn: 95852
2010-02-11 05:11:54 +00:00
Chris Lattner d924f63692 Make jump threading honor x|undef -> true and x&undef -> false,
instead of considering x|undef -> x, which may not be true.

llvm-svn: 95850
2010-02-11 04:40:44 +00:00
Devang Patel 03936a1880 Ignore dbg info intrinsics.
llvm-svn: 95828
2010-02-11 00:20:49 +00:00
Dan Gohman 4a618827de Fix "the the" and similar typos.
llvm-svn: 95781
2010-02-10 16:03:48 +00:00
Eric Christopher ad1aa86276 Pull these back out, they're a little too aggressive and time
consuming for a simple optimization.

llvm-svn: 95671
2010-02-09 17:29:18 +00:00
Eric Christopher be2f0b2b7b Add file in here too.
llvm-svn: 95641
2010-02-09 01:11:03 +00:00
Eric Christopher 9f85e7eb16 Add a new pass to do llvm.objsize lowering using SCEV.
Initial skeleton and SCEVUnknown lowering implemented,
the rest should come relatively quickly.  Move testcase
to new directory.

Move pass to right before SimplifyLibCalls - which is
moved down a bit so we can take advantage of a few opts.

llvm-svn: 95628
2010-02-09 00:35:38 +00:00
Jakob Stoklund Olesen 5f9ead2714 Don't unroll loops containing function calls.
llvm-svn: 95454
2010-02-05 23:21:31 +00:00
Jakob Stoklund Olesen 916f48a054 Teach SimplifyCFG about magic pointer constants.
Weird code sometimes uses pointer constants other than null. This patch
teaches SimplifyCFG to build switch instructions in those cases.

Code like this:

void f(const char *x) {
  if (!x)
    puts("null");
  else if ((uintptr_t)x == 1)
    puts("one");
  else if (x == (char*)2 || x == (char*)3)
    puts("two");
  else if ((intptr_t)x == 4)
    puts("four");
  else
    puts(x);
}

Now becomes a switch:

define void @f(i8* %x) nounwind ssp {
entry:
  %magicptr23 = ptrtoint i8* %x to i64            ; <i64> [#uses=1]
  switch i64 %magicptr23, label %if.else16 [
    i64 0, label %if.then
    i64 1, label %if.then2
    i64 2, label %if.then9
    i64 3, label %if.then9
    i64 4, label %if.then14
  ]

Note that LLVM's own DenseMap uses magic pointers.

llvm-svn: 95439
2010-02-05 22:03:18 +00:00
Dan Gohman 4739e41ce9 Implement releaseMemory in CodeGenPrepare and free the BackEdges
container data. This prevents it from holding onto dangling
pointers and potentially behaving unpredictably.

llvm-svn: 95409
2010-02-05 19:24:11 +00:00
Bob Wilson 27dfb1e1a4 Do not reassociate expressions with i1 type. SimplifyCFG converts some
short-circuited conditions to AND/OR expressions, and those expressions
are often converted back to a short-circuited form in code gen.  The
original source order may have been optimized to take advantage of the
expected values, and if we reassociate them, we change the order and
subvert that optimization.  Radar 7497329.

llvm-svn: 95333
2010-02-04 23:32:37 +00:00
Bob Wilson 04365c5f72 Adjust the heuristics used to decide when SROA is likely to be profitable.
The SRThreshold value makes perfect sense for checking if an entire aggregate
should be promoted to a scalar integer, but it is not so good for splitting
an aggregate into its separate elements.  A struct may contain a large embedded
array along with some scalar fields that would benefit from being split apart
by SROA.  Even if the total aggregate size is large, it may still be good to
perform SROA.  Thus, the most important piece of this patch is simply moving
the aggregate size comparison vs. SRThreshold so that it guards only the
aggregate promotion.

We have also been checking the number of elements to decide if an aggregate
should be split up.  The limit of "SRThreshold/4" seemed rather arbitrary,
and I don't think it's very useful to derive this limit from SRThreshold
anyway.  I've collected some data showing that the current default limit of
32 (since SRThreshold defaults to 128) is a reasonable cutoff for struct
types.  One thing suggested by the data is that distinguishing between structs
and arrays might be useful.  There are (obviously) a lot more large arrays
than large structs (as measured by the number of elements and not the total
size -- a large array inside a struct still counts as a single element given
the way we do SROA right now).  Out of 8377 arrays where we successfully
performed SROA while compiling a large set of benchmarks, only 16 of them had
more than 8 elements.  And, for those 16 arrays, it's not at all clear that
SROA was actually beneficial.  So, to offset the compile time cost of
investigating more large structs for SROA, the patch lowers the limit on array
elements to 8.

This fixes Apple Radar 7563690.

llvm-svn: 95224
2010-02-03 17:23:56 +00:00
Evan Cheng 27a41d5473 Revert 94937 and move the noreturn check to codegen.
llvm-svn: 95198
2010-02-03 03:55:59 +00:00
Bob Wilson 76e8c59509 Fix some comment typos.
llvm-svn: 95170
2010-02-03 00:33:21 +00:00
Eric Christopher d86233c118 Recommit this, looks like it wasn't the cause.
llvm-svn: 95165
2010-02-03 00:21:58 +00:00
Eric Christopher e67d01a9a8 Hopefully temporarily revert this.
llvm-svn: 95154
2010-02-02 23:01:31 +00:00
Eric Christopher 4264e7e46f Re-add strcmp and known size object size checking optimization.
Passed bootstrap and nightly test run here.

llvm-svn: 95145
2010-02-02 22:10:43 +00:00
Chris Lattner 302240d73e fix a crash in loop unswitch on a loop invariant vector condition.
llvm-svn: 95055
2010-02-02 02:26:54 +00:00
Eric Christopher 14dfc3f6df Don't need to check the last argument since it'll always be bool. We also
don't use TargetData here.

llvm-svn: 95040
2010-02-02 00:51:45 +00:00
Eric Christopher 9afa973203 More indentation/tabification fixes.
llvm-svn: 95036
2010-02-02 00:13:06 +00:00
Eric Christopher 1408234753 Untabify previous commit.
llvm-svn: 95035
2010-02-02 00:06:55 +00:00
Eric Christopher 56e4182c49 Formatting.
llvm-svn: 95027
2010-02-01 23:25:03 +00:00
Bob Wilson d517b52012 Add an option to GVN to remove all partially redundant loads. This is currently
disabled by default.  This divides the existing load PRE code into 2 phases:
first it checks that it is safe to move the load to each of the predecessors
where it is unavailable, and then if it is safe, the code is changed to move
the load.  Radar 7571861.

llvm-svn: 95007
2010-02-01 21:17:14 +00:00
Evan Cheng d86d3fe0c3 Do not mark no-return calls tail calls. It'll screw up special calls like longjmp and it doesn't make much sense for performance reason. If my logic is faulty, please let me know.
llvm-svn: 94937
2010-01-31 00:59:31 +00:00
Bob Wilson 56600a15ad Check alignment of loads when deciding whether it is safe to execute them
unconditionally.  Besides checking the offset, also check that the underlying
object is aligned as much as the load itself.

llvm-svn: 94875
2010-01-30 04:42:39 +00:00
Eric Christopher 5a0e174863 Revert my last couple of patches. They appear to have broken bison.
llvm-svn: 94841
2010-01-29 21:16:24 +00:00
Bob Wilson 7c42b9d51e Improve isSafeToLoadUnconditionally to recognize that GEPs with constant
indices are safe if the result is known to be within the bounds of the
underlying object.

llvm-svn: 94829
2010-01-29 19:19:08 +00:00
Eric Christopher 9b3c02b7da Make strcpy_chk lower to strcpy if we have a safe size.
llvm-svn: 94783
2010-01-29 01:37:11 +00:00
Bill Wendling 48816a0b3f Generic reformatting and comment fixing. No functionality change.
llvm-svn: 94771
2010-01-29 00:52:43 +00:00
Bill Wendling 8277838cf8 Add newline to debugging output, and fix some grammar-os in comment.
llvm-svn: 94765
2010-01-29 00:27:39 +00:00
Benjamin Kramer 40582a891c Use the less expensive getName function instead of getNameStr.
llvm-svn: 94683
2010-01-27 19:46:52 +00:00
Bob Wilson 70c8fe5e4e Remove check for an impossible condition: the condition of the while loop has
already checked that TmpBB->getSinglePredecessor() is non-null.

llvm-svn: 94451
2010-01-25 21:28:05 +00:00
Bob Wilson fc060e4337 Change Value::getUnderlyingObject to have the MaxLookup value specified as a
parameter with a default value, instead of just hardcoding it in the
implementation.  The limit of MaxLookup = 6 was introduced in r69151 to fix
a performance problem with O(n^2) behavior in instcombine, but the scalarrepl
pass is relying on getUnderlyingObject to go all the way back to an AllocaInst.
Making the limit part of the method signature makes it clear that by default
the result is limited and should help avoid similar problems in the future.
This fixes pr6126.

llvm-svn: 94433
2010-01-25 18:26:54 +00:00
Chris Lattner 823aed16f9 make -fno-rtti the default unless a directory builds with REQUIRES_RTTI.
llvm-svn: 94378
2010-01-24 20:43:08 +00:00
Chris Lattner 29b15c5cfd third bug from PR6119: the xor dupe extension allows
for arbitrary terminators in predecessors, don't assume
it is a conditional or uncond branch.  The testcase shows
an example where they can happen with switches.

llvm-svn: 94323
2010-01-23 19:21:31 +00:00
Chris Lattner ba2d0b89ff add an early out to ProcessBranchOnXOR to speed it up,
handle the case when we can infer an input to the xor
from all inputs that agree, instead of going into an
infinite loop.  Another part of PR6199

llvm-svn: 94321
2010-01-23 19:16:25 +00:00
Chris Lattner de5ab4860f fix a crash in jump threading, PR6119
llvm-svn: 94319
2010-01-23 18:56:07 +00:00
Eric Christopher ba7cd4c393 Reapply 94059 while fixing the calling convention setup
for strcpy.

llvm-svn: 94287
2010-01-23 05:29:06 +00:00
Bob Wilson 6c0c8d41b4 Revert 94059. It is breaking the MultiSource/Benchmarks/Prolangs-C/bison
test on ARM.

llvm-svn: 94198
2010-01-22 19:16:40 +00:00
Chris Lattner 7ba0661f27 Stop building RTTI information for *most* llvm libraries. Notable
missing ones are libsupport, libsystem and libvmcore.  libvmcore is
currently blocked on bugpoint, which uses EH.  Once it stops using
EH, we can switch it off.

This #if 0's out 3 unit tests, because gtest requires RTTI information.
Suggestions welcome on how to fix this.

llvm-svn: 94164
2010-01-22 06:49:46 +00:00
Dan Gohman 045f81981a Revert LoopStrengthReduce.cpp to pre-r94061 for now.
llvm-svn: 94123
2010-01-22 00:46:49 +00:00
Victor Hernandez 1df65186d1 DbgInfoIntrinsics no longer appear in an instruction's use list; so clean up looking for them in use iterations and remove OnlyUsedByDbgInfoIntrinsics()
llvm-svn: 94111
2010-01-21 23:05:53 +00:00
Dan Gohman b1ee154b6b When inserting expressions for post-increment users which contain
loop-variant components, adds must be inserted after the increment.
Keep track of the increment position for this case, and insert
these adds in the correct location.

llvm-svn: 94110
2010-01-21 23:01:22 +00:00
Dan Gohman cb8d577eb2 Include IVUsers information in LSR's debug output.
llvm-svn: 94108
2010-01-21 22:46:32 +00:00
Dan Gohman 29916e023d Prune the search for candidate formulae if the number of register
operands exceeds the number of registers used in the initial
solution, as that wouldn't lead to a profitable solution anyway.

llvm-svn: 94107
2010-01-21 22:42:49 +00:00
Dan Gohman c903499ff8 Add a comment.
llvm-svn: 94104
2010-01-21 21:31:09 +00:00
Dan Gohman 51ad99d2c5 Re-implement the main strength-reduction portion of LoopStrengthReduction.
This new version is much more aggressive about doing "full" reduction in
cases where it reduces register pressure, and also more aggressive about
rewriting induction variables to count down (or up) to zero when doing so
reduces register pressure.

It currently uses fairly simplistic algorithms for finding reuse
opportunities, but it introduces a new framework allows it to combine
multiple strategies at once to form hybrid solutions, instead of doing
all full-reduction or all base+index.

llvm-svn: 94061
2010-01-21 02:09:26 +00:00
Eric Christopher fa863258d0 Add strcpy_chk -> strcpy support for "don't know" object size
answers.  This will update as object size checking gets better information.

llvm-svn: 94059
2010-01-21 01:04:38 +00:00
Dan Gohman ca19445d08 When doing address-mode sinking, expand the base register first, rather
than the scaled register. This makes it more likely that subsequent
AddrModeMatcher queries will match the new address the same way as the
old, instead of accidentally matching what had been the base register
as the new scaled register, and then failing to match the scaled register.
This fixes some problems with address-mode sinking multiple muls into a
block, which will be a lot more common with some upcoming
LoopStrengthReduction changes.

llvm-svn: 93935
2010-01-19 22:45:06 +00:00
Bob Wilson 58d59fe394 Fix a crash in scalarrepl for memcpy/memmove where the source and destination
are the same.  I had already fixed a similar problem where the source and
destination were different bitcasts derived from the same alloca, but the
previous fix still did not handle the case where both operands are exactly
the same value.  Radar 7552893.

llvm-svn: 93848
2010-01-19 04:32:48 +00:00
Owen Anderson cdea3572fa Convert some of the dynamic opcode lookups into static ones.
llvm-svn: 93693
2010-01-17 19:33:27 +00:00
Chris Lattner 573da8ac90 1) Use the new SimplifyInstructionsInBlock routine instead of the copy
in JT.

2) When cloning blocks for PHI or xor conditions, use
instsimplify to simplify the code as we go.  This allows us to 
squish common cases early in JT which opens up opportunities for
subsequent iterations, and allows it to completely simplify the
testcase.

llvm-svn: 93253
2010-01-12 20:41:47 +00:00
Chris Lattner af7855d571 tidy up
llvm-svn: 93222
2010-01-12 02:07:50 +00:00
Chris Lattner eb73bdb2e1 Teach jump threading to duplicate small blocks when the branch
condition is a xor with a phi node.  This eliminates nonsense
like this from 176.gcc in several places:

 LBB166_84:
        testl   %eax, %eax
-       setne   %al
-       xorb    %cl, %al
-       notb    %al
-       testb   $1, %al
-       je      LBB166_85
+       je      LBB166_69
+       jmp     LBB166_85

This is rdar://7391699

llvm-svn: 93221
2010-01-12 02:07:17 +00:00
Chris Lattner 6a19ed0b86 some cleanup, and make it obvious that ProcessJumpOnPHI only works
on branches by renaming it and checking for a branch at the call site.

llvm-svn: 93208
2010-01-11 23:41:09 +00:00
Chris Lattner ab7087ad66 only factor from expressions whose uses are empty and whose
base is the right expression type.  This fixes PR5981.

llvm-svn: 93045
2010-01-09 06:01:36 +00:00
Duncan Sands 4a8b15dc74 Suppress an unused variable warning when assertions are off;
remove some trailing whitespace while there.

llvm-svn: 93008
2010-01-08 17:51:48 +00:00
Benjamin Kramer 76e2766442 Use a do-while loop instead of while + boolean.
llvm-svn: 92912
2010-01-07 13:50:07 +00:00
Eric Christopher 2cdb806fd8 Move the object size intrinsic optimization to inst-combine and make
it work for any integer size return type.

llvm-svn: 92853
2010-01-06 20:04:44 +00:00
Mikhail Glushenkov 40d2429b28 Formatting.
llvm-svn: 92831
2010-01-06 09:20:39 +00:00
Benjamin Kramer d2564e3afb Move remaining stuff to the isInteger predicate.
llvm-svn: 92771
2010-01-05 21:05:54 +00:00
Benjamin Kramer a81a6dff0d Convert a ton of simple integer type equality tests to the new predicate.
llvm-svn: 92760
2010-01-05 20:07:06 +00:00
Dan Gohman b5358003fb Set Changed properly after calling DeleteDeadPHIs.
llvm-svn: 92735
2010-01-05 16:31:45 +00:00
Dan Gohman 28943873e6 Use do+while instead of while for loops which obviously have a
non-zero trip count. Use SmallVector's pop_back_val().

llvm-svn: 92734
2010-01-05 16:27:25 +00:00
Chris Lattner f741d72b84 fix an infinite loop in reassociate building emacs.
llvm-svn: 92679
2010-01-05 04:55:35 +00:00
David Greene 241992382e Change errs() to dbgs().
llvm-svn: 92624
2010-01-05 01:27:47 +00:00
David Greene e0b9789593 Change errs() to dbgs().
llvm-svn: 92623
2010-01-05 01:27:44 +00:00
David Greene 6bc0776343 Change errs() to dbgs().
llvm-svn: 92622
2010-01-05 01:27:39 +00:00
David Greene 3a79df0993 Change errs() to dbgs().
llvm-svn: 92620
2010-01-05 01:27:33 +00:00
David Greene 0fd862254e Change errs() to dbgs().
llvm-svn: 92619
2010-01-05 01:27:30 +00:00
David Greene d17c3916d0 Change errs() to dbgs().
llvm-svn: 92617
2010-01-05 01:27:24 +00:00
David Greene 9ddc6e2e12 Change errs() to dbgs().
llvm-svn: 92615
2010-01-05 01:27:21 +00:00
David Greene 1efdb45562 Change errs() to dbgs().
llvm-svn: 92614
2010-01-05 01:27:19 +00:00
David Greene 2e6efc441f Change errs() to dbgs().
llvm-svn: 92613
2010-01-05 01:27:17 +00:00
David Greene 389fc3b9f6 Change errs() to dbgs().
llvm-svn: 92612
2010-01-05 01:27:15 +00:00
David Greene 74e2d4917d Change errs() to dbgs().
llvm-svn: 92611
2010-01-05 01:27:11 +00:00
David Greene 48c86bedbd Change errs() to dbgs().
llvm-svn: 92610
2010-01-05 01:27:09 +00:00
David Greene 0dd384cfd0 Change errs() to dbgs().
llvm-svn: 92609
2010-01-05 01:27:06 +00:00
David Greene d9c355d590 Change errs() to dbgs().
llvm-svn: 92608
2010-01-05 01:27:04 +00:00
Devang Patel be94f23992 Remove dead debug info intrinsics.
Intrinsic::dbg_stoppoint
 Intrinsic::dbg_region_start 
 Intrinsic::dbg_region_end 
 Intrinsic::dbg_func_start
AutoUpgrade simply ignores these intrinsics now.

llvm-svn: 92557
2010-01-05 01:10:40 +00:00
Mikhail Glushenkov 6a8ac8ce8f 80-col violations, trailing whitespace.
llvm-svn: 92470
2010-01-04 07:55:25 +00:00
Chris Lattner c0e6640d3a move instcombine to its own library, it's past time.
llvm-svn: 92459
2010-01-04 06:23:24 +00:00
Chris Lattner 2d91231d82 implement an instcombine xform needed by clang's codegen
on the example in PR4216.  This doesn't trigger in the testsuite,
so I'd really appreciate someone scrutinizing the logic for
correctness.

llvm-svn: 92458
2010-01-04 06:03:59 +00:00
Chris Lattner 48218e42cd pull my debug hooks out, I'm done with this xform for now.
llvm-svn: 92446
2010-01-03 06:58:48 +00:00
Nick Lewycky 475d3d1215 Small cleanups, refactor some duplicated code into a single method. No
functionality change.

llvm-svn: 92445
2010-01-03 04:39:07 +00:00
Chris Lattner fca0c8f93a generalize the previous transformation to handle indexing into
arrays of structs and other arrays, so long as all the subsequent
indexes are constants.  This triggers frequently for stuff like:

@divisions = internal constant [29 x [2 x i32]] [[2 x i32] zeroinitializer, [2 x i32] [i32 0, i32 1], [2 x i32] [i32 0, i32 2], [2 x i32] [i32 0, i32 1], [2 x i32] zeroinitializer, [2 x i32] [i32 0, i32 1], [2 x i32] [i32 0, i32 1], [2 x i32] [i32 0, i32 2], [2 x i32] [i32 0, i32 2], [2 x i32] zeroinitializer, [2 x i32] zeroinitializer, [2 x i32] zeroinitializer, [2 x i32] [i32 0, i32 2], [2 x i32] [i32 0, i32 1], [2 x i32] zeroinitializer, [2 x i32] [i32 1, i32 0], [2 x i32] [i32 1, i32 1], [2 x i32] [i32 1, i32 1], [2 x i32] [i32 1, i32 2], [2 x i32] [i32 1, i32 1], [2 x i32] [i32 1, i32 0], [2 x i32] [i32 1, i32 2], [2 x i32] [i32 1, i32 2], [2 x i32] [i32 1, i32 0], [2 x i32] [i32 1, i32 0], [2 x i32] [i32 1, i32 0], [2 x i32] [i32 1, i32 1], [2 x i32] [i32 1, i32 2], [2 x i32] [i32 1, i32 2]], align 32 ; <[29 x [2 x i32]]*> [#uses=50]

	  %623 = getelementptr inbounds [29 x [2 x i32]]* @divisions, i64 0, i64 %619, i64 0 ; <i32*> [#uses=1]
	   %684 = icmp eq i32 %683, 999 

also for the "my_defs" table in 'gs', etc.

llvm-svn: 92444
2010-01-03 03:03:27 +00:00
Nick Lewycky ff9cd7ace7 Cleanup.
llvm-svn: 92436
2010-01-03 00:55:31 +00:00
Chris Lattner 98ad2b56cc teach instcombine to optimize idioms like A[i]&42 == 0. This
occurs in 403.gcc in mode_mask_array, in safe-ctype.c (which
is copied in multiple apps) in _sch_istable, etc.

llvm-svn: 92427
2010-01-02 22:08:28 +00:00
Chris Lattner b56bef45f8 Teach the table lookup optimization to generate range compares
when a consequtive sequence of elements all satisfies the 
predicate.  Like the double compare case, this generates better
code than the magic constant case and generalizes to more than
32/64 element array lookups.

Here are some examples where it triggers.  From 403.gcc, most
accesses to the rtx_class array are handled, e.g.:

@rtx_class = constant [153 x i8] c"xxxxxmmmmmmmmxxxxxxxxxxxxmxxxxxxiiixxxxxxxxxxxxxxxxxxxooxooooooxxoooooox3x2c21c2222ccc122222ccccaaaaaa<<<<<<<<<<<<<<<<<<111111111111bbooxxxxxxxxxxcc2211x", align 32 ; <[153 x i8]*> [#uses=547]
   %142 = icmp eq i8 %141, 105
@rtx_class = constant [153 x i8] c"xxxxxmmmmmmmmxxxxxxxxxxxxmxxxxxxiiixxxxxxxxxxxxxxxxxxxooxooooooxxoooooox3x2c21c2222ccc122222ccccaaaaaa<<<<<<<<<<<<<<<<<<111111111111bbooxxxxxxxxxxcc2211x", align 32 ; <[153 x i8]*> [#uses=543]
	   %165 = icmp eq i8 %164, 60      

Also, most of the 59-element arrays (mode_class/rid_to_yy, etc) 
optimized before are actually range compares.  This lets 32-bit
machines optimize them.

400.perlbmk has stuff like this:

400.perlbmk: PL_regkind, even for 32-bit:
@PL_regkind = constant [62 x i8] c"\00\00\02\02\02\06\06\06\06\09\09\0B\0B\0D\0E\0E\0E\11\12\12\14\14\16\16\18\18\1A\1A\1C\1C\1E\1F !!!$$&'((((,-.///88886789:;8$", align 32 ; <[62 x i8]*> [#uses=4]
	   %811 = icmp ne i8 %810, 33 

@PL_utf8skip = constant [256 x i8] c"\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\02\02\02\02\02\02\02\02\02\02\02\02\02\02\02\02\02\02\02\02\02\02\02\02\02\02\02\02\02\02\02\02\03\03\03\03\03\03\03\03\03\03\03\03\03\03\03\03\04\04\04\04\04\04\04\04\05\05\05\05\06\06\07\0D", align 32 ; <[256 x i8]*> [#uses=94]
	   %12 = icmp ult i8 %10, 2
           
etc.

llvm-svn: 92426
2010-01-02 21:50:18 +00:00
Chris Lattner e199d2df80 theoretically the negate we find could be in a different function, check
for this case.

llvm-svn: 92425
2010-01-02 21:46:33 +00:00
Chris Lattner 2fa4ec70fc use enums for the over/underdefined markers for clarity. Switch
to using -2/-3 instead of -1/-2 for a future xform.

llvm-svn: 92423
2010-01-02 20:20:33 +00:00
Chris Lattner 351e22aa36 remove the random sampling framework, which is not maintained anymore.
If there is interest, it can be resurrected from SVN.  PR4912.

llvm-svn: 92422
2010-01-02 20:07:03 +00:00
Nick Lewycky a67519be12 Fix logic error in previous commit. The != case needs to become an or, not an
and.

llvm-svn: 92419
2010-01-02 16:14:56 +00:00
Nick Lewycky 357d41b3c1 Optimize pointer comparison into the typesafe form, now that the backends will
handle them efficiently. This is the opposite direction of the transformation
we used to have here.

llvm-svn: 92418
2010-01-02 15:25:44 +00:00
Chris Lattner cfda435c73 Generalize the previous xform to handle cases where exactly
two elements match or don't match with two comparisons.  For
example, the testcase compiles into:

define i1 @test5(i32 %X) {
  %1 = icmp eq i32 %X, 2                          ; <i1> [#uses=1]
  %2 = icmp eq i32 %X, 7                          ; <i1> [#uses=1]
  %R = or i1 %1, %2                               ; <i1> [#uses=1]
  ret i1 %R
}

This generalizes the previous xforms when the array is larger than
64 elements (and this case matches) and generates better code for
cases where it overlaps with the magic bitshift case.

This generalizes more cases than you might expect.  For example,
400.perlbmk has:

@PL_utf8skip = constant [256 x i8] c"\01\01\01\...
%15 = icmp ult i8 %7, 7

403.gcc has:
@rid_to_yy = internal constant [114 x i16] [i16 259, i16 260, ...
%18 = icmp eq i16 %16, 295 

and xalancbmk has a bunch of examples, such as 
_ZN11xercesc_2_5L15gCombiningCharsE and _ZN11xercesc_2_5L10gBaseCharsE.

llvm-svn: 92417
2010-01-02 09:35:17 +00:00
Chris Lattner c6ac078423 fix a miscompilation I introduced of cdecl with a late change.
llvm-svn: 92416
2010-01-02 09:22:13 +00:00
Chris Lattner 935a4a606a enhance the compare/load/index optimization to work on *any* load
from a global with 32/64 elements or less (depending on whether
i64 is native on the target), generating a bitshift idiom to 
determine the result.  For example, on test4 we produce:

define i1 @test4(i32 %X) {
  %1 = lshr i32 933, %X                           ; <i32> [#uses=1]
  %2 = and i32 %1, 1                              ; <i32> [#uses=1]
  %R = icmp ne i32 %2, 0                          ; <i1> [#uses=1]
  ret i1 %R
}

This triggers in a number of interesting cases, for example, here's an
fp case:
@A.3255 = internal constant [4 x double] [double 4.100000e+00, double -3.900000e+00, double -1.000000e+00, double 1.000000e+00], align 32 ; <[4 x double]*> [#uses=7]
...
	   %7 = fcmp olt double %3, 0.000000e+00

In this case we make the slen2_tab global dead, which is nice:
@slen2_tab = internal constant [16 x i32] [i32 0, i32 1, i32 2, i32 3, i32 0, i32 1, i32 2, i32 3, i32 1, i32 2, i32 3, i32 1, i32 2, i32 3, i32 2, i32 3], align 32 ; <[16 x i32]*> [#uses=1]
...
	   %204 = icmp eq i32 %46, 0     

Perl has a bunch of these, also on the 'Perl_regkind' array:
@Perl_yygindex = internal constant [51 x i16] [i16 0, i16 0, i16 0, i16 0, i16 374, i16 351, i16 0, i16 -12, i16 0, i16 946, i16 413, i16 -83, i16 0, i16 0, i16 0, i16 -311, i16 -13, i16 4007, i16 2893, i16 0, i16 0, i16 0, i16 0, i16 0, i16 372, i16 -8, i16 0, i16 0, i16 246, i16 -131, i16 43, i16 86, i16 208, i16 -45, i16 -169, i16 987, i16 0, i16 0, i16 0, i16 0, i16 308, i16 0, i16 -271, i16 0, i16 0, i16 0, i16 0, i16 0, i16 0, i16 0, i16 0], align 32 ; <[51 x i16]*> [#uses=1]
...
  %1364 = icmp eq i16 %1361, 0

186.crafty really likes this on 64-bit machines, because it triggers on a bunch of globals like this:
@white_outpost = internal constant [64 x i8] c"\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\02\02\00\00\00\00\00\04\05\05\04\00\00\00\00\03\06\06\03\00\00\00\00\00\01\01\00\00\00\00\00\00\00\00\00\00\00", align 32 ; <[64 x i8]*> [#uses=2]

However the big winner is 403.gcc, which triggers hundreds of times, eliminating all the accesses to the 57-element arrays 'mode_class', mode_unit_size, mode_bitsize, regclass_map, etc.

go 64-bit machines :)

llvm-svn: 92415
2010-01-02 08:56:52 +00:00
Chris Lattner b1567bd584 enhance the previous optimization to work with fcmp in addition
to icmp.

llvm-svn: 92412
2010-01-02 08:20:51 +00:00
Chris Lattner a061859ccc Teach instcombine to fold compares of loads from constant
arrays with variable indices into a comparison of the index
with a constant.  The most common occurrence of this that
I see by far is stuff like:

if ("foobar"[i] == '\0') ...

which we compile into: if (i == 6), saving a load and 
materialization of the global address.  This also exposes 
loop trip count information to later passes in many cases.

This triggers hundreds of times in xalancbmk, which is where I first
noticed it, but it also triggers in many other apps.  Here are a few 
interesting ones from various apps:

@must_be_connected_without = internal constant [8 x i8*] [i8* getelementptr inbounds ([3 x i8]* @.str64320, i64 0, i64 0), i8* getelementptr inbounds ([3 x i8]* @.str27283, i64 0, i64 0), i8* getelementptr inbounds ([4 x i8]* @.str71327, i64 0, i64 0), i8* getelementptr inbounds ([4 x i8]* @.str72328, i64 0, i64 0), i8* getelementptr inbounds ([3 x i8]* @.str18274, i64 0, i64 0), i8* getelementptr inbounds ([6 x i8]* @.str11267, i64 0, i64 0), i8* getelementptr inbounds ([3 x i8]* @.str32288, i64 0, i64 0), i8* null], align 32 ; <[8 x i8*]*> [#uses=2]
  %scevgep.i = getelementptr [8 x i8*]* @must_be_connected_without, i64 0, i64 %indvar.i ; <i8**> [#uses=1]
  %17 = load ...
  %18 = icmp eq i8* %17, null                     ; <i1> [#uses=1]
-> icmp eq i64 %indvar.i, 7 


@yytable1095 = internal constant [84 x i8] c"\12\01(\05\06\07\08\09\0A\0B\0C\0D\0E1\0F\10\11266\1D: \10\11,-,0\03'\10\11B6\04\17&\18\1945\05\06\07\08\09\0A\0B\0C\0D\0E\1E\0F\10\11*\1A\1B\1C$3+>#%;<IJ=ADFEGH9KL\00\00\00C", align 32 ; <[84 x i8]*> [#uses=2]
  %57 = getelementptr inbounds [84 x i8]* @yytable1095, i64 0, i64 %56 ; <i8*> [#uses=1]
   %mode.0.in = getelementptr inbounds [9 x i32]* @mb_mode_table, i64 0, i64 %.pn ; <i32*> [#uses=1]
load ...
   %64 = icmp eq i8 %58, 4                         ; <i1> [#uses=1]
-> icmp eq i64 %.pn, 35             ; <i1> [#uses=0]


@gsm_DLB = internal constant [4 x i16] [i16 6554, i16 16384, i16 26214, i16 32767]
%scevgep.i = getelementptr [4 x i16]* @gsm_DLB, i64 0, i64 %indvar.i ; <i16*> [#uses=1]
%425 = load %scevgep.i
%426 = icmp eq i16 %425, -32768                 ; <i1> [#uses=0]
-> false

llvm-svn: 92411
2010-01-02 08:12:04 +00:00
Chris Lattner 2e4be2c340 remove the instcombine transformations that are inserting nasty
pointer to int casts that confuse later optimizations.  See PR3351
for details.

This improves but doesn't complete fix 483.xalancbmk because llvm-gcc
does this xform in GCC's "fold" routine as well.  Clang++ will do
better I guess.

llvm-svn: 92408
2010-01-02 00:31:05 +00:00
Chris Lattner faf1337acb add a simple instcombine xform, simplify another one to use hasAllZeroIndices()
instead of hand rolling a loop.

llvm-svn: 92403
2010-01-01 23:09:08 +00:00
Chris Lattner 30c0a2833d generalize the pointer difference optimization to handle
a constantexpr gep on the 'base' side of the expression.
This completes comment #4 in PR3351, which comes from
483.xalancbmk.

llvm-svn: 92402
2010-01-01 22:42:29 +00:00
Chris Lattner 4394f71752 teach instcombine to optimize pointer difference idioms involving constant
expressions.  This is a step towards comment #4 in PR3351.

llvm-svn: 92401
2010-01-01 22:29:12 +00:00
Chris Lattner 9d4c5414bb use 'match' to simplify some code.
llvm-svn: 92400
2010-01-01 22:12:03 +00:00
Chris Lattner 25c87e9cf9 implement the transform requested in PR5284
llvm-svn: 92398
2010-01-01 18:34:40 +00:00
Chris Lattner ee1f861d81 add missing line.
llvm-svn: 92384
2010-01-01 01:54:08 +00:00
Chris Lattner 8330daf733 add a few trivial instcombines for llvm.powi.
llvm-svn: 92383
2010-01-01 01:52:15 +00:00
Chris Lattner 0c59ac3f41 When factoring multiply expressions across adds, factor both
positive and negative forms of constants together.  This 
allows us to compile:

int foo(int x, int y) {
    return (x-y) + (x-y) + (x-y);
}

into:

_foo:                                                       ## @foo
	subl	%esi, %edi
	leal	(%rdi,%rdi,2), %eax
	ret

instead of (where the 3 and -3 were not factored):

_foo:
        imull   $-3, 8(%esp), %ecx
        imull   $3, 4(%esp), %eax
        addl    %ecx, %eax
        ret

this started out as:
    movl    12(%ebp), %ecx
    imull   $3, 8(%ebp), %eax
    subl    %ecx, %eax
    subl    %ecx, %eax
    subl    %ecx, %eax
    ret

This comes from PR5359.

llvm-svn: 92381
2010-01-01 01:13:15 +00:00
Chris Lattner a552683fd4 clean up some comments.
llvm-svn: 92377
2010-01-01 00:04:26 +00:00
Chris Lattner 17229a7cb8 switch from std::map to DenseMap for rank data structures.
llvm-svn: 92375
2010-01-01 00:01:34 +00:00
Chris Lattner fed3397654 reuse negates where possible instead of always creating them from scratch.
This allows us to optimize test12 into:

define i32 @test12(i32 %X) {
  %factor = mul i32 %X, -3                        ; <i32> [#uses=1]
  %Z = add i32 %factor, 6                         ; <i32> [#uses=1]
  ret i32 %Z
}

instead of:

define i32 @test12(i32 %X) {
  %Y = sub i32 6, %X                              ; <i32> [#uses=1]
  %C = sub i32 %Y, %X                             ; <i32> [#uses=1]
  %Z = sub i32 %C, %X                             ; <i32> [#uses=1]
  ret i32 %Z
}

llvm-svn: 92373
2009-12-31 20:34:32 +00:00
Chris Lattner 60c2ca743d we don't need a smallptrset to detect duplicates, the values are
sorted, so we can just do a linear scan.

llvm-svn: 92372
2009-12-31 19:49:01 +00:00
Chris Lattner 1d8979422a make reassociate more careful about not leaving around dead mul's
llvm-svn: 92370
2009-12-31 19:34:45 +00:00
Chris Lattner ed18917665 remove debug
llvm-svn: 92369
2009-12-31 19:25:19 +00:00
Chris Lattner 60b71b5c4d teach reassociate to factor x+x+x -> x*3. While I'm at it,
fix RemoveDeadBinaryOp to actually do something.

llvm-svn: 92368
2009-12-31 19:24:52 +00:00
Chris Lattner 38abecbad0 change reassociate to use SmallVector for its key datastructures
instead of std::vector.

llvm-svn: 92366
2009-12-31 18:40:32 +00:00