mutated by recursive simplification. This also enhances
ReplaceAndSimplifyAllUses to actually do a real RAUW
at the end of it, which updates any value handles
pointing to "From" to start pointing to "To". This
seems useful for debug info and random other VH users.
llvm-svn: 108415
that was actually useful here.
Chris, please double check that this is the correct interpretation. I was
pretty sure, and ran it by Nick as well.
llvm-svn: 108129
interface needs implementations to be consistent, so any code which
wants to support different semantics must use a different interface.
It's not currently worthwhile to add a new interface for this new
concept.
Document that AliasAnalysis doesn't support cross-function queries.
llvm-svn: 107776
entries associated with the value being erased in the
folding set map. These entries used to be harmless, because
a SCEVUnknown doesn't store any information about its Value*,
so having a new Value allocated at the old Value's address
wasn't a problem. But now that ScalarEvolution is storing more
information about values, this is no longer safe.
llvm-svn: 107316
on ScalarEvolution successfully folding and preserving
range information for both A-B and B-A. Now, if it gets
either one, it's sufficient.
llvm-svn: 107249
instruction to an add scev, it's not safe to blindly transfer the
inbounds flag from a gep instruction to an nsw on the scev for the
gep.
llvm-svn: 107117
assuming that loops are in canonical form, as ScalarEvolution doesn't
depend on LoopSimplify itself. Also, with indirectbr not all loops can
be simplified. This fixes PR7416.
llvm-svn: 106389
scrounging through SCEVUnknown contents and SCEVNAryExpr operands;
instead just do a simple deterministic comparison of the precomputed
hash data.
Also, since this is more precise, it eliminates the need for the slow
N^2 duplicate detection code.
llvm-svn: 105540
encapsulation to force the users of these classes to know about the internal
data structure of the Operands structure. It also can lead to errors, like in
the MSIL writer.
llvm-svn: 105539
on RAUW of functions, this is a correctness issue instead of a mere memory
usage problem.
No testcase until the new MergeFunctions can land.
llvm-svn: 103653
Also, pass true for isSigned even when creating constants for unsigned
comparisons, because the point is to create an all-ones constant,
rather than UINT64_MAX, even for integers wider than 64 bits.
llvm-svn: 102946
SimplifyICmpOperands will simplify such cases to EQ or NE. This makes
the correcntess of the code independent on SimplifyICmpOperands doing
certain simplifications.
llvm-svn: 102927
that can have a big effect :). The first is to enable the
iterative SCC passmanager juice that kicks in when the
scc passmgr detects that a function pass has devirtualized
a call. In this case, it will rerun all the passes it
manages on the SCC, up to the iteration count limit (4). This
is useful because a function pass may devirualize a call, and
we want the inliner to inline it, or pruneeh to infer stuff
about it, etc.
The second patch is to add *all* call sites to the
DevirtualizedCalls list the inliner uses. This list is
about to get renamed, but the jist of this is that the
inliner now reconsiders *all* inlined call sites as candidates
for further inlining. The intuition is this that in cases
like this:
f() { g(1); } g(int x) { h(x); }
We analyze this bottom up, and may decide that it isn't
profitable to inline H into G. Next step, we decide that it is
profitable to inline G into F, and do so, which means that F
now calls H. Even though the call from G -> H may not have been
profitable to inline, the call from F -> H may be (in this case
because a constant allows folding etc).
In my spot checks, this doesn't have a big impact on code. For
example, the LLC output for 252.eon grew from 0.02% (from
317252 to 317308) and 176.gcc actually shrunk by .3% (from 1525612
to 1520964 bytes). 252.eon never iterated in the SCC Passmgr,
176.gcc iterated at most 1 time.
llvm-svn: 102823
were still inlining self-recursive functions into other functions.
Inlining a recursive function into itself has the potential to
reduce recursion depth by a factor of 2, inlining a recursive
function into something else reduces recursion depth by exactly
1. Since inlining a recursive function into something else is a
weird form of loop peeling, turn this off.
The deleted testcase was added by Dale in r62107, since then
we're leaning towards not inlining recursive stuff ever. In any
case, if we like inlining recursive stuff, it should be done
within the recursive function itself to get the algorithm
recursion depth win.
llvm-svn: 102798
doesn't dominate the header is needed, don't check whether the increment
expression has computable loop evolution. While the operands of an
addrec are required to be loop-invariant, they're not required to
dominate any part of the loop. This fixes PR6914.
llvm-svn: 102389
Also, generalize ScalarEvolutions's min and max recognition to handle
some new forms of min and max that this change makes more common.
llvm-svn: 102234
Add the instruction pointer value for debuggability.
We now get dump output that looks like this:
Call graph node for function: 'f1'<<0x1017086b0>> #uses=1
CS<0x1017046f8> calls external node
Call graph node for function: '_ZNSt6vectorIdSaIdEEC1EmRKdRKS0_'<<0x1017086f0>> #uses=1
CS<0x0> calls external node
Call graph node for function: 'f4'<<0x1017087a0>> #uses=1
CS<0x101708c88> calls function 'f3'
llvm-svn: 102194
Fix RefreshCallGraph to use CGN->replaceCallEdge instead of hand
rolling its own loop. replaceCallEdge properly maintains the
reference counts of the nodes, fixing a crash exposed by the
iterative callgraph stuff.
llvm-svn: 102120
we have RefreshCallGraph detect when a function pass devirtualizes
a call, and have CGSCCPassMgr iterate (up to a count) when this
happens. This allows (in the example) GVN to devirtualize the
call in foo, then the inliner to inline it away.
This is not currently enabled because I haven't done any analysis
on the (potentially substantial) code size or performance impact of
doing this, and guess what, it exposes callgraph updating bugs in
various passes. This is progress though, and you can play with it
by passing -max-cg-scc-iterations=5 to opt.
llvm-svn: 101973
recursive callsites, inlining can reduce the number of calls by
exponential factors, as it does in
MultiSource/Benchmarks/Olden/treeadd. More involved heuristics
will be needed.
llvm-svn: 101969
just ask ScalarEvolution for it on demand. This helps IVUsers be more robust
in the case of expressions changing underneath it. This fixes PR6862.
llvm-svn: 101819
by switching CachedFunctionInfo from a std::map to a
ValueMap (which is implemented in terms of a DenseMap).
DenseMap has different iterator invalidation semantics
than std::map.
This should hopefully fix the dragonegg builder.
llvm-svn: 101658
CGSCC can delete nodes in regions of the callgraph that
have already been visited. If new CG nodes are allocated
to the same pointer, we shouldn't abort, just handle it
correctly by assigning a new number. This should restore
stability by removing invalidated pointers that *will* be
reused from the densemap in the iterator.
llvm-svn: 101628
to keep the node entries in scc_iterator up to date instead of dangling as
the SCC mutates.
This is a really terrible problem which was causing -g to affect codegen
because it would permute the memory image of the compiler process.
Thanks to Dale for expertly hunting it down.
llvm-svn: 101565
to CallGraphSCCPass's instead of passing around a
std::vector<CallGraphNode*>. No functionality change,
but now we have a much tidier interface.
llvm-svn: 101558
with a fix for self-hosting
rotate CallInst operands, i.e. move callee to the back
of the operand array
the motivation for this patch are laid out in my mail to llvm-commits:
more efficient access to operands and callee, faster callgraph-construction,
smaller compiler binary
llvm-svn: 101465
with a fix
rotate CallInst operands, i.e. move callee to the back
of the operand array
the motivation for this patch are laid out in my mail to llvm-commits:
more efficient access to operands and callee, faster callgraph-construction,
smaller compiler binary
llvm-svn: 101397
of the operand array
the motivation for this patch are laid out in my mail to llvm-commits:
more efficient access to operands and callee, faster callgraph-construction,
smaller compiler binary
llvm-svn: 101364
The information is already available with "opt -analyze". The DominatorTree
does also not have this in its runOnFunction. So they behave now
more consistent.
llvm-svn: 101038
AddRecs so that it checks for overflow in the computation that it is
performing, rather than just checking hasNo{Signed,Unsigned}Wrap, since
those flags are for a different computation. This fixes a bug that
impacts an upcoming change.
llvm-svn: 101028
explicitly split into stride-and-offset pairs. Also, add the
ability to track multiple post-increment loops on the same expression.
This refines the concept of "normalizing" SCEV expressions used for
to post-increment uses, and introduces a dedicated utility routine for
normalizing and denormalizing expressions.
This fixes the expansion of expressions which are post-increment users
of more than one loop at a time. More broadly, this takes LSR another
step closer to being able to reason about more than one loop at a time.
llvm-svn: 100699
source addition. Apparently the buildbots were wrong about failures.
---
Add some switches helpful for debugging:
-print-before=<Pass Name>
Dump IR before running pass <Pass Name>.
-print-before-all
Dump IR before running each pass.
-print-after-all
Dump IR after running each pass.
These are helpful when tracking down a miscompilation. It is easy to
get IR dumps and do diffs on them, etc.
To make this work well, add a new getPrinterPass API to Pass so that
each kind of pass (ModulePass, FunctionPass, etc.) can create a Pass
suitable for dumping out the kind of object the Pass works on.
llvm-svn: 100249
materializing an MDNode for every debugloc. don't do that! :)
"clang -g -S t.c" really no longer makes mdnodes for location
tuples now.
llvm-svn: 100224
representation. This eliminates the 'DILocation' MDNodes for
file/line/col tuples from -O0 -g codegen.
This remove the old DebugLoc class, making it a typedef for DebugLoc,
I'll rename NewDebugLoc next.
I didn't update the JIT to use the new apis, so it will continue to
work, but be as slow as before. Someone should eventually do this
or, better yet, rip out the JIT debug info stuff and build the JIT
on top of MC.
llvm-svn: 100209
<string> include. For some reason the buildbot choked on this while my
builds did not. It's probably due to a difference in system headers.
---
Add some switches helpful for debugging:
-print-before=<Pass Name>
Dump IR before running pass <Pass Name>.
-print-before-all
Dump IR before running each pass.
-print-after-all
Dump IR after running each pass.
These are helpful when tracking down a miscompilation. It is easy to
get IR dumps and do diffs on them, etc.
To make this work well, add a new getPrinterPass API to Pass so that
each kind of pass (ModulePass, FunctionPass, etc.) can create a Pass
suitable for dumping out the kind of object the Pass works on.
llvm-svn: 100204
-print-before=<Pass Name>
Dump IR before running pass <Pass Name>.
-print-before-all
Dump IR before running each pass.
-print-after-all
Dump IR after running each pass.
These are helpful when tracking down a miscompilation. It is easy to
get IR dumps and do diffs on them, etc.
To make this work well, add a new getPrinterPass API to Pass so that
each kind of pass (ModulePass, FunctionPass, etc.) can create a Pass
suitable for dumping out the kind of object the Pass works on.
llvm-svn: 100143
I have audited all getOperandNo calls now, fixing
hidden assumptions. CallSite related uglyness will
be eliminated successively.
Note this patch has a long and griveous history,
for all the back-and-forths have a look at
CallSite.h's log.
llvm-svn: 99399
use-before-def errors in SCEVExpander-produced code in sqlite3 when debug
info with optimization is enabled, though the testcases for this are
dependent on use-list order.
llvm-svn: 99001
This time I did a self-hosted bootstrap on Linux x86-64,
with no problems. Let's see how darwin 64-bit self-hosting
goes. At the first sign of failure I'll back this out.
Maybe the valgrind bots give me a hint of what may be wrong
(it at all).
llvm-svn: 98957
BumpPtrAllocator-allocated region to allow it to be stored in a more
compact form and to avoid the need for a non-trivial destructor call.
Use this new mechanism in ScalarEvolution instead of
FastFoldingSetNode to avoid leaking memory in the case where a
FoldingSetNodeID uses heap storage, and to reduce overall memory
usage.
llvm-svn: 98829
pointer and length, and allocate the arrays in ScalarEvolution's
BumpPtrAllocator, so that they get released when their owning
SCEV gets released. SCEVs are immutable, so they don't need to worry
about operand array resizing. This fixes a memory leak reported
in PR6637.
llvm-svn: 98755
The Caller cost info would be reset everytime a callee was inlined. If the
caller has lots of calls and there is some mutual recursion going on, the
caller cost info could be calculated many times.
This patch reduces inliner runtime from 240s to 0.5s for a function with 20000
small function calls.
This is a more conservative version of r98089 that doesn't break the clang
test CodeGenCXX/temp-order.cpp. That test relies on rather extreme inlining
for constant folding.
llvm-svn: 98099
The Caller cost info would be reset everytime a callee was inlined. If the
caller has lots of calls and there is some mutual recursion going on, the
caller cost info could be calculated many times.
This patch reduces inliner runtime from 240s to 0.5s for a function with 20000
small function calls.
llvm-svn: 98089
This patch updates LLVMDebugVersion to 8.
Debug info descriptors encoded using LLVMDebugVersion 7 is supported.
Corresponding llvmgcc and clang FE commits are required.
llvm-svn: 98020
which branch on undef to branch on a boolean constant for the edge
exiting the loop. This helps ScalarEvolution compute trip counts for
loops.
Teach ScalarEvolution to recognize single-value PHIs, when safe, and
ForgetSymbolicName to forget such single-value PHI nodes as apprpriate
in ForgetSymbolicName.
llvm-svn: 97126
argument is non-null, pass it along to PHITranslateSubExpr so that it can
prefer using existing values that dominate the PredBB, instead of just
blindly picking the first equivalent value that it finds on a uselist.
Also when the DominatorTree is specified, have PHITranslateValue filter
out any result that does not dominate the PredBB. This is basically just
refactoring the check that used to be in GetAvailablePHITranslatedSubExpr
and also in GVN.
Despite my initial expectations, this change does not affect the results
of GVN for any testcases that I could find, but it should help compile time.
Before this change, if PHITranslateSubExpr picked a value that does not
dominate, PHITranslateWithInsertion would then insert a new value, which GVN
would later determine to be redundant and would replace. By picking a good
value to begin with, we save GVN the extra work of inserting and then
replacing a new value.
llvm-svn: 97010
getelementptr. Despite only doing so in the case where x is a known
array object and c can be converted to an index within range, this
could still be invalid if c is actually the address of an object
allocated outside of LLVM. Also, SCEVExpander, the original motivation
for this code, has since been improved to avoid inttoptr+ptroint in
more cases.
llvm-svn: 96950
operators.
The test difference is just due to the multiplication operands
being commuted (and thus requiring a more elaborate match). In
optimized code, that expression would be folded.
llvm-svn: 96816
true or false as its exit condition. These are usually eliminated by
SimplifyCFG, but the may be left around during a pass which wishes
to preserve the CFG.
llvm-svn: 96683
a loop exit value, so that if a loop gets deleted, ScalarEvolution
isn't stick holding on to dangling SCEVAddRecExprs for that loop. This
fixes PR6339.
llvm-svn: 96626
case where there are loop-invariant instructions somehow left
inside the loop, and in a position where they won't dominate
the IV increment position.
llvm-svn: 96448
name, test whether the SCEV itself is that temporary symbolic name,
in addition to checking whether the symbolic name appears as a
possibly-indirect operand.
llvm-svn: 96216
insert location has become an "inserted" instruction since the time
it was saved. If so, advance to the first non-"inserted" instruction.
llvm-svn: 96203
current insertion point, advance the current insertion point.
This avoids a use-before-def situation in a testcase extracted
from clang which is difficult to reduce to a reasonable-sized
regression test.
llvm-svn: 96151
bug fixes, and with improved heuristics for analyzing foreign-loop
addrecs.
This change also flattens IVUsers, eliminating the stride-oriented
groupings, which makes it easier to work with.
llvm-svn: 95975
cases, and implement target-independent folding rules for alignof and
offsetof. Also, reassociate reassociative operators when it leads to
more folding.
Generalize ScalarEvolution's isOffsetOf to recognize offsetof on
arrays. Rename getAllocSizeExpr to getSizeOfExpr, and getFieldOffsetExpr
to getOffsetOfExpr, for consistency with analagous ConstantExpr routines.
Make the target-dependent folder promote GEP array indices to
pointer-sized integers, to make implicit casting explicit and exposed
to subsequent folding.
And add a bunch of testcases for this new functionality, and a bunch
of related existing functionality.
llvm-svn: 94987
use plain SCEVUnknowns with ConstantExpr::getSizeOf and
ConstantExpr::getOffsetOf constants. This eliminates a bunch of
special-case code.
Also add code for pattern-matching these expressions, for clients that
want to recognize them.
Move ScalarEvolution's logic for expanding array and vector sizeof
expressions into an element count times the element size, to expose
the multiplication to subsequent folding, into the regular constant
folder.
llvm-svn: 94737
After running a batch of measurements, it is clear that the inliner metrics
need some adjustments:
Own argument bonus: 20 -> 5
Outgoing argument penalty: 0 -> 5
Alloca bonus: 10 -> 5
Constant instr bonus: 7 -> 5
Dead successor bonus: 40 -> 5*(avg instrs/block)
The new cost metrics are generaly 25 points higher than before, so we may need
to move thresholds.
With this change, InlineConstants::CallPenalty becomes a political correction:
if (!isa<IntrinsicInst>(II) && !callIsSmall(CS.getCalledFunction()))
NumInsts += InlineConstants::CallPenalty + CS.arg_size();
The code size is accurately modelled by CS.arg_size(). CallPenalty is added
because calls tend to take a long time, so it may not be worth it to inline a
function with lots of calls.
All of the political corrections are in the InlineConstants namespace:
IndirectCallBonus, CallPenalty, LastCallToStaticBonus, ColdccPenalty,
NoreturnPenalty.
llvm-svn: 94615
A GEP with all constant indices is already considered free by
analyzeBasicBlock(), so don't give it an extra bonus in
CountCodeReductionForAlloca().
This patch should remove a small positive bias toward inlining functions with
variable-index GEPs, and remove a smaller negative bias from functions with
all-constant index GEPs.
llvm-svn: 94591
Functions containing indirectbr are marked NeverInline by analyzeBasicBlock(),
so there is no point in giving indirectbr special treatment in
CountCodeReductionForConstant. It is never called.
No functional change intended.
llvm-svn: 94590
have trouble with an intermediate add overflowing. Also, be more conservative
about the case where the induction variable in an SLT loop exit can step past
the RHS of the SLT and overflow in a single step.
Make getSignedRange more aggressive, to recover for some common cases which
the above fixes pessimized.
This addresses rdar://7561161.
llvm-svn: 94512
missing ones are libsupport, libsystem and libvmcore. libvmcore is
currently blocked on bugpoint, which uses EH. Once it stops using
EH, we can switch it off.
This #if 0's out 3 unit tests, because gtest requires RTTI information.
Suggestions welcome on how to fix this.
llvm-svn: 94164
This new version is much more aggressive about doing "full" reduction in
cases where it reduces register pressure, and also more aggressive about
rewriting induction variables to count down (or up) to zero when doing so
reduces register pressure.
It currently uses fairly simplistic algorithms for finding reuse
opportunities, but it introduces a new framework allows it to combine
multiple strategies at once to form hybrid solutions, instead of doing
all full-reduction or all base+index.
llvm-svn: 94061
form of an expression. This is the expression without the
post-increment adjustment made, which is useful in determining
which registers will be used by the expansion.
llvm-svn: 93921
Move the DOTGraphTraits dotty printer/viewer templates, that were developed for
the dominance tree into their own header file. This will allow reuse in future
passes.
llvm-svn: 93632
This patch also cleans up code that expects there to be a bitcast in the first argument and testcases that call llvm.dbg.declare.
It also strips old llvm.dbg.declare intrinsics that did not pass metadata as the first argument.
llvm-svn: 93531
memcpy, memset and other intrinsics that only access their arguments
to be readnone if the intrinsic's arguments all point to local memory.
This improves the testcase in the README to readonly, but it could in
theory be made readnone, however this would involve more sophisticated
analysis that looks through the memcpy.
llvm-svn: 92829
after the MDNode in memory. This eliminates the operands
pointer and saves a new[] per node.
Note that the code in DIDerivedType::replaceAllUsesWith is wrong
and quite scary. A MDNode should not be RAUW'd with something
else: this changes all uses of the mdnode, which may not be debug
info related! Debug info should use something non-mdnode for
declarations.
llvm-svn: 92321
getMDKindID/getMDKindNames methods to LLVMContext (and add
convenience methods to Module), eliminating MetadataContext.
Move the state that it maintains out to LLVMContext.
llvm-svn: 92259
I asked Devang to do back on Sep 27. Instead of going through the
MetadataContext class with methods like getMD() and getMDs(), just
ask the instruction directly for its metadata with getMetadata()
and getAllMetadata().
This includes a variety of other fixes and improvements: previously
all Value*'s were bloated because the HasMetadata bit was thrown into
value, adding a 9th bit to a byte. Now this is properly sunk down to
the Instruction class (the only place where it makes sense) and it
will be folded away somewhere soon.
This also fixes some confusion in getMDs and its clients about
whether the returned list is indexed by the MDID or densely packed.
This is now returned sorted and densely packed and the comments make
this clear.
This introduces a number of fixme's which I'll follow up on.
llvm-svn: 92235
instead of stored. This reduces memdep memory usage, and also eliminates a bunch of
weakvh's. This speeds up gvn on gcc.c-torture/20001226-1.c from 23.9s to 8.45s (2.8x)
on a different machine than earlier.
llvm-svn: 91885
cache a pointer as being unavailable due to phi trans in the
wrong place. This would cause later queries to fail even when
they didn't involve phi trans.
llvm-svn: 91787
contains another loop, or an instruction. The loop form is
substantially more efficient on large loops than the typical
code it replaces.
llvm-svn: 91654
of 91296 that caused trouble -- the Processed list needs to be
preserved for the livetime of the pass, as AddUsersIfInteresting
is called from other passes.
llvm-svn: 91641
isPodLike type trait. This is a generally useful type trait for
more than just DenseMap, and we really care about whether something
acts like a pod, not whether it really is a pod.
llvm-svn: 91421
add, there is no need to scan the world to find the same add again.
This invalidates the previous testcase, which wasn't wonderful anyway,
because it needed a run of instcombine to permute the use-lists in
just the right way to before GVN was run (so it was really fragile).
Not a big loss.
llvm-svn: 90973
phi translation of complex expressions like &A[i+1]. This has the
following benefits:
1. The phi translation logic is all contained in its own class with
a strong interface and verification that it is self consistent.
2. The logic is more correct than before. Previously, if intermediate
expressions got PHI translated, we'd miss the update and scan for
the wrong pointers in predecessor blocks. @phi_trans2 is a testcase
for this.
3. We have a lot less code in memdep.
We can handle phi translation across blocks of things like @phi_trans3,
which is pretty insane :).
This patch should fix the miscompiles of 255.vortex, and I tested it
with a bootstrap of llvm-gcc, llvm-test and dejagnu of course.
llvm-svn: 90926
examines; fall back to a conservative answer if there are
more. This works around some several compile time problems
resulting from BasicAliasAnalysis calling PointerMayBeCaptured.
The value has been chosen arbitrarily.
This fixes rdar://7438917 and may partially address PR5708.
llvm-svn: 90905
The semantics of llvm.dbg.value are that starting from where it is executed, an offset into the specified user source variable is specified to get a new value.
An example:
call void @llvm.dbg.value(metadata !{ i32 7 }, i64 0, metadata !2)
Here the user source variable associated with metadata #2 gets the value "i32 7" at offset 0.
llvm-svn: 90788
was being added to the Result vector, but not being put in the
cache. This means that if the cache was reused wholesale for a
later query that it would be missing this entry and we'd do an
incorrect load elimination.
Unfortunately, it's not really possible to write a useful
testcase for this, but this unbreaks 255.vortex.
llvm-svn: 90093
if we don't have an address expression available in a predecessor,
then model this as the value being clobbered at the end of the pred
block instead of being modeled as a complete phi translation failure.
This is important for PRE of loads because we want to see that the
load is available in all but this predecessor, and complete phi
translation failure results in not getting any information about
predecessors.
This doesn't do anything until I renable code insertion since PRE
now sees that it is available in all but one predecessors, but can't
insert the addressing in the predecessor that is missing it to
eliminate the redundancy.
llvm-svn: 90037
translation of add with immediate. This allows us
to optimize this function:
void test(int N, double* G) {
long j;
G[1] = 1;
for (j = 1; j < N - 1; j++)
G[j+1] = G[j] + G[j+1];
}
to only do one load every iteration of the loop.
llvm-svn: 90013
ConstantExpr, not just the top-level operator. This allows it to
fold many more constants.
Also, make GlobalOpt call ConstantFoldConstantExpression on
GlobalVariable initializers.
llvm-svn: 89659
same object to be a non-capture; Duncan pointed out a way that such
a comparison could be a capture.
Make the rule that considers a comparison against null more specific,
and only consider noalias return values compared against null. This
still supports test/Transforms/GVN/nonescaping-malloc.ll, and is not
susceptible to the problem Duncan pointed out with noalias arguments.
llvm-svn: 89468
because if the results from getUnderlyingObject match, the values must
be from the same underlying object, even if we don't know what that
object is.
llvm-svn: 89434
careful about crazy methods of capturing pointers using comparisons.
Comparisons of identified objects with null in the default address
space are not captures. And, comparisons of two pointers within the
same identified object are not captures.
llvm-svn: 89421
if it is not ultimately captured. Teach BasicAliasAnalysis that a
local object address which does not escape and is never stored does
not alias with a value resulting from a load.
llvm-svn: 89398
cannot be folded into target cmp instruction.
- Avoid a phase ordering issue where early cmp optimization would prevent the
later count-to-zero optimization.
- Add missing checks which could cause LSR to reuse stride that does not have
users.
- Fix a bug in count-to-zero optimization code which failed to find the pre-inc
iv's phi node.
- Remove, tighten, loosen some incorrect checks disable valid transformations.
- Quite a bit of code clean up.
llvm-svn: 86969
This allows StringRef to skip controversial if(str) check in constructor.
Buildbots, wait for corresponding clang and llvm-gcc FE check-ins!
llvm-svn: 86914
start using them in a trivial way when -enable-jump-threading-lvi
is passed. enable-jump-threading-lvi will be my playground for
awhile.
llvm-svn: 86789
except that the result may not be a constant. Switch jump threading to
use it so that it gets things like (X & 0) -> 0, which occur when phi preds
are deleted and the remaining phi pred was a zero.
llvm-svn: 86637
This patch forbids implicit conversion of DenseMap::const_iterator to
DenseMap::iterator which was possible because DenseMapIterator inherited
(publicly) from DenseMapConstIterator. Conversion the other way around is now
allowed as one may expect.
The template DenseMapConstIterator is removed and the template parameter
IsConst which specifies whether the iterator is constant is added to
DenseMapIterator.
Actually IsConst parameter is not necessary since the constness can be
determined from KeyT but this is not relevant to the fix and can be addressed
later.
Patch by Victor Zverovich!
llvm-svn: 86636
takes decimated instructions and applies identities to them. This
is pretty minimal at this point, but I plan to pull some instcombine
logic out into these and similar routines.
llvm-svn: 86613
Here is the original commit message:
This commit updates malloc optimizations to operate on malloc calls that have constant int size arguments.
Update CreateMalloc so that its callers specify the size to allocate:
MallocInst-autoupgrade users use non-TargetData-computed allocation sizes.
Optimization uses use TargetData to compute the allocation size.
Now that malloc calls can have constant sizes, update isArrayMallocHelper() to use TargetData to determine the size of the malloced type and the size of malloced arrays.
Extend getMallocType() to support malloc calls that have non-bitcast uses.
Update OptimizeGlobalAddressOfMalloc() to optimize malloc calls that have non-bitcast uses. The bitcast use of a malloc call has to be treated specially here because the uses of the bitcast need to be replaced and the bitcast needs to be erased (just like the malloc call) for OptimizeGlobalAddressOfMalloc() to work correctly.
Update PerformHeapAllocSRoA() to optimize malloc calls that have non-bitcast uses. The bitcast use of the malloc is not handled specially here because ReplaceUsesOfMallocWithGlobal replaces through the bitcast use.
Update OptimizeOnceStoredGlobal() to not care about the malloc calls' bitcast use.
Update all globalopt malloc tests to not rely on autoupgraded-MallocInsts, but instead use explicit malloc calls with correct allocation sizes.
llvm-svn: 86311
MallocInst-autoupgrade users use non-TargetData-computed allocation sizes.
Optimization uses use TargetData to compute the allocation size.
Now that malloc calls can have constant sizes, update isArrayMallocHelper() to use TargetData to determine the size of the malloced type and the size of malloced arrays.
Extend getMallocType() to support malloc calls that have non-bitcast uses.
Update OptimizeGlobalAddressOfMalloc() to optimize malloc calls that have non-bitcast uses. The bitcast use of a malloc call has to be treated specially here because the uses of the bitcast need to be replaced and the bitcast needs to be erased (just like the malloc call) for OptimizeGlobalAddressOfMalloc() to work correctly.
Update PerformHeapAllocSRoA() to optimize malloc calls that have non-bitcast uses. The bitcast use of the malloc is not handled specially here because ReplaceUsesOfMallocWithGlobal replaces through the bitcast use.
Update OptimizeOnceStoredGlobal() to not care about the malloc calls' bitcast use.
Update all globalopt malloc tests to not rely on autoupgraded-MallocInsts, but instead use explicit malloc calls with correct allocation sizes.
llvm-svn: 86077
ArraySize * ElementSize
ElementSize * ArraySize
ArraySize << log2(ElementSize)
ElementSize << log2(ArraySize)
Refactor isArrayMallocHelper and delete isSafeToGetMallocArraySize, so that there is only 1 copy of the malloc array determining logic.
Update users of getMallocArraySize() to not bother calling isArrayMalloc() as well.
llvm-svn: 85421
Update all analysis passes and transforms to treat free calls just like FreeInst.
Remove RaiseAllocations and all its tests since FreeInst no longer needs to be raised.
llvm-svn: 84987
non-type-safe constant initializers. This sort of thing happens
quite a bit for 4-byte loads out of string constants, unions,
bitfields, and an interesting endianness check from sqlite, which
is something like this:
const int sqlite3one = 1;
# define SQLITE_BIGENDIAN (*(char *)(&sqlite3one)==0)
# define SQLITE_LITTLEENDIAN (*(char *)(&sqlite3one)==1)
# define SQLITE_UTF16NATIVE (SQLITE_BIGENDIAN?SQLITE_UTF16BE:SQLITE_UTF16LE)
all of these macros now constant fold away.
This implements PR3152 and is based on a patch started by Eli, but heavily
modified and extended.
llvm-svn: 84936
Analysis/ConstantFolding.cpp. This doesn't change the behavior of
instcombine but makes other clients of ConstantFoldInstruction
able to handle loads. This was partially extracted from Eli's patch
in PR3152.
llvm-svn: 84836
container of the blocks and do efficient lookups. This makes
isLoopSimplifyForm much faster on large loops, fixing a significant
compile-time issue in builds with assertions enabled.
llvm-svn: 84673
identifying the malloc as a non-array malloc. This broke GlobalOpt's optimization of stores of mallocs
to global variables.
The fix is to classify malloc's into 3 categories:
1. non-array mallocs
2. array mallocs whose array size can be determined
3. mallocs that cannot be determined to be of type 1 or 2 and cannot be optimized
getMallocArraySize() returns NULL for category 3, and all users of this function must avoid their
malloc optimization if this function returns NULL.
Eventually, currently unexpected codegen for computing the malloc's size argument will be supported in
isArrayMalloc() and getMallocArraySize(), extending malloc optimizations to those examples.
llvm-svn: 84199
cannot alias the GEP. GEP pointer alias rule states this clearly:
A pointer value formed from a getelementptr instruction is associated with the
addresses associated with the first operand of the getelementptr.
llvm-svn: 84079
question, can we get rid of the BasicBlock versions of all inserters
and use Head == 0 to indicate the old case when GetInsertBlock == 0?
llvm-svn: 83216
information. This allows arbitrary code involving DW_OP_plus_uconst
and DW_OP_deref. The scheme allows for easy extention to include,
any, or all of the DW_OP_ opcodes. I thought about just exposing all
of them, but, wasn't sure if people wanted the dwarf opcodes exposed
in the api. Is that a layering violation?
With this scheme, the entire existing block scheme used by llvm-gcc
can be switched over to the new scheme. I think that would be
cleaner, as then the compiler specific bits are not present in llvm
proper. Before the old code can be yanked however, similar code in
clang would have to be removed.
Next up, more testing.
llvm-svn: 83120
the PassManager code into a regular verifyAnalysis method.
Also, reorganize loop verification. Make the LoopPass infrastructure
call verifyLoop as needed instead of having LoopInfo::verifyAnalysis
check every loop in the function after each looop pass. Add a new
command-line argument, -verify-loop-info, to enable the expensive
full checking.
llvm-svn: 82952
code that stops the timer doesn't have to search to find the timer
object before it stops the timer. This avoids a lock acquisition
and a few other things done with the timer running.
llvm-svn: 82949
LoopPasses for that loop. This avoids trouble with the PassManager
trying to call verifyAnalysis on them, and frees up some memory
sooner rather than later.
llvm-svn: 82945
aren't in canonical loop-simplify form, since it doesn't itself depend
on LoopSimplify. This means handling loops without preheaders and loops
with multiple backedges.
llvm-svn: 82905
test whether it properly dominates the loop header. This is equivalent
when the loop has a preheader, and has the advantage of working when
the loop doesn't have a preheader. Since IVUsers doesn't Require
LoopSimplify, the loop isn't guaranteed to have a preheader.
llvm-svn: 82899
to. This can be combined with LCSSA or SSI form to store more information on a
PHINode than can be computed by looking at its incoming values.
llvm-svn: 82317
In getMallocArraySize(), fix bug in the case that array size is the product of 2 constants.
Extend isArrayMalloc() and getMallocArraySize() to handle case where malloc is used as char array.
Ensure that ArraySize in LowerAllocations::runOnBasicBlock() is correct type.
Extend Instruction::isSafeToSpeculativelyExecute() to handle malloc calls.
Add verification for malloc calls.
Reviewed by Dan Gohman.
llvm-svn: 82257
where the induction variable has a non-unit stride, such as {0,+,2}, and
there are expressions such as {1,+,2} inside the loop formed with
or or add nsw operators.
llvm-svn: 82151
argpromote to avoid invalidating an iterator. This fixes PR4977.
All clang tests now pass with expensive checking (on my system
at least).
llvm-svn: 81843
how to fold notionally-out-of-bounds array getelementptr indices instead
of just doing these in lib/Analysis/ConstantFolding.cpp, because it can
be done in a fairly general way without TargetData, and because not all
constants are visited by lib/Analysis/ConstantFolding.cpp. This enables
more constant folding.
Also, set the "inbounds" flag when the getelementptr indices are
one-past-the-end.
llvm-svn: 81483
Fixed non working -profile-verifier-noassert option.
Fixed missing newline in debugEntry().
Cleaned up assert messages. (assert(0 && Message) is still shown, but the message is printed before.)
When verifiying loaded profiles the ProfileVerifier got confused when block was a setjmp target, this is checked now.
When verifiying loaded profiles the ProfileVerifier got confused when block eventually reaching an exit(), this is checked now.
llvm-svn: 81338
that get created during loop unswitching, and fix SplitBlockPredecessors'
LCSSA updating code to create new PHIs instead of trying to just move
existing ones.
Also, optimize Loop::verifyLoop, since it gets called a lot. Use
searches on a sorted list of blocks instead of calling the "contains"
function, as is done in other places in the Loop class, since "contains"
does a linear search. Also, don't call verifyLoop from LoopSimplify or
LCSSA, as the PassManager is already calling verifyLoop as part of
LoopInfo's verifyAnalysis.
llvm-svn: 81221
D test/Analysis/Profiling
--- Reverse-merging r80907 into '.':
U lib/Analysis/ProfileInfoLoaderPass.cpp
Attempt to remove failure in the self-hosting build bot.
llvm-svn: 80966
and exact flags. Because ConstantExprs are uniqued, creating an
expression with this flag causes all expressions with the same operands
to have the same flag, which may not be safe. Add, sub, mul, and sdiv
ConstantExprs are usually folded anyway, so the main interesting flag
here is inbounds, and the constant folder already knows how to set the
inbounds flag automatically in most cases, so there isn't an urgent need
for the API support.
This can be reconsidered in the future, but for now just removing these
API bits eliminates a source of potential trouble with little downside.
llvm-svn: 80959
Optimal edge profiling is only possible when blocks with no predecessors get an
virtual edge (BB,0) that counts the execution frequencies of this
function-exiting blocks.
This patch makes the necessary changes before actually enabling optimal edge profiling.
llvm-svn: 80667
This adds a pass to verify the current profile against the flow conditions.
This is very helpful when later on trying to perserve the profiling information
during all passes.
llvm-svn: 80666
for sanity. This didn't turn up any bugs.
Change CallGraphNode to maintain its "callsite" information in the
call edges list as a WeakVH instead of as an instruction*. This fixes
a broad class of dangling pointer bugs, and makes CallGraph have a number
of useful invariants again. This fixes the class of problem indicated
by PR4029 and PR3601.
llvm-svn: 80663
SCEVUnknowns, as the non-SCEVUnknown cases in the getSCEVAtScope code
can also end up repeatedly climing through the same expression trees,
which can be unusably slow when the trees are very tall.
Also, add a quick check for SCEV pointer equality to the main
SCEV comparison routine, as the full comparison code can be expensive
in the case of large expression trees.
These fix compile-time problems in some pathlogical cases.
llvm-svn: 80623
stem from the fact that we have two types of passes that need to update it:
1. callgraphscc and module passes that are explicitly aware of it
2. Functionpasses (and loop passes etc) that are interlaced with CGSCC passes
by the CGSCC Passmgr.
In the case of #1, we can reasonably expect the passes to update the call
graph just like any analysis. However, functionpasses are not and generally
should not be CG aware. This has caused us no end of problems, so this takes
a new approach. Logically, the CGSCC Pass manager can rescan every function
after it runs a function pass over it to see if the functionpass made any
updates to the IR that affect the callgraph. This allows it to catch new calls
introduced by the functionpass.
In practice, doing this would be slow. This implementation keeps track of
whether or not the current scc is dirtied by a function pass, and, if so,
delays updating the callgraph until it is actually needed again. This was
we avoid extraneous rescans, but we still have good invariants when the
callgraph is needed.
Step #2 of the "give Callgraph some sane invariants" is to change CallGraphNode
to use a CallBackVH for the callsite entry of the CallGraphNode. This way
we can immediately remove entries from the callgraph when a FunctionPass is
active instead of having dangling pointers. The current pass tries to tolerate
these dangling pointers, but it is just an evil hack.
This is related to PR3601/4835/4029. This also reverts r80541, a hack working
around the sad lack of invariants.
llvm-svn: 80566
indirect function pointer, inline it, then go to delete the body.
The problem is that the callgraph had other references to the function,
though the inliner had no way to know it, so we got a dangling pointer
and an invalid iterator out of the deal.
The fix to this is pretty simple: stop the inliner from deleting the
function by knowing that there are references to it. Do this by making
CallGraphNodes contain a refcount. This requires moving deletion of
available_externally functions to the module-level cleanup sweep where
it belongs.
llvm-svn: 80533
argpromotion and structretpromote. Basically, when replacing
a function, they used the 'changeFunction' api which changes
the entry in the function map (and steals/reuses the callgraph
node).
This has some interesting effects: first, the problem is that it doesn't
update the "callee" edges in any callees of the function in the call graph.
Second, this covers for a major problem in all the CGSCC pass stuff, which
is that it is completely broken when functions are deleted if they *don't*
reuse a CGN. (there is a cute little fixme about this though :).
This patch changes the protocol that CGSCC passes must obey: now the CGSCC
pass manager copies the SCC and preincrements its iterator to avoid passes
invalidating it. This allows CGSCC passes to mutate the current SCC. However
multiple passes may be run on that SCC, so if passes do this, they are now
required to *update* the SCC to be current when they return.
Other less interesting parts of this patch are that it makes passes update
the CG more directly, eliminates changeFunction, and requires clients of
replaceCallSite to specify the new callee CGN if they are changing it.
llvm-svn: 80527
This is a simple AliasAnalysis implementation which works by making
ScalarEvolution queries. ScalarEvolution has a more complete understanding
of arithmetic than BasicAA's collection of ad-hoc checks, so it handles
some cases that BasicAA misses, for example p[i] and p[i+1] within the
same iteration of a loop.
This is currently experimental. It may be that the main use for this pass
will be to help find cases where BasicAA can be profitably extended, or
to help in the development of the overall AliasAnalysis infrastructure,
however it's also possible that it could grow up to become a directly
useful pass.
llvm-svn: 80098
will always return the same value. This isn't currently necessary,
since this code doesn't currently ever get called under circumstances
where it would matter, but it may some day.
llvm-svn: 80017
This is conventional command-line tool behavior. -f now just means
"enable binary output on terminals".
Add a -f option to llvm-extract and llvm-link, for consistency.
Remove F_Force from raw_fd_ostream and enable overwriting and
truncating by default. Introduce an F_Excl flag to permit users to
enable a failure when the file already exists. This flag is
currently unused.
Update Makefiles and documentation accordingly.
llvm-svn: 79990
*) introducing new data type and export function of edge info for whole function (preparation for next patch).
*) renaming variables to make clear distinction between data and containers that contain this data.
*) updated comments and whitespaces.
*) made ProfileInfo::MissingValue a double (as it should be...).
(Discussed at http://lists.cs.uiuc.edu/pipermail/llvm-commits/Week-of-Mon-20090817/084955.html.)
llvm-svn: 79940
array member of a struct, it's possible to land in an arbitrary position
inside that struct, such that attempting to find further getelementptr
indices will fail. In such cases, folding cannot be done.
llvm-svn: 79485
static extents of the static array type, it causes GlobalOpt and
other passes to be more conservative. This canonicalization also
allows the constant folder to add "inbounds" to GEPs.
llvm-svn: 79440
TargetData is not present. It still uses TargetData when available.
This generalization also fixed some limitations in the TargetData
case; the attached testcase covers this.
llvm-svn: 79344
- Part of optimal static profiling patch sequence by Andreas Neustifter.
- Store edge, block, and function information separately for each functions
(instead of in one giant map).
- Return frequencies as double instead of int, and use a sentinel value for
missing information.
llvm-svn: 78477
LoopDependenceAnalysis::getLoops is currently O(N*M) for a loop-nest of
depth N and a compound SCEV of M atomic SCEVs. As both N and M will
typically be very small, this should not be a problem. If it turns out
to be one, rewriting getLoops as SCEVVisitor will reduce complexity to
O(M).
llvm-svn: 78394
We can not simply apply ZIV testing to the pointer offsets, as this
would incorrectly return independence for e.g. (GEP x,0,i; GEP x,1,-i).
llvm-svn: 78155
affected after a PHI node has been analyzed, just remove affected
SCEVs from the Scalars map, so that they'll be (lazily) recreated as
needed. This avoids creating SCEV objects that aren't actually needed.
Also, rewrite the associated def-use walking code to be non-recursive
and to continue traversing past Instructions that don't have an
entry in the Scalars map.
llvm-svn: 77032
- Some clients which used DOUT have moved to DEBUG. We are deprecating the
"magic" DOUT behavior which avoided calling printing functions when the
statement was disabled. In addition to being unnecessary magic, it had the
downside of leaving code in -Asserts builds, and of hiding potentially
unnecessary computations.
llvm-svn: 77019
- Yay for '-'s and simplifications!
- I kept StringMap::GetOrCreateValue for compatibility purposes, this can
eventually go away. Likewise the StringMapEntry Create functions still follow
the old style.
- NIFC.
llvm-svn: 76888
This introduces an LDA-internal DependencePair class. The intention is,
that this is a place where dependence testers can store various results
such as SCEVs describing conflicting iterations, breaking conditions,
distance/direction vectors, etc.
llvm-svn: 76877
(x pred y) with more thorough code that does more complete canonicalization
before resorting to range checks. This helps it find more cases where
the canonicalized expressions match.
llvm-svn: 76671
Getelementptrs that are defined to wrap are virtually useless to
optimization, and getelementptrs that are undefined on any kind
of overflow are too restrictive -- it's difficult to ensure that
all intermediate addresses are within bounds. I'm going to take
a different approach.
Remove a few optimizations that depended on this flag.
llvm-svn: 76437
all values belonging to the intersection will belong to the resulting range.
The former was inconsistent about that point (either way is fine, just pick
one.) This is part of PR4545.
llvm-svn: 76289
GEPOperator's hasNoPointer0verflow(), and make a few places in instcombine
that create GEPs that may overflow clear the NoOverflow value. Among
other things, this partially addresses PR2831.
llvm-svn: 76252
in a convenient manner, factoring out some common code from
InstructionCombining and ValueTracking. Move the contents of
BinaryOperators.h into Operator.h and use Operator to generalize them
to support ConstantExprs as well as Instructions.
llvm-svn: 76232
isSafeToSpeculativelyExecute. The new method is a bit closer to what
the callers actually care about in that it rejects more things callers
don't want. It also adds more precise handling for integer
division, and unifies code for analyzing the legality of a speculative
load.
llvm-svn: 76150
the operands have pointer type, so that the resulting type matches
the original SCEV type, and so that unnecessary ptrtoints are
avoided in common cases.
llvm-svn: 75680
For now this only computes the allocated size of the memory pointed to by a
pointer, and offset a pointer from allocated pointer.
The actual checkLimits part will come later, after another round of review.
llvm-svn: 75657
This adds location info for all llvm_unreachable calls (which is a macro now) in
!NDEBUG builds.
In NDEBUG builds location info and the message is off (it only prints
"UREACHABLE executed").
llvm-svn: 75640
works similar to isLoopInvariant, except that it will do trivial
hoisting to try to make the value loop invariant if it isn't already.
This makes it easier for transformation passes to clear trivial
instructions out of the way (the regular LICM pass doesn't run
until relatively late). This is code factored out of LoopSimplify
and other places.
llvm-svn: 75578
and related functions out of LoopBase and into Loop, since they
are specific to BasicBlock-based loops. This also allows the code
to be moved out-of-line.
llvm-svn: 75523
using the Curiously Recurring Template Pattern with LoopBase.
This will help further refactoring, and future functionality for
Loop. Also, Headers can now foward-declare Loop, instead of pulling
in LoopInfo.h or doing tricks.
llvm-svn: 75519
SCEVZeroExtendExpr ahead of the most expensive analysis. This
speeds up analysis and helps avoid pathologically bad behavior
on the testcase in PR4534.
llvm-svn: 75496
so that all code paths get it. PR4256 was about a case where the
phi translation loop would find all preds in the Visited cache, so
it could get by without re-sorting the NonLocalPointerDeps cache.
Fix this by resorting it earlier, there is no reason not to do this.
This patch inspired by Jakub Staszak's patch.
llvm-svn: 75476
This involves temporarily hard wiring some parts to use the global context. This isn't ideal, but it's
the only way I could figure out to make this process vaguely incremental.
llvm-svn: 75445
Make llvm_unreachable take an optional string, thus moving the cerr<< out of
line.
LLVM_UNREACHABLE is now a simple wrapper that makes the message go away for
NDEBUG builds.
llvm-svn: 75379
to a loop deletion more thorough. Don't prune the def-use tree search at
instructions that don't have SCEVs computed, because an instruction with
a user that has a computed SCEV may itself lack a computed SCEV. Also,
remove loop-related values from the ValuesAtScopes and
ConstantEvolutionLoopExitValues maps as well.
This fixes a regression in 483.xalancbmk.
llvm-svn: 75030
Constant. This lets ConstantInts be handled as SCEVConstant instead
of SCEVUnknown, as getUnknown no longer has special-case code for
ConstantInt and friends. This usually doesn't affect the final
output, since the constants end up getting folded later, but it
does make intermediate expressions more obvious in many cases.
llvm-svn: 74459
an individual exhaustive evaluation reflects only the exit value
implied by an individual exit, which may differ from the actual
exit value of the loop if there are other exits. This fixes PR4477.
llvm-svn: 74447
Also don't call finalizers for LoopPass if initialization was not called.
Add a unittest that tests that these methods are called, in the proper
order, and the correct number of times.
llvm-svn: 74438
This helps it avoid reusing an instruction that doesn't dominate all
of the users, in cases where the original instruction was inserted
before all of the users were known. This may result in redundant
expansions of sub-expressions that depend on loop-unpredictable values
in some cases, however this isn't very common, and it primarily impacts
IndVarSimplify, so GVN can be expected to clean these up.
This eliminates the need for IndVarSimplify's FixUsesBeforeDefs,
which fixes several bugs.
llvm-svn: 74352
nesting order of nested AddRec expressions to skip the transformation
if it would introduce an AddRec with operands not loop-invariant
with respect to its loop.
llvm-svn: 74343
trip counts in more cases.
Generalize ScalarEvolution's isLoopGuardedByCond code to recognize
And and Or conditions, splitting the code out into an
isNecessaryCond helper function so that it can evaluate Ands and Ors
recursively, and make SCEVExpander be much more aggressive about
hoisting instructions out of loops.
test/CodeGen/X86/pr3495.ll has an additional instruction now, but
it appears to be due to an arbitrary register allocation difference.
llvm-svn: 74048
createSCEV. Also, recognize UndefValue in createSCEV.
Change getIntegerSCEV's comment to avoid mentioning FP types,
and re-implement it in terms of getConstant instead of getUnknown.
llvm-svn: 74041
This also throws out the SCEV reference counting scheme, as the the SCEVs now have a lifetime controlled by the
ScalarEvolution pass.
Note that SCEVHandle is now a no-op, and will be remove in a future commit.
llvm-svn: 73892
counts for loops with multiple exits, replacing more conservative code
which only handled constants. This is derived from a patch by
Nick Lewycky.
This also fixes llc aborts in ClamAV and others, as
getUMinFromMismatchedTypes takes care of balancing the types before
working with them.
llvm-svn: 73884
blocks, and also exit blocks with multiple conditions (combined
with (bitwise) ands and ors). It's often infeasible to compute an
exact trip count in such cases, but a useful upper bound can often
be found.
llvm-svn: 73866
SCEVUnknowns with identical Instructions to be equal. This allows
it to analze cases such as the attached testcase, where the front-end
has cloned the loop controlling expression. Along with r73805, this
lets IndVarSimplify eliminate all the sign-extend casts in the
loop in the attached testcase.
llvm-svn: 73807
so that it can access the TargetData member (when available) and
use ValueTracking.h information to compute information for
SCEVUnknown Values.
Also add GetMinLeadingZeros and GetMinSignBits functions,
with minimal implementations.
llvm-svn: 73794
expression in IVUsers, because in the case of a use of a non-linear
addrec outside of a loop, this causes the addrec to be evaluated as
a linear addrec.
llvm-svn: 73774
casted induction variables in cases where the cast
isn't foldable. It ended up being a pessimization in
many cases. This could be fixed, but it would require
a bunch of complicated code in IVUsers' clients. The
advantages of this approach aren't visible enough to
justify it at this time.
llvm-svn: 73706
obscuring what would otherwise be a low-bits mask. Use ComputeMaskedBits
to compute what ShrinkDemandedConstant knew about to reconstruct a
low-bits mask value.
llvm-svn: 73540
failures.
To support this, add some utility functions to Type to help support
vector/scalar-independent code. Change ConstantInt::get and
ConstantFP::get to support vector types, and add an overload to
ConstantInt::get that uses a static IntegerType type, for
convenience.
Introduce a new getConstant method for ScalarEvolution, to simplify
common use cases.
llvm-svn: 73431
they contain multiplications of constants with add operations.
This helps simplify several kinds of things; in particular it
helps simplify expressions like ((-1 * (%a + %b)) + %a) to %b,
as expressions like this often come up in loop trip count
computations.
llvm-svn: 73361
even though the order doesn't matter at the top level of an expression,
it does matter when the constant is a subexpression of an n-ary
expression, because n-ary expressions are sorted lexicographically.
llvm-svn: 73358
induction variable when the addrec to be expanded does not require
a wider type. This eliminates the need for IndVarSimplify to
micro-manage SCEV expansions, because SCEVExpander now
automatically expands them in the form that IndVarSimplify considers
to be canonical. (LSR still micro-manages its SCEV expansions,
because it's optimizing for the target, rather than for
other optimizations.)
Also, this uses the new getAnyExtendExpr, which has more clever
expression simplification logic than the IndVarSimplify code it
replaces, and this cleans up some ugly expansions in code such as
the included masked-iv.ll testcase.
llvm-svn: 73294
immediately casted. At present, this is just a minor code
simplification. In the future, the expansion code may be able
to make better choices if it knows what the desired result
type will be.
llvm-svn: 73137
integer and floating-point opcodes, introducing
FAdd, FSub, and FMul.
For now, the AsmParser, BitcodeReader, and IRBuilder all preserve
backwards compatability, and the Core LLVM APIs preserve backwards
compatibility for IR producers. Most front-ends won't need to change
immediately.
This implements the first step of the plan outlined here:
http://nondot.org/sabre/LLVMNotes/IntegerOverflow.txt
llvm-svn: 72897
TargetData pointer. The only thing it's used for are
calls to ConstantFoldCompareInstOperands and
ConstantFoldInstOperands, which both already accept a
null TargetData pointer. This makes
ConstantFoldConstantExpression easier to use in clients
where TargetData is optional.
llvm-svn: 72741
possible. For example, it now emits
%p.2.ip.1 = getelementptr [3 x [3 x double]]* %p, i64 2, i64 %tmp, i64 1
instead of the equivalent but less obvious
%p.2.ip.1 = getelementptr [3 x [3 x double]]* %p, i64 0, i64 %tmp, i64 19
llvm-svn: 72452
beyond their associated static array type.
I believe that this fixes a legitimate bug, because BasicAliasAnalysis
already has code to check for this condition that works for non-constant
indices, however it was missing the case of constant indices. With this
change, it checks for both.
This fixes PR4267, and miscompiles of SPEC 188.ammp and 464.h264.href.
llvm-svn: 72451
that of the LHS. It doesn't matter for correctness, but the LHS
is more likely than the RHS to be a pointer type in exotic cases,
and it's more tidy to have it return the integer type.
llvm-svn: 72424
in the case where a loop exit value cannot be computed, instead of only in
some cases while using SCEVCouldNotCompute in others. This simplifies
getSCEVAtScope's callers.
llvm-svn: 72375
sending SCEVUnknowns to expandAddToGEP. This avoids the need for
expandAddToGEP to bend the rules and peek into SCEVUnknown
expressions.
Factor out the code for testing whether a SCEV can be factored by
a constant for use in a GEP index. This allows it to handle
SCEVAddRecExprs, by recursing.
As a result, SCEVExpander can now put more things in GEP indices,
so it emits fewer explicit mul instructions.
llvm-svn: 72366
Fix by clearing the rewriter cache before deleting the trivially dead
instructions.
Also make InsertedExpressions use an AssertingVH to catch these
bugs easier.
llvm-svn: 72364
Instcombine to be more aggressive about using SimplifyDemandedBits
on shift nodes. This allows a shift to be simplified to zero in the
included test case.
llvm-svn: 72204
instructions. It attempts to create high-level multi-operand GEPs,
though in cases where this isn't possible it falls back to casting
the pointer to i8* and emitting a GEP with that. Using GEP instructions
instead of ptrtoint+arithmetic+inttoptr helps pointer analyses that
don't use ScalarEvolution, such as BasicAliasAnalysis.
Also, make the AddrModeMatcher more aggressive in handling GEPs.
Previously it assumed that operand 0 of a GEP would require a register
in almost all cases. It now does extra checking and can do more
matching if operand 0 of the GEP is foldable. This fixes a problem
that was exposed by SCEVExpander using GEPs.
llvm-svn: 72093
IVUsers.cpp: In member function ‘bool llvm::IVUsers::AddUsersIfInteresting(llvm::Instruction*)’:
IVUsers.cpp:221: warning: ‘isSigned’ may be used uninitialized in this function
with gcc-4.3.
llvm-svn: 71654
getNoopOrSignExtend, and getTruncateOrNoop. These are similar
to getTruncateOrZeroExtend etc., except that they assert that
the conversion is either not widening or narrowing, as
appropriate. These will be used in some upcoming fixes.
llvm-svn: 71632
and generalize it so that it can be used by IndVarSimplify. Implement the
base IndVarSimplify transformation code using IVUsers. This removes
TestOrigIVForWrap and associated code, as ScalarEvolution now has enough
builtin overflow detection and folding logic to handle all the same cases,
and more. Run "opt -iv-users -analyze -disable-output" on your favorite
loop for an example of what IVUsers does.
This lets IndVarSimplify eliminate IV casts and compute trip counts in
more cases. Also, this happens to finally fix the remaining testcases
in PR1301.
Now that IndVarSimplify is being more aggressive, it occasionally runs
into the problem where ScalarEvolutionExpander's code for avoiding
duplicate expansions makes it difficult to ensure that all expanded
instructions dominate all the instructions that will use them. As a
temporary measure, IndVarSimplify now uses a FixUsesBeforeDefs function
to fix up instructions inserted by SCEVExpander. Fortunately, this code
is contained, and can be easily removed once a more comprehensive
solution is available.
llvm-svn: 71535
These values aren't analyzable, so they don't care if more information
about the loop trip count can be had. Also, SCEVUnknown is used for
a PHI while the PHI itself is being analyzed, so it needs to be left
in the Scalars map. This fixes a variety of subtle issues.
llvm-svn: 71533
return the correct value when the cast operand is all zeros. This ought
to be pretty rare, because it would mean that the regular SCEV folding
routines missed a case, though there are cases they might legitimately
miss. Also, it's unlikely anything currently using GetMinTrailingZeros
cares about this case.
llvm-svn: 71532
which are not analyzed with SCEV techniques, which can require
brute-forcing through a large number of instructions. This
fixes a massive compile-time issue on 400.perlbench (in
particular, the loop in MD5Transform).
llvm-svn: 71259
bits captured, but the pointer marked nocapture. In fact
I now recall that this problem is why only readnone functions
returning void were considered before! However keep a small
fix that was also in r70876: a readnone function returning
void can result in bits being captured if it unwinds, so
test for this.
llvm-svn: 71168
checking for bcopy... no
checking for getc_unlocked... Assertion failed: (0 && "Unknown SCEV kind!"), function operator(), file /Volumes/Sandbox/Buildbot/llvm/full-llvm/build/llvmCore.roots/llvmCore~obj/src/lib/Analysis/ScalarEvolution.cpp, line 511.
/Volumes/Sandbox/Buildbot/llvm/full-llvm/build/llvmgcc42.roots/llvmgcc42~obj/src/libdecnumber/decUtility.c:360: internal compiler error: Abort trap
Please submit a full bug report,
with preprocessed source if appropriate.
See <URL:http://developer.apple.com/bugreporter> for instructions.
make[4]: *** [decUtility.o] Error 1
make[4]: *** Waiting for unfinished jobs....
Assertion failed: (0 && "Unknown SCEV kind!"), function operator(), file /Volumes/Sandbox/Buildbot/llvm/full-llvm/build/llvmCore.roots/llvmCore~obj/src/lib/Analysis/ScalarEvolution.cpp, line 511.
/Volumes/Sandbox/Buildbot/llvm/full-llvm/build/llvmgcc42.roots/llvmgcc42~obj/src/libdecnumber/decNumber.c:5591: internal compiler error: Abort trap
Please submit a full bug report,
with preprocessed source if appropriate.
See <URL:http://developer.apple.com/bugreporter> for instructions.
make[4]: *** [decNumber.o] Error 1
make[3]: *** [all-stage2-libdecnumber] Error 2
make[3]: *** Waiting for unfinished jobs....
llvm-svn: 71165
to sorting SCEVs by their kind, sort SCEVs of the same kind according
to their operands. This helps avoid things like (a+b) being a distinct
expression from (b+a).
llvm-svn: 71160
CallbackVH, with fixes. allUsesReplacedWith need to
walk the def-use chains and invalidate all users of a
value that is replaced. SCEVs of users need to be
recalcualted even if the new value is equivalent. Also,
make forgetLoopPHIs walk def-use chains, since any
SCEV that depends on a PHI should be recalculated when
more information about that PHI becomes available.
llvm-svn: 70927
only capture their arguments by returning them or
throwing an exception or not based on the argument
value. Patch essentially by Frits van Bommel.
llvm-svn: 70876
makes ScalarEvolution::deleteValueFromRecords, and it's code that
subtly needed to be called before ReplaceAllUsesWith, unnecessary.
It also makes ValueDeletionListener unnecessary.
llvm-svn: 70645
artificial "ptrtoint", as it tends to clutter up complicated
expressions. The cast operators now print both source and
destination types, which is usually sufficient.
llvm-svn: 70554
compute an upper-bound value for the trip count, in addition to
the actual trip count. Use this to allow getZeroExtendExpr and
getSignExtendExpr to fold casts in more cases.
This may eventually morph into a more general value-range
analysis capability; there are certainly plenty of places where
more complete value-range information would allow more folding.
llvm-svn: 70509
(sext i8 {-128,+,1} to i64) to i64 {-128,+,1}, where the iteration
crosses from negative to positive, but is still safe if the trip
count is within range.
llvm-svn: 70421
information to simplify [sz]ext({a,+,b}) to {zext(a),+,[zs]ext(b)},
as appropriate.
These functions and the trip count code each call into the other, so
this requires careful handling to avoid infinite recursion. During
the initial trip count computation, conservative SCEVs are used,
which are subsequently discarded once the trip count is actually
known.
Among other benefits, this change lets LSR automatically eliminate
some unnecessary zext-inreg and sext-inreg operation where the
operand is an induction variable.
llvm-svn: 70241
with the persistent insertion point, and change IndVars to make
use of it. This fixes a bug where IndVars was holding on to a
stale insertion point and forcing the SCEVExpander to continue to
use it.
This fixes PR4038.
llvm-svn: 69892