same object to be a non-capture; Duncan pointed out a way that such
a comparison could be a capture.
Make the rule that considers a comparison against null more specific,
and only consider noalias return values compared against null. This
still supports test/Transforms/GVN/nonescaping-malloc.ll, and is not
susceptible to the problem Duncan pointed out with noalias arguments.
llvm-svn: 89468
because if the results from getUnderlyingObject match, the values must
be from the same underlying object, even if we don't know what that
object is.
llvm-svn: 89434
careful about crazy methods of capturing pointers using comparisons.
Comparisons of identified objects with null in the default address
space are not captures. And, comparisons of two pointers within the
same identified object are not captures.
llvm-svn: 89421
if it is not ultimately captured. Teach BasicAliasAnalysis that a
local object address which does not escape and is never stored does
not alias with a value resulting from a load.
llvm-svn: 89398
cannot be folded into target cmp instruction.
- Avoid a phase ordering issue where early cmp optimization would prevent the
later count-to-zero optimization.
- Add missing checks which could cause LSR to reuse stride that does not have
users.
- Fix a bug in count-to-zero optimization code which failed to find the pre-inc
iv's phi node.
- Remove, tighten, loosen some incorrect checks disable valid transformations.
- Quite a bit of code clean up.
llvm-svn: 86969
This allows StringRef to skip controversial if(str) check in constructor.
Buildbots, wait for corresponding clang and llvm-gcc FE check-ins!
llvm-svn: 86914
start using them in a trivial way when -enable-jump-threading-lvi
is passed. enable-jump-threading-lvi will be my playground for
awhile.
llvm-svn: 86789
except that the result may not be a constant. Switch jump threading to
use it so that it gets things like (X & 0) -> 0, which occur when phi preds
are deleted and the remaining phi pred was a zero.
llvm-svn: 86637
This patch forbids implicit conversion of DenseMap::const_iterator to
DenseMap::iterator which was possible because DenseMapIterator inherited
(publicly) from DenseMapConstIterator. Conversion the other way around is now
allowed as one may expect.
The template DenseMapConstIterator is removed and the template parameter
IsConst which specifies whether the iterator is constant is added to
DenseMapIterator.
Actually IsConst parameter is not necessary since the constness can be
determined from KeyT but this is not relevant to the fix and can be addressed
later.
Patch by Victor Zverovich!
llvm-svn: 86636
takes decimated instructions and applies identities to them. This
is pretty minimal at this point, but I plan to pull some instcombine
logic out into these and similar routines.
llvm-svn: 86613
Here is the original commit message:
This commit updates malloc optimizations to operate on malloc calls that have constant int size arguments.
Update CreateMalloc so that its callers specify the size to allocate:
MallocInst-autoupgrade users use non-TargetData-computed allocation sizes.
Optimization uses use TargetData to compute the allocation size.
Now that malloc calls can have constant sizes, update isArrayMallocHelper() to use TargetData to determine the size of the malloced type and the size of malloced arrays.
Extend getMallocType() to support malloc calls that have non-bitcast uses.
Update OptimizeGlobalAddressOfMalloc() to optimize malloc calls that have non-bitcast uses. The bitcast use of a malloc call has to be treated specially here because the uses of the bitcast need to be replaced and the bitcast needs to be erased (just like the malloc call) for OptimizeGlobalAddressOfMalloc() to work correctly.
Update PerformHeapAllocSRoA() to optimize malloc calls that have non-bitcast uses. The bitcast use of the malloc is not handled specially here because ReplaceUsesOfMallocWithGlobal replaces through the bitcast use.
Update OptimizeOnceStoredGlobal() to not care about the malloc calls' bitcast use.
Update all globalopt malloc tests to not rely on autoupgraded-MallocInsts, but instead use explicit malloc calls with correct allocation sizes.
llvm-svn: 86311
MallocInst-autoupgrade users use non-TargetData-computed allocation sizes.
Optimization uses use TargetData to compute the allocation size.
Now that malloc calls can have constant sizes, update isArrayMallocHelper() to use TargetData to determine the size of the malloced type and the size of malloced arrays.
Extend getMallocType() to support malloc calls that have non-bitcast uses.
Update OptimizeGlobalAddressOfMalloc() to optimize malloc calls that have non-bitcast uses. The bitcast use of a malloc call has to be treated specially here because the uses of the bitcast need to be replaced and the bitcast needs to be erased (just like the malloc call) for OptimizeGlobalAddressOfMalloc() to work correctly.
Update PerformHeapAllocSRoA() to optimize malloc calls that have non-bitcast uses. The bitcast use of the malloc is not handled specially here because ReplaceUsesOfMallocWithGlobal replaces through the bitcast use.
Update OptimizeOnceStoredGlobal() to not care about the malloc calls' bitcast use.
Update all globalopt malloc tests to not rely on autoupgraded-MallocInsts, but instead use explicit malloc calls with correct allocation sizes.
llvm-svn: 86077
ArraySize * ElementSize
ElementSize * ArraySize
ArraySize << log2(ElementSize)
ElementSize << log2(ArraySize)
Refactor isArrayMallocHelper and delete isSafeToGetMallocArraySize, so that there is only 1 copy of the malloc array determining logic.
Update users of getMallocArraySize() to not bother calling isArrayMalloc() as well.
llvm-svn: 85421
Update all analysis passes and transforms to treat free calls just like FreeInst.
Remove RaiseAllocations and all its tests since FreeInst no longer needs to be raised.
llvm-svn: 84987
non-type-safe constant initializers. This sort of thing happens
quite a bit for 4-byte loads out of string constants, unions,
bitfields, and an interesting endianness check from sqlite, which
is something like this:
const int sqlite3one = 1;
# define SQLITE_BIGENDIAN (*(char *)(&sqlite3one)==0)
# define SQLITE_LITTLEENDIAN (*(char *)(&sqlite3one)==1)
# define SQLITE_UTF16NATIVE (SQLITE_BIGENDIAN?SQLITE_UTF16BE:SQLITE_UTF16LE)
all of these macros now constant fold away.
This implements PR3152 and is based on a patch started by Eli, but heavily
modified and extended.
llvm-svn: 84936
Analysis/ConstantFolding.cpp. This doesn't change the behavior of
instcombine but makes other clients of ConstantFoldInstruction
able to handle loads. This was partially extracted from Eli's patch
in PR3152.
llvm-svn: 84836
container of the blocks and do efficient lookups. This makes
isLoopSimplifyForm much faster on large loops, fixing a significant
compile-time issue in builds with assertions enabled.
llvm-svn: 84673
identifying the malloc as a non-array malloc. This broke GlobalOpt's optimization of stores of mallocs
to global variables.
The fix is to classify malloc's into 3 categories:
1. non-array mallocs
2. array mallocs whose array size can be determined
3. mallocs that cannot be determined to be of type 1 or 2 and cannot be optimized
getMallocArraySize() returns NULL for category 3, and all users of this function must avoid their
malloc optimization if this function returns NULL.
Eventually, currently unexpected codegen for computing the malloc's size argument will be supported in
isArrayMalloc() and getMallocArraySize(), extending malloc optimizations to those examples.
llvm-svn: 84199
cannot alias the GEP. GEP pointer alias rule states this clearly:
A pointer value formed from a getelementptr instruction is associated with the
addresses associated with the first operand of the getelementptr.
llvm-svn: 84079
question, can we get rid of the BasicBlock versions of all inserters
and use Head == 0 to indicate the old case when GetInsertBlock == 0?
llvm-svn: 83216
information. This allows arbitrary code involving DW_OP_plus_uconst
and DW_OP_deref. The scheme allows for easy extention to include,
any, or all of the DW_OP_ opcodes. I thought about just exposing all
of them, but, wasn't sure if people wanted the dwarf opcodes exposed
in the api. Is that a layering violation?
With this scheme, the entire existing block scheme used by llvm-gcc
can be switched over to the new scheme. I think that would be
cleaner, as then the compiler specific bits are not present in llvm
proper. Before the old code can be yanked however, similar code in
clang would have to be removed.
Next up, more testing.
llvm-svn: 83120
the PassManager code into a regular verifyAnalysis method.
Also, reorganize loop verification. Make the LoopPass infrastructure
call verifyLoop as needed instead of having LoopInfo::verifyAnalysis
check every loop in the function after each looop pass. Add a new
command-line argument, -verify-loop-info, to enable the expensive
full checking.
llvm-svn: 82952
code that stops the timer doesn't have to search to find the timer
object before it stops the timer. This avoids a lock acquisition
and a few other things done with the timer running.
llvm-svn: 82949
LoopPasses for that loop. This avoids trouble with the PassManager
trying to call verifyAnalysis on them, and frees up some memory
sooner rather than later.
llvm-svn: 82945
aren't in canonical loop-simplify form, since it doesn't itself depend
on LoopSimplify. This means handling loops without preheaders and loops
with multiple backedges.
llvm-svn: 82905
test whether it properly dominates the loop header. This is equivalent
when the loop has a preheader, and has the advantage of working when
the loop doesn't have a preheader. Since IVUsers doesn't Require
LoopSimplify, the loop isn't guaranteed to have a preheader.
llvm-svn: 82899
to. This can be combined with LCSSA or SSI form to store more information on a
PHINode than can be computed by looking at its incoming values.
llvm-svn: 82317
In getMallocArraySize(), fix bug in the case that array size is the product of 2 constants.
Extend isArrayMalloc() and getMallocArraySize() to handle case where malloc is used as char array.
Ensure that ArraySize in LowerAllocations::runOnBasicBlock() is correct type.
Extend Instruction::isSafeToSpeculativelyExecute() to handle malloc calls.
Add verification for malloc calls.
Reviewed by Dan Gohman.
llvm-svn: 82257
where the induction variable has a non-unit stride, such as {0,+,2}, and
there are expressions such as {1,+,2} inside the loop formed with
or or add nsw operators.
llvm-svn: 82151
argpromote to avoid invalidating an iterator. This fixes PR4977.
All clang tests now pass with expensive checking (on my system
at least).
llvm-svn: 81843
how to fold notionally-out-of-bounds array getelementptr indices instead
of just doing these in lib/Analysis/ConstantFolding.cpp, because it can
be done in a fairly general way without TargetData, and because not all
constants are visited by lib/Analysis/ConstantFolding.cpp. This enables
more constant folding.
Also, set the "inbounds" flag when the getelementptr indices are
one-past-the-end.
llvm-svn: 81483
Fixed non working -profile-verifier-noassert option.
Fixed missing newline in debugEntry().
Cleaned up assert messages. (assert(0 && Message) is still shown, but the message is printed before.)
When verifiying loaded profiles the ProfileVerifier got confused when block was a setjmp target, this is checked now.
When verifiying loaded profiles the ProfileVerifier got confused when block eventually reaching an exit(), this is checked now.
llvm-svn: 81338
that get created during loop unswitching, and fix SplitBlockPredecessors'
LCSSA updating code to create new PHIs instead of trying to just move
existing ones.
Also, optimize Loop::verifyLoop, since it gets called a lot. Use
searches on a sorted list of blocks instead of calling the "contains"
function, as is done in other places in the Loop class, since "contains"
does a linear search. Also, don't call verifyLoop from LoopSimplify or
LCSSA, as the PassManager is already calling verifyLoop as part of
LoopInfo's verifyAnalysis.
llvm-svn: 81221
D test/Analysis/Profiling
--- Reverse-merging r80907 into '.':
U lib/Analysis/ProfileInfoLoaderPass.cpp
Attempt to remove failure in the self-hosting build bot.
llvm-svn: 80966
and exact flags. Because ConstantExprs are uniqued, creating an
expression with this flag causes all expressions with the same operands
to have the same flag, which may not be safe. Add, sub, mul, and sdiv
ConstantExprs are usually folded anyway, so the main interesting flag
here is inbounds, and the constant folder already knows how to set the
inbounds flag automatically in most cases, so there isn't an urgent need
for the API support.
This can be reconsidered in the future, but for now just removing these
API bits eliminates a source of potential trouble with little downside.
llvm-svn: 80959
Optimal edge profiling is only possible when blocks with no predecessors get an
virtual edge (BB,0) that counts the execution frequencies of this
function-exiting blocks.
This patch makes the necessary changes before actually enabling optimal edge profiling.
llvm-svn: 80667
This adds a pass to verify the current profile against the flow conditions.
This is very helpful when later on trying to perserve the profiling information
during all passes.
llvm-svn: 80666
for sanity. This didn't turn up any bugs.
Change CallGraphNode to maintain its "callsite" information in the
call edges list as a WeakVH instead of as an instruction*. This fixes
a broad class of dangling pointer bugs, and makes CallGraph have a number
of useful invariants again. This fixes the class of problem indicated
by PR4029 and PR3601.
llvm-svn: 80663
SCEVUnknowns, as the non-SCEVUnknown cases in the getSCEVAtScope code
can also end up repeatedly climing through the same expression trees,
which can be unusably slow when the trees are very tall.
Also, add a quick check for SCEV pointer equality to the main
SCEV comparison routine, as the full comparison code can be expensive
in the case of large expression trees.
These fix compile-time problems in some pathlogical cases.
llvm-svn: 80623
stem from the fact that we have two types of passes that need to update it:
1. callgraphscc and module passes that are explicitly aware of it
2. Functionpasses (and loop passes etc) that are interlaced with CGSCC passes
by the CGSCC Passmgr.
In the case of #1, we can reasonably expect the passes to update the call
graph just like any analysis. However, functionpasses are not and generally
should not be CG aware. This has caused us no end of problems, so this takes
a new approach. Logically, the CGSCC Pass manager can rescan every function
after it runs a function pass over it to see if the functionpass made any
updates to the IR that affect the callgraph. This allows it to catch new calls
introduced by the functionpass.
In practice, doing this would be slow. This implementation keeps track of
whether or not the current scc is dirtied by a function pass, and, if so,
delays updating the callgraph until it is actually needed again. This was
we avoid extraneous rescans, but we still have good invariants when the
callgraph is needed.
Step #2 of the "give Callgraph some sane invariants" is to change CallGraphNode
to use a CallBackVH for the callsite entry of the CallGraphNode. This way
we can immediately remove entries from the callgraph when a FunctionPass is
active instead of having dangling pointers. The current pass tries to tolerate
these dangling pointers, but it is just an evil hack.
This is related to PR3601/4835/4029. This also reverts r80541, a hack working
around the sad lack of invariants.
llvm-svn: 80566
indirect function pointer, inline it, then go to delete the body.
The problem is that the callgraph had other references to the function,
though the inliner had no way to know it, so we got a dangling pointer
and an invalid iterator out of the deal.
The fix to this is pretty simple: stop the inliner from deleting the
function by knowing that there are references to it. Do this by making
CallGraphNodes contain a refcount. This requires moving deletion of
available_externally functions to the module-level cleanup sweep where
it belongs.
llvm-svn: 80533
argpromotion and structretpromote. Basically, when replacing
a function, they used the 'changeFunction' api which changes
the entry in the function map (and steals/reuses the callgraph
node).
This has some interesting effects: first, the problem is that it doesn't
update the "callee" edges in any callees of the function in the call graph.
Second, this covers for a major problem in all the CGSCC pass stuff, which
is that it is completely broken when functions are deleted if they *don't*
reuse a CGN. (there is a cute little fixme about this though :).
This patch changes the protocol that CGSCC passes must obey: now the CGSCC
pass manager copies the SCC and preincrements its iterator to avoid passes
invalidating it. This allows CGSCC passes to mutate the current SCC. However
multiple passes may be run on that SCC, so if passes do this, they are now
required to *update* the SCC to be current when they return.
Other less interesting parts of this patch are that it makes passes update
the CG more directly, eliminates changeFunction, and requires clients of
replaceCallSite to specify the new callee CGN if they are changing it.
llvm-svn: 80527
This is a simple AliasAnalysis implementation which works by making
ScalarEvolution queries. ScalarEvolution has a more complete understanding
of arithmetic than BasicAA's collection of ad-hoc checks, so it handles
some cases that BasicAA misses, for example p[i] and p[i+1] within the
same iteration of a loop.
This is currently experimental. It may be that the main use for this pass
will be to help find cases where BasicAA can be profitably extended, or
to help in the development of the overall AliasAnalysis infrastructure,
however it's also possible that it could grow up to become a directly
useful pass.
llvm-svn: 80098
will always return the same value. This isn't currently necessary,
since this code doesn't currently ever get called under circumstances
where it would matter, but it may some day.
llvm-svn: 80017
This is conventional command-line tool behavior. -f now just means
"enable binary output on terminals".
Add a -f option to llvm-extract and llvm-link, for consistency.
Remove F_Force from raw_fd_ostream and enable overwriting and
truncating by default. Introduce an F_Excl flag to permit users to
enable a failure when the file already exists. This flag is
currently unused.
Update Makefiles and documentation accordingly.
llvm-svn: 79990
*) introducing new data type and export function of edge info for whole function (preparation for next patch).
*) renaming variables to make clear distinction between data and containers that contain this data.
*) updated comments and whitespaces.
*) made ProfileInfo::MissingValue a double (as it should be...).
(Discussed at http://lists.cs.uiuc.edu/pipermail/llvm-commits/Week-of-Mon-20090817/084955.html.)
llvm-svn: 79940
array member of a struct, it's possible to land in an arbitrary position
inside that struct, such that attempting to find further getelementptr
indices will fail. In such cases, folding cannot be done.
llvm-svn: 79485
static extents of the static array type, it causes GlobalOpt and
other passes to be more conservative. This canonicalization also
allows the constant folder to add "inbounds" to GEPs.
llvm-svn: 79440
TargetData is not present. It still uses TargetData when available.
This generalization also fixed some limitations in the TargetData
case; the attached testcase covers this.
llvm-svn: 79344
- Part of optimal static profiling patch sequence by Andreas Neustifter.
- Store edge, block, and function information separately for each functions
(instead of in one giant map).
- Return frequencies as double instead of int, and use a sentinel value for
missing information.
llvm-svn: 78477
LoopDependenceAnalysis::getLoops is currently O(N*M) for a loop-nest of
depth N and a compound SCEV of M atomic SCEVs. As both N and M will
typically be very small, this should not be a problem. If it turns out
to be one, rewriting getLoops as SCEVVisitor will reduce complexity to
O(M).
llvm-svn: 78394
We can not simply apply ZIV testing to the pointer offsets, as this
would incorrectly return independence for e.g. (GEP x,0,i; GEP x,1,-i).
llvm-svn: 78155
affected after a PHI node has been analyzed, just remove affected
SCEVs from the Scalars map, so that they'll be (lazily) recreated as
needed. This avoids creating SCEV objects that aren't actually needed.
Also, rewrite the associated def-use walking code to be non-recursive
and to continue traversing past Instructions that don't have an
entry in the Scalars map.
llvm-svn: 77032
- Some clients which used DOUT have moved to DEBUG. We are deprecating the
"magic" DOUT behavior which avoided calling printing functions when the
statement was disabled. In addition to being unnecessary magic, it had the
downside of leaving code in -Asserts builds, and of hiding potentially
unnecessary computations.
llvm-svn: 77019
- Yay for '-'s and simplifications!
- I kept StringMap::GetOrCreateValue for compatibility purposes, this can
eventually go away. Likewise the StringMapEntry Create functions still follow
the old style.
- NIFC.
llvm-svn: 76888
This introduces an LDA-internal DependencePair class. The intention is,
that this is a place where dependence testers can store various results
such as SCEVs describing conflicting iterations, breaking conditions,
distance/direction vectors, etc.
llvm-svn: 76877
(x pred y) with more thorough code that does more complete canonicalization
before resorting to range checks. This helps it find more cases where
the canonicalized expressions match.
llvm-svn: 76671
Getelementptrs that are defined to wrap are virtually useless to
optimization, and getelementptrs that are undefined on any kind
of overflow are too restrictive -- it's difficult to ensure that
all intermediate addresses are within bounds. I'm going to take
a different approach.
Remove a few optimizations that depended on this flag.
llvm-svn: 76437
all values belonging to the intersection will belong to the resulting range.
The former was inconsistent about that point (either way is fine, just pick
one.) This is part of PR4545.
llvm-svn: 76289
GEPOperator's hasNoPointer0verflow(), and make a few places in instcombine
that create GEPs that may overflow clear the NoOverflow value. Among
other things, this partially addresses PR2831.
llvm-svn: 76252
in a convenient manner, factoring out some common code from
InstructionCombining and ValueTracking. Move the contents of
BinaryOperators.h into Operator.h and use Operator to generalize them
to support ConstantExprs as well as Instructions.
llvm-svn: 76232
isSafeToSpeculativelyExecute. The new method is a bit closer to what
the callers actually care about in that it rejects more things callers
don't want. It also adds more precise handling for integer
division, and unifies code for analyzing the legality of a speculative
load.
llvm-svn: 76150
the operands have pointer type, so that the resulting type matches
the original SCEV type, and so that unnecessary ptrtoints are
avoided in common cases.
llvm-svn: 75680
For now this only computes the allocated size of the memory pointed to by a
pointer, and offset a pointer from allocated pointer.
The actual checkLimits part will come later, after another round of review.
llvm-svn: 75657
This adds location info for all llvm_unreachable calls (which is a macro now) in
!NDEBUG builds.
In NDEBUG builds location info and the message is off (it only prints
"UREACHABLE executed").
llvm-svn: 75640
works similar to isLoopInvariant, except that it will do trivial
hoisting to try to make the value loop invariant if it isn't already.
This makes it easier for transformation passes to clear trivial
instructions out of the way (the regular LICM pass doesn't run
until relatively late). This is code factored out of LoopSimplify
and other places.
llvm-svn: 75578
and related functions out of LoopBase and into Loop, since they
are specific to BasicBlock-based loops. This also allows the code
to be moved out-of-line.
llvm-svn: 75523
using the Curiously Recurring Template Pattern with LoopBase.
This will help further refactoring, and future functionality for
Loop. Also, Headers can now foward-declare Loop, instead of pulling
in LoopInfo.h or doing tricks.
llvm-svn: 75519
SCEVZeroExtendExpr ahead of the most expensive analysis. This
speeds up analysis and helps avoid pathologically bad behavior
on the testcase in PR4534.
llvm-svn: 75496
so that all code paths get it. PR4256 was about a case where the
phi translation loop would find all preds in the Visited cache, so
it could get by without re-sorting the NonLocalPointerDeps cache.
Fix this by resorting it earlier, there is no reason not to do this.
This patch inspired by Jakub Staszak's patch.
llvm-svn: 75476
This involves temporarily hard wiring some parts to use the global context. This isn't ideal, but it's
the only way I could figure out to make this process vaguely incremental.
llvm-svn: 75445
Make llvm_unreachable take an optional string, thus moving the cerr<< out of
line.
LLVM_UNREACHABLE is now a simple wrapper that makes the message go away for
NDEBUG builds.
llvm-svn: 75379
to a loop deletion more thorough. Don't prune the def-use tree search at
instructions that don't have SCEVs computed, because an instruction with
a user that has a computed SCEV may itself lack a computed SCEV. Also,
remove loop-related values from the ValuesAtScopes and
ConstantEvolutionLoopExitValues maps as well.
This fixes a regression in 483.xalancbmk.
llvm-svn: 75030
Constant. This lets ConstantInts be handled as SCEVConstant instead
of SCEVUnknown, as getUnknown no longer has special-case code for
ConstantInt and friends. This usually doesn't affect the final
output, since the constants end up getting folded later, but it
does make intermediate expressions more obvious in many cases.
llvm-svn: 74459
an individual exhaustive evaluation reflects only the exit value
implied by an individual exit, which may differ from the actual
exit value of the loop if there are other exits. This fixes PR4477.
llvm-svn: 74447
Also don't call finalizers for LoopPass if initialization was not called.
Add a unittest that tests that these methods are called, in the proper
order, and the correct number of times.
llvm-svn: 74438
This helps it avoid reusing an instruction that doesn't dominate all
of the users, in cases where the original instruction was inserted
before all of the users were known. This may result in redundant
expansions of sub-expressions that depend on loop-unpredictable values
in some cases, however this isn't very common, and it primarily impacts
IndVarSimplify, so GVN can be expected to clean these up.
This eliminates the need for IndVarSimplify's FixUsesBeforeDefs,
which fixes several bugs.
llvm-svn: 74352
nesting order of nested AddRec expressions to skip the transformation
if it would introduce an AddRec with operands not loop-invariant
with respect to its loop.
llvm-svn: 74343
trip counts in more cases.
Generalize ScalarEvolution's isLoopGuardedByCond code to recognize
And and Or conditions, splitting the code out into an
isNecessaryCond helper function so that it can evaluate Ands and Ors
recursively, and make SCEVExpander be much more aggressive about
hoisting instructions out of loops.
test/CodeGen/X86/pr3495.ll has an additional instruction now, but
it appears to be due to an arbitrary register allocation difference.
llvm-svn: 74048
createSCEV. Also, recognize UndefValue in createSCEV.
Change getIntegerSCEV's comment to avoid mentioning FP types,
and re-implement it in terms of getConstant instead of getUnknown.
llvm-svn: 74041
This also throws out the SCEV reference counting scheme, as the the SCEVs now have a lifetime controlled by the
ScalarEvolution pass.
Note that SCEVHandle is now a no-op, and will be remove in a future commit.
llvm-svn: 73892
counts for loops with multiple exits, replacing more conservative code
which only handled constants. This is derived from a patch by
Nick Lewycky.
This also fixes llc aborts in ClamAV and others, as
getUMinFromMismatchedTypes takes care of balancing the types before
working with them.
llvm-svn: 73884
blocks, and also exit blocks with multiple conditions (combined
with (bitwise) ands and ors). It's often infeasible to compute an
exact trip count in such cases, but a useful upper bound can often
be found.
llvm-svn: 73866
SCEVUnknowns with identical Instructions to be equal. This allows
it to analze cases such as the attached testcase, where the front-end
has cloned the loop controlling expression. Along with r73805, this
lets IndVarSimplify eliminate all the sign-extend casts in the
loop in the attached testcase.
llvm-svn: 73807
so that it can access the TargetData member (when available) and
use ValueTracking.h information to compute information for
SCEVUnknown Values.
Also add GetMinLeadingZeros and GetMinSignBits functions,
with minimal implementations.
llvm-svn: 73794
expression in IVUsers, because in the case of a use of a non-linear
addrec outside of a loop, this causes the addrec to be evaluated as
a linear addrec.
llvm-svn: 73774
casted induction variables in cases where the cast
isn't foldable. It ended up being a pessimization in
many cases. This could be fixed, but it would require
a bunch of complicated code in IVUsers' clients. The
advantages of this approach aren't visible enough to
justify it at this time.
llvm-svn: 73706
obscuring what would otherwise be a low-bits mask. Use ComputeMaskedBits
to compute what ShrinkDemandedConstant knew about to reconstruct a
low-bits mask value.
llvm-svn: 73540
failures.
To support this, add some utility functions to Type to help support
vector/scalar-independent code. Change ConstantInt::get and
ConstantFP::get to support vector types, and add an overload to
ConstantInt::get that uses a static IntegerType type, for
convenience.
Introduce a new getConstant method for ScalarEvolution, to simplify
common use cases.
llvm-svn: 73431
they contain multiplications of constants with add operations.
This helps simplify several kinds of things; in particular it
helps simplify expressions like ((-1 * (%a + %b)) + %a) to %b,
as expressions like this often come up in loop trip count
computations.
llvm-svn: 73361
even though the order doesn't matter at the top level of an expression,
it does matter when the constant is a subexpression of an n-ary
expression, because n-ary expressions are sorted lexicographically.
llvm-svn: 73358
induction variable when the addrec to be expanded does not require
a wider type. This eliminates the need for IndVarSimplify to
micro-manage SCEV expansions, because SCEVExpander now
automatically expands them in the form that IndVarSimplify considers
to be canonical. (LSR still micro-manages its SCEV expansions,
because it's optimizing for the target, rather than for
other optimizations.)
Also, this uses the new getAnyExtendExpr, which has more clever
expression simplification logic than the IndVarSimplify code it
replaces, and this cleans up some ugly expansions in code such as
the included masked-iv.ll testcase.
llvm-svn: 73294
immediately casted. At present, this is just a minor code
simplification. In the future, the expansion code may be able
to make better choices if it knows what the desired result
type will be.
llvm-svn: 73137
integer and floating-point opcodes, introducing
FAdd, FSub, and FMul.
For now, the AsmParser, BitcodeReader, and IRBuilder all preserve
backwards compatability, and the Core LLVM APIs preserve backwards
compatibility for IR producers. Most front-ends won't need to change
immediately.
This implements the first step of the plan outlined here:
http://nondot.org/sabre/LLVMNotes/IntegerOverflow.txt
llvm-svn: 72897
TargetData pointer. The only thing it's used for are
calls to ConstantFoldCompareInstOperands and
ConstantFoldInstOperands, which both already accept a
null TargetData pointer. This makes
ConstantFoldConstantExpression easier to use in clients
where TargetData is optional.
llvm-svn: 72741
possible. For example, it now emits
%p.2.ip.1 = getelementptr [3 x [3 x double]]* %p, i64 2, i64 %tmp, i64 1
instead of the equivalent but less obvious
%p.2.ip.1 = getelementptr [3 x [3 x double]]* %p, i64 0, i64 %tmp, i64 19
llvm-svn: 72452
beyond their associated static array type.
I believe that this fixes a legitimate bug, because BasicAliasAnalysis
already has code to check for this condition that works for non-constant
indices, however it was missing the case of constant indices. With this
change, it checks for both.
This fixes PR4267, and miscompiles of SPEC 188.ammp and 464.h264.href.
llvm-svn: 72451
that of the LHS. It doesn't matter for correctness, but the LHS
is more likely than the RHS to be a pointer type in exotic cases,
and it's more tidy to have it return the integer type.
llvm-svn: 72424
in the case where a loop exit value cannot be computed, instead of only in
some cases while using SCEVCouldNotCompute in others. This simplifies
getSCEVAtScope's callers.
llvm-svn: 72375
sending SCEVUnknowns to expandAddToGEP. This avoids the need for
expandAddToGEP to bend the rules and peek into SCEVUnknown
expressions.
Factor out the code for testing whether a SCEV can be factored by
a constant for use in a GEP index. This allows it to handle
SCEVAddRecExprs, by recursing.
As a result, SCEVExpander can now put more things in GEP indices,
so it emits fewer explicit mul instructions.
llvm-svn: 72366
Fix by clearing the rewriter cache before deleting the trivially dead
instructions.
Also make InsertedExpressions use an AssertingVH to catch these
bugs easier.
llvm-svn: 72364
Instcombine to be more aggressive about using SimplifyDemandedBits
on shift nodes. This allows a shift to be simplified to zero in the
included test case.
llvm-svn: 72204
instructions. It attempts to create high-level multi-operand GEPs,
though in cases where this isn't possible it falls back to casting
the pointer to i8* and emitting a GEP with that. Using GEP instructions
instead of ptrtoint+arithmetic+inttoptr helps pointer analyses that
don't use ScalarEvolution, such as BasicAliasAnalysis.
Also, make the AddrModeMatcher more aggressive in handling GEPs.
Previously it assumed that operand 0 of a GEP would require a register
in almost all cases. It now does extra checking and can do more
matching if operand 0 of the GEP is foldable. This fixes a problem
that was exposed by SCEVExpander using GEPs.
llvm-svn: 72093
IVUsers.cpp: In member function ‘bool llvm::IVUsers::AddUsersIfInteresting(llvm::Instruction*)’:
IVUsers.cpp:221: warning: ‘isSigned’ may be used uninitialized in this function
with gcc-4.3.
llvm-svn: 71654
getNoopOrSignExtend, and getTruncateOrNoop. These are similar
to getTruncateOrZeroExtend etc., except that they assert that
the conversion is either not widening or narrowing, as
appropriate. These will be used in some upcoming fixes.
llvm-svn: 71632
and generalize it so that it can be used by IndVarSimplify. Implement the
base IndVarSimplify transformation code using IVUsers. This removes
TestOrigIVForWrap and associated code, as ScalarEvolution now has enough
builtin overflow detection and folding logic to handle all the same cases,
and more. Run "opt -iv-users -analyze -disable-output" on your favorite
loop for an example of what IVUsers does.
This lets IndVarSimplify eliminate IV casts and compute trip counts in
more cases. Also, this happens to finally fix the remaining testcases
in PR1301.
Now that IndVarSimplify is being more aggressive, it occasionally runs
into the problem where ScalarEvolutionExpander's code for avoiding
duplicate expansions makes it difficult to ensure that all expanded
instructions dominate all the instructions that will use them. As a
temporary measure, IndVarSimplify now uses a FixUsesBeforeDefs function
to fix up instructions inserted by SCEVExpander. Fortunately, this code
is contained, and can be easily removed once a more comprehensive
solution is available.
llvm-svn: 71535
These values aren't analyzable, so they don't care if more information
about the loop trip count can be had. Also, SCEVUnknown is used for
a PHI while the PHI itself is being analyzed, so it needs to be left
in the Scalars map. This fixes a variety of subtle issues.
llvm-svn: 71533
return the correct value when the cast operand is all zeros. This ought
to be pretty rare, because it would mean that the regular SCEV folding
routines missed a case, though there are cases they might legitimately
miss. Also, it's unlikely anything currently using GetMinTrailingZeros
cares about this case.
llvm-svn: 71532
which are not analyzed with SCEV techniques, which can require
brute-forcing through a large number of instructions. This
fixes a massive compile-time issue on 400.perlbench (in
particular, the loop in MD5Transform).
llvm-svn: 71259
bits captured, but the pointer marked nocapture. In fact
I now recall that this problem is why only readnone functions
returning void were considered before! However keep a small
fix that was also in r70876: a readnone function returning
void can result in bits being captured if it unwinds, so
test for this.
llvm-svn: 71168
checking for bcopy... no
checking for getc_unlocked... Assertion failed: (0 && "Unknown SCEV kind!"), function operator(), file /Volumes/Sandbox/Buildbot/llvm/full-llvm/build/llvmCore.roots/llvmCore~obj/src/lib/Analysis/ScalarEvolution.cpp, line 511.
/Volumes/Sandbox/Buildbot/llvm/full-llvm/build/llvmgcc42.roots/llvmgcc42~obj/src/libdecnumber/decUtility.c:360: internal compiler error: Abort trap
Please submit a full bug report,
with preprocessed source if appropriate.
See <URL:http://developer.apple.com/bugreporter> for instructions.
make[4]: *** [decUtility.o] Error 1
make[4]: *** Waiting for unfinished jobs....
Assertion failed: (0 && "Unknown SCEV kind!"), function operator(), file /Volumes/Sandbox/Buildbot/llvm/full-llvm/build/llvmCore.roots/llvmCore~obj/src/lib/Analysis/ScalarEvolution.cpp, line 511.
/Volumes/Sandbox/Buildbot/llvm/full-llvm/build/llvmgcc42.roots/llvmgcc42~obj/src/libdecnumber/decNumber.c:5591: internal compiler error: Abort trap
Please submit a full bug report,
with preprocessed source if appropriate.
See <URL:http://developer.apple.com/bugreporter> for instructions.
make[4]: *** [decNumber.o] Error 1
make[3]: *** [all-stage2-libdecnumber] Error 2
make[3]: *** Waiting for unfinished jobs....
llvm-svn: 71165
to sorting SCEVs by their kind, sort SCEVs of the same kind according
to their operands. This helps avoid things like (a+b) being a distinct
expression from (b+a).
llvm-svn: 71160
CallbackVH, with fixes. allUsesReplacedWith need to
walk the def-use chains and invalidate all users of a
value that is replaced. SCEVs of users need to be
recalcualted even if the new value is equivalent. Also,
make forgetLoopPHIs walk def-use chains, since any
SCEV that depends on a PHI should be recalculated when
more information about that PHI becomes available.
llvm-svn: 70927
only capture their arguments by returning them or
throwing an exception or not based on the argument
value. Patch essentially by Frits van Bommel.
llvm-svn: 70876
makes ScalarEvolution::deleteValueFromRecords, and it's code that
subtly needed to be called before ReplaceAllUsesWith, unnecessary.
It also makes ValueDeletionListener unnecessary.
llvm-svn: 70645
artificial "ptrtoint", as it tends to clutter up complicated
expressions. The cast operators now print both source and
destination types, which is usually sufficient.
llvm-svn: 70554
compute an upper-bound value for the trip count, in addition to
the actual trip count. Use this to allow getZeroExtendExpr and
getSignExtendExpr to fold casts in more cases.
This may eventually morph into a more general value-range
analysis capability; there are certainly plenty of places where
more complete value-range information would allow more folding.
llvm-svn: 70509
(sext i8 {-128,+,1} to i64) to i64 {-128,+,1}, where the iteration
crosses from negative to positive, but is still safe if the trip
count is within range.
llvm-svn: 70421
information to simplify [sz]ext({a,+,b}) to {zext(a),+,[zs]ext(b)},
as appropriate.
These functions and the trip count code each call into the other, so
this requires careful handling to avoid infinite recursion. During
the initial trip count computation, conservative SCEVs are used,
which are subsequently discarded once the trip count is actually
known.
Among other benefits, this change lets LSR automatically eliminate
some unnecessary zext-inreg and sext-inreg operation where the
operand is an induction variable.
llvm-svn: 70241
with the persistent insertion point, and change IndVars to make
use of it. This fixes a bug where IndVars was holding on to a
stale insertion point and forcing the SCEVExpander to continue to
use it.
This fixes PR4038.
llvm-svn: 69892
instructions in order to avoid inserting new ones. However, if
the cast instruction is the SCEVExpander's InsertPt, this
causes subsequently emitted instructions to be inserted near
the cast, and not at the location of the original insert point.
Fix this by adjusting the insert point in such cases.
This fixes PR4009.
llvm-svn: 69808
type to truncate to should be the number of bits of the value that are
preserved, not the number that are clobbered with sign-extension.
This fixes regressions in ldecod.
llvm-svn: 69704
size from the integer, requiring zero extension or truncation. Don't
create ZExtInsts with pointer types. This fixes a regression in
consumer-jpeg.
llvm-svn: 69307
have pointer types, though in contrast to C pointer types, SCEV
addition is never implicitly scaled. This not only eliminates the
need for special code like IndVars' EliminatePointerRecurrence
and LSR's own GEP expansion code, it also does a better job because
it lets the normal optimizations handle pointer expressions just
like integer expressions.
Also, since LLVM IR GEPs can't directly index into multi-dimensional
VLAs, moving the GEP analysis out of client code and into the SCEV
framework makes it easier for clients to handle multi-dimensional
VLAs the same way as other arrays.
Some existing regression tests show improved optimization.
test/CodeGen/ARM/2007-03-13-InstrSched.ll in particular improved to
the point where if-conversion started kicking in; I turned it off
for this test to preserve the intent of the test.
llvm-svn: 69258
- Make type declarations match the struct/class keyword of the definition.
- Move AddSignalHandler into the namespace where it belongs.
- Correctly call functions from template base.
- Some other small changes.
With this patch, LLVM and Clang should build properly and with far less noise under VS2008.
llvm-svn: 67347
the inliner; prevents nondeterministic behavior
when the same address is reallocated.
Don't build call graph nodes for debug intrinsic calls;
they're useless, and there were typically a lot of them.
llvm-svn: 67311
the set of blocks in which values are used, the set in which
values are live-through, and the set in which values are
killed. For the live-through and killed sets, conservative
approximations are used.
llvm-svn: 67309
changes.
For InvokeInst now all arguments begin at op_begin().
The Callee, Cont and Fail are now faster to get by
access relative to op_end().
This patch introduces some temporary uglyness in CallSite.
Next I'll bring CallInst up to a similar scheme and then
the uglyness will magically vanish.
This patch also exposes all the reliance of the libraries
on InvokeInst's operand ordering. I am thinking of taking
care of that too.
llvm-svn: 66920
to obtain debug info about them.
Introduce helpers to access debug info for global variables. Also introduce a
helper that works for both local and global variables.
llvm-svn: 66541
and extern_weak_odr. These are the same as the non-odr versions,
except that they indicate that the global will only be overridden
by an *equivalent* global. In C, a function with weak linkage can
be overridden by a function which behaves completely differently.
This means that IP passes have to skip weak functions, since any
deductions made from the function definition might be wrong, since
the definition could be replaced by something completely different
at link time. This is not allowed in C++, thanks to the ODR
(One-Definition-Rule): if a function is replaced by another at
link-time, then the new function must be the same as the original
function. If a language knows that a function or other global can
only be overridden by an equivalent global, it can give it the
weak_odr linkage type, and the optimizers will understand that it
is alright to make deductions based on the function body. The
code generators on the other hand map weak and weak_odr linkage
to the same thing.
llvm-svn: 66339
get nice and happy stack traces when we crash in an optimizer or codegen. For
example, an abort put in UnswitchLoops now looks like this:
Stack dump:
0. Program arguments: clang pr3399.c -S -O3
1. <eof> parser at end of file
2. per-module optimization passes
3. Running pass 'CallGraph Pass Manager' on module 'pr3399.c'.
4. Running pass 'Loop Pass Manager' on function '@foo'
5. Running pass 'Unswitch loops' on basic block '%for.inc'
Abort
llvm-svn: 66260
to more accurately describe what it does. Expand its doxygen comment
to describe what the backedge-taken count is and how it differs
from the actual iteration count of the loop. Adjust names and
comments in associated code accordingly.
llvm-svn: 65382
trip count value when the original loop iteration condition is
signed and the canonical induction variable won't undergo signed
overflow. This isn't required for correctness; it just preserves
more information about original loop iteration values.
Add a getTruncateOrSignExtend method to ScalarEvolution,
following getTruncateOrZeroExtend.
llvm-svn: 64918
modified in a way that may effect the trip count calculation. Change
IndVars to use this method when it rewrites pointer or floating-point
induction variables instead of using a doInitialization method to
sneak these changes in before ScalarEvolution has a chance to see
the loop. This eliminates the need for LoopPass to depend on
ScalarEvolution.
llvm-svn: 64810
Make sure the SCC pass manager initializes any contained
function pass managers. Without this, simplify-libcalls
would add nocapture attributes when run on its own, but
not when run as part of -std-compile-opts or similar.
llvm-svn: 64443
couldn't ever be the return of call instruction. However, it's quite possible
that said local allocation is itself the return of a function call. That's
what malloc and calloc are for, actually.
llvm-svn: 64442
loop induction on LP64 targets. When the induction variable is
used in addressing, IndVars now is usually able to inserst a
64-bit induction variable and eliminates the sign-extending cast.
This is also useful for code using C "short" types for
induction variables on targets with 32-bit addressing.
Inserting a wider induction variable is easy; the tricky part is
determining when trunc(sext(i)) expressions are no-ops. This
requires range analysis of the loop trip count. A common case is
when the original loop iteration starts at 0 and exits when the
induction variable is signed-less-than a fixed value; this case
is now handled.
This replaces IndVarSimplify's OptimizeCanonicalIVType. It was
doing the same optimization, but it was limited to loops with
constant trip counts, because it was running after the loop
rewrite, and the information about the original induction
variable is lost by that point.
Rename ScalarEvolution's executesAtLeastOnce to
isLoopGuardedByCond, generalize it to be able to test for
ICMP_NE conditions, and move it to be a public function so that
IndVars can use it.
llvm-svn: 64407
function pass managers. Without this, simplify-libcalls
would add nocapture attributes when run on its own, but
not when run as part of -std-compile-opts or similar.
llvm-svn: 64300
they are useful to analyses other than BasicAliasAnalysis.cpp. Include
the full comment for isIdentifiedObject in the header file. Thanks to
Chris for suggeseting this.
llvm-svn: 63589
information output. However, many target specific tool chains prefer to encode
only one compile unit in an object file. In this situation, the LLVM code
generator will include debugging information entities in the compile unit
that is marked as main compile unit. The code generator accepts maximum one main
compile unit per module. If a module does not contain any main compile unit
then the code generator will emit multiple compile units in the output object
file.
[Part 1]
Update DebugInfo APIs to accept optional boolean value while creating DICompileUnit to mark the unit as "main" unit. By defaults all units are considered non-main. Update SourceLevelDebugging.html to document "main" compile unit.
Update DebugInfo APIs to not accept and encode separate source file/directory entries while creating various llvm.dbg.* entities. There was a recent, yet to be documented, change to include this additional information so no documentation changes are required here.
Update DwarfDebug to handle "main" compile unit. If "main" compile unit is seen then all DIEs are inserted into "main" compile unit. All other compile units are used to find source location for llvm.dbg.* values. If there is not any "main" compile unit then create unique compile unit DIEs for each llvm.dbg.compile_unit.
[Part 2]
Create separate llvm.dbg.compile_unit for each input file. Mark compile unit create for main_input_filename as "main" compile unit. Use appropriate compile unit, based on source location information collected from the tree node, while creating llvm.dbg.* values using DebugInfo APIs.
---
This is Part 1.
llvm-svn: 63400
If a MachineInstr doesn't have a memoperand but has an opcode that
is known to load or store, assume its memory reference may alias
*anything*, including stack slots which the compiler completely
controls.
To partially compensate for this, teach the ScheduleDAG building
code to do basic getUnderlyingValue analysis. This greatly
reduces the number of instructions that require restrictive
dependencies. This code will need to be revisited when we start
doing real alias analysis, but it should suffice for now.
llvm-svn: 63370
DW_AT_APPLE_optimized flag is set when a compile_unit is optimized. The debugger takes advantage of this information some way.
DW_AT_APPLE_flags encodes command line options when certain env. variable is set. This is used by build engineers to track various gcc command lines used by by a project, irrespective of whether the project used makefile, Xcode or something else.
llvm-gcc patch is next.
llvm-svn: 62888
This avoids using a dangling pointer.
Reset NumSortedEntries after restoring Cache to avoid extraneous sorts.
This fixes the reduced sqlite3 testcase, but apparently not the whole app.
llvm-svn: 62838
analyses could be run without the caches properly sorted. This
can fix all sorts of weirdness. Many thanks to Bill for coming
up with the 'issorted' verification idea.
llvm-svn: 62757
doing very similar pointer capture analysis.
Factor out the common logic. The new version
is from FunctionAttrs since it does a better
job than the version in BasicAliasAnalysis
llvm-svn: 62461