Fixed non working -profile-verifier-noassert option.
Fixed missing newline in debugEntry().
Cleaned up assert messages. (assert(0 && Message) is still shown, but the message is printed before.)
When verifiying loaded profiles the ProfileVerifier got confused when block was a setjmp target, this is checked now.
When verifiying loaded profiles the ProfileVerifier got confused when block eventually reaching an exit(), this is checked now.
llvm-svn: 81338
that get created during loop unswitching, and fix SplitBlockPredecessors'
LCSSA updating code to create new PHIs instead of trying to just move
existing ones.
Also, optimize Loop::verifyLoop, since it gets called a lot. Use
searches on a sorted list of blocks instead of calling the "contains"
function, as is done in other places in the Loop class, since "contains"
does a linear search. Also, don't call verifyLoop from LoopSimplify or
LCSSA, as the PassManager is already calling verifyLoop as part of
LoopInfo's verifyAnalysis.
llvm-svn: 81221
D test/Analysis/Profiling
--- Reverse-merging r80907 into '.':
U lib/Analysis/ProfileInfoLoaderPass.cpp
Attempt to remove failure in the self-hosting build bot.
llvm-svn: 80966
and exact flags. Because ConstantExprs are uniqued, creating an
expression with this flag causes all expressions with the same operands
to have the same flag, which may not be safe. Add, sub, mul, and sdiv
ConstantExprs are usually folded anyway, so the main interesting flag
here is inbounds, and the constant folder already knows how to set the
inbounds flag automatically in most cases, so there isn't an urgent need
for the API support.
This can be reconsidered in the future, but for now just removing these
API bits eliminates a source of potential trouble with little downside.
llvm-svn: 80959
Optimal edge profiling is only possible when blocks with no predecessors get an
virtual edge (BB,0) that counts the execution frequencies of this
function-exiting blocks.
This patch makes the necessary changes before actually enabling optimal edge profiling.
llvm-svn: 80667
This adds a pass to verify the current profile against the flow conditions.
This is very helpful when later on trying to perserve the profiling information
during all passes.
llvm-svn: 80666
for sanity. This didn't turn up any bugs.
Change CallGraphNode to maintain its "callsite" information in the
call edges list as a WeakVH instead of as an instruction*. This fixes
a broad class of dangling pointer bugs, and makes CallGraph have a number
of useful invariants again. This fixes the class of problem indicated
by PR4029 and PR3601.
llvm-svn: 80663
SCEVUnknowns, as the non-SCEVUnknown cases in the getSCEVAtScope code
can also end up repeatedly climing through the same expression trees,
which can be unusably slow when the trees are very tall.
Also, add a quick check for SCEV pointer equality to the main
SCEV comparison routine, as the full comparison code can be expensive
in the case of large expression trees.
These fix compile-time problems in some pathlogical cases.
llvm-svn: 80623
stem from the fact that we have two types of passes that need to update it:
1. callgraphscc and module passes that are explicitly aware of it
2. Functionpasses (and loop passes etc) that are interlaced with CGSCC passes
by the CGSCC Passmgr.
In the case of #1, we can reasonably expect the passes to update the call
graph just like any analysis. However, functionpasses are not and generally
should not be CG aware. This has caused us no end of problems, so this takes
a new approach. Logically, the CGSCC Pass manager can rescan every function
after it runs a function pass over it to see if the functionpass made any
updates to the IR that affect the callgraph. This allows it to catch new calls
introduced by the functionpass.
In practice, doing this would be slow. This implementation keeps track of
whether or not the current scc is dirtied by a function pass, and, if so,
delays updating the callgraph until it is actually needed again. This was
we avoid extraneous rescans, but we still have good invariants when the
callgraph is needed.
Step #2 of the "give Callgraph some sane invariants" is to change CallGraphNode
to use a CallBackVH for the callsite entry of the CallGraphNode. This way
we can immediately remove entries from the callgraph when a FunctionPass is
active instead of having dangling pointers. The current pass tries to tolerate
these dangling pointers, but it is just an evil hack.
This is related to PR3601/4835/4029. This also reverts r80541, a hack working
around the sad lack of invariants.
llvm-svn: 80566
indirect function pointer, inline it, then go to delete the body.
The problem is that the callgraph had other references to the function,
though the inliner had no way to know it, so we got a dangling pointer
and an invalid iterator out of the deal.
The fix to this is pretty simple: stop the inliner from deleting the
function by knowing that there are references to it. Do this by making
CallGraphNodes contain a refcount. This requires moving deletion of
available_externally functions to the module-level cleanup sweep where
it belongs.
llvm-svn: 80533
argpromotion and structretpromote. Basically, when replacing
a function, they used the 'changeFunction' api which changes
the entry in the function map (and steals/reuses the callgraph
node).
This has some interesting effects: first, the problem is that it doesn't
update the "callee" edges in any callees of the function in the call graph.
Second, this covers for a major problem in all the CGSCC pass stuff, which
is that it is completely broken when functions are deleted if they *don't*
reuse a CGN. (there is a cute little fixme about this though :).
This patch changes the protocol that CGSCC passes must obey: now the CGSCC
pass manager copies the SCC and preincrements its iterator to avoid passes
invalidating it. This allows CGSCC passes to mutate the current SCC. However
multiple passes may be run on that SCC, so if passes do this, they are now
required to *update* the SCC to be current when they return.
Other less interesting parts of this patch are that it makes passes update
the CG more directly, eliminates changeFunction, and requires clients of
replaceCallSite to specify the new callee CGN if they are changing it.
llvm-svn: 80527
This is a simple AliasAnalysis implementation which works by making
ScalarEvolution queries. ScalarEvolution has a more complete understanding
of arithmetic than BasicAA's collection of ad-hoc checks, so it handles
some cases that BasicAA misses, for example p[i] and p[i+1] within the
same iteration of a loop.
This is currently experimental. It may be that the main use for this pass
will be to help find cases where BasicAA can be profitably extended, or
to help in the development of the overall AliasAnalysis infrastructure,
however it's also possible that it could grow up to become a directly
useful pass.
llvm-svn: 80098
will always return the same value. This isn't currently necessary,
since this code doesn't currently ever get called under circumstances
where it would matter, but it may some day.
llvm-svn: 80017
This is conventional command-line tool behavior. -f now just means
"enable binary output on terminals".
Add a -f option to llvm-extract and llvm-link, for consistency.
Remove F_Force from raw_fd_ostream and enable overwriting and
truncating by default. Introduce an F_Excl flag to permit users to
enable a failure when the file already exists. This flag is
currently unused.
Update Makefiles and documentation accordingly.
llvm-svn: 79990
*) introducing new data type and export function of edge info for whole function (preparation for next patch).
*) renaming variables to make clear distinction between data and containers that contain this data.
*) updated comments and whitespaces.
*) made ProfileInfo::MissingValue a double (as it should be...).
(Discussed at http://lists.cs.uiuc.edu/pipermail/llvm-commits/Week-of-Mon-20090817/084955.html.)
llvm-svn: 79940
array member of a struct, it's possible to land in an arbitrary position
inside that struct, such that attempting to find further getelementptr
indices will fail. In such cases, folding cannot be done.
llvm-svn: 79485
static extents of the static array type, it causes GlobalOpt and
other passes to be more conservative. This canonicalization also
allows the constant folder to add "inbounds" to GEPs.
llvm-svn: 79440
TargetData is not present. It still uses TargetData when available.
This generalization also fixed some limitations in the TargetData
case; the attached testcase covers this.
llvm-svn: 79344
- Part of optimal static profiling patch sequence by Andreas Neustifter.
- Store edge, block, and function information separately for each functions
(instead of in one giant map).
- Return frequencies as double instead of int, and use a sentinel value for
missing information.
llvm-svn: 78477
LoopDependenceAnalysis::getLoops is currently O(N*M) for a loop-nest of
depth N and a compound SCEV of M atomic SCEVs. As both N and M will
typically be very small, this should not be a problem. If it turns out
to be one, rewriting getLoops as SCEVVisitor will reduce complexity to
O(M).
llvm-svn: 78394
We can not simply apply ZIV testing to the pointer offsets, as this
would incorrectly return independence for e.g. (GEP x,0,i; GEP x,1,-i).
llvm-svn: 78155
affected after a PHI node has been analyzed, just remove affected
SCEVs from the Scalars map, so that they'll be (lazily) recreated as
needed. This avoids creating SCEV objects that aren't actually needed.
Also, rewrite the associated def-use walking code to be non-recursive
and to continue traversing past Instructions that don't have an
entry in the Scalars map.
llvm-svn: 77032
- Some clients which used DOUT have moved to DEBUG. We are deprecating the
"magic" DOUT behavior which avoided calling printing functions when the
statement was disabled. In addition to being unnecessary magic, it had the
downside of leaving code in -Asserts builds, and of hiding potentially
unnecessary computations.
llvm-svn: 77019
- Yay for '-'s and simplifications!
- I kept StringMap::GetOrCreateValue for compatibility purposes, this can
eventually go away. Likewise the StringMapEntry Create functions still follow
the old style.
- NIFC.
llvm-svn: 76888
This introduces an LDA-internal DependencePair class. The intention is,
that this is a place where dependence testers can store various results
such as SCEVs describing conflicting iterations, breaking conditions,
distance/direction vectors, etc.
llvm-svn: 76877
(x pred y) with more thorough code that does more complete canonicalization
before resorting to range checks. This helps it find more cases where
the canonicalized expressions match.
llvm-svn: 76671
Getelementptrs that are defined to wrap are virtually useless to
optimization, and getelementptrs that are undefined on any kind
of overflow are too restrictive -- it's difficult to ensure that
all intermediate addresses are within bounds. I'm going to take
a different approach.
Remove a few optimizations that depended on this flag.
llvm-svn: 76437
all values belonging to the intersection will belong to the resulting range.
The former was inconsistent about that point (either way is fine, just pick
one.) This is part of PR4545.
llvm-svn: 76289
GEPOperator's hasNoPointer0verflow(), and make a few places in instcombine
that create GEPs that may overflow clear the NoOverflow value. Among
other things, this partially addresses PR2831.
llvm-svn: 76252
in a convenient manner, factoring out some common code from
InstructionCombining and ValueTracking. Move the contents of
BinaryOperators.h into Operator.h and use Operator to generalize them
to support ConstantExprs as well as Instructions.
llvm-svn: 76232
isSafeToSpeculativelyExecute. The new method is a bit closer to what
the callers actually care about in that it rejects more things callers
don't want. It also adds more precise handling for integer
division, and unifies code for analyzing the legality of a speculative
load.
llvm-svn: 76150
the operands have pointer type, so that the resulting type matches
the original SCEV type, and so that unnecessary ptrtoints are
avoided in common cases.
llvm-svn: 75680
For now this only computes the allocated size of the memory pointed to by a
pointer, and offset a pointer from allocated pointer.
The actual checkLimits part will come later, after another round of review.
llvm-svn: 75657
This adds location info for all llvm_unreachable calls (which is a macro now) in
!NDEBUG builds.
In NDEBUG builds location info and the message is off (it only prints
"UREACHABLE executed").
llvm-svn: 75640
works similar to isLoopInvariant, except that it will do trivial
hoisting to try to make the value loop invariant if it isn't already.
This makes it easier for transformation passes to clear trivial
instructions out of the way (the regular LICM pass doesn't run
until relatively late). This is code factored out of LoopSimplify
and other places.
llvm-svn: 75578
and related functions out of LoopBase and into Loop, since they
are specific to BasicBlock-based loops. This also allows the code
to be moved out-of-line.
llvm-svn: 75523
using the Curiously Recurring Template Pattern with LoopBase.
This will help further refactoring, and future functionality for
Loop. Also, Headers can now foward-declare Loop, instead of pulling
in LoopInfo.h or doing tricks.
llvm-svn: 75519
SCEVZeroExtendExpr ahead of the most expensive analysis. This
speeds up analysis and helps avoid pathologically bad behavior
on the testcase in PR4534.
llvm-svn: 75496
so that all code paths get it. PR4256 was about a case where the
phi translation loop would find all preds in the Visited cache, so
it could get by without re-sorting the NonLocalPointerDeps cache.
Fix this by resorting it earlier, there is no reason not to do this.
This patch inspired by Jakub Staszak's patch.
llvm-svn: 75476
This involves temporarily hard wiring some parts to use the global context. This isn't ideal, but it's
the only way I could figure out to make this process vaguely incremental.
llvm-svn: 75445
Make llvm_unreachable take an optional string, thus moving the cerr<< out of
line.
LLVM_UNREACHABLE is now a simple wrapper that makes the message go away for
NDEBUG builds.
llvm-svn: 75379
to a loop deletion more thorough. Don't prune the def-use tree search at
instructions that don't have SCEVs computed, because an instruction with
a user that has a computed SCEV may itself lack a computed SCEV. Also,
remove loop-related values from the ValuesAtScopes and
ConstantEvolutionLoopExitValues maps as well.
This fixes a regression in 483.xalancbmk.
llvm-svn: 75030
Constant. This lets ConstantInts be handled as SCEVConstant instead
of SCEVUnknown, as getUnknown no longer has special-case code for
ConstantInt and friends. This usually doesn't affect the final
output, since the constants end up getting folded later, but it
does make intermediate expressions more obvious in many cases.
llvm-svn: 74459
an individual exhaustive evaluation reflects only the exit value
implied by an individual exit, which may differ from the actual
exit value of the loop if there are other exits. This fixes PR4477.
llvm-svn: 74447
Also don't call finalizers for LoopPass if initialization was not called.
Add a unittest that tests that these methods are called, in the proper
order, and the correct number of times.
llvm-svn: 74438
This helps it avoid reusing an instruction that doesn't dominate all
of the users, in cases where the original instruction was inserted
before all of the users were known. This may result in redundant
expansions of sub-expressions that depend on loop-unpredictable values
in some cases, however this isn't very common, and it primarily impacts
IndVarSimplify, so GVN can be expected to clean these up.
This eliminates the need for IndVarSimplify's FixUsesBeforeDefs,
which fixes several bugs.
llvm-svn: 74352
nesting order of nested AddRec expressions to skip the transformation
if it would introduce an AddRec with operands not loop-invariant
with respect to its loop.
llvm-svn: 74343
trip counts in more cases.
Generalize ScalarEvolution's isLoopGuardedByCond code to recognize
And and Or conditions, splitting the code out into an
isNecessaryCond helper function so that it can evaluate Ands and Ors
recursively, and make SCEVExpander be much more aggressive about
hoisting instructions out of loops.
test/CodeGen/X86/pr3495.ll has an additional instruction now, but
it appears to be due to an arbitrary register allocation difference.
llvm-svn: 74048
createSCEV. Also, recognize UndefValue in createSCEV.
Change getIntegerSCEV's comment to avoid mentioning FP types,
and re-implement it in terms of getConstant instead of getUnknown.
llvm-svn: 74041
This also throws out the SCEV reference counting scheme, as the the SCEVs now have a lifetime controlled by the
ScalarEvolution pass.
Note that SCEVHandle is now a no-op, and will be remove in a future commit.
llvm-svn: 73892
counts for loops with multiple exits, replacing more conservative code
which only handled constants. This is derived from a patch by
Nick Lewycky.
This also fixes llc aborts in ClamAV and others, as
getUMinFromMismatchedTypes takes care of balancing the types before
working with them.
llvm-svn: 73884
blocks, and also exit blocks with multiple conditions (combined
with (bitwise) ands and ors). It's often infeasible to compute an
exact trip count in such cases, but a useful upper bound can often
be found.
llvm-svn: 73866
SCEVUnknowns with identical Instructions to be equal. This allows
it to analze cases such as the attached testcase, where the front-end
has cloned the loop controlling expression. Along with r73805, this
lets IndVarSimplify eliminate all the sign-extend casts in the
loop in the attached testcase.
llvm-svn: 73807
so that it can access the TargetData member (when available) and
use ValueTracking.h information to compute information for
SCEVUnknown Values.
Also add GetMinLeadingZeros and GetMinSignBits functions,
with minimal implementations.
llvm-svn: 73794
expression in IVUsers, because in the case of a use of a non-linear
addrec outside of a loop, this causes the addrec to be evaluated as
a linear addrec.
llvm-svn: 73774
casted induction variables in cases where the cast
isn't foldable. It ended up being a pessimization in
many cases. This could be fixed, but it would require
a bunch of complicated code in IVUsers' clients. The
advantages of this approach aren't visible enough to
justify it at this time.
llvm-svn: 73706
obscuring what would otherwise be a low-bits mask. Use ComputeMaskedBits
to compute what ShrinkDemandedConstant knew about to reconstruct a
low-bits mask value.
llvm-svn: 73540
failures.
To support this, add some utility functions to Type to help support
vector/scalar-independent code. Change ConstantInt::get and
ConstantFP::get to support vector types, and add an overload to
ConstantInt::get that uses a static IntegerType type, for
convenience.
Introduce a new getConstant method for ScalarEvolution, to simplify
common use cases.
llvm-svn: 73431
they contain multiplications of constants with add operations.
This helps simplify several kinds of things; in particular it
helps simplify expressions like ((-1 * (%a + %b)) + %a) to %b,
as expressions like this often come up in loop trip count
computations.
llvm-svn: 73361
even though the order doesn't matter at the top level of an expression,
it does matter when the constant is a subexpression of an n-ary
expression, because n-ary expressions are sorted lexicographically.
llvm-svn: 73358
induction variable when the addrec to be expanded does not require
a wider type. This eliminates the need for IndVarSimplify to
micro-manage SCEV expansions, because SCEVExpander now
automatically expands them in the form that IndVarSimplify considers
to be canonical. (LSR still micro-manages its SCEV expansions,
because it's optimizing for the target, rather than for
other optimizations.)
Also, this uses the new getAnyExtendExpr, which has more clever
expression simplification logic than the IndVarSimplify code it
replaces, and this cleans up some ugly expansions in code such as
the included masked-iv.ll testcase.
llvm-svn: 73294
immediately casted. At present, this is just a minor code
simplification. In the future, the expansion code may be able
to make better choices if it knows what the desired result
type will be.
llvm-svn: 73137
integer and floating-point opcodes, introducing
FAdd, FSub, and FMul.
For now, the AsmParser, BitcodeReader, and IRBuilder all preserve
backwards compatability, and the Core LLVM APIs preserve backwards
compatibility for IR producers. Most front-ends won't need to change
immediately.
This implements the first step of the plan outlined here:
http://nondot.org/sabre/LLVMNotes/IntegerOverflow.txt
llvm-svn: 72897
TargetData pointer. The only thing it's used for are
calls to ConstantFoldCompareInstOperands and
ConstantFoldInstOperands, which both already accept a
null TargetData pointer. This makes
ConstantFoldConstantExpression easier to use in clients
where TargetData is optional.
llvm-svn: 72741
possible. For example, it now emits
%p.2.ip.1 = getelementptr [3 x [3 x double]]* %p, i64 2, i64 %tmp, i64 1
instead of the equivalent but less obvious
%p.2.ip.1 = getelementptr [3 x [3 x double]]* %p, i64 0, i64 %tmp, i64 19
llvm-svn: 72452
beyond their associated static array type.
I believe that this fixes a legitimate bug, because BasicAliasAnalysis
already has code to check for this condition that works for non-constant
indices, however it was missing the case of constant indices. With this
change, it checks for both.
This fixes PR4267, and miscompiles of SPEC 188.ammp and 464.h264.href.
llvm-svn: 72451
that of the LHS. It doesn't matter for correctness, but the LHS
is more likely than the RHS to be a pointer type in exotic cases,
and it's more tidy to have it return the integer type.
llvm-svn: 72424
in the case where a loop exit value cannot be computed, instead of only in
some cases while using SCEVCouldNotCompute in others. This simplifies
getSCEVAtScope's callers.
llvm-svn: 72375
sending SCEVUnknowns to expandAddToGEP. This avoids the need for
expandAddToGEP to bend the rules and peek into SCEVUnknown
expressions.
Factor out the code for testing whether a SCEV can be factored by
a constant for use in a GEP index. This allows it to handle
SCEVAddRecExprs, by recursing.
As a result, SCEVExpander can now put more things in GEP indices,
so it emits fewer explicit mul instructions.
llvm-svn: 72366
Fix by clearing the rewriter cache before deleting the trivially dead
instructions.
Also make InsertedExpressions use an AssertingVH to catch these
bugs easier.
llvm-svn: 72364
Instcombine to be more aggressive about using SimplifyDemandedBits
on shift nodes. This allows a shift to be simplified to zero in the
included test case.
llvm-svn: 72204
instructions. It attempts to create high-level multi-operand GEPs,
though in cases where this isn't possible it falls back to casting
the pointer to i8* and emitting a GEP with that. Using GEP instructions
instead of ptrtoint+arithmetic+inttoptr helps pointer analyses that
don't use ScalarEvolution, such as BasicAliasAnalysis.
Also, make the AddrModeMatcher more aggressive in handling GEPs.
Previously it assumed that operand 0 of a GEP would require a register
in almost all cases. It now does extra checking and can do more
matching if operand 0 of the GEP is foldable. This fixes a problem
that was exposed by SCEVExpander using GEPs.
llvm-svn: 72093
IVUsers.cpp: In member function ‘bool llvm::IVUsers::AddUsersIfInteresting(llvm::Instruction*)’:
IVUsers.cpp:221: warning: ‘isSigned’ may be used uninitialized in this function
with gcc-4.3.
llvm-svn: 71654
getNoopOrSignExtend, and getTruncateOrNoop. These are similar
to getTruncateOrZeroExtend etc., except that they assert that
the conversion is either not widening or narrowing, as
appropriate. These will be used in some upcoming fixes.
llvm-svn: 71632
and generalize it so that it can be used by IndVarSimplify. Implement the
base IndVarSimplify transformation code using IVUsers. This removes
TestOrigIVForWrap and associated code, as ScalarEvolution now has enough
builtin overflow detection and folding logic to handle all the same cases,
and more. Run "opt -iv-users -analyze -disable-output" on your favorite
loop for an example of what IVUsers does.
This lets IndVarSimplify eliminate IV casts and compute trip counts in
more cases. Also, this happens to finally fix the remaining testcases
in PR1301.
Now that IndVarSimplify is being more aggressive, it occasionally runs
into the problem where ScalarEvolutionExpander's code for avoiding
duplicate expansions makes it difficult to ensure that all expanded
instructions dominate all the instructions that will use them. As a
temporary measure, IndVarSimplify now uses a FixUsesBeforeDefs function
to fix up instructions inserted by SCEVExpander. Fortunately, this code
is contained, and can be easily removed once a more comprehensive
solution is available.
llvm-svn: 71535
These values aren't analyzable, so they don't care if more information
about the loop trip count can be had. Also, SCEVUnknown is used for
a PHI while the PHI itself is being analyzed, so it needs to be left
in the Scalars map. This fixes a variety of subtle issues.
llvm-svn: 71533
return the correct value when the cast operand is all zeros. This ought
to be pretty rare, because it would mean that the regular SCEV folding
routines missed a case, though there are cases they might legitimately
miss. Also, it's unlikely anything currently using GetMinTrailingZeros
cares about this case.
llvm-svn: 71532
which are not analyzed with SCEV techniques, which can require
brute-forcing through a large number of instructions. This
fixes a massive compile-time issue on 400.perlbench (in
particular, the loop in MD5Transform).
llvm-svn: 71259
bits captured, but the pointer marked nocapture. In fact
I now recall that this problem is why only readnone functions
returning void were considered before! However keep a small
fix that was also in r70876: a readnone function returning
void can result in bits being captured if it unwinds, so
test for this.
llvm-svn: 71168
checking for bcopy... no
checking for getc_unlocked... Assertion failed: (0 && "Unknown SCEV kind!"), function operator(), file /Volumes/Sandbox/Buildbot/llvm/full-llvm/build/llvmCore.roots/llvmCore~obj/src/lib/Analysis/ScalarEvolution.cpp, line 511.
/Volumes/Sandbox/Buildbot/llvm/full-llvm/build/llvmgcc42.roots/llvmgcc42~obj/src/libdecnumber/decUtility.c:360: internal compiler error: Abort trap
Please submit a full bug report,
with preprocessed source if appropriate.
See <URL:http://developer.apple.com/bugreporter> for instructions.
make[4]: *** [decUtility.o] Error 1
make[4]: *** Waiting for unfinished jobs....
Assertion failed: (0 && "Unknown SCEV kind!"), function operator(), file /Volumes/Sandbox/Buildbot/llvm/full-llvm/build/llvmCore.roots/llvmCore~obj/src/lib/Analysis/ScalarEvolution.cpp, line 511.
/Volumes/Sandbox/Buildbot/llvm/full-llvm/build/llvmgcc42.roots/llvmgcc42~obj/src/libdecnumber/decNumber.c:5591: internal compiler error: Abort trap
Please submit a full bug report,
with preprocessed source if appropriate.
See <URL:http://developer.apple.com/bugreporter> for instructions.
make[4]: *** [decNumber.o] Error 1
make[3]: *** [all-stage2-libdecnumber] Error 2
make[3]: *** Waiting for unfinished jobs....
llvm-svn: 71165
to sorting SCEVs by their kind, sort SCEVs of the same kind according
to their operands. This helps avoid things like (a+b) being a distinct
expression from (b+a).
llvm-svn: 71160
CallbackVH, with fixes. allUsesReplacedWith need to
walk the def-use chains and invalidate all users of a
value that is replaced. SCEVs of users need to be
recalcualted even if the new value is equivalent. Also,
make forgetLoopPHIs walk def-use chains, since any
SCEV that depends on a PHI should be recalculated when
more information about that PHI becomes available.
llvm-svn: 70927
only capture their arguments by returning them or
throwing an exception or not based on the argument
value. Patch essentially by Frits van Bommel.
llvm-svn: 70876
makes ScalarEvolution::deleteValueFromRecords, and it's code that
subtly needed to be called before ReplaceAllUsesWith, unnecessary.
It also makes ValueDeletionListener unnecessary.
llvm-svn: 70645
artificial "ptrtoint", as it tends to clutter up complicated
expressions. The cast operators now print both source and
destination types, which is usually sufficient.
llvm-svn: 70554
compute an upper-bound value for the trip count, in addition to
the actual trip count. Use this to allow getZeroExtendExpr and
getSignExtendExpr to fold casts in more cases.
This may eventually morph into a more general value-range
analysis capability; there are certainly plenty of places where
more complete value-range information would allow more folding.
llvm-svn: 70509
(sext i8 {-128,+,1} to i64) to i64 {-128,+,1}, where the iteration
crosses from negative to positive, but is still safe if the trip
count is within range.
llvm-svn: 70421
information to simplify [sz]ext({a,+,b}) to {zext(a),+,[zs]ext(b)},
as appropriate.
These functions and the trip count code each call into the other, so
this requires careful handling to avoid infinite recursion. During
the initial trip count computation, conservative SCEVs are used,
which are subsequently discarded once the trip count is actually
known.
Among other benefits, this change lets LSR automatically eliminate
some unnecessary zext-inreg and sext-inreg operation where the
operand is an induction variable.
llvm-svn: 70241
with the persistent insertion point, and change IndVars to make
use of it. This fixes a bug where IndVars was holding on to a
stale insertion point and forcing the SCEVExpander to continue to
use it.
This fixes PR4038.
llvm-svn: 69892
instructions in order to avoid inserting new ones. However, if
the cast instruction is the SCEVExpander's InsertPt, this
causes subsequently emitted instructions to be inserted near
the cast, and not at the location of the original insert point.
Fix this by adjusting the insert point in such cases.
This fixes PR4009.
llvm-svn: 69808
type to truncate to should be the number of bits of the value that are
preserved, not the number that are clobbered with sign-extension.
This fixes regressions in ldecod.
llvm-svn: 69704
size from the integer, requiring zero extension or truncation. Don't
create ZExtInsts with pointer types. This fixes a regression in
consumer-jpeg.
llvm-svn: 69307
have pointer types, though in contrast to C pointer types, SCEV
addition is never implicitly scaled. This not only eliminates the
need for special code like IndVars' EliminatePointerRecurrence
and LSR's own GEP expansion code, it also does a better job because
it lets the normal optimizations handle pointer expressions just
like integer expressions.
Also, since LLVM IR GEPs can't directly index into multi-dimensional
VLAs, moving the GEP analysis out of client code and into the SCEV
framework makes it easier for clients to handle multi-dimensional
VLAs the same way as other arrays.
Some existing regression tests show improved optimization.
test/CodeGen/ARM/2007-03-13-InstrSched.ll in particular improved to
the point where if-conversion started kicking in; I turned it off
for this test to preserve the intent of the test.
llvm-svn: 69258
- Make type declarations match the struct/class keyword of the definition.
- Move AddSignalHandler into the namespace where it belongs.
- Correctly call functions from template base.
- Some other small changes.
With this patch, LLVM and Clang should build properly and with far less noise under VS2008.
llvm-svn: 67347
the inliner; prevents nondeterministic behavior
when the same address is reallocated.
Don't build call graph nodes for debug intrinsic calls;
they're useless, and there were typically a lot of them.
llvm-svn: 67311
the set of blocks in which values are used, the set in which
values are live-through, and the set in which values are
killed. For the live-through and killed sets, conservative
approximations are used.
llvm-svn: 67309
changes.
For InvokeInst now all arguments begin at op_begin().
The Callee, Cont and Fail are now faster to get by
access relative to op_end().
This patch introduces some temporary uglyness in CallSite.
Next I'll bring CallInst up to a similar scheme and then
the uglyness will magically vanish.
This patch also exposes all the reliance of the libraries
on InvokeInst's operand ordering. I am thinking of taking
care of that too.
llvm-svn: 66920
to obtain debug info about them.
Introduce helpers to access debug info for global variables. Also introduce a
helper that works for both local and global variables.
llvm-svn: 66541
and extern_weak_odr. These are the same as the non-odr versions,
except that they indicate that the global will only be overridden
by an *equivalent* global. In C, a function with weak linkage can
be overridden by a function which behaves completely differently.
This means that IP passes have to skip weak functions, since any
deductions made from the function definition might be wrong, since
the definition could be replaced by something completely different
at link time. This is not allowed in C++, thanks to the ODR
(One-Definition-Rule): if a function is replaced by another at
link-time, then the new function must be the same as the original
function. If a language knows that a function or other global can
only be overridden by an equivalent global, it can give it the
weak_odr linkage type, and the optimizers will understand that it
is alright to make deductions based on the function body. The
code generators on the other hand map weak and weak_odr linkage
to the same thing.
llvm-svn: 66339
get nice and happy stack traces when we crash in an optimizer or codegen. For
example, an abort put in UnswitchLoops now looks like this:
Stack dump:
0. Program arguments: clang pr3399.c -S -O3
1. <eof> parser at end of file
2. per-module optimization passes
3. Running pass 'CallGraph Pass Manager' on module 'pr3399.c'.
4. Running pass 'Loop Pass Manager' on function '@foo'
5. Running pass 'Unswitch loops' on basic block '%for.inc'
Abort
llvm-svn: 66260
to more accurately describe what it does. Expand its doxygen comment
to describe what the backedge-taken count is and how it differs
from the actual iteration count of the loop. Adjust names and
comments in associated code accordingly.
llvm-svn: 65382
trip count value when the original loop iteration condition is
signed and the canonical induction variable won't undergo signed
overflow. This isn't required for correctness; it just preserves
more information about original loop iteration values.
Add a getTruncateOrSignExtend method to ScalarEvolution,
following getTruncateOrZeroExtend.
llvm-svn: 64918
modified in a way that may effect the trip count calculation. Change
IndVars to use this method when it rewrites pointer or floating-point
induction variables instead of using a doInitialization method to
sneak these changes in before ScalarEvolution has a chance to see
the loop. This eliminates the need for LoopPass to depend on
ScalarEvolution.
llvm-svn: 64810
Make sure the SCC pass manager initializes any contained
function pass managers. Without this, simplify-libcalls
would add nocapture attributes when run on its own, but
not when run as part of -std-compile-opts or similar.
llvm-svn: 64443
couldn't ever be the return of call instruction. However, it's quite possible
that said local allocation is itself the return of a function call. That's
what malloc and calloc are for, actually.
llvm-svn: 64442
loop induction on LP64 targets. When the induction variable is
used in addressing, IndVars now is usually able to inserst a
64-bit induction variable and eliminates the sign-extending cast.
This is also useful for code using C "short" types for
induction variables on targets with 32-bit addressing.
Inserting a wider induction variable is easy; the tricky part is
determining when trunc(sext(i)) expressions are no-ops. This
requires range analysis of the loop trip count. A common case is
when the original loop iteration starts at 0 and exits when the
induction variable is signed-less-than a fixed value; this case
is now handled.
This replaces IndVarSimplify's OptimizeCanonicalIVType. It was
doing the same optimization, but it was limited to loops with
constant trip counts, because it was running after the loop
rewrite, and the information about the original induction
variable is lost by that point.
Rename ScalarEvolution's executesAtLeastOnce to
isLoopGuardedByCond, generalize it to be able to test for
ICMP_NE conditions, and move it to be a public function so that
IndVars can use it.
llvm-svn: 64407
function pass managers. Without this, simplify-libcalls
would add nocapture attributes when run on its own, but
not when run as part of -std-compile-opts or similar.
llvm-svn: 64300
they are useful to analyses other than BasicAliasAnalysis.cpp. Include
the full comment for isIdentifiedObject in the header file. Thanks to
Chris for suggeseting this.
llvm-svn: 63589
information output. However, many target specific tool chains prefer to encode
only one compile unit in an object file. In this situation, the LLVM code
generator will include debugging information entities in the compile unit
that is marked as main compile unit. The code generator accepts maximum one main
compile unit per module. If a module does not contain any main compile unit
then the code generator will emit multiple compile units in the output object
file.
[Part 1]
Update DebugInfo APIs to accept optional boolean value while creating DICompileUnit to mark the unit as "main" unit. By defaults all units are considered non-main. Update SourceLevelDebugging.html to document "main" compile unit.
Update DebugInfo APIs to not accept and encode separate source file/directory entries while creating various llvm.dbg.* entities. There was a recent, yet to be documented, change to include this additional information so no documentation changes are required here.
Update DwarfDebug to handle "main" compile unit. If "main" compile unit is seen then all DIEs are inserted into "main" compile unit. All other compile units are used to find source location for llvm.dbg.* values. If there is not any "main" compile unit then create unique compile unit DIEs for each llvm.dbg.compile_unit.
[Part 2]
Create separate llvm.dbg.compile_unit for each input file. Mark compile unit create for main_input_filename as "main" compile unit. Use appropriate compile unit, based on source location information collected from the tree node, while creating llvm.dbg.* values using DebugInfo APIs.
---
This is Part 1.
llvm-svn: 63400
If a MachineInstr doesn't have a memoperand but has an opcode that
is known to load or store, assume its memory reference may alias
*anything*, including stack slots which the compiler completely
controls.
To partially compensate for this, teach the ScheduleDAG building
code to do basic getUnderlyingValue analysis. This greatly
reduces the number of instructions that require restrictive
dependencies. This code will need to be revisited when we start
doing real alias analysis, but it should suffice for now.
llvm-svn: 63370
DW_AT_APPLE_optimized flag is set when a compile_unit is optimized. The debugger takes advantage of this information some way.
DW_AT_APPLE_flags encodes command line options when certain env. variable is set. This is used by build engineers to track various gcc command lines used by by a project, irrespective of whether the project used makefile, Xcode or something else.
llvm-gcc patch is next.
llvm-svn: 62888
This avoids using a dangling pointer.
Reset NumSortedEntries after restoring Cache to avoid extraneous sorts.
This fixes the reduced sqlite3 testcase, but apparently not the whole app.
llvm-svn: 62838
analyses could be run without the caches properly sorted. This
can fix all sorts of weirdness. Many thanks to Bill for coming
up with the 'issorted' verification idea.
llvm-svn: 62757
doing very similar pointer capture analysis.
Factor out the common logic. The new version
is from FunctionAttrs since it does a better
job than the version in BasicAliasAnalysis
llvm-svn: 62461
will get its preferred alignment. It has to be careful and cautiously assume
it will just get the ABI alignment. This prevents instcombine from rounding
up the alignment of a load/store without adjusting the alignment of the alloca.
llvm-svn: 61934
The problematic part of this patch is that we were out of attribute bits,
requiring some fancy bit hacking to make it fit (by shrinking alignment)
without breaking existing users or the file format.
This change will require users to rebuild llvm-gcc to match llvm.
llvm-svn: 61239
visited set before they are used. If used, their blocks need to be
added to the visited set so that subsequent queries don't use conflicting
pointer values in the cache result blocks.
llvm-svn: 61080
memdep keeps track of how PHIs affect the pointer in dep queries, which
allows it to eliminate the load in cases like rle-phi-translate.ll, which
basically end up being:
BB1:
X = load P
br BB3
BB2:
Y = load Q
br BB3
BB3:
R = phi [P] [Q]
load R
turning "load R" into a phi of X/Y. In addition to additional exposed
opportunities, this makes memdep safe in many cases that it wasn't before
(which is required for load PRE) and also makes it substantially more
efficient. For example, consider:
bb1: // has many predecessors.
P = some_operator()
load P
In this example, previously memdep would scan all the predecessors of BB1
to see if they had something that would mustalias P. In some cases (e.g.
test/Transforms/GVN/rle-must-alias.ll) it would actually find them and end
up eliminating something. In many other cases though, it would scan and not
find anything useful. MemDep now stops at a block if the pointer is defined
in that block and cannot be phi translated to predecessors. This causes it
to miss the (rare) cases like rle-must-alias.ll, but makes it faster by not
scanning tons of stuff that is unlikely to be useful. For example, this
speeds up GVN as a whole from 3.928s to 2.448s (60%)!. IMO, scalar GVN
should be enhanced to simplify the rle-must-alias pointer base anyway, which
would allow the loads to be eliminated.
In the future, this should be enhanced to phi translate through geps and
bitcasts as well (as indicated by FIXMEs) making memdep even more powerful.
llvm-svn: 61022
parallel, allowing it to decide that P/Q must alias if A/B
must alias in things like:
P = gep A, 0, i, 1
Q = gep B, 0, i, 1
This allows GVN to delete 62 more instructions out of 403.gcc.
llvm-svn: 60820
of a pointer. This allows is to catch more equivalencies. For example,
the type_lists_compatible_p function used to require two iterations of
the gvn pass (!) to delete its 18 redundant loads because the first pass
would CSE all the addressing computation cruft, which would unblock the
second memdep/gvn passes from recognizing them. This change allows
memdep/gvn to catch all 18 when run just once on the function (as is
typical :) instead of just 3.
On all of 403.gcc, this bumps up the # reundandancies found from:
63 gvn - Number of instructions PRE'd
153991 gvn - Number of instructions deleted
50069 gvn - Number of loads deleted
to:
63 gvn - Number of instructions PRE'd
154137 gvn - Number of instructions deleted
50185 gvn - Number of loads deleted
+120 loads deleted isn't bad.
llvm-svn: 60799
tricks based on readnone/readonly functions.
Teach memdep to look past readonly calls when analyzing
deps for a readonly call. This allows elimination of a
few more calls from 403.gcc:
before:
63 gvn - Number of instructions PRE'd
153986 gvn - Number of instructions deleted
50069 gvn - Number of loads deleted
after:
63 gvn - Number of instructions PRE'd
153991 gvn - Number of instructions deleted
50069 gvn - Number of loads deleted
5 calls isn't much, but this adds plumbing for the next change.
llvm-svn: 60794
load dependence queries. This allows GVN to eliminate a few more
instructions on 403.gcc:
152598 gvn - Number of instructions deleted
49240 gvn - Number of loads deleted
after:
153986 gvn - Number of instructions deleted
50069 gvn - Number of loads deleted
llvm-svn: 60786
the first block of a query specially. This makes the "complete query
caching" subsystem more effective, avoiding predecessor queries. This
speeds up GVN another 4%.
llvm-svn: 60752
track of whether the CachedNonLocalPointerInfo for a block is specific
to a block. If so, just return it without any pred scanning. This is
good for a 6% speedup on GVN (when it uses this lookup method, which
it doesn't right now).
llvm-svn: 60695
method. This will eventually take over load/store dep
queries from getNonLocalDependency. For now it works
fine, but is incredibly slow because it does no caching.
Lets not switch GVN to use it until that is fixed :)
llvm-svn: 60649
clobber with the current implementation. Instead of returning
a "precise clobber" just return a fuzzy one. This doesn't
matter to any clients anyway and should speed up analysis time
very very slightly.
llvm-svn: 60641
1. Merge the 'None' result into 'Normal', making loads
and stores return their dependencies on allocations as Normal.
2. Split the 'Normal' result into 'Clobber' and 'Def' to
distinguish between the cases when memdep knows the value is
produced from when we just know if may be changed.
3. Move some of the logic for determining whether readonly calls
are CSEs into memdep instead of it being in GVN. This still
leaves verification that the arguments are hte same to GVN to
let it know about value equivalences in different contexts.
4. Change memdep's call/call dependency analysis to use
getModRefInfo(CallSite,CallSite) instead of doing something
very weak. This only really matters for things like DSA, but
someday maybe we'll have some other decent context sensitive
analyses :)
5. This reimplements the guts of memdep to handle the new results.
6. This simplifies GVN significantly:
a) readonly call CSE is slightly simpler
b) I eliminated the "getDependencyFrom" chaining for load
elimination and load CSE doesn't have to worry about
volatile (they are always clobbers) anymore.
c) GVN no longer does any 'lastLoad' caching, leaving it to
memdep.
7. The logic in DSE is simplified a bit and sped up. A potentially
unsafe case was eliminated.
llvm-svn: 60607
vector instead of a densemap. This shrinks the memory usage of this thing
substantially (the high water mark) as well as making operations like
scanning it faster. This speeds up memdep slightly, gvn goes from
3.9376 to 3.9118s on 403.gcc
This also splits out the statistics for the cached non-local case to
differentiate between the dirty and clean cached case. Here's the stats
for 403.gcc:
6153 memdep - Number of dirty cached non-local responses
169336 memdep - Number of fully cached non-local responses
162428 memdep - Number of uncached non-local responses
yay for caching :)
llvm-svn: 60313
ReverseLocalDeps when we update it. This fixes a regression test
failure from my last commit.
Second, for each non-local cached information structure, keep a bit that
indicates whether it is dirty or not. This saves us a scan over the whole
thing in the common case when it isn't dirty.
llvm-svn: 60274
instead of containing them by value. This increases the density
(!) of NonLocalDeps as well as making the reallocation case
faster. This speeds up gvn on 403.gcc by 2% and makes room for
future improvements.
I'm not super thrilled with having to explicitly manage the new/delete
of the map, but it is necesary for the next change.
llvm-svn: 60271
If we see that a load depends on the allocation of its memory with no
intervening stores, we now return a 'None' depedency instead of "Normal".
This tweaks GVN to do its optimization with the new result.
llvm-svn: 60267
dependencies. The basic situation was this: consider if we had:
store1
...
store2
...
store3
Where memdep thinks that store3 depends on store2 and store2 depends
on store1. The problem happens when we delete store2: The code in
question was updating dep info for store3 to be store1. This is a
spiffy optimization, but is not safe at all, because aliasing isn't
transitive. This bug isn't exposed today with DSE because DSE will only
zap store2 if it is identifical to store 3, and in this case, it is
safe to update it to depend on store1. However, memcpyopt is not so
fortunate, which is presumably why the "dropInstruction" code used to
exist.
Since this doesn't actually provide a speedup in practice, just rip the
code out.
llvm-svn: 60263
an entry in the nonlocal deps map, don't reset entries
referencing that instruction to [dirty, null], instead, set
them to [dirty,next] where next is the instruction after the
deleted one. Use this information in the non-local deps
code to avoid rescanning entire blocks.
This speeds up GVN slightly by avoiding pointless work. On
403.gcc this makes GVN 1.5% faster.
llvm-svn: 60256
Put a some code back to handle buggy behavior that GVN expects: it wants
loads to depend on each other, and accesses to depend on their allocations.
llvm-svn: 60240
Document the Dirty value more precisely, use it for the uninitialized
DepResultTy value. Change reverse mappings to be from an instruction*
instead of DepResultTy, and stop tracking other forms. This makes it more
clear that we only care about the instruction cases.
Eliminate a DepResultTy,bool pair by using Dirty in the local case as well,
shrinking the map and simplifying the code.
This speeds up GVN by ~3% on 403.gcc.
llvm-svn: 60232
query. This makes it crystal clear what cases can escape from MemDep that
the clients have to handle. This also gives the clients a nice simplified
interface to it that is easy to poke at.
This patch also makes DepResultTy and MemoryDependenceAnalysis::DepType
private, yay.
llvm-svn: 60231
of a pointer/int pair instead of a manually bitmangled pointer.
This forces clients to think a little more about checking the
appropriate pieces and will be useful for internal
implementation improvements later.
I'm not particularly happy with this. After going through this
I don't think that the clients of memdep should be exposed to
the internal type at all. I'll fix this in a subsequent commit.
This has no functionality change.
llvm-svn: 60230
properly updates the reverse dependency map when it installs updated
dependencies for instructions that depend on the removed instruction.
llvm-svn: 60222
value. It must now be as if the pointer were allocated and has not escaped to
the caller. Thanks to Dan Gohman for pointing out the error in the original
and helping devise this definition.
llvm-svn: 59940
indicate functions that allocate, such as operator new, or list::insert. The
actual definition is slightly less strict (for now).
No changes to the bitcode reader/writer, asm printer or verifier were needed.
llvm-svn: 59934
g++ -m32 -c -g -DIN_GCC -W -Wall -Wwrite-strings -Wmissing-format-attribute -fno-common -mdynamic-no-pic -DHAVE_CONFIG_H -Wno-unused -DTARGET_NAME=\"i386-apple-darwin9.5.0\" -I. -I. -I../../llvm-gcc.src/gcc -I../../llvm-gcc.src/gcc/. -I../../llvm-gcc.src/gcc/../include -I./../intl -I../../llvm-gcc.src/gcc/../libcpp/include -I../../llvm-gcc.src/gcc/../libdecnumber -I../libdecnumber -I/Volumes/Sandbox/Buildbot/llvm/full-llvm/build/llvm.obj/include -I/Volumes/Sandbox/Buildbot/llvm/full-llvm/build/llvm.src/include -DENABLE_LLVM -I/Volumes/Sandbox/Buildbot/llvm/full-llvm/build/llvm.obj/../llvm.src/include -D_DEBUG -D_GNU_SOURCE -D__STDC_LIMIT_MACROS -D__STDC_CONSTANT_MACROS -I. -I. -I../../llvm-gcc.src/gcc -I../../llvm-gcc.src/gcc/. -I../../llvm-gcc.src/gcc/../include -I./../intl -I../../llvm-gcc.src/gcc/../libcpp/include -I../../llvm-gcc.src/gcc/../libdecnumber -I../libdecnumber -I/Volumes/Sandbox/Buildbot/llvm/full-llvm/build/llvm.obj/include -I/Volumes/Sandbox/Buildbot/llvm/full-llvm/build/llvm.src/include ../../llvm-gcc.src/gcc/llvm-types.cpp -o llvm-types.o
../../llvm-gcc.src/gcc/llvm-convert.cpp: In member function 'void TreeToLLVM::EmitMemCpy(llvm::Value*, llvm::Value*, llvm::Value*, unsigned int)':
../../llvm-gcc.src/gcc/llvm-convert.cpp:1496: error: 'memcpy_i32' is not a member of 'llvm::Intrinsic'
../../llvm-gcc.src/gcc/llvm-convert.cpp:1496: error: 'memcpy_i64' is not a member of 'llvm::Intrinsic'
../../llvm-gcc.src/gcc/llvm-convert.cpp: In member function 'void TreeToLLVM::EmitMemMove(llvm::Value*, llvm::Value*, llvm::Value*, unsigned int)':
../../llvm-gcc.src/gcc/llvm-convert.cpp:1512: error: 'memmove_i32' is not a member of 'llvm::Intrinsic'
../../llvm-gcc.src/gcc/llvm-convert.cpp:1512: error: 'memmove_i64' is not a member of 'llvm::Intrinsic'
../../llvm-gcc.src/gcc/llvm-convert.cpp: In member function 'void TreeToLLVM::EmitMemSet(llvm::Value*, llvm::Value*, llvm::Value*, unsigned int)':
../../llvm-gcc.src/gcc/llvm-convert.cpp:1528: error: 'memset_i32' is not a member of 'llvm::Intrinsic'
../../llvm-gcc.src/gcc/llvm-convert.cpp:1528: error: 'memset_i64' is not a member of 'llvm::Intrinsic'
make[3]: *** [llvm-convert.o] Error 1
make[3]: *** Waiting for unfinished jobs....
rm fsf-funding.pod gcov.pod gfdl.pod cpp.pod gpl.pod gcc.pod
make[2]: *** [all-stage1-gcc] Error 2
make[1]: *** [stage1-bubble] Error 2
make: *** [all] Error 2
llvm-svn: 59809
Use it to safely handle less-than-or-equals-to exit conditions in loops. These
also occur when the loop exit branch is exit on true because SCEV inverses the
icmp predicate.
Use it again to handle non-zero strides, but only with an unsigned comparison
in the exit condition.
llvm-svn: 59528
If this patch causes a performance regression for anyone, please let me know,
and it can be fixed in a different way with much more effort.
llvm-svn: 59384
information. This logically replaces the "Desc" classes in
MachineModuleInfo. Nice features of these classes are that they:
1. Are much more efficient than MMI because they don't create a
temporary parallel data structure for debug info that has to be
'serialized' and 'deserialized' into/out of the module.
2. These provide a much cleaner abstraction for debug info than
MMI, which will make it easier to change the implementation in
the future (to be MDNode-based).
3. These are much easier to use than the MMI interfaces, requiring
a lot less code in the front-ends.
4. These can be used to both create (for frontends) and read (for
codegen) debug information. DebugInfoBuilder can only be used
to create the nodes.
So far, this is implemented just enough to support the debug info
generation needs of clang. This can and should be extended to
support the full set of debug info constructs, and we should switch
llvm-gcc and llc over to using this in the near future.
This code also has a ton of FIXMEs in it, because the way we
currently represent debug info in LLVM IR is basically insane in a
variety of details. This sort of issue should be fixed when we
eventually reimplement debug info on top of MDNodes.
llvm-svn: 58954
pointer bitcasts and GEP's", and centralize the
logic in Value::getUnderlyingObject. The
difference with stripPointerCasts is that
stripPointerCasts only strips GEPs if all
indices are zero, while getUnderlyingObject
strips GEPs no matter what the indices are.
llvm-svn: 56922
- Recognize expressions like "x > -1 ? x : 0" as min/max and turn them
into expressions like "x < 0 ? 0 : x", which is easily recognizable
as a min/max operation.
- Refrain from folding expression like "y/2 < 1" to "y < 2" when the
comparison is being used as part of a min or max idiom, like
"y/2 < 1 ? 1 : y/2". In that case, the division has another use, so
folding doesn't eliminate it, and obfuscates the min/max, making it
harder to recognize as a min/max operation.
These benefit ScalarEvolution, CodeGen, and anything else that wants to
recognize integer min and max.
llvm-svn: 56246
its callers to emit a space character before calling it when a
space is needed.
This fixes several spurious whitespace issues in
ScalarEvolution's debug dumps. See the test changes for
examples.
This also fixes odd space-after-tab indentation in the output
for switch statements, and changes calls from being printed like
this:
call void @foo( i32 %x )
to this:
call void @foo(i32 %x)
llvm-svn: 56196
when a readonly declaration is called, set a
flag. This is faster and uses less memory.
In theory it is less accurate, because before
only those internal globals that were read
by someone were being marked "Ref", but now
all are. But in practice, thanks to other
passes, all internal globals of the kind
considered here will be both read and stored
to: those only read will have been turned
into constants, and those only stored to will
have been deleted.
llvm-svn: 56143
(1) code left over from the days of ConstantPointerRef:
if a use of a function is a GlobalValue then that is
not considered a reason to add an edge from the external
node, even though the use may be as an initializer for
an externally visible global! There might be some point
to this behaviour when the use is by an alias (though the
code predated aliases by some centuries), but I think
PR2782 is a better way of handling that. (2) If function
F calls function G, and also G is a parameter to the
call, then an F->G edge is not added to the callgraph.
While this doesn't seem to matter much, adding such an
edge makes the callgraph more regular.
In addition, the new code should be faster as well as
simpler.
llvm-svn: 55987
call (thus changing the call site) it didn't
inform the callgraph about this. But the
call site does matter - as shown by the testcase,
the callgraph become invalid after the inliner
ran (with an edge between two functions simply
missing), resulting in wrong deductions by
GlobalsModRef.
llvm-svn: 55872
because it does not maintain a correct list
of callsites. I discovered (see following
commit) that the inliner will create a wrong
callgraph if it is fed a callgraph with
correct edges but incorrect callsites. These
were created by Prune-EH, and while it wasn't
done via removeCallEdgeTo, it could have been
done via removeCallEdgeTo, which is an accident
waiting to happen. Use removeCallEdgeFor
instead.
llvm-svn: 55859
analysis would bail out without removing function
records for other members of the SCC (which may exist
if those functions read or wrote global variables).
Since these are initialized to "readnone", this
resulted in incorrect alias analysis results.
llvm-svn: 55714
callgraph, when one member of a SCC calls another
then the analysis would drop to mod-ref because
there is (usually) no function info for the callee
yet; fix this. Teach the analysis about function
attributes, in particular the readonly attribute
(which requires being careful about globals).
llvm-svn: 55696
use raw_ostream instead of std::ostream. Among other goodness,
this speeds up llvm-dis of kc++ with a release build from 0.85s
to 0.49s (88% faster).
Other interesting changes:
1) This makes Value::print be non-virtual.
2) AP[S]Int and ConstantRange can no longer print to ostream directly,
use raw_ostream instead.
3) This fixes a bug in raw_os_ostream where it didn't flush itself
when destroyed.
4) This adds a new SDNode::print method, instead of only allowing "dump".
A lot of APIs have both std::ostream and raw_ostream versions, it would
be useful to go through and systematically anihilate the std::ostream
versions.
This passes dejagnu, but there may be minor fallout, plz let me know if
so and I'll fix it.
llvm-svn: 55263
continue past the first conditional branch when looking for a
relevant test. This helps it avoid using MAX expressions in
loop trip counts in more cases.
llvm-svn: 54697
type lattice value for an Argument*, giving clients the opportunity to
use something other than Top for it if they choose to."
Patch by John McCall!
llvm-svn: 54589
version uses a new algorithm for evaluating the binomial coefficients
which is significantly more efficient for AddRecs of more than 2 terms
(see the comments in the code for details on how the algorithm works).
It also fixes some bugs: it removes the arbitrary length restriction for
AddRecs, it fixes the silent generation of incorrect code for AddRecs
which require a wide calculation width, and it fixes an issue where we
were incorrectly truncating the iteration count too far when evaluating
an AddRec expression narrower than the induction variable.
There are still a few related issues I know of: I think there's
still an issue with the SCEVExpander expansion of AddRec in terms of
the width of the induction variable used. The hack to avoid generating
too-wide integers shouldn't be necessary; instead, the callers should be
considering the cost of the expansion before expanding it (in addition
to not expanding too-wide integers, we might not want to expand
expressions that are really expensive, especially when optimizing for
size; calculating an length-17 32-bit AddRec currently generates about 250
instructions of straight-line code on X86). Also, for long 32-bit
AddRecs on X86, CodeGen really sucks at scheduling the code. I'm planning on
filing follow-up PRs for these issues.
llvm-svn: 54332
time applying to the implicit comparison in smin expressions. The
correct way to transform an inequality into the opposite
inequality, either signed or unsigned, is with a not expression.
I looked through the SCEV code, and I don't think there are any more
occurrences of this issue.
llvm-svn: 54194
SGT exit condition. Essentially, the correct way to flip an inequality
in 2's complement is the not operator, not the negation operator.
That said, the difference only affects cases involving INT_MIN.
Also, enhance the pre-test search logic to be a bit smarter about
inequalities flipped with a not operator, so it can eliminate the smax
from the iteration count for simple loops.
llvm-svn: 54184
circumstances we could end up remapping a dependee to the same instruction
that we're trying to remove. Handle this properly by just falling back to
a conservative solution.
llvm-svn: 54132
bail after 256-bits to avoid producing code that the backends can't handle.
Previously, we capped it at 64-bits, preferring to miscompile in those cases.
This change also reverts much of r52248 because the invariants the code was
expecting are now being met.
llvm-svn: 53812
Move GetConstantStringInfo to lib/Analysis. Remove
string output routine from Constant. Update all
callers. Change debug intrinsic api slightly to
accomodate move of routine, these now return values
instead of strings.
This unbreaks llvm-gcc bootstrap.
llvm-svn: 52884
string output routine from Constant. Update all
callers. Change debug intrinsic api slightly to
accomodate move of routine, these now return values
instead of strings.
llvm-svn: 52748
inserting extractvalues. In particular, this prevents the insertion of
extractvalues that can't be folded away later. Also add an example of when this
stuff is needed.
llvm-svn: 52328
I'm at it, rename it to FindInsertedValue.
The only functional change is that newly created instructions are no longer
added to instcombine's worklist, but that is not really necessary anyway (and
I'll commit some improvements next that will completely remove the need).
llvm-svn: 52315
This fixes several minor bugs (such as returning noalias
for comparisons between external weak functions an null) but
is mostly a cleanup.
llvm-svn: 52299
take into account the instrucion pointed by InsertPt. Thanks to it,
returning the new value of InsertPt to the InsertBinop() caller can be
avoided. The bug was, actually, in visitAddRecExpr() method which wasn't
correctly handling changes of InsertPt. There shouldn't be any
performance regression, as -gvn pass (run after -indvars) removes any
redundant binops.
llvm-svn: 52291
Add a safety measure. It isn't safe to assume in ScalarEvolutionExpander that
all loops are in canonical form (but it should be safe for loops that have
AddRecs).
xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
llvm-svn: 52275
with code that was expecting different bit widths for different values.
Make getTruncateOrZeroExtend a method on ScalarEvolution, and use it.
llvm-svn: 52248
is longer than the second one) should stop after finding one. Added break
instruction guarantees it. It also changes difference between offsets to
absolute value of this difference in the condition.
llvm-svn: 51875
out of instcombine into a new file in libanalysis. This also teaches
ComputeNumSignBits about the number of sign bits in a constantint.
llvm-svn: 51863
Analysis/ConstantFolding to fold ConstantExpr's, then make instcombine use it
to try to use targetdata to fold constant expressions on void instructions.
Also extend the icmp(inttoptr, inttoptr) folding to handle the case where
int size != ptr size.
llvm-svn: 51559
SCCP like sparse lattice analysis with relative ease. Just pick your
lattice function and implement the transfer function and you're good.
Just make sure you don't break monotonicity ;-)
llvm-svn: 50961
by an instance of LibCallInfo to provide mod/ref info of
standard library functions. This is powerful enough to
say that 'sqrt' is readonly except that it modifies errno,
or that "printf doesn't store to memory unless the %n
constraint is present" etc.
llvm-svn: 50827
Currently is sufficient to describe mod/ref behavior but will hopefully
eventually be extended for other purposes.
This isn't used by anything yet.
llvm-svn: 50820
manually performing the comparison. This allows the special
case to work correctly even in the case where someone is
experimenting with a different comparison function :-).
llvm-svn: 49670
Parse reversed smax and umax as smin and umin and express them with negative
or binary-not SCEVs (which are really just subtract under the hood).
Parse 'xor %x, -1' as (-1 - %x).
Remove dead code (ConstantInt::get always returns a ConstantInt).
Don't use getIntegerSCEV(-1, Ty). The first value is an int, then it gets
passed into a uint64_t. Instead, create the -1 directly from
ConstantInt::getAllOnesValue().
llvm-svn: 47360
Also, noalias arguments are be considered "like" stack allocated ones for this purpose, because
the only way they can be modref'ed is if they escape somewhere in the current function.
llvm-svn: 47247
variable (with step 1) and m is its final value. Then, the correct trip
count is SMAX(m,n)-n. Previously, we used SMAX(0,m-n), but m-n may
overflow and can't in general be interpreted as signed.
Patch by Nick Lewycky.
llvm-svn: 47007
to the RHS. This simple change allows to compute loop iteration count
for loops with condition similar to the one in the testcase (which seems
to be quite common).
llvm-svn: 46959
arbitrary iteration.
The patch:
1) changes SCEVSDivExpr into SCEVUDivExpr,
2) replaces PartialFact() function with BinomialCoefficient(); the
computations (essentially, the division) in BinomialCoefficient() are
performed with the apprioprate bitwidth necessary to avoid overflow;
unsigned division is used instead of the signed one.
Computations in BinomialCoefficient() require support from the code
generator for APInts. Currently, we use a hack rounding up the
neccessary bitwidth to the nearest power of 2. The hack is easy to turn
off in future.
One remaining issue: we assume the divisor of the binomial coefficient
formula can be computed accurately using 16 bits. It means we can handle
AddRecs of length up to 9. In future, we should use APInts to evaluate
the divisor.
Thanks to Nicholas for cooperation!
llvm-svn: 46955
and readnone for functions with bodies because it
broke llvm-gcc-4.2 bootstrap. It turns out that,
because of LLVM's array_ref hack, gcc was computing
pure/const attributes wrong (now fixed by turning
off the gcc ipa-pure-const pass).
llvm-svn: 44937
Reimplement the xform in Analysis/ConstantFolding.cpp where we can use
targetdata to validate that it is safe. While I'm in there, fix some const
correctness issues and generalize the interface to the "operand folder".
llvm-svn: 44817
throw exceptions", just mark intrinsics with the nounwind
attribute. Likewise, mark intrinsics as readnone/readonly
and get rid of special aliasing logic (which didn't use
anything more than this anyway).
llvm-svn: 44544
into alias analysis. This meant updating the API
which now has versions of the getModRefBehavior,
doesNotAccessMemory and onlyReadsMemory methods
which take a callsite parameter. These should be
used unless the callsite is not known, since in
general they can do a better job than the versions
that take a function. Also, users should no longer
call the version of getModRefBehavior that takes
both a function and a callsite. To reduce the
chance of misuse it is now protected.
llvm-svn: 44487
the function type, instead they belong to functions
and function calls. This is an updated and slightly
corrected version of Reid Spencer's original patch.
The only known problem is that auto-upgrading of
bitcode files doesn't seem to work properly (see
test/Bitcode/AutoUpgradeIntrinsics.ll). Hopefully
a bitcode guru (who might that be? :) ) will fix it.
llvm-svn: 44359
OnlyReadsMemoryFns tables are dead! We
get more, and more accurate, information
from gcc via the readnone and readonly
function attributes.
llvm-svn: 44288
The meaning of getTypeSize was not clear - clarifying it is important
now that we have x86 long double and arbitrary precision integers.
The issue with long double is that it requires 80 bits, and this is
not a multiple of its alignment. This gives a primitive type for
which getTypeSize differed from getABITypeSize. For arbitrary precision
integers it is even worse: there is the minimum number of bits needed to
hold the type (eg: 36 for an i36), the maximum number of bits that will
be overwriten when storing the type (40 bits for i36) and the ABI size
(i.e. the storage size rounded up to a multiple of the alignment; 64 bits
for i36).
This patch removes getTypeSize (not really - it is still there but
deprecated to allow for a gradual transition). Instead there is:
(1) getTypeSizeInBits - a number of bits that suffices to hold all
values of the type. For a primitive type, this is the minimum number
of bits. For an i36 this is 36 bits. For x86 long double it is 80.
This corresponds to gcc's TYPE_PRECISION.
(2) getTypeStoreSizeInBits - the maximum number of bits that is
written when storing the type (or read when reading it). For an
i36 this is 40 bits, for an x86 long double it is 80 bits. This
is the size alias analysis is interested in (getTypeStoreSize
returns the number of bytes). There doesn't seem to be anything
corresponding to this in gcc.
(3) getABITypeSizeInBits - this is getTypeStoreSizeInBits rounded
up to a multiple of the alignment. For an i36 this is 64, for an
x86 long double this is 96 or 128 depending on the OS. This is the
spacing between consecutive elements when you form an array out of
this type (getABITypeSize returns the number of bytes). This is
TYPE_SIZE in gcc.
Since successive elements in a SequentialType (arrays, pointers
and vectors) need to be aligned, the spacing between them will be
given by getABITypeSize. This means that the size of an array
is the length times the getABITypeSize. It also means that GEP
computations need to use getABITypeSize when computing offsets.
Furthermore, if an alloca allocates several elements at once then
these too need to be aligned, so the size of the alloca has to be
the number of elements multiplied by getABITypeSize. Logically
speaking this doesn't have to be the case when allocating just
one element, but it is simpler to also use getABITypeSize in this
case. So alloca's and mallocs should use getABITypeSize. Finally,
since gcc's only notion of size is that given by getABITypeSize, if
you want to output assembler etc the same as gcc then getABITypeSize
is the size you want.
Since a store will overwrite no more than getTypeStoreSize bytes,
and a read will read no more than that many bytes, this is the
notion of size appropriate for alias analysis calculations.
In this patch I have corrected all type size uses except some of
those in ScalarReplAggregates, lib/Codegen, lib/Target (the hard
cases). I will get around to auditing these too at some point,
but I could do with some help.
Finally, I made one change which I think wise but others might
consider pointless and suboptimal: in an unpacked struct the
amount of space allocated for a field is now given by the ABI
size rather than getTypeStoreSize. I did this because every
other place that reserves memory for a type (eg: alloca) now
uses getABITypeSize, and I didn't want to make an exception
for unpacked structs, i.e. I did it to make things more uniform.
This only effects structs containing long doubles and arbitrary
precision integers. If someone wants to pack these types more
tightly they can always use a packed struct.
llvm-svn: 43620
and time usage.
Fixup operator == to make this work, and add a resize method to DenseMap
so we can resize our hashtable once we know how big it should be.
llvm-svn: 42269