Instructions defined in the original inner loop preheader may depend on
values defined in the outer loop header, but the inner loop header will
become the entry block in the loop nest. Move the instructions from the
preheader to the outer loop header, so we do not break dominance. We
also have to check for unsafe instructions in the preheader. If there
are no unsafe instructions, all instructions should be movable.
Currently we move all instructions except the terminator and rely on
LICM to hoist out invariant instructions later.
Fixes PR45743
Values defined in the outer loop header could be used in the inner loop
latch. In that case, we need to create LCSSA phis for them, because after
interchanging they will be defined in the new inner loop and used in the
new outer loop.
The PHI node checks for inner loop exits are too permissive currently.
As indicated by an existing comment, we should only allow LCSSA PHI
nodes that are part of reductions or are only used outside of the loop
nest. We ensure this by checking the users of the LCSSA PHIs.
Specifically, it is not safe to use an exiting value from the inner loop in the latch of the outer
loop.
It also moves the inner loop exit check before the outer loop exit
check.
Fixes PR43473.
Reviewers: efriedma, mcrosier
Reviewed By: efriedma
Differential Revision: https://reviews.llvm.org/D68144
Currently the assertion in updateSuccessor is overly strict in some
cases and overly relaxed in other cases. For branches to the inner and
outer loop preheader it is too strict, because they can either be
unconditional branches or conditional branches with duplicate targets.
Both cases are fine and we can allow updating multiple successors.
On the other hand, we have to at least update one successor. This patch
adds such an assertion.
This file lists every pass in LLVM, and is included by Pass.h, which is
very popular. Every time we add, remove, or rename a pass in LLVM, it
caused lots of recompilation.
I found this fact by looking at this table, which is sorted by the
number of times a file was changed over the last 100,000 git commits
multiplied by the number of object files that depend on it in the
current checkout:
recompiles touches affected_files header
342380 95 3604 llvm/include/llvm/ADT/STLExtras.h
314730 234 1345 llvm/include/llvm/InitializePasses.h
307036 118 2602 llvm/include/llvm/ADT/APInt.h
213049 59 3611 llvm/include/llvm/Support/MathExtras.h
170422 47 3626 llvm/include/llvm/Support/Compiler.h
162225 45 3605 llvm/include/llvm/ADT/Optional.h
158319 63 2513 llvm/include/llvm/ADT/Triple.h
140322 39 3598 llvm/include/llvm/ADT/StringRef.h
137647 59 2333 llvm/include/llvm/Support/Error.h
131619 73 1803 llvm/include/llvm/Support/FileSystem.h
Before this change, touching InitializePasses.h would cause 1345 files
to recompile. After this change, touching it only causes 550 compiles in
an incremental rebuild.
Reviewers: bkramer, asbirlea, bollu, jdoerfert
Differential Revision: https://reviews.llvm.org/D70211
Currently we have limited support for outer loops with multiple basic
blocks after the inner loop exit. But the current checks for creating
PHIs for loop exit values only assumes the header and latches of the
outer loop. It is better to just skip incoming values defined in the
original inner loops. Those are handled earlier.
Reviewers: efriedma, mcrosier
Reviewed By: efriedma
Differential Revision: https://reviews.llvm.org/D70059
Currently we only rely on the induction increment to come before the
condition to ensure the required instructions get moved to the new
latch.
This patch duplicates and moves the required instructions to the
newly created latch. We move the condition to the end of the new block,
then process its operands. We stop at operands that are defined
outside the loop, or are the induction PHI.
We duplicate the instructions and update the uses in the moved
instructions, to ensure other users remain intact. See the added
test2 for such an example.
Reviewers: efriedma, mcrosier
Reviewed By: efriedma
Differential Revision: https://reviews.llvm.org/D67367
llvm-svn: 371595
The code to preserve LCSSA PHIs currently only properly supports
reduction PHIs and PHIs for values defined outside the latches.
This patch improves the LCSSA PHI handling to cover PHIs for values
defined in the latches.
Fixes PR41725.
Reviewers: efriedma, mcrosier, davide, jdoerfert
Reviewed By: jdoerfert
Differential Revision: https://reviews.llvm.org/D61576
llvm-svn: 361743
Summary:
This PR extends the loop object with more utilities to get loop bounds, step, induction variable, and guard branch. There already exists passes which try to obtain the loop induction variable in their own pass, e.g. loop interchange. It would be useful to have a common area to get these information. Moreover, loop fusion (https://reviews.llvm.org/D55851) is planning to use getGuard() to extend the kind of loops it is able to fuse, e.g. rotated loop with non-constant upper bound, which would have a loop guard.
/// Example:
/// for (int i = lb; i < ub; i+=step)
/// <loop body>
/// --- pseudo LLVMIR ---
/// beforeloop:
/// guardcmp = (lb < ub)
/// if (guardcmp) goto preheader; else goto afterloop
/// preheader:
/// loop:
/// i1 = phi[{lb, preheader}, {i2, latch}]
/// <loop body>
/// i2 = i1 + step
/// latch:
/// cmp = (i2 < ub)
/// if (cmp) goto loop
/// exit:
/// afterloop:
///
/// getBounds
/// getInitialIVValue --> lb
/// getStepInst --> i2 = i1 + step
/// getStepValue --> step
/// getFinalIVValue --> ub
/// getCanonicalPredicate --> '<'
/// getDirection --> Increasing
/// getGuard --> if (guardcmp) goto loop; else goto afterloop
/// getInductionVariable --> i1
/// getAuxiliaryInductionVariable --> {i1}
/// isCanonical --> false
Committed on behalf of @Whitney (Whitney Tsang).
Reviewers: kbarton, hfinkel, dmgreen, Meinersbur, jdoerfert, syzaara, fhahn
Reviewed By: kbarton
Subscribers: tvvikram, bmahjour, etiotto, fhahn, jsji, hiraditya, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D60565
llvm-svn: 361517
Summary:
Preserve MemorySSA in LoopSimplify, in the old pass manager, if the analysis is available.
Do not preserve it in the new pass manager.
Update tests.
Subscribers: nemanjai, jlebar, javed.absar, Prazek, kbarton, zzheng, jsji, llvm-commits, george.burgess.iv, chandlerc
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D60833
llvm-svn: 360270
Summary:
It is a common thing to loop over every `PHINode` in some `BasicBlock`
and change old `BasicBlock` incoming block to a new `BasicBlock` incoming block.
`replaceSuccessorsPhiUsesWith()` already had code to do that,
it just wasn't a function.
So outline it into a new function, and use it.
Reviewers: chandlerc, craig.topper, spatel, danielcdh
Reviewed By: craig.topper
Subscribers: llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D61013
llvm-svn: 359996
Summary:
There is `PHINode::getBasicBlockIndex()`, `PHINode::setIncomingBlock()`
and `PHINode::getNumOperands()`, but no function to replace every
specified `BasicBlock*` predecessor with some other specified `BasicBlock*`.
Clearly, there are a lot of places that could use that functionality.
Reviewers: chandlerc, craig.topper, spatel, danielcdh
Reviewed By: craig.topper
Subscribers: llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D61011
llvm-svn: 359995
to reflect the new license.
We understand that people may be surprised that we're moving the header
entirely to discuss the new license. We checked this carefully with the
Foundation's lawyer and we believe this is the correct approach.
Essentially, all code in the project is now made available by the LLVM
project under our new license, so you will see that the license headers
include that license only. Some of our contributors have contributed
code under our old license, and accordingly, we have retained a copy of
our old license notice in the top-level files in each project and
repository.
llvm-svn: 351636
This patch adds logic to detect reductions across the inner and outer
loop by following the incoming values of PHI nodes in the outer loop. If
the incoming values take part in a reduction in the inner loop or come
from outside the outer loop, we found a reduction spanning across inner
and outer loop.
With this change, ~10% more loops are interchanged in the LLVM
test-suite + SPEC2006.
Fixes https://bugs.llvm.org/show_bug.cgi?id=30472
Reviewers: mcrosier, efriedma, karthikthecool, davide, hfinkel, dmgreen
Reviewed By: efriedma
Differential Revision: https://reviews.llvm.org/D43245
llvm-svn: 346438
Inner-loop only reductions require additional checks to make sure they
form a load-phi-store cycle across inner and outer loop. Otherwise the
reduction value is not properly preserved. This patch disables
interchanging such loops for now, as it causes miscompiles in some
cases and it seems to apply only for a tiny amount of loops. Across the
test-suite, SPEC2000 and SPEC2006, 61 instead of 62 loops are
interchange with inner loop reduction support disabled. With
-loop-interchange-threshold=-1000, 3256 instead of 3267.
See the discussion and history of D53027 for an outline of how such legality
checks could look like.
Reviewers: efriedma, mcrosier, davide
Reviewed By: efriedma
Differential Revision: https://reviews.llvm.org/D53027
llvm-svn: 345877
This patch turns LoopInterchange into a loop pass. It now only
considers top-level loops and tries to move the innermost loop to the
optimal position within the loop nest. By only looking at top-level
loops, we might miss a few opportunities the function pass would get
(e.g. if we have a loop nest of 3 loops, in the function pass
we might process loops at level 1 and 2 and move the inner most loop to
level 1, and then we process loops at levels 0, 1, 2 and interchange
again, because we now have a different inner loop). But I think it would
be better to handle such cases by picking the best inner loop from the
start and avoid re-visiting the same loops again.
The biggest advantage of it being a function pass is that it interacts
nicely with the other loop passes. Without this patch, there are some
performance regressions on AArch64 with loop interchanging enabled,
where no loops were interchanged, but we missed out on some other loop
optimizations.
It also removes the SimplifyCFG run. We are just changing branches, so
the CFG should not be more complicated, besides the additional 'unique'
preheaders this pass might create.
Reviewers: chandlerc, efriedma, mcrosier, javed.absar, xbolva00
Reviewed By: xbolva00
Differential Revision: https://reviews.llvm.org/D51702
llvm-svn: 343308
This patch extends LoopInterchange to move LCSSA to the right place
after interchanging. This is required for LoopInterchange to become a
function pass.
An alternative to the manual moving of the PHIs, we could also re-form
the LCSSA phis for a set of interchanged loops, but that's more
expensive.
Reviewers: efriedma, mcrosier, davide
Reviewed By: efriedma
Differential Revision: https://reviews.llvm.org/D52154
llvm-svn: 343132
As preparation for LoopInterchange becoming a loop pass, it needs to
preserve ScalarEvolution. Even though interchanging should not change
the trip count of the loop, it modifies loop entry, latch and exit
blocks.
I added -verify-scev to some loop interchange tests, but the verification does
not catch problems caused by missing invalidation of SE in loop interchange, as
the trip counts themselves do not change. So there might be potential to
make the SE verification covering more stuff in the future.
Reviewers: mkazantsev, efriedma, karthikthecool
Reviewed By: efriedma
Differential Revision: https://reviews.llvm.org/D52026
llvm-svn: 342209
There is no need to create preheaders in the analysis stage, we only
need them when adjusting the branches. Also, the only cases we need to
create our own preheaders is when they have more than 1 predecessors or
PHI nodes (even with only 1 predecessor, we could have an LCSSA phi
node). I have simplified the conditions and added some assertions to be
sure. Because we know the inner and outer loop need to be tightly
nested, it is sufficient to check if the inner loop preheader is the
outer loop header to check if we need to create a new preheader.
Reviewers: efriedma, mcrosier, karthikthecool
Reviewed By: efriedma
Differential Revision: https://reviews.llvm.org/D51703
llvm-svn: 341533
The core get and set routines move to the `Instruction` class. These
routines are only valid to call on instructions which are terminators.
The iterator and *generic* range based access move to `CFG.h` where all
the other generic successor and predecessor access lives. While moving
the iterator here, simplify it using the iterator utilities LLVM
provides and updates coding style as much as reasonable. The APIs remain
pointer-heavy when they could better use references, and retain the odd
behavior of `operator*` and `operator->` that is common in LLVM
iterators. Adjusting this API, if desired, should be a follow-up step.
Non-generic range iteration is added for the two instructions where
there is an especially easy mechanism and where there was code
attempting to use the range accessor from a specific subclass:
`indirectbr` and `br`. In both cases, the successors are contiguous
operands and can be easily iterated via the operand list.
This is the first major patch in removing the `TerminatorInst` type from
the IR's instruction type hierarchy. This change was discussed in an RFC
here and was pretty clearly positive:
http://lists.llvm.org/pipermail/llvm-dev/2018-May/123407.html
There will be a series of much more mechanical changes following this
one to complete this move.
Differential Revision: https://reviews.llvm.org/D47467
llvm-svn: 340698
This patch moves the logic to handle reduction PHI nodes to the end of
adjustLoopBranches. Reduction PHI nodes in the outer loop header can be
moved to the inner loop header and reduction PHI nodes from the inner loop
header can be moved to the outer loop header. In the latter situation,
we have to deal with 1 kind of PHI nodes:
PHI nodes that are part of inner loop-only reductions.
We can replace the PHI node with the value coming from outside
the inner loop.
Reviewers: mcrosier, efriedma, karthikthecool
Reviewed By: efriedma
Differential Revision: https://reviews.llvm.org/D46198
llvm-svn: 335027
The DEBUG() macro is very generic so it might clash with other projects.
The renaming was done as follows:
- git grep -l 'DEBUG' | xargs sed -i 's/\bDEBUG\s\?(/LLVM_DEBUG(/g'
- git diff -U0 master | ../clang/tools/clang-format/clang-format-diff.py -i -p1 -style LLVM
- Manual change to APInt
- Manually chage DOCS as regex doesn't match it.
In the transition period the DEBUG() macro is still present and aliased
to the LLVM_DEBUG() one.
Differential Revision: https://reviews.llvm.org/D43624
llvm-svn: 332240
We've been running doxygen with the autobrief option for a couple of
years now. This makes the \brief markers into our comments
redundant. Since they are a visual distraction and we don't want to
encourage more \brief markers in new code either, this patch removes
them all.
Patch produced by
for i in $(git grep -l '\\brief'); do perl -pi -e 's/\\brief //g' $i & done
Differential Revision: https://reviews.llvm.org/D46290
llvm-svn: 331272
We currently support LCSSA PHI nodes in the outer loop exit, if their
incoming values do not come from the outer loop latch or if the
outer loop latch has a single predecessor. In that case, the outer loop latch
will be executed only if the inner loop gets executed. If we have multiple
predecessors for the outer loop latch, it may be executed even if the inner
loop does not get executed.
This is a first step to support the case described in
https://bugs.llvm.org/show_bug.cgi?id=30472
Reviewers: efriedma, karthikthecool, mcrosier
Reviewed By: efriedma
Differential Revision: https://reviews.llvm.org/D43237
llvm-svn: 331037
This also means we have to check if the latch is the exiting block now,
as `transform` expects the latches to be the exiting blocks too.
https://bugs.llvm.org/show_bug.cgi?id=36586
Reviewers: efriedma, davide, karthikthecool
Reviewed By: efriedma
Differential Revision: https://reviews.llvm.org/D45279
llvm-svn: 330806
After D43236, we started interchanging loops with empty dependence
matrices. In isProfitableForVectorization, we try to determine if
interchanging makes the loop dependences more friendly to the
vectorizer. If there are no dependences, we should not interchange,
based on that heuristic.
Reviewers: efriedma, mcrosier, karthikthecool, blitz.opensource
Reviewed By: mcrosier
Differential Revision: https://reviews.llvm.org/D45208
llvm-svn: 330738
If a loop with child loops becomes our new inner loop after
interchanging, we only need to update LoopInfo for the blocks defined in
the old outer loop. BBs in child loops will stay there.
Reviewers: efriedma, karthikthecool, mcrosier
Reviewed By: efriedma
Differential Revision: https://reviews.llvm.org/D45970
llvm-svn: 330653
LoopInterchange relies on LoopInfo being up-to-date, so we should
preserve it after interchanging. This patch updates restructureLoops to
move the BBs of the interchanged loops to the right place.
Reviewers: davide, efriedma, karthikthecool, mcrosier
Reviewed By: efriedma
Differential Revision: https://reviews.llvm.org/D45278
llvm-svn: 329264
It also updates test/Transforms/LoopInterchange/call-instructions.ll
to use accesses where we can prove dependence after D35430.
Reviewers: sebpop, karthikthecool, blitz.opensource
Reviewed By: sebpop
Differential Revision: https://reviews.llvm.org/D45206
llvm-svn: 329111
It's been quite some time the Dependence Analysis (DA) is broken,
as it uses the GEP representation to "identify" multi-dimensional arrays.
It even wrongly detects multi-dimensional arrays in single nested loops:
from test/Analysis/DependenceAnalysis/Coupled.ll, example @couple6
;; for (long int i = 0; i < 50; i++) {
;; A[i][3*i - 6] = i;
;; *B++ = A[i][i];
DA used to detect two subscripts, which makes no sense in the LLVM IR
or in C/C++ semantics, as there are no guarantees as in Fortran of
subscripts not overlapping into a next array dimension:
maximum nesting levels = 1
SrcPtrSCEV = %A
DstPtrSCEV = %A
using GEPs
subscript 0
src = {0,+,1}<nuw><nsw><%for.body>
dst = {0,+,1}<nuw><nsw><%for.body>
class = 1
loops = {1}
subscript 1
src = {-6,+,3}<nsw><%for.body>
dst = {0,+,1}<nuw><nsw><%for.body>
class = 1
loops = {1}
Separable = {}
Coupled = {1}
With the current patch, DA will correctly work on only one dimension:
maximum nesting levels = 1
SrcSCEV = {(-2424 + %A)<nsw>,+,1212}<%for.body>
DstSCEV = {%A,+,404}<%for.body>
subscript 0
src = {(-2424 + %A)<nsw>,+,1212}<%for.body>
dst = {%A,+,404}<%for.body>
class = 1
loops = {1}
Separable = {0}
Coupled = {}
This change removes all uses of GEP from DA, and we now only rely
on the SCEV representation.
The patch does not turn on -da-delinearize by default, and so the DA analysis
will be more conservative in the case of multi-dimensional memory accesses in
nested loops.
I disabled some interchange tests, as the DA is not able to disambiguate
the dependence anymore. To make DA stronger, we may need to
compute a bound on the number of iterations based on the access functions
and array dimensions.
The patch cleans up all the CHECKs in test/Transforms/LoopInterchange/*.ll to
avoid checking for snippets of LLVM IR: this form of checking is very hard to
maintain. Instead, we now check for output of the pass that are more meaningful
than dozens of lines of LLVM IR. Some tests now require -debug messages and thus
only enabled with asserts.
Patch written by Sebastian Pop and Aditya Kumar.
Differential Revision: https://reviews.llvm.org/D35430
llvm-svn: 326837
The dependency matrix is only empty if no conflicting load/store
instructions have been found. In that case, it is safe to interchange.
For the LLVM test-suite, after this change around 1900 loops are
interchanged, whereas it is 15 before this change. On cortex-a57,
this gives an improvement of -0.57% on the geomean execution
time of SPEC2006, SPEC2000 and the test-suite. There are a
few small perf regressions, but I think we can improve on those
by making the cost model better.
Reviewers: karthikthecool, mcrosier
Reviewed by: karthikthecool
Differential Revision: https://reviews.llvm.org/D43236
llvm-svn: 326077
We can use incremental dominator tree updates to avoid re-calculating
the dominator tree after interchanging 2 loops.
Reviewers: dmgreen, kuhar
Reviewed By: kuhar
Differential Revision: https://reviews.llvm.org/D43176
llvm-svn: 325122
In cases where the OuterMostLoopLatchBI only has a single successor,
accessing the second successor will fail.
This fixes a failure when building the test-suite with loop-interchange
enabled.
Reviewers: mcrosier, karthikthecool, davide
Reviewed by: karthikthecool
Differential Revision: https://reviews.llvm.org/D42906
llvm-svn: 324994
The way that splitInnerLoopHeader splits blocks requires that
the induction PHI will be the first PHI in the inner loop
header. This makes sure that is actually the case when there
are both IV and reduction phis.
Differential Revision: https://reviews.llvm.org/D38682
llvm-svn: 316261
parameterized emit() calls
Summary: This is not functional change to adopt new emit() API added in r313691.
Reviewed By: anemet
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D38285
llvm-svn: 315476
Summary:
SimplifyIndVar may introduce zext instructions to widen arguments of the
loop exit check. They should not prevent us from splitting the loop at
the induction variable, but maybe the check should be more conservative,
e.g. making sure it only extends arguments used by a comparison?
Reviewers: karthikthecool, mcrosier, mzolotukhin
Reviewed By: mcrosier
Subscribers: mzolotukhin, llvm-commits
Differential Revision: https://reviews.llvm.org/D34879
llvm-svn: 311783
Summary:
Without any information about the called function, we cannot be sure
that it is safe to interchange loops which contain function calls. For
example there could be dependences that prevent interchanging between
accesses in the called function and the loops. Even functions without any
parameters could cause problems, as they could access memory using
global pointers.
For now, I think it is only safe to interchange loops with calls marked
as readnone.
With this patch, the LLVM test suite passes with `-O3 -mllvm
-enable-loopinterchange` and LoopInterchangeProfitability::isProfitable
returning true for all loops. check-llvm and check-clang also pass when
bootstrapped in a similar fashion, although only 3 loops got
interchanged.
Reviewers: karthikthecool, blitz.opensource, hfinkel, mcrosier, mkuper
Reviewed By: mcrosier
Subscribers: mzolotukhin, llvm-commits
Differential Revision: https://reviews.llvm.org/D35489
llvm-svn: 309547
Summary:
The remaining non range-based for loops do not iterate over full ranges,
so leave them as they are.
Reviewers: karthikthecool, blitz.opensource, mcrosier, mkuper, aemerson
Reviewed By: aemerson
Subscribers: aemerson, mzolotukhin, llvm-commits
Differential Revision: https://reviews.llvm.org/D35777
llvm-svn: 308872
Summary: This makes it easier to find out which limitation prevented this pass from doing its work.
Reviewers: karthikthecool, mzolotukhin, efriedma, mcrosier
Reviewed By: mcrosier
Subscribers: mcrosier, llvm-commits
Differential Revision: https://reviews.llvm.org/D34940
llvm-svn: 307035
After r289755, the AssumptionCache is no longer needed. Variables affected by
assumptions are now found by using the new operand-bundle-based scheme. This
new scheme is more computationally efficient, and also we need much less
code...
llvm-svn: 289756
This condition is trivially always true prior to the change. The comment
at the call site makes it clear that we expect *all* of these to be '=',
'S', or 'I' so fix the code.
We have a bug I will update to track the fact that Clang doesn't warn on
this: http://llvm.org/PR13101
llvm-svn: 285930
Currently, we give up on loop interchange if we encounter a flow dependency
anywhere in the loop list. Worse yet, we don't even track output dependencies.
This patch updates the dependency matrix computation to track flow and output
dependencies in the same way we track anti dependencies.
This improves an internal workload by 2.2x.
Note the loop interchange pass is off by default and it can be enabled with
'-mllvm -enable-loopinterchange'
Differential Revision: https://reviews.llvm.org/D24564
llvm-svn: 282101
Allowed loop vectorization with secondary FP IVs. Like this:
float *A;
float x = init;
for (int i=0; i < N; ++i) {
A[i] = x;
x -= fp_inc;
}
The auto-vectorization is possible when the induction binary operator is "fast" or the function has "unsafe" attribute.
Differential Revision: https://reviews.llvm.org/D21330
llvm-svn: 276554
Ported DA to the new PM by splitting the former DependenceAnalysis Pass
into a DependenceInfo result type and DependenceAnalysisWrapperPass type
and adding a new PM-style DependenceAnalysis analysis pass returning the
DependenceInfo.
Patch by Philip Pfaffe, most of the review by Justin.
Differential Revision: http://reviews.llvm.org/D18834
llvm-svn: 269370
A large number of loop utility functions take a `Pass *` and reach
into it to find out which analyses to preserve. There are a number of
problems with this:
- The APIs have access to pretty well any Pass state they want, so
it's hard to tell what they may or may not do.
- Other APIs have copied these and pass around a `Pass *` even though
they don't even use it. Some of these just hand a nullptr to the API
since the callers don't even have a pass available.
- Passes in the new pass manager don't work like the current ones, so
the APIs can't be used as is there.
Instead, we should explicitly thread the analysis results that we
actually care about through these APIs. This is both simpler and more
reusable.
llvm-svn: 255669
Remove remaining `ilist_iterator` implicit conversions from
LLVMScalarOpts.
This change exposed some scary behaviour in
lib/Transforms/Scalar/SCCP.cpp around line 1770. This patch changes a
call from `Function::begin()` to `&Function::front()`, since the return
was immediately being passed into another function that takes a
`Function*`. `Function::front()` started to assert, since the function
was empty. Note that `Function::end()` does not point at a legal
`Function*` -- it points at an `ilist_half_node` -- so the other
function was getting garbage before. (I added the missing check for
`Function::isDeclaration()`.)
Otherwise, no functionality change intended.
llvm-svn: 250211
with the new pass manager, and no longer relying on analysis groups.
This builds essentially a ground-up new AA infrastructure stack for
LLVM. The core ideas are the same that are used throughout the new pass
manager: type erased polymorphism and direct composition. The design is
as follows:
- FunctionAAResults is a type-erasing alias analysis results aggregation
interface to walk a single query across a range of results from
different alias analyses. Currently this is function-specific as we
always assume that aliasing queries are *within* a function.
- AAResultBase is a CRTP utility providing stub implementations of
various parts of the alias analysis result concept, notably in several
cases in terms of other more general parts of the interface. This can
be used to implement only a narrow part of the interface rather than
the entire interface. This isn't really ideal, this logic should be
hoisted into FunctionAAResults as currently it will cause
a significant amount of redundant work, but it faithfully models the
behavior of the prior infrastructure.
- All the alias analysis passes are ported to be wrapper passes for the
legacy PM and new-style analysis passes for the new PM with a shared
result object. In some cases (most notably CFL), this is an extremely
naive approach that we should revisit when we can specialize for the
new pass manager.
- BasicAA has been restructured to reflect that it is much more
fundamentally a function analysis because it uses dominator trees and
loop info that need to be constructed for each function.
All of the references to getting alias analysis results have been
updated to use the new aggregation interface. All the preservation and
other pass management code has been updated accordingly.
The way the FunctionAAResultsWrapperPass works is to detect the
available alias analyses when run, and add them to the results object.
This means that we should be able to continue to respect when various
passes are added to the pipeline, for example adding CFL or adding TBAA
passes should just cause their results to be available and to get folded
into this. The exception to this rule is BasicAA which really needs to
be a function pass due to using dominator trees and loop info. As
a consequence, the FunctionAAResultsWrapperPass directly depends on
BasicAA and always includes it in the aggregation.
This has significant implications for preserving analyses. Generally,
most passes shouldn't bother preserving FunctionAAResultsWrapperPass
because rebuilding the results just updates the set of known AA passes.
The exception to this rule are LoopPass instances which need to preserve
all the function analyses that the loop pass manager will end up
needing. This means preserving both BasicAAWrapperPass and the
aggregating FunctionAAResultsWrapperPass.
Now, when preserving an alias analysis, you do so by directly preserving
that analysis. This is only necessary for non-immutable-pass-provided
alias analyses though, and there are only three of interest: BasicAA,
GlobalsAA (formerly GlobalsModRef), and SCEVAA. Usually BasicAA is
preserved when needed because it (like DominatorTree and LoopInfo) is
marked as a CFG-only pass. I've expanded GlobalsAA into the preserved
set everywhere we previously were preserving all of AliasAnalysis, and
I've added SCEVAA in the intersection of that with where we preserve
SCEV itself.
One significant challenge to all of this is that the CGSCC passes were
actually using the alias analysis implementations by taking advantage of
a pretty amazing set of loop holes in the old pass manager's analysis
management code which allowed analysis groups to slide through in many
cases. Moving away from analysis groups makes this problem much more
obvious. To fix it, I've leveraged the flexibility the design of the new
PM components provides to just directly construct the relevant alias
analyses for the relevant functions in the IPO passes that need them.
This is a bit hacky, but should go away with the new pass manager, and
is already in many ways cleaner than the prior state.
Another significant challenge is that various facilities of the old
alias analysis infrastructure just don't fit any more. The most
significant of these is the alias analysis 'counter' pass. That pass
relied on the ability to snoop on AA queries at different points in the
analysis group chain. Instead, I'm planning to build printing
functionality directly into the aggregation layer. I've not included
that in this patch merely to keep it smaller.
Note that all of this needs a nearly complete rewrite of the AA
documentation. I'm planning to do that, but I'd like to make sure the
new design settles, and to flesh out a bit more of what it looks like in
the new pass manager first.
Differential Revision: http://reviews.llvm.org/D12080
llvm-svn: 247167
... and move it into LoopUtils where it can be used by other passes, just like ReductionDescriptor. The API is very similar to ReductionDescriptor - that is, not very nice at all. Sorting these both out will come in a followup.
NFC
llvm-svn: 246145
This change makes ScalarEvolution a stand-alone object and just produces
one from a pass as needed. Making this work well requires making the
object movable, using references instead of overwritten pointers in
a number of places, and other refactorings.
I've also wired it up to the new pass manager and added a RUN line to
a test to exercise it under the new pass manager. This includes basic
printing support much like with other analyses.
But there is a big and somewhat scary change here. Prior to this patch
ScalarEvolution was never *actually* invalidated!!! Re-running the pass
just re-wired up the various other analyses and didn't remove any of the
existing entries in the SCEV caches or clear out anything at all. This
might seem OK as everything in SCEV that can uses ValueHandles to track
updates to the values that serve as SCEV keys. However, this still means
that as we ran SCEV over each function in the module, we kept
accumulating more and more SCEVs into the cache. At the end, we would
have a SCEV cache with every value that we ever needed a SCEV for in the
entire module!!! Yowzers. The releaseMemory routine would dump all of
this, but that isn't realy called during normal runs of the pipeline as
far as I can see.
To make matters worse, there *is* actually a key that we don't update
with value handles -- there is a map keyed off of Loop*s. Because
LoopInfo *does* release its memory from run to run, it is entirely
possible to run SCEV over one function, then over another function, and
then lookup a Loop* from the second function but find an entry inserted
for the first function! Ouch.
To make matters still worse, there are plenty of updates that *don't*
trip a value handle. It seems incredibly unlikely that today GVN or
another pass that invalidates SCEV can update values in *just* such
a way that a subsequent run of SCEV will incorrectly find lookups in
a cache, but it is theoretically possible and would be a nightmare to
debug.
With this refactoring, I've fixed all this by actually destroying and
recreating the ScalarEvolution object from run to run. Technically, this
could increase the amount of malloc traffic we see, but then again it is
also technically correct. ;] I don't actually think we're suffering from
tons of malloc traffic from SCEV because if we were, the fact that we
never clear the memory would seem more likely to have come up as an
actual problem before now. So, I've made the simple fix here. If in fact
there are serious issues with too much allocation and deallocation,
I can work on a clever fix that preserves the allocations (while
clearing the data) between each run, but I'd prefer to do that kind of
optimization with a test case / benchmark that shows why we need such
cleverness (and that can test that we actually make it faster). It's
possible that this will make some things faster by making the SCEV
caches have higher locality (due to being significantly smaller) so
until there is a clear benchmark, I think the simple change is best.
Differential Revision: http://reviews.llvm.org/D12063
llvm-svn: 245193
This reverts commit r243167.
Duncan pointed out that dyn_cast can return null in these cases, so this
was an unsafe commit to make. Sorry for the noise.
Worryingly there were no tests which fail...
llvm-svn: 243302
Since both places which set this variable do so with dyn_cast, and not
dyn_cast_or_null, its impossible to get a nullptr here, so we can remove
the check.
llvm-svn: 243167
Passes should never modify it, just use the const version. While there
reduce copying in LoopInterchange. No functional change intended.
llvm-svn: 242041
A reduction is a special kind of recurrence. In the loop vectorizer we currently
identify basic reductions. Future patches will extend this to identifying basic
recurrences.
llvm-svn: 239835