The patch has been reverted because it ended up prohibiting propagation
of a constant to exit value. For such values, we should skip all checks
related to hard uses because propagating a constant is always profitable.
Differential Revision: https://reviews.llvm.org/D53691
llvm-svn: 346397
This reverts commit 2f425e9c7946b9d74e64ebbfa33c1caa36914402.
It seems that the check that we still should do the transform if we
know the result is constant is missing in this code. So the logic that
has been deleted by this change is still sometimes accidentally useful.
I revert the change to see what can be done about it. The motivating
case is the following:
@Y = global [400 x i16] zeroinitializer, align 1
define i16 @foo() {
entry:
br label %for.body
for.body: ; preds = %entry, %for.body
%i = phi i16 [ 0, %entry ], [ %inc, %for.body ]
%arrayidx = getelementptr inbounds [400 x i16], [400 x i16]* @Y, i16 0, i16 %i
store i16 0, i16* %arrayidx, align 1
%inc = add nuw nsw i16 %i, 1
%cmp = icmp ult i16 %inc, 400
br i1 %cmp, label %for.body, label %for.end
for.end: ; preds = %for.body
%inc.lcssa = phi i16 [ %inc, %for.body ]
ret i16 %inc.lcssa
}
We should be able to figure out that the result is constant, but the patch
breaks it.
Differential Revision: https://reviews.llvm.org/D51584
llvm-svn: 346198
When rewriting loop exit values, IndVars considers this transform not profitable if
the loop instruction has a loop user which it believes cannot be optimized away.
In current implementation only calls that immediately use the instruction are considered
as such.
This patch extends the definition of "hard" users to any side-effecting instructions
(which usually cannot be optimized away from the loop) and also allows handling
of not just immediate users, but use chains.
Differentlai Revision: https://reviews.llvm.org/D51584
Reviewed By: etherzhhb
llvm-svn: 345814
For some unclear reason rewriteLoopExitValues considers recalculation
after the loop profitable if it has some "soft uses" outside the loop (i.e. any
use other than call and return), even if we have proved that it has a user inside
the loop which we think will not be optimized away.
There is no existing unit test that would explain this. This patch provides an
example when rematerialisation of exit value is not profitable but it passes
this check due to presence of a "soft use" outside the loop.
It makes no sense to recalculate value on exit if we are going to compute it
due to some irremovable within the loop. This patch disallows applying this
transform in the described situation.
Differential Revision: https://reviews.llvm.org/D51581
Reviewed By: etherzhhb
llvm-svn: 345708
A piece of logic in rewriteLoopExitValues has a weird check on number of
users which allowed an unprofitable transform in case if an instruction has
more than 6 users.
Differential Revision: https://reviews.llvm.org/D51404
Reviewed By: etherzhhb
llvm-svn: 342444
IndVarSimplify's design is somewhat odd in the way how it reports that
some transform has made a change. It has a `Changed` field which can
be set from within any function, which makes it hard to track whether or
not it was set properly after a transform was made. It leads to oversights
in setting this flag where needed, see example in PR38855.
This patch removes the `Changed` field, turns it into a local and unifies
the signatures of all relevant transform functions to return boolean value
which designates whether or not this transform has made a change.
Differential Revision: https://reviews.llvm.org/D51850
Reviewed By: skatkov
llvm-svn: 341893
Currently, `rewriteFirstIterationLoopExitValues` does not set Changed flag even if it makes
changes in the IR. There is no clear evidence that it can cause a crash, but it
looks highly suspicious and likely invalid.
Differential Revision: https://reviews.llvm.org/D51779
Reviewed By: skatkov
llvm-svn: 341779
Currently, `sinkUnusedInvariants` does not set Changed flag even if it makes
changes in the IR. There is no clear evidence that it can cause a crash, but it
looks highly suspicious and likely invalid.
Differential Revision: https://reviews.llvm.org/D51777
Reviewed By: skatkov
llvm-svn: 341777
IndVars does not set `Changed` flag when it eliminates dead instructions. As result,
it may make IR modifications and report that it has done nothing. It leads to inconsistent
preserved analyzes results.
Differential Revision: https://reviews.llvm.org/D51770
Reviewed By: skatkov
llvm-svn: 341633
Function rewriteLoopExitValues contains a check on isValidRewrite which
is needed to make sure that SCEV does not convert the pattern
`gep Base, (&p[n] - &p[0])` into `gep &p[n], Base - &p[0]`. This problem
has been fixed in SCEV long ago, so this check is just obsolete.
This patch converts it into an assertion to make sure that the SCEV will
not mess up this case in the future.
Differential Revision: https://reviews.llvm.org/D51582
Reviewed By: atrick
llvm-svn: 341516
This patch removes the function `expandSCEVIfNeeded` which behaves not as
it was intended. This function tries to make a lookup for exact existing expansion
and only goes to normal expansion via `expandCodeFor` if this lookup hasn't found
anything. As a result of this, if some instruction above the loop has a `SCEVConstant`
SCEV, this logic will return this instruction when asked for this `SCEVConstant` rather
than return a constant value. This is both non-profitable and in some cases leads to
breach of LCSSA form (as in PR38674).
Whether or not it is possible to break LCSSA with this algorithm and with some
non-constant SCEVs is still in question, this is still being investigated. I wasn't
able to construct such a test so far, so maybe this situation is impossible. If it is,
it will go as a separate fix.
Rather than do it, it is always correct to just invoke `expandCodeFor` unconditionally:
it behaves smarter about insertion points, and as side effect of this it will choose a
constant value for SCEVConstants. For other SCEVs it may end up finding a better insertion
point. So it should not be worse in any case.
NOTE: So far the only known case for which this transform may break LCSSA is mapping
of SCEVConstant to an instruction. However there is a suspicion that the entire algorithm
can compromise LCSSA form for other cases as well (yet not proved).
Differential Revision: https://reviews.llvm.org/D51286
Reviewed By: etherzhhb
llvm-svn: 341345
Review feedback from r328165. Split out just the one function from the
file that's used by Analysis. (As chandlerc pointed out, the original
change only moved the header and not the implementation anyway - which
was fine for the one function that was used (since it's a
template/inlined in the header) but not in general)
llvm-svn: 333954
The DEBUG() macro is very generic so it might clash with other projects.
The renaming was done as follows:
- git grep -l 'DEBUG' | xargs sed -i 's/\bDEBUG\s\?(/LLVM_DEBUG(/g'
- git diff -U0 master | ../clang/tools/clang-format/clang-format-diff.py -i -p1 -style LLVM
- Manual change to APInt
- Manually chage DOCS as regex doesn't match it.
In the transition period the DEBUG() macro is still present and aliased
to the LLVM_DEBUG() one.
Differential Revision: https://reviews.llvm.org/D43624
llvm-svn: 332240
Remove #include of Transforms/Scalar.h from Transform/Utils to fix layering.
Transforms depends on Transforms/Utils, not the other way around. So
remove the header and the "createStripGCRelocatesPass" function
declaration (& definition) that is unused and motivated this dependency.
Move Transforms/Utils/Local.h into Analysis because it's used by
Analysis/MemoryBuiltins.cpp.
llvm-svn: 328165
This change adds an option disable-lftr to be able to disable Linear Function Test Replace optimization.
By default option is off so current behavior is not changed.
Reviewers: reames, sanjoy, wmi, andreadb, apilipenko
Reviewed By: sanjoy
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D33979
llvm-svn: 305055
Commits were:
"Use WeakVH instead of WeakTrackingVH in AliasSetTracker's UnkownInsts"
"Add a new WeakVH value handle; NFC"
"Rename WeakVH to WeakTrackingVH; NFC"
The changes assumed pointers are 8 byte aligned on all architectures.
llvm-svn: 301429
Summary:
I plan to use WeakVH to mean "nulls itself out on deletion, but does
not track RAUW" in a subsequent commit.
Reviewers: dblaikie, davide
Reviewed By: davide
Subscribers: arsenm, mehdi_amini, mcrosier, mzolotukhin, jfb, llvm-commits, nhaehnle
Differential Revision: https://reviews.llvm.org/D32266
llvm-svn: 301424
a function's CFG when that CFG is unchanged.
This allows transformation passes to simply claim they preserve the CFG
and analysis passes to check for the CFG being preserved to remove the
fanout of all analyses being listed in all passes.
I've gone through and removed or cleaned up as many of the comments
reminding us to do this as I could.
Differential Revision: https://reviews.llvm.org/D28627
llvm-svn: 292054
the latter to the Transforms library.
While the loop PM uses an analysis to form the IR units, the current
plan is to have the PM itself establish and enforce both loop simplified
form and LCSSA. This would be a layering violation in the analysis
library.
Fundamentally, the idea behind the loop PM is to *transform* loops in
addition to running passes over them, so it really seemed like the most
natural place to sink this was into the transforms library.
We can't just move *everything* because we also have loop analyses that
rely on a subset of the invariants. So this patch splits the the loop
infrastructure into the analysis management that has to be part of the
analysis library, and the transform-aware pass manager.
This also required splitting the loop analyses' printer passes out to
the transforms library, which makes sense to me as running these will
transform the code into LCSSA in theory.
I haven't split the unittest though because testing one component
without the other seems nearly intractable.
Differential Revision: https://reviews.llvm.org/D28452
llvm-svn: 291662
arguments much like the CGSCC pass manager.
This is a major redesign following the pattern establish for the CGSCC layer to
support updates to the set of loops during the traversal of the loop nest and
to support invalidation of analyses.
An additional significant burden in the loop PM is that so many passes require
access to a large number of function analyses. Manually ensuring these are
cached, available, and preserved has been a long-standing burden in LLVM even
with the help of the automatic scheduling in the old pass manager. And it made
the new pass manager extremely unweildy. With this design, we can package the
common analyses up while in a function pass and make them immediately available
to all the loop passes. While in some cases this is unnecessary, I think the
simplicity afforded is worth it.
This does not (yet) address loop simplified form or LCSSA form, but those are
the next things on my radar and I have a clear plan for them.
While the patch is very large, most of it is either mechanically updating loop
passes to the new API or the new testing for the loop PM. The code for it is
reasonably compact.
I have not yet updated all of the loop passes to correctly leverage the update
mechanisms demonstrated in the unittests. I'll do that in follow-up patches
along with improved FileCheck tests for those passes that ensure things work in
more realistic scenarios. In many cases, there isn't much we can do with these
until the loop simplified form and LCSSA form are in place.
Differential Revision: https://reviews.llvm.org/D28292
llvm-svn: 291651
When both WidenIV::getWideRecurrence and WidenIV::getExtendedOperandRecurrence
return non-null but different WideAddRec, if getWideRecurrence is called
before getExtendedOperandRecurrence, we won't bother to call
getExtendedOperandRecurrence again. But As we know it is possible that after
SCEV folding, we cannot prove the legality using the SCEVAddRecExpr returned
by getWideRecurrence. Meanwhile if getExtendedOperandRecurrence returns non-null
WideAddRec, we know for sure that it is legal to do widening for current instruction.
So it is better to put getExtendedOperandRecurrence before getWideRecurrence, which
will increase the chance of successful widening.
Differential Revision: https://reviews.llvm.org/D26059
llvm-svn: 286987
When the loop exit condition is canonicalized as a != compaison, reuse the
debug location of the original (non canonical) comparison.
Before this patch, the debug location of the new icmp was obtained from the
loop latch terminator. This patch fixes the issue by correctly setting the
IRBuilder's "current debug location" to the location of the original compare.
Differential Revision: https://reviews.llvm.org/D25953
llvm-svn: 285185
When indvars widened an induction variable, the debug location for the loop
increment computation was incorrectly set equal to the debug loc of the loop
latch terminator.
This patch fixes the issue by propagating the correct location from the
original loop increment instruction to the new widened increment.
Differential Revision: https://reviews.llvm.org/D25872
llvm-svn: 285083
This change is motivated by the case when IndVarSimplify doesn't widen a comparison of IV increment because it can't prove IV increment being non-negative. We end up with a redundant trunc of the widened increment on this example.
for.body:
%i = phi i32 [ %start, %for.body.lr.ph ], [ %i.inc, %for.inc ]
%within_limits = icmp ult i32 %i, 64
br i1 %within_limits, label %continue, label %for.end
continue:
%i.i64 = zext i32 %i to i64
%arrayidx = getelementptr inbounds i32, i32* %base, i64 %i.i64
%val = load i32, i32* %arrayidx, align 4
br label %for.inc
for.inc:
%i.inc = add nsw nuw i32 %i, 1
%cmp = icmp slt i32 %i.inc, %limit
br i1 %cmp, label %for.body, label %for.end
There is a range check inside of the loop which guarantees the IV to be non-negative. NSW on the increment guarantees that the increment is also non-negative. Teach IndVarSimplify to use the range check to prove non-negativity of loop increments.
Reviewed By: sanjoy
Differential Revision: https://reviews.llvm.org/D25738
llvm-svn: 284629
For each block check that it doesn't have any uses outside of it's innermost loop.
Differential Revision: https://reviews.llvm.org/D25364
llvm-svn: 283877
The core of the change is supposed to be NFC, however it also fixes
what I believe was an undefined behavior when calling:
va_start(ValueArgs, Desc);
with Desc being a StringRef.
Differential Revision: https://reviews.llvm.org/D25342
llvm-svn: 283671
Summary:
The patch fixes regression caused by two earlier patches D18777 and D18867.
Reviewers: reames, sanjoy
Differential Revision: http://reviews.llvm.org/D24280
From: Li Huang
llvm-svn: 282650
This change needs to be reverted in order to revert -r278267 which cause performance regression on MultiSource/Benchmarks/TSVC/Symbolics-flt/Symbolics-flt from LNT and some other bechmarks.
See comments on https://reviews.llvm.org/D18777 for details.
llvm-svn: 279432
IndVarSimplify::sinkUnusedInvariants calls
BasicBlock::getFirstInsertionPt on the ExitBlock and moves instructions
before it. This can return end(), so it's not safe to dereference. Add
an iterator-based overload to Instruction::moveBefore to avoid the UB.
llvm-svn: 278886
`IVVisitor::visitCast` used to have the invariant that if the
instruction it was passed was a sext or zext instruction, the result of
the instruction would be wider than the induction variable. This is no
longer true after rL275037, so this change teaches `IndVarSimplify` s
implementation of `IVVisitor::visitCast` to work with the relaxed
invariant.
A corresponding change to SimplifyIndVar to preserve the said invariant
after rL275037 would also work, but given how `IVVisitor::visitCast` is
spelled (no indication of said invariant), I figured the current fix is
cleaner.
Fixes PR28935.
llvm-svn: 278584
When legal, extending trip count in the loop control logic generates better code compared to truncating IV. This is because
(1) extending trip count is a loop invariant operation (see genLoopLimit where we prove trip count is loop invariant).
(2) Scalar Evolution seems to have problems understanding trunc when computing loop trip count. So removing them allows better analysis performed in Scalar Evolution. (In particular this fixes PR 28363 which is the motivation for this change).
I am not going to perform any performance test. Any degradation caused by this should be an indication of a bug elsewhere.
To prove legality, we rely on SCEV to prove zext(trunc(IV)) == IV (or similarly for sext). If this holds, we can prove equivalence of trunc(IV)==ExitCnt (1) and IV == zext(ExitCnt). Simply take zext of boths sides of (1) and apply the proven equivalence.
This commit contains changes in a newly added testcase which was not included in the previous commit (which was reverted later on).
https://reviews.llvm.org/D23075
llvm-svn: 278421
When legal, extending trip count in the loop control logic generates better code compared to truncating IV. This is because
(1) extending trip count is a loop invariant operation (see genLoopLimit where we prove trip count is loop invariant).
(2) Scalar Evolution seems to have problems understanding trunc when computing loop trip count. So removing them allows better analysis performed in Scalar Evolution. (In particular this fixes PR 28363 which is the motivation for this change).
I am not going to perform any performance test. Any degradation caused by this should be an indication of a bug elsewhere.
To prove legality, we rely on SCEV to prove zext(trunc(IV)) == IV (or similarly for sext). If this holds, we can prove equivalence of trunc(IV)==ExitCnt (1) and IV == zext(ExitCnt). Simply take zext of boths sides of (1) and apply the proven equivalence.
https://reviews.llvm.org/D23075
llvm-svn: 278334
The patch is to fix the bug in PR28705. It was caused by setting wrong return
value for SCEVExpander::findExistingExpansion. The return values of findExistingExpansion
have different meanings when the function is used in different ways so it is easy to make
mistake. The fix creates two new interfaces to replace SCEVExpander::findExistingExpansion,
and specifies where each interface is expected to be used.
Differential Revision: https://reviews.llvm.org/D22942
llvm-svn: 278161
One exception here is LoopInfo which must forward-declare it (because
the typedef is in LoopPassManager.h which depends on LoopInfo).
Also, some includes for LoopPassManager.h were needed since that file
provides the typedef.
Besides a general consistently benefit, the extra layer of indirection
allows the mechanical part of https://reviews.llvm.org/D23256 that
requires touching every transformation and analysis to be factored out
cleanly.
Thanks to David for the suggestion.
llvm-svn: 278079
Summary:
There are some rough corners, since the new pass manager doesn't have
(as far as I can tell) LoopSimplify and LCSSA, so I've updated the
tests to run them separately in the old pass manager in the lit tests.
We also don't have an equivalent for AU.setPreservesCFG() in the new
pass manager, so I've left a FIXME.
Reviewers: bogner, chandlerc, davide
Subscribers: sanjoy, mcrosier, llvm-commits
Differential Revision: http://reviews.llvm.org/D20783
llvm-svn: 271846
Since we already assert that the outgoing IR is in LCSSA, it is easy to
get misled into thinking that -indvars broke LCSSA if the incoming IR is
non-LCSSA. Checking this pre-condition will make such cases break in
more obvious ways.
Inspired by (but does _not_ fix) PR26682.
llvm-svn: 271196
The original commit was reverted because of a buildbot problem with LazyCallGraph::SCC handling (not related to the OptBisect handling).
Differential Revision: http://reviews.llvm.org/D19172
llvm-svn: 267231
This patch implements a optimization bisect feature, which will allow optimizations to be selectively disabled at compile time in order to track down test failures that are caused by incorrect optimizations.
The bisection is enabled using a new command line option (-opt-bisect-limit). Individual passes that may be skipped call the OptBisect object (via an LLVMContext) to see if they should be skipped based on the bisect limit. A finer level of control (disabling individual transformations) can be managed through an addition OptBisect method, but this is not yet used.
The skip checking in this implementation is based on (and replaces) the skipOptnoneFunction check. Where that check was being called, a new call has been inserted in its place which checks the bisect limit and the optnone attribute. A new function call has been added for module and SCC passes that behaves in a similar way.
Differential Revision: http://reviews.llvm.org/D19172
llvm-svn: 267022
Removed some unused headers, replaced some headers with forward class declarations.
Found using simple scripts like this one:
clear && ack --cpp -l '#include "llvm/ADT/IndexedMap.h"' | xargs grep -L 'IndexedMap[<]' | xargs grep -n --color=auto 'IndexedMap'
Patch by Eugene Kosov <claprix@yandex.ru>
Differential Revision: http://reviews.llvm.org/D19219
From: Mehdi Amini <mehdi.amini@apple.com>
llvm-svn: 266595
Widening a PHI requires us to insert a trunc.
The logical place for this trunc is in the same BB as the PHI.
This is not possible if the BB is terminated by a catchswitch.
This fixes PR27133.
llvm-svn: 264926
routine.
We were getting this wrong in small ways and generally being very
inconsistent about it across loop passes. Instead, let's have a common
place where we do this. One minor downside is that this will require
some analyses like SCEV in more places than they are strictly needed.
However, this seems benign as these analyses are complete no-ops, and
without this consistency we can in many cases end up with the legacy
pass manager scheduling deciding to split up a loop pass pipeline in
order to run the function analysis half-way through. It is very, very
annoying to fix these without just being very pedantic across the board.
The only loop passes I've not updated here are ones that use
AU.setPreservesAll() such as IVUsers (an analysis) and the pass printer.
They seemed less relevant.
With this patch, almost all of the problems in PR24804 around loop pass
pipelines are fixed. The one remaining issue is that we run simplify-cfg
and instcombine in the middle of the loop pass pipeline. We've recently
added some loop variants of these passes that would seem substantially
cleaner to use, but this at least gets us much closer to the previous
state. Notably, the seven loop pass managers is down to three.
I've not updated the loop passes using LoopAccessAnalysis because that
analysis hasn't been fully wired into LoopSimplify/LCSSA, and it isn't
clear that those transforms want to support those forms anyways. They
all run late anyways, so this is harmless. Similarly, LSR is left alone
because it already carefully manages its forms and doesn't need to get
fused into a single loop pass manager with a bunch of other loop passes.
LoopReroll didn't use loop simplified form previously, and I've updated
the test case to match the trivially different output.
Finally, I've also factored all the pass initialization for the passes
that use this technique as well, so that should be done regularly and
reliably.
Thanks to James for the help reviewing and thinking about this stuff,
and Ben for help thinking about it as well!
Differential Revision: http://reviews.llvm.org/D17435
llvm-svn: 261316
Summary:
This is a revised version of D13974, and the following quoted summary are from D13974
"This patch adds support to check if a loop has loop invariant conditions which lead to loop exits. If so, we know that if the exit path is taken, it is at the first loop iteration. If there is an induction variable used in that exit path whose value has not been updated, it will keep its initial value passing from loop preheader. We can therefore rewrite the exit value with
its initial value. This will help remove phis created by LCSSA and enable other optimizations like loop unswitch."
D13974 was committed but failed one lnt test. The bug was that we only checked the condition from loop exit's incoming block was a loop invariant. But there could be another condition from loop header to that incoming block not being a loop invariant. This would produce miscompiled code.
This patch fixes the issue by checking if the incoming block is loop header, and if not, don't perform the rewrite. The could be further improved by recursively checking all conditions leading to loop exit block, but I'd like to check in this simple version first and improve it with future patches.
Reviewers: sanjoy
Subscribers: llvm-commits
Differential Revision: http://reviews.llvm.org/D16570
llvm-svn: 258912
A large number of loop utility functions take a `Pass *` and reach
into it to find out which analyses to preserve. There are a number of
problems with this:
- The APIs have access to pretty well any Pass state they want, so
it's hard to tell what they may or may not do.
- Other APIs have copied these and pass around a `Pass *` even though
they don't even use it. Some of these just hand a nullptr to the API
since the callers don't even have a pass available.
- Passes in the new pass manager don't work like the current ones, so
the APIs can't be used as is there.
Instead, we should explicitly thread the analysis results that we
actually care about through these APIs. This is both simpler and more
reusable.
llvm-svn: 255669
Summary:
Also add a stricter post-condition for IndVarSimplify.
Fixes PR25578. Test case by Michael Zolotukhin.
Reviewers: hfinkel, atrick, mzolotukhin
Subscribers: llvm-commits
Differential Revision: http://reviews.llvm.org/D15059
llvm-svn: 254977
Commit 251839 triggers miscompiles on some bots:
http://lab.llvm.org:8011/builders/perf-x86_64-penryn-O3-polly-fast/builds/13723
(The commit is listed in 13722, but due to an existing failure introduced in
13721 and reverted in 13723 the failure is only visible in 13723)
To verify r251839 is indeed the only change that triggered the buildbot failures
and to ensure the buildbots remain green while investigating I temporarily
revert this commit. At the current state it is unclear if this commit introduced
some miscompile or if it only exposed code to Polly that is subsequently
miscompiled by Polly.
llvm-svn: 251901
Summary:
This patch adds support to check if a loop has loop invariant conditions which lead to loop exits. If so, we know that if the exit path is taken, it is at the first loop iteration. If there is an induction variable used in that exit path whose value has not been updated, it will keep its initial value passing from loop preheader. We can therefore rewrite the exit value with
its initial value. This will help remove phis created by LCSSA and enable other optimizations like loop unswitch.
Reviewers: sanjoy
Subscribers: llvm-commits
Differential Revision: http://reviews.llvm.org/D13974
llvm-svn: 251839
Summary:
This patch adds support to check if a loop has loop invariant conditions which lead to loop exits. If so, we know that if the exit path is taken, it is at the first loop iteration. If there is an induction variable used in that exit path whose value has not been updated, it will keep its initial value passing from loop preheader. We can therefore rewrite the exit value with
its initial value. This will help remove phis created by LCSSA and enable other optimizations like loop unswitch.
Reviewers: sanjoy
Subscribers: llvm-commits
Differential Revision: http://reviews.llvm.org/D13974
llvm-svn: 251492
Summary:
`cloneArithmeticIVUser` currently trips over expression like `add %iv,
-1` when `%iv` is being zero extended -- it tries to construct the
widened use as `add %iv.zext, zext(-1)` and (correctly) fails to prove
equivalence to `zext(add %iv, -1)` (here the SCEV for `%iv` is
`{1,+,1}`).
This change teaches `IndVars` to try sign extending the non-IV operand
if that makes the newly constructed IV use equivalent to the widened
narrow IV use.
Reviewers: atrick, hfinkel, reames
Subscribers: sanjoy, llvm-commits
Differential Revision: http://reviews.llvm.org/D13717
llvm-svn: 250483
Summary:
This NFC splitting is intended to make a later diff easier to follow.
It just tail duplicates `cloneIVUser` into `cloneArithmeticIVUser` and
`cloneBitwiseIVUser`.
Reviewers: atrick, hfinkel, reames
Subscribers: llvm-commits
Differential Revision: http://reviews.llvm.org/D13716
llvm-svn: 250481
Summary:
After r249211, `getSCEV(X) == getSCEV(Y)` does not guarantee that X and
Y are related in the dominator tree, even if X is an operand to Y (I've
included a toy example in comments, and a real example as a test case).
This commit changes `SimplifyIndVar` to require a `DominatorTree`. I
don't think this is a problem because `ScalarEvolution` requires it
anyway.
Fixes PR25051.
Depends on D13459.
Reviewers: atrick, hfinkel
Subscribers: joker.eph, llvm-commits, sanjoy
Differential Revision: http://reviews.llvm.org/D13460
llvm-svn: 249471
Summary:
It is fairly common to call SE->getConstant(Ty, 0) or
SE->getConstant(Ty, 1); this change makes such uses a little bit
briefer.
I've refactored the call sites I could find easily to use getZero /
getOne.
Reviewers: hfinkel, majnemer, reames
Subscribers: sanjoy, llvm-commits
Differential Revision: http://reviews.llvm.org/D12947
llvm-svn: 248362
Because -indvars widens induction variables through arithmetic,
`NeverNegative` cannot be a property of the `WidenIV` (a `WidenIV`
manages information for all transitive uses of an IV being widened,
including uses of `-1 * IV`). Instead it must live on `NarrowIVDefUse`
which manages information for a specific def-use edge in the transitive
use list of an induction variable.
This change also adds a test case that demonstrates the problem with
r248045.
llvm-svn: 248107
Summary:
If an induction variable is provably non-negative, its sign extension is
equal to its zero extension. This means narrow uses like
icmp slt iNarrow %indvar, %rhs
can be widened into
icmp slt iWide zext(%indvar), sext(%rhs)
Reviewers: atrick, mcrosier, hfinkel
Subscribers: hfinkel, reames, llvm-commits
Differential Revision: http://reviews.llvm.org/D12745
llvm-svn: 248045
In `IndVarSimplify::ExpandSCEVIfNeeded`,
`SCEVExpander::findExistingExpansion` may return an `llvm::Value` that
differs in type from the SCEV it was asked to find an expansion for (but
computes the same value). In such cases, we fall back on
`expandCodeFor`; and rely on LLVM to CSE the two equivalent
expressions (different only by a no-op cast) into a single computation.
I tried a few other approaches to fixing PR24783, all of which turned
out to be more complex than this current version:
1. Move the `ExpandSCEVIfNeeded` logic into `expandCodeFor`. This got
problematic because currently we do not pass in the `Loop *` into
`expandCodeFor`. Changing the interface to do this is a more
invasive change, and really does not make much semantic sense unless
the SCEV being passed in is an add recurrence.
There is also the problem of `expandCodeFor` being used in places
other than `indvars` -- there may be performance / correctness
issues elsewhere if `expandCodeFor` is moved from always generating
IR from scratch to cache-like model.
2. Have `findExistingExpansion` only return expression with the correct
type. This would make `isHighCostExpansionHelper` and thus
`isHighCostExpansion` more conservative than necessary.
3. Insert casts on the value returned by `findExistingExpansion` if
needed using `InsertNoopCastOfTo`. This is complicated because
`InsertNoopCastOfTo` depends on internal state of its
`SCEVExpander` (specifically `Builder.GetInserPoint()`), and this
may not be set up when `ExpandSCEVIfNeeded` is called.
4. Manually insert casts on the value returned by
`findExistingExpansion` if needed using `InsertNoopCastOfTo` via
`CastInst::Create`. This is probably workable, but figuring out the
location where the cast instruction needs to be inserted has enough
edge cases (arguments, constants, invokes, LCSSA must be preserved)
makes me feel what I have right now is simplest solution.
llvm-svn: 247749
GlobalsAA must by definition be preserved in function passes, but the passmanager doesn't know that. Make each pass explicitly preserve GlobalsAA.
llvm-svn: 247263