Commit Graph

346 Commits

Author SHA1 Message Date
Max Kazantsev 98838527c6 Revert "Revert "Revert "[IndVars] Canonicalize comparisons between non-negative values and indvars"""
It appears that the problem is still there. Needs more analysis to understand why
SaturatedMultiply test fails.

llvm-svn: 307249
2017-07-06 10:47:13 +00:00
Max Kazantsev c8db20b78c Revert "Revert "[IndVars] Canonicalize comparisons between non-negative values and indvars""
It seems that the patch was reverted by mistake. Clang testing showed failure of the
MathExtras.SaturatingMultiply test, however I was unable to reproduce the issue on the
fresh code base and was able to confirm that the transformation introduced by the change
does not happen in the said test. This gives a strong confidence that the actual reason of
the failure of the initial patch was somewhere else, and that problem now seems to be
fixed. Re-submitting the change to confirm that.

llvm-svn: 307244
2017-07-06 09:57:41 +00:00
Max Kazantsev ebe56283bc Revert "[IndVars] Canonicalize comparisons between non-negative values and indvars"
This patch seems to cause failures of test MathExtras.SaturatingMultiply on
multiple buildbots. Reverting until the reason of that is clarified.

Differential Revision: https://reviews.llvm.org/rL307126

llvm-svn: 307135
2017-07-05 09:44:41 +00:00
Max Kazantsev 80bc4a5554 [IndVars] Canonicalize comparisons between non-negative values and indvars
-If there is a IndVar which is known to be non-negative, and there is a value which is also non-negative,
then signed and unsigned comparisons between them produce the same result. Both of those can be
seen in the same loop. To allow other optimizations to simplify them, we turn all instructions like

  %c = icmp slt i32 %iv, %b
to

  %c = icmp ult i32 %iv, %b

if both %iv and %b are known to be non-negative.

Differential Revision: https://reviews.llvm.org/D34979

llvm-svn: 307126
2017-07-05 06:38:49 +00:00
Max Kazantsev 8d0322e612 [SCEV] Use depth limit instead of local cache for SExt and ZExt
In rL300494 there was an attempt to deal with excessive compile time on
invocations of getSign/ZeroExtExpr using local caching. This approach only
helps if we request the same SCEV multiple times throughout recursion. But
in the bug PR33431 we see a case where we request different values all the time,
so caching does not help and the size of the cache grows enormously.

In this patch we remove the local cache for this methods and add the recursion
depth limit instead, as we do for arithmetics. This gives us a guarantee that the
invocation sequence is limited and reasonably short.

Differential Revision: https://reviews.llvm.org/D34273

llvm-svn: 306785
2017-06-30 05:04:09 +00:00
Alexandre Isoard 41044876fc Reverting r306695 while investigating failing test case.
Failing test case:
    Transforms/LoopVectorize.iv_outside_user.ll

llvm-svn: 306723
2017-06-29 18:48:56 +00:00
Alexandre Isoard aa29afc756 ScalarEvolution: Add URem support
In LLVM IR the following code:

    %r = urem <ty> %t, %b

is equivalent to:

    %q = udiv <ty> %t, %b
    %s = mul <ty> nuw %q, %b
    %r = sub <ty> nuw %t, %q ; (t / b) * b + (t % b) = t

As UDiv, Mul and Sub are already supported by SCEV, URem can be
implemented with minimal effort this way.

Note: While SRem and SDiv are also related this way, SCEV does not
provides SDiv yet.

llvm-svn: 306695
2017-06-29 16:29:04 +00:00
Max Kazantsev dc80366d52 [ScalarEvolution] Apply Depth limit to getMulExpr
This is a fix for PR33292 that shows a case of extremely long compilation
of a single .c file with clang, with most time spent within SCEV.

We have a mechanism of limiting recursion depth for getAddExpr to avoid
long analysis in SCEV. However, there are calls from getAddExpr to getMulExpr
and back that do not propagate the info about depth. As result of this, a chain

  getAddExpr -> ... .> getAddExpr -> getMulExpr -> getAddExpr -> ... -> getAddExpr

can be extremely long, with every segment of getAddExpr's being up to max depth long.
This leads either to long compilation or crash by stack overflow. We face this situation while
analyzing big SCEVs in the test of PR33292.

This patch applies the same limit on max expression depth for getAddExpr and getMulExpr.

Differential Revision: https://reviews.llvm.org/D33984

llvm-svn: 305463
2017-06-15 11:48:21 +00:00
Max Kazantsev 41450329f7 Re-enable "[SCEV] Do not fold dominated SCEVUnknown into AddRecExpr start"
The patch rL303730 was reverted because test lsr-expand-quadratic.ll failed on
many non-X86 configs with this patch. The reason of this is that the patch
makes a correctless fix that changes optimizer's behavior for this test.
Without the change, LSR was making an overconfident simplification basing on a
wrong SCEV. Apparently it did not need the IV analysis to do this. With the
change, it chose a different way to simplify (that wasn't so confident), and
this way required the IV analysis. Now, following the right execution path,
LSR tries to make a transformation relying on IV Users analysis. This analysis
is target-dependent due to this code:

  // LSR is not APInt clean, do not touch integers bigger than 64-bits.
  // Also avoid creating IVs of non-native types. For example, we don't want a
  // 64-bit IV in 32-bit code just because the loop has one 64-bit cast.
  uint64_t Width = SE->getTypeSizeInBits(I->getType());
  if (Width > 64 || !DL.isLegalInteger(Width))
    return false;

To make a proper transformation in this test case, the type i32 needs to be
legal for the specified data layout. When the test runs on some non-X86
configuration (e.g. pure ARM 64), opt gets confused by the specified target
and does not use it, rejecting the specified data layout as well. Instead,
it uses some default layout that does not treat i32 as a legal type
(currently the layout that is used when it is not specified does not have
legal types at all). As result, the transformation we expect to happen does
not happen for this test.

This re-enabling patch does not have any source code changes compared to the
original patch rL303730. The only difference is that the failing test is
moved to X86 directory and now has requirement of running on x86 only to comply
with the specified target triple and data layout.

Differential Revision: https://reviews.llvm.org/D33543

llvm-svn: 303971
2017-05-26 06:47:04 +00:00
Diana Picus 183863fc3b Revert "[SCEV] Do not fold dominated SCEVUnknown into AddRecExpr start"
This reverts commit r303730 because it broke all the buildbots.

llvm-svn: 303747
2017-05-24 14:16:04 +00:00
Max Kazantsev 13e016bf48 [SCEV] Do not fold dominated SCEVUnknown into AddRecExpr start
When folding arguments of AddExpr or MulExpr with recurrences, we rely on the fact that
the loop of our base recurrency is the bottom-lost in terms of domination. This assumption
may be broken by an expression which is treated as invariant, and which depends on a complex
Phi for which SCEVUnknown was created. If such Phi is a loop Phi, and this loop is lower than
the chosen AddRecExpr's loop, it is invalid to fold our expression with the recurrence.

Another reason why it might be invalid to fold SCEVUnknown into Phi start value is that unlike
other SCEVs, SCEVUnknown are sometimes position-bound. For example, here:

for (...) { // loop
  phi = {A,+,B}
}
X = load ...
Folding phi + X into {A+X,+,B}<loop> actually makes no sense, because X does not exist and cannot
exist while we are iterating in loop (this memory can be even not allocated and not filled by this moment).
It is only valid to make such folding if X is defined before the loop. In this case the recurrence {A+X,+,B}<loop>
may be existant.

This patch prohibits folding of SCEVUnknown (and those who use them) into the start value of an AddRecExpr,
if this instruction is dominated by the loop. Merging the dominating unknown values is still valid. Some tests that
relied on the fact that some SCEVUnknown should be folded into AddRec's are changed so that they no longer
expect such behavior.

llvm-svn: 303730
2017-05-24 08:52:18 +00:00
Sanjoy Das 036dda25a5 [SCEV] Clarify behavior around max backedge taken count
This is a re-application of a r303497 that was reverted in r303498.
I thought it had broken a bot when it had not (the breakage did not
go away with the revert).

This change makes the split between the "exact" backedge taken count
and the "maximum" backedge taken count a bit more obvious.  Both of
these are upper bounds on the number of times the loop header
executes (since SCEV does not account for most kinds of abnormal
control flow), but the latter is guaranteed to be a constant.

There were a few places where the max backedge taken count *was* a
non-constant; I've changed those to compute constants instead.

At this point, I'm not sure if the constant max backedge count can be
computed by calling `getUnsignedRange(Exact).getUnsignedMax()` without
losing precision.  If it can, we can simplify even further by making
`getMaxBackedgeTakenCount` a thin wrapper around
`getBackedgeTakenCount` and `getUnsignedRange`.

llvm-svn: 303531
2017-05-22 06:46:04 +00:00
Sanjoy Das 8963650cfa Revert "[SCEV] Clarify behavior around max backedge taken count"
This reverts commit r303497 since it breaks the msan bootstrap bot:
http://lab.llvm.org:8011/builders/sanitizer-x86_64-linux-bootstrap/builds/1379/

llvm-svn: 303498
2017-05-21 05:02:12 +00:00
Sanjoy Das 5207168383 [SCEV] Clarify behavior around max backedge taken count
This change makes the split between the "exact" backedge taken count
and the "maximum" backedge taken count a bit more obvious.  Both of
these are upper bounds on the number of times the loop header
executes (since SCEV does not account for most kinds of abnormal
control flow), but the latter is guaranteed to be a constant.

There were a few places where the max backedge taken count *was* a
non-constant; I've changed those to compute constants instead.

At this point, I'm not sure if the constant max backedge count can be
computed by calling `getUnsignedRange(Exact).getUnsignedMax()` without
losing precision.  If it can, we can simplify even further by making
`getMaxBackedgeTakenCount` a thin wrapper around
`getBackedgeTakenCount` and `getUnsignedRange`.

llvm-svn: 303497
2017-05-21 01:47:50 +00:00
Max Kazantsev b09b5db793 [SCEV] Fix sorting order for AddRecExprs
The existing sorting order in defined CompareSCEVComplexity sorts AddRecExprs
by loop depth, but does not pay attention to dominance of loops. This can
lead us to the following buggy situation:

for (...) { // loop1
  op1 = {A,+,B}
}
for (...) { // loop2
  op2 = {A,+,B}
  S = add op1, op2
}

In this case there is no guarantee that in operand list of S the op2 comes
before op1 (loop depth is the same, so they will be sorted just
lexicographically), so we can incorrectly treat S as a recurrence of loop1,
which is wrong.

This patch changes the sorting logic so that it places the dominated recs
before the dominating recs. This ensures that when we pick the first recurrency
in the operands order, it will be the bottom-most in terms of domination tree.
The attached test set includes some tests that produce incorrect SCEV
estimations and crashes with oldlogic.

Reviewers: sanjoy, reames, apilipenko, anna

Reviewed By: sanjoy

Subscribers: llvm-commits, mzolotukhin

Differential Revision: https://reviews.llvm.org/D33121

llvm-svn: 303148
2017-05-16 07:27:06 +00:00
Michael Zolotukhin 37162adf3e [SCEV] createAddRecFromPHI: Optimize for the most common case.
Summary:
The existing implementation creates a symbolic SCEV expression every
time we analyze a phi node and then has to remove it, when the analysis
is finished. This is very expensive, and in most of the cases it's also
unnecessary. According to the data I collected, ~60-70% of analyzed phi
nodes (measured on SPEC) have the following form:
  PN = phi(Start, OP(Self, Constant))
Handling such cases separately significantly speeds this up.

Reviewers: sanjoy, pete

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D32663

llvm-svn: 302096
2017-05-03 23:53:38 +00:00
Sanjoy Das 08989c7ecd Rename isKnownNotFullPoison to programUndefinedIfPoison; NFC
Summary:
programUndefinedIfPoison makes more sense, given what the function
does; and I'm about to add a function with a name similar to
isKnownNotFullPoison (so do the rename to avoid confusion).

Reviewers: broune, majnemer, bjarke.roune

Reviewed By: broune

Subscribers: mcrosier, llvm-commits, mzolotukhin

Differential Revision: https://reviews.llvm.org/D30444

llvm-svn: 301776
2017-04-30 19:41:19 +00:00
Sanjoy Das bdbc4938f9 [SCEV] Fix exponential time complexity by caching
llvm-svn: 301149
2017-04-24 00:09:46 +00:00
Eli Friedman d0e6ae5678 Revert r300746 (SCEV analysis for or instructions).
There have been multiple reports of this causing problems: a
compile-time explosion on the LLVM testsuite, and a stack
overflow for an opencl kernel.

llvm-svn: 300928
2017-04-20 23:59:05 +00:00
Eli Friedman e77d2b86b4 [SCEV] Make SCEV or modeling more aggressive.
Use haveNoCommonBitsSet to figure out whether an "or" instruction
is equivalent to addition. This handles more cases than just
checking for a constant on the RHS.

Differential Revision: https://reviews.llvm.org/D32239

llvm-svn: 300746
2017-04-19 20:19:58 +00:00
Max Kazantsev 2e44d2969a [ScalarEvolution] Re-enable Predicate implication from operations
The patch rL298481 was reverted due to crash on clang-with-lto-ubuntu build.
The reason of the crash was type mismatch between either a or b and RHS in the following situation:

  LHS = sext(a +nsw b) > RHS.

This is quite rare, but still possible situation. Normally we need to cast all {a, b, RHS} to their widest type.
But we try to avoid creation of new SCEV that are not constants to avoid initiating recursive analysis that
can take a lot of time and/or cache a bad value for iterations number. To deal with this, in this patch we
reject this case and will not try to analyze it if the type of sum doesn't match with the type of RHS. In this
situation we don't need to create any non-constant SCEVs.

This patch also adds an assertion to the method IsProvedViaContext so that we could fail on it and not
go further into range analysis etc (because in some situations these analyzes succeed even when the passed
arguments have wrong types, what should not normally happen).

The patch also contains a fix for a problem with too narrow scope of the analysis caused by wrong
usage of predicates in recursive invocations.

The regression test on the said failure: test/Analysis/ScalarEvolution/implied-via-addition.ll

Reviewers: reames, apilipenko, anna, sanjoy

Reviewed By: sanjoy

Subscribers: mzolotukhin, mehdi_amini, llvm-commits

Differential Revision: https://reviews.llvm.org/D31238

llvm-svn: 299205
2017-03-31 12:05:30 +00:00
Max Kazantsev 7696a7edf9 Revert "[ScalarEvolution] Re-enable Predicate implication from operations"
This reverts commit rL298690

Causes failures on clang.

llvm-svn: 298693
2017-03-24 07:04:31 +00:00
Max Kazantsev 89554446e7 [ScalarEvolution] Re-enable Predicate implication from operations
The patch rL298481 was reverted due to crash on clang-with-lto-ubuntu build.
The reason of the crash was type mismatch between either a or b and RHS in the following situation:

  LHS = sext(a +nsw b) > RHS.

This is quite rare, but still possible situation. Normally we need to cast all {a, b, RHS} to their widest type.
But we try to avoid creation of new SCEV that are not constants to avoid initiating recursive analysis that
can take a lot of time and/or cache a bad value for iterations number. To deal with this, in this patch we
reject this case and will not try to analyze it if the type of sum doesn't match with the type of RHS. In this
situation we don't need to create any non-constant SCEVs.

This patch also adds an assertion to the method IsProvedViaContext so that we could fail on it and not
go further into range analysis etc (because in some situations these analyzes succeed even when the passed
arguments have wrong types, what should not normally happen).

The patch also contains a fix for a problem with too narrow scope of the analysis caused by wrong
usage of predicates in recursive invocations.

The regression test on the said failure: test/Analysis/ScalarEvolution/implied-via-addition.ll

llvm-svn: 298690
2017-03-24 06:19:00 +00:00
Zhaoshi Zheng e3c9070f06 Model ashr(shl(x, n), m) as mul(x, 2^(n-m)) when n > m
Given below case:

  %y = shl %x, n
  %z = ashr %y, m

when n = m, SCEV models it as sext(trunc(x)). This patch tries to handle
the case where n > m by using sext(mul(trunc(x), 2^(n-m)))) as the SCEV
expression.

llvm-svn: 298631
2017-03-23 18:06:09 +00:00
Max Kazantsev c6effaa495 Revert "[ScalarEvolution] Predicate implication from operations"
This reverts commit rL298481

Fails clang-with-lto-ubuntu build.

llvm-svn: 298489
2017-03-22 07:50:33 +00:00
Max Kazantsev 15e76aa0f8 [ScalarEvolution] Predicate implication from operations
This patch allows SCEV predicate analysis to prove implication of some expression predicates
from context predicates related to arguments of those expressions.
It introduces three new rules:

For addition:
  (A >X && B >= 0) || (B >= 0 && A > X) ===> (A + B) > X.

For division:
  (A > X) && (0 < B <= X + 1) ===> (A / B > 0).
  (A > X) && (-B <= X < 0) ===> (A / B >= 0).

Using these rules, SCEV is able to prove facts like "if X > 1 then X / 2 > 0".
They can also be combined with the same context, to prove more complex expressions like
"if X > 1 then X/2 + 1 > 1".

Diffirential Revision: https://reviews.llvm.org/D30887

Reviewed by: sanjoy

llvm-svn: 298481
2017-03-22 04:48:46 +00:00
Eli Friedman b1578d3612 [SCEV] Fix trip multiple calculation
If loop bound containing calculations like min(a,b), the Scalar
Evolution API getSmallConstantTripMultiple returns 4294967295 "-1"
as the trip multiple. The problem is that, SCEV use -1 * umax to
represent umin. The multiple constant -1 was returned, and the logic
of guarding against huge trip counts was skipped. Because -1 has 32
active bits.

The fix attempt to factor more general cases. First try to get the
greatest power of two divisor of trip count expression. In case
overflow happens, the trip count expression is still divisible by the
greatest power of two divisor returned. Returns 1 if not divisible by 2.

Patch by Huihui Zhang <huihuiz@codeaurora.org>

Differential Revision: https://reviews.llvm.org/D30840

llvm-svn: 298301
2017-03-20 20:25:46 +00:00
Michael Zolotukhin 99de88d1f3 [SCEV] Compute affine range in another way to avoid bitwidth extending.
Summary:
This approach has two major advantages over the existing one:
1. We don't need to extend bitwidth in our computations. Extending
bitwidth is a big issue for compile time as we often end up working with
APInts wider than 64bit, which is a slow case for APInt.
2. When we zero extend a wrapped range, we lose some information (we
replace the range with [0, 1 << src bit width)). Thus, avoiding such
extensions better preserves information.

Correctness testing:
I ran 'ninja check' with assertions that the new implementation of
getRangeForAffineAR gives the same results as the old one (this
functionality is not present in this patch). There were several failures
- I inspected them manually and found out that they all are caused by
the fact that we're returning more accurate results now (see bullet (2)
above).
Without such assertions 'ninja check' works just fine, as well as
SPEC2006.

Compile time testing:
CTMark/Os:
 - mafft/pairlocalalign	-16.98%
 - tramp3d-v4/tramp3d-v4	-12.72%
 - lencod/lencod	-11.51%
 - Bullet/bullet	-4.36%
 - ClamAV/clamscan	-3.66%
 - 7zip/7zip-benchmark	-3.19%
 - sqlite3/sqlite3	-2.95%
 - SPASS/SPASS	-2.74%
 - Average	-5.81%

Performance testing:
The changes are expected to be neutral for runtime performance.

Reviewers: sanjoy, atrick, pete

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D30477

llvm-svn: 297992
2017-03-16 21:07:38 +00:00
Sanjoy Das 5cd6c5cacf [ValueTracking] Make poison propagation more aggressive
Summary:
Motivation: fix PR31181 without regression (the actual fix is still in
progress).  However, the actual content of PR31181 is not relevant
here.

This change makes poison propagation more aggressive in the following
cases:

 1. poision * Val == poison, for any Val.  In particular, this changes
    existing intentional and documented behavior in these two cases:
     a. Val is 0
     b. Val is 2^k * N
 2. poison << Val == poison, for any Val
 3. getelementptr is poison if any input is poison

I think all of these are justified (and are axiomatically true in the
new poison / undef model):

1a: we need poison * 0 to be poison to allow transforms like these:

  A * (B + C) ==> A * B + A * C

If poison * 0 were 0 then the above transform could not be allowed
since e.g. we could have A = poison, B = 1, C = -1, making the LHS

  poison * (1 + -1) = poison * 0 = 0

and the RHS

  poison * 1 + poison * -1 = poison + poison = poison

1b: we need e.g. poison * 4 to be poison since we want to allow

  A * 4 ==> A + A + A + A

If poison * 4 were a value with all of their bits poison except the
last four; then we'd not be able to do this transform since then if A
were poison the LHS would only be "partially" poison while the RHS
would be "full" poison.

2: Same reasoning as (1b), we'd like have the following kinds
transforms be legal:

  A << 1 ==> A + A

Reviewers: majnemer, efriedma

Subscribers: mcrosier, llvm-commits

Differential Revision: https://reviews.llvm.org/D30185

llvm-svn: 295809
2017-02-22 06:52:32 +00:00
Igor Laevsky c11c1ed909 [SCEV] Cache results during GetMinTrailingZeros query
Differential Revision: https://reviews.llvm.org/D29759

llvm-svn: 295060
2017-02-14 15:53:12 +00:00
Eli Friedman 10d1ff64fe [SCEV] Simplify/generalize howFarToZero solving.
Make SolveLinEquationWithOverflow take the start as a SCEV, so we can
solve more cases. With that implemented, get rid of the special case
for powers of two.

The additional functionality probably isn't particularly useful,
but it might help a little for certain cases involving pointer
arithmetic.

Differential Revision: https://reviews.llvm.org/D28884

llvm-svn: 293576
2017-01-31 00:42:42 +00:00
Daniil Fukalov b09dac59fc [SCEV] Introduce add operation inlining limit
Inlining in getAddExpr() can cause abnormal computational time in some cases.
New parameter -scev-addops-inline-threshold is intruduced with default value 500.

Reviewers: sanjoy

Subscribers: mzolotukhin, llvm-commits

Differential Revision: https://reviews.llvm.org/D28812

llvm-svn: 293176
2017-01-26 13:33:17 +00:00
Chandler Carruth d501b18990 This test apparently requires an x86 target and is failing on numerous
bots ever since d0k fixed the CHECK lines so that it did something at
all.

It isn't actually testing SCEV directly but LSR, so move it into LSR and
the x86-specific tree of tests that already exists there. Target
dependence is common and unavoidable with the current design of LSR.

llvm-svn: 292774
2017-01-23 08:33:29 +00:00
Benjamin Kramer 1fd0d44e9b Attempt to fix test in release builds.
llvm-svn: 292762
2017-01-22 21:01:19 +00:00
Benjamin Kramer db9e0b659d Fix some broken CHECK lines.
The colon is important.

llvm-svn: 292761
2017-01-22 20:28:56 +00:00
Eli Friedman f1f49c8265 [SCEV] Make getUDivExactExpr handle non-nuw multiplies correctly.
To avoid regressions, make ScalarEvolution::createSCEV a bit more
clever.

Also get rid of some useless code in ScalarEvolution::howFarToZero
which was hiding this bug.

No new testcase because it's impossible to actually expose this bug:
we don't have any in-tree users of getUDivExactExpr besides the two
functions I just mentioned, and they both dodged the problem. I'll
try to add some interesting users in a followup.

Differential Revision: https://reviews.llvm.org/D28587

llvm-svn: 292449
2017-01-18 23:56:42 +00:00
Chandler Carruth 0952750fae [PM] Clean up the testing for IVUsers, especially with the new PM.
First, I've moved a test of IVUsers from the LSR tree to a dedicated
IVUsers test directory. I've also simplified its RUN line now that the
new pass manager's loop PM is providing analyses on their own.

No functionality changed, but it makes subsequent changes cleaner.

llvm-svn: 292060
2017-01-15 09:29:27 +00:00
Chandler Carruth 2f19a324cb [PM] The assumption cache is fundamentally designed to be self-updating,
mark it as never invalidated in the new PM.

The old PM already required this to work, and after a discussion with
Hal this seems to really be the only sensible answer. The cache
gracefully degrades as the IR is mutated, and most things which do this
should already be incrementally updating the cache.

This gets rid of a bunch of logic preserving and testing the
invalidation of this analysis.

llvm-svn: 292039
2017-01-15 00:26:18 +00:00
Eli Friedman bd6dedaa7f [SCEV] Make howFarToZero max backedge-taken count check for precondition.
Refines max backedge-taken count if a loop like
"for (int i = 0; i != n; ++i) { /* body */ }" is rotated.

Differential Revision: https://reviews.llvm.org/D28536

llvm-svn: 291704
2017-01-11 21:07:15 +00:00
Eli Friedman 8396265655 [SCEV] Make howFarToZero use a simpler formula for max backedge-taken count.
This is both easier to understand, and produces a tighter bound in certain
cases.

Differential Revision: https://reviews.llvm.org/D28393

llvm-svn: 291701
2017-01-11 20:55:48 +00:00
Chandler Carruth 082c183f06 [PM] Teach SCEV to invalidate itself when its dependencies become
invalid.

This fixes use-after-free bugs that will arise with any interesting use
of SCEV.

I've added a dedicated test that works diligently to trigger these kinds
of bugs in the new pass manager and also checks for them explicitly as
well as triggering ASan failures when things go squirly.

llvm-svn: 291426
2017-01-09 07:44:34 +00:00
Daniel Jasper aec2fa352f Revert @llvm.assume with operator bundles (r289755-r289757)
This creates non-linear behavior in the inliner (see more details in
r289755's commit thread).

llvm-svn: 290086
2016-12-19 08:22:17 +00:00
Hal Finkel cb9f78e1c3 Make processing @llvm.assume more efficient by using operand bundles
There was an efficiency problem with how we processed @llvm.assume in
ValueTracking (and other places). The AssumptionCache tracked all of the
assumptions in a given function. In order to find assumptions relevant to
computing known bits, etc. we searched every assumption in the function. For
ValueTracking, that means that we did O(#assumes * #values) work in InstCombine
and other passes (with a constant factor that can be quite large because we'd
repeat this search at every level of recursion of the analysis).

Several of us discussed this situation at the last developers' meeting, and
this implements the discussed solution: Make the values that an assume might
affect operands of the assume itself. To avoid exposing this detail to
frontends and passes that need not worry about it, I've used the new
operand-bundle feature to add these extra call "operands" in a way that does
not affect the intrinsic's signature. I think this solution is relatively
clean. InstCombine adds these extra operands based on what ValueTracking, LVI,
etc. will need and then those passes need only search the users of the values
under consideration. This should fix the computational-complexity problem.

At this point, no passes depend on the AssumptionCache, and so I'll remove
that as a follow-up change.

Differential Revision: https://reviews.llvm.org/D27259

llvm-svn: 289755
2016-12-15 02:53:42 +00:00
Li Huang faa857dba7 [SCEV] Memoize visitMulExpr results in SCEVRewriteVisitor.
Summary:
When SCEVRewriteVisitor traverses the SCEV DAG, it may visit the same SCEV
multiple times if this SCEV is referenced by multiple other SCEVs. This has
exponential time complexity in the worst case. Memoizing the results will
avoid re-visiting the same SCEV. Add a map to save the results, and override
the visit function of SCEVVisitor. Now SCEVRewriteVisitor only visit each
SCEV once and thus returns the same result for the same input SCEV.

This patch fixes PR18606, PR18607.

Reviewers: Sanjoy Das, Mehdi Amini, Michael Zolotukhin

Differential Revision: https://reviews.llvm.org/D25810

llvm-svn: 284868
2016-10-21 20:05:21 +00:00
John Brawn 84b21835f1 [LoopUnroll] Keep the loop test only on the first iteration of max-or-zero loops
When we have a loop with a known upper bound on the number of iterations, and
furthermore know that either the number of iterations will be either exactly
that upper bound or zero, then we can fully unroll up to that upper bound
keeping only the first loop test to check for the zero iteration case.

Most of the work here is in plumbing this 'max-or-zero' information from the
part of scalar evolution where it's detected through to loop unrolling. I've
also gone for the safe default of 'false' everywhere but howManyLessThans which
could probably be improved.

Differential Revision: https://reviews.llvm.org/D25682

llvm-svn: 284818
2016-10-21 11:08:48 +00:00
Li Huang fcfe8cd3ae [SCEV] Add a threshold to restrict number of mul operands to be inlined into SCEV
This is to avoid inlining too many multiplication operands into a SCEV, which could 
take exponential time in the worst case.

Reviewers: Sanjoy Das, Mehdi Amini, Michael Zolotukhin

Differential Revision: https://reviews.llvm.org/D25794

llvm-svn: 284784
2016-10-20 21:38:39 +00:00
John Brawn ecf79300dd [SCEV] More accurate calculation of max backedge count of some less-than loops
In loops that look something like
 i = n;
 do {
  ...
 } while(i++ < n+k);
where k is a constant, the maximum backedge count is k (in fact the backedge
count will be either 0 or k, depending on whether n+k wraps). More generally
for LHS < RHS if RHS-(LHS of first comparison) is a constant then the loop will
iterate either 0 or that constant number of times.

This allows for more loop unrolling with the recent upper bound loop unrolling
changes, and I'm working on a patch that will let loop unrolling additionally
make use of the loop being executed either 0 or k times (we need to retain the
loop comparison only on the first unrolled iteration).

Differential Revision: https://reviews.llvm.org/D25607

llvm-svn: 284465
2016-10-18 10:10:53 +00:00
David L Kreitzer 8bbabee21a Reapplying r278731 after fixing the problem that caused it to be reverted.
Enhance SCEV to compute the trip count for some loops with unknown stride.

Patch by Pankaj Chawla

Differential Revision: https://reviews.llvm.org/D22377

llvm-svn: 281732
2016-09-16 14:38:13 +00:00
Wei Mi 24662395df Create a getelementptr instead of sub expr for ValueOffsetPair if the
value is a pointer.

This patch is to fix PR30213. When expanding an expr based on ValueOffsetPair,
if the value is of pointer type, we can only create a getelementptr instead
of sub expr.

Differential Revision: https://reviews.llvm.org/D24088

llvm-svn: 281439
2016-09-14 04:39:50 +00:00
Wei Mi 59ca96636d [UNROLL] Postpone ScalarEvolution::forgetLoop after TripCountSC is expanded
when unroll runtime iteration loop.

In llvm::UnrollRuntimeLoopRemainder, if the loop to be unrolled is the inner
loop inside a loop nest, the scalar evolution needs to be dropped for its
parent loop which is done by ScalarEvolution::forgetLoop. However, we can
postpone forgetLoop to the end of UnrollRuntimeLoopRemainder so TripCountSC
expansion can still reuse existing value.

Differential Revision: https://reviews.llvm.org/D23572

llvm-svn: 279748
2016-08-25 16:17:18 +00:00