Commit Graph

1625 Commits

Author SHA1 Message Date
Nikita Popov 2e3f4694d6 [IR] Add GEPOperator::indices() (NFC)
In order to mirror the GetElementPtrInst::indices() API.

Wanted to use this in the IRForTarget code, and was surprised to
find that it didn't exist yet.
2021-07-09 21:41:20 +02:00
Martin Storsjö e479777d3c Revert "[ScalarEvolution] Fix overflow in computeBECount."
This reverts commit 5b350183cd (and
also "[NFC][ScalarEvolution] Cleanup howManyLessThans.",
009436e9c1, to make it apply).

See https://reviews.llvm.org/D105216 for discussion on various
miscompilations caused by that commit.
2021-07-09 14:26:48 +03:00
Eli Friedman 009436e9c1 [NFC][ScalarEvolution] Cleanup howManyLessThans.
In preparation for D104075. Some NFC cleanup, and some test coverage for
planned changes.
2021-07-08 17:56:26 -07:00
Eli Friedman 5b350183cd [ScalarEvolution] Fix overflow in computeBECount.
There are two issues with the current implementation of computeBECount:

1. It doesn't account for the possibility that adding "Stride - 1" to
Delta might overflow. For almost all loops, it doesn't, but it's not
actually proven anywhere.
2. It doesn't account for the possibility that Stride is zero. If Delta
is zero, the backedge is never taken; the value of Stride isn't
relevant. To handle this, we have to make sure that the expression
returned by computeBECount evaluates to zero.

To deal with this, add two new checks:

1. Use a variety of tricks to try to prove that the addition doesn't
overflow.  If the proof is impossible, use an alternate sequence which
never overflows.
2. Use umax(Stride, 1) to handle the possibility that Stride is zero.

Differential Revision: https://reviews.llvm.org/D105216
2021-07-08 10:09:55 -07:00
Eli Friedman f5603aa050 [ScalarEvolution] Make sure getMinusSCEV doesn't negate pointers.
Add a function removePointerBase that returns, essentially, S -
getPointerBase(S).  Use it in getMinusSCEV instead of actually
subtracting pointers.

Differential Revision: https://reviews.llvm.org/D105503
2021-07-07 10:27:10 -07:00
Eli Friedman 7ac1c7bead Recommit [ScalarEvolution] Make getMinusSCEV() fail for unrelated pointers.
As part of making ScalarEvolution's handling of pointers consistent, we
want to forbid multiplying a pointer by -1 (or any other value). This
means we can't blindly subtract pointers.

There are a few ways we could deal with this:
1. We could completely forbid subtracting pointers in getMinusSCEV()
2. We could forbid subracting pointers with different pointer bases
(this patch).
3. We could try to ptrtoint pointer operands.

The option in this patch is more friendly to non-integral pointers: code
that works with normal pointers will also work with non-integral
pointers. And it seems like there are very few places that actually
benefit from the third option.

As a minimal patch, the ScalarEvolution implementation of getMinusSCEV
still ends up subtracting pointers if they have the same base.  This
should eliminate the shared pointer base, but eventually we'll need to
rewrite it to avoid negating the pointer base. I plan to do this as a
separate step to allow measuring the compile-time impact.

This doesn't cause obvious functional changes in most cases; the one
case that is significantly affected is ICmpZero handling in LSR (which
is the source of almost all the test changes).  The resulting changes
seem okay to me, but suggestions welcome.  As an alternative, I tried
explicitly ptrtoint'ing the operands, but the result doesn't seem
obviously better.

I deleted the test lsr-undef-in-binop.ll becuase I couldn't figure out
how to repair it to test what it was actually trying to test.

Recommitting with fix to MemoryDepChecker::isDependent.

Differential Revision: https://reviews.llvm.org/D104806
2021-07-06 12:16:05 -07:00
Eli Friedman a6d081b2cb Revert "[ScalarEvolution] Make getMinusSCEV() fail for unrelated pointers."
This reverts commit 74d6ce5d5f.

Seeing crashes on buildbots in MemoryDepChecker::isDependent.
2021-07-06 11:17:13 -07:00
Eli Friedman 74d6ce5d5f [ScalarEvolution] Make getMinusSCEV() fail for unrelated pointers.
As part of making ScalarEvolution's handling of pointers consistent, we
want to forbid multiplying a pointer by -1 (or any other value). This
means we can't blindly subtract pointers.

There are a few ways we could deal with this:
1. We could completely forbid subtracting pointers in getMinusSCEV()
2. We could forbid subracting pointers with different pointer bases
(this patch).
3. We could try to ptrtoint pointer operands.

The option in this patch is more friendly to non-integral pointers: code
that works with normal pointers will also work with non-integral
pointers. And it seems like there are very few places that actually
benefit from the third option.

As a minimal patch, the ScalarEvolution implementation of getMinusSCEV
still ends up subtracting pointers if they have the same base.  This
should eliminate the shared pointer base, but eventually we'll need to
rewrite it to avoid negating the pointer base. I plan to do this as a
separate step to allow measuring the compile-time impact.

This doesn't cause obvious functional changes in most cases; the one
case that is significantly affected is ICmpZero handling in LSR (which
is the source of almost all the test changes).  The resulting changes
seem okay to me, but suggestions welcome.  As an alternative, I tried
explicitly ptrtoint'ing the operands, but the result doesn't seem
obviously better.

I deleted the test lsr-undef-in-binop.ll becuase I couldn't figure out
how to repair it to test what it was actually trying to test.

Differential Revision: https://reviews.llvm.org/D104806
2021-07-06 10:54:41 -07:00
Philip Reames 14d8f1546a [SCEV] Fold (0 udiv %x) to 0
We have analogous rules in instsimplify, etc.., but were missing the same in SCEV.  The fold is near trivial, but came up in the context of a larger change.
2021-06-30 08:31:13 -07:00
Eli Friedman 8d5bf0709d [NFC] Prefer ConstantRange::makeExactICmpRegion over makeAllowedICmpRegion
The implementation is identical, but it makes the semantics a bit more
obvious.
2021-06-25 14:43:13 -07:00
Florian Hahn 6478f3fb78
[SCEV] Support single-cond range check idiom in applyLoopGuards.
This patch extends applyLoopGuards to detect a single-cond range check
idiom that InstCombine generates.

It extends applyLoopGuards to detect conditions of the form
(-C1 + X < C2). InstCombine will create this form when combining two
checks of the form (X u< C2 + C1) and (X >=u C1).

In practice, this enables us to correctly compute a tight trip count
bounds for code as in the function below. InstCombine will fold the
minimum iteration check created by LoopRotate with the user check (< 8).

    void unsigned_check(short *pred, unsigned width) {
        if (width < 8) {
            for (int x = 0; x < width; x++)
                pred[x] = pred[x] * pred[x];
        }
    }

As a consequence, LLVM creates dead vector loops for the code above,
e.g. see https://godbolt.org/z/cb8eTcqET

https://alive2.llvm.org/ce/z/SHHW4d

Reviewed By: nikic

Differential Revision: https://reviews.llvm.org/D104741
2021-06-25 10:24:40 +01:00
Florian Hahn 121ecb05e7
[SCEV] Generalize MatchBinaryAddToConst to support non-add expressions.
This patch generalizes MatchBinaryAddToConst to support matching
(A + C1), (A + C2), instead of just matching (A + C1), A.

The existing cases can be handled by treating non-add expressions A as
A + 0.

Reviewed By: mkazantsev

Differential Revision: https://reviews.llvm.org/D104634
2021-06-24 12:16:15 +01:00
Eli Friedman b12192f7cd [ScalarEvolution] Clarify implementation of getPointerBase().
getPointerBase should only be looking through Add and AddRec
expressions; other expressions either aren't pointers, or can't be
looked through.

Technically, this is a functional change. For a multiply or min/max
expression, if they have exactly one pointer operand, and that operand
is the first operand, the behavior here changes. Similarly, if an AddRec
has a pointer-type step, the behavior changes. But that shouldn't be
happening in practice, and we plan to make such expressions illegal.
2021-06-23 12:55:59 -07:00
Eli Friedman fdaf304e0d [NFC][ScalarEvolution] Fix SCEVNAryExpr::getType().
SCEVNAryExpr::getType() could return the wrong type for a SCEVAddExpr.
Remove it, and add getType() methods to the relevant subclasses.

NFC because nothing uses it directly, as far as I know; this is just
future-proofing.
2021-06-23 12:55:59 -07:00
Florian Hahn adee485adf
[SCEV] Support signed predicates in applyLoopGuards.
This adds handling for signed predicates, similar to how unsigned
predicates are already handled.

Reviewed By: nikic

Differential Revision: https://reviews.llvm.org/D104732
2021-06-23 10:21:05 +01:00
Florian Hahn 6c782e6eb0
[SCEV] Reduce code to handle predicates in applyLoopGuards (NFC).
Hoist out common recurrence check and sink updating the map, to reduce
the code required to support additional predicates.
2021-06-22 15:56:45 +01:00
Florian Hahn d17798823c
[SCEV] Retain AddExpr flags when subtracting a foldable constant.
Currently we drop wrapping flags for expressions like (A + C1)<flags> - C2.

But we can retain flags under certain conditions:

* Adding a smaller constant is NUW if the original AddExpr was NUW.

* Adding a constant with the same sign and small magnitude is NSW, if the
  original AddExpr was NSW.

This can improve results after using `SimplifyICmpOperands`, which may
subtract one in order to use stricter predicates, as is the case for
`isKnownPredicate`.

Reviewed By: efriedma

Differential Revision: https://reviews.llvm.org/D104319
2021-06-22 11:27:51 +01:00
Eli Friedman 8f3d16905d [ScalarEvolution] Ensure backedge-taken counts are not pointers.
A backedge-taken count doesn't refer to memory; returning a pointer type
is nonsense. So make sure we always return an integer.

The obvious way to do this would be to just convert the operands of the
icmp to integers, but that doesn't quite work out at the moment:
isLoopEntryGuardedByCond currently gets confused by ptrtoint operations.
So we perform the ptrtoint conversion late for lt/gt operations.

The test changes are mostly innocuous. The most interesting changes are
more complex SCEV expressions of the form "(-1 * (ptrtoint i8* %ptr to
i64)) + %ptr)". This is expected: we can't fold this to zero because we
need to preserve the pointer base.

The call to isLoopEntryGuardedByCond in howFarToZero is less precise
because of ptrtoint operations; this shows up in the function
pr46786_c26_char in ptrtoint.ll. Fixing it here would require more
complex refactoring.  It should eventually be fixed by future
improvements to isImpliedCond.

See https://bugs.llvm.org/show_bug.cgi?id=46786 for context.

Differential Revision: https://reviews.llvm.org/D103656
2021-06-21 16:24:16 -07:00
Eli Friedman 62ed024c74 [NFC][ScalarEvolution] Clean up ExitLimit constructors.
Make all the constructors forward to one constructor.  Remove redundant
assertions.
2021-06-20 17:40:30 -07:00
Eli Friedman 8a567e5f22 [ScalarEvolution] Fix pointer/int type handling converting select/phi to min/max.
The old version of this code would blindly perform arithmetic without
paying attention to whether the types involved were pointers or
integers.  This could lead to weird expressions like negating a pointer.

Explicitly handle simple cases involving pointers, like "x < y ? x : y".
In all other cases, coerce the operands of the comparison to integer
types.  This avoids the weird cases, while handling most of the
interesting cases.

Differential Revision: https://reviews.llvm.org/D103660
2021-06-17 14:05:12 -07:00
Eli Friedman 27963ccf07 [NFC][ScalarEvolution] Refactor createNodeForSelectOrPHI
In preparation for D103660.
2021-06-16 12:32:32 -07:00
Roman Lebedev a3113df219
[SCEV] PtrToInt on non-integral pointers is allowed
As per (committed without review) @reames's rGac81cb7e6dde9b0890ee1780eae94ab96743569b change,
we are now allowed to produce `ptrtoint` for non-integral pointers.
This will unblock further unbreaking of SCEV regarding int-vs-pointer type confusion.

Reviewed By: mkazantsev

Differential Revision: https://reviews.llvm.org/D104322
2021-06-16 10:24:25 +03:00
Philip Reames 7629b2a09c [LI] Add a cover function for checking if a loop is mustprogress [nfc]
Essentially, the cover function simply combines the loop level check and the function level scope into one call.  This simplifies several callers and is (subjectively) less error prone.
2021-06-10 13:37:32 -07:00
Philip Reames aaaeb4b160 [SCEV] Use mustprogress flag on loops (in addition to function attribute)
This addresses a performance regression reported against 3c6e4191.  That change (correctly) limited a transform based on assumed finiteness to mustprogress loops, but the previous change (38540d7) which introduced the mustprogress check utility only handled function attributes, not the loop metadata form.

It turns out that clang uses the function attribute form for C++, and the loop metadata form for C.  As a result, 3c6e4191 ended up being a large regression in practice for C code as loops weren't being considered mustprogress despite the language semantics.
2021-06-10 13:20:28 -07:00
Philip Reames b65f30d6fb [SCEV] Minor code motion to simplify a later patch [nfc] 2021-06-09 14:17:06 -07:00
Florian Hahn b76f1f1202
[SCEV] Keep common NUW flags when inlining Add operands.
Currently, NoWrapFlags are dropped if we inline operands of SCEVAddExpr
operands. As a consequence, we always drop flags when building
expressions like `getAddExpr(A, getAddExpr(B, C, NUW), NUW)`.

We should be able to retain NUW flags common among all inlined
SCEVAddExpr and the original flags.

Reviewed By: nikic, mkazantsev

Differential Revision: https://reviews.llvm.org/D103877
2021-06-09 17:13:21 +01:00
Philip Reames 3c6e419198 [SCEV] Properly guard reasoning about infinite loops being UB on mustprogress
Noticed via code inspection. We changed the semantics of the IR when we added mustprogress, and we appear to have not updated this location.

Differential Revision: https://reviews.llvm.org/D103834
2021-06-07 14:47:36 -07:00
Philip Reames 38540d71c7 [SCEV] Compute exit counts for unsigned IVs using mustprogress semantics
The motivation here is simple loops with unsigned induction variables w/non-one steps. A toy example would be:
for (unsigned i = 0; i < N; i += 2) { body; }

Given C/C++ semantics, we do not get the nuw flag on the induction variable. Given that lack, we currently can't compute a bound for this loop. We can do better for many cases, depending on the contents of "body".

The basic intuition behind this patch is as follows:
* A step which evenly divides the iteration space must wrap through the same numbers repeatedly. And thus, we can ignore potential cornercases where we exit after the n-th wrap through uint32_max.
* Per C++ rules, infinite loops without side effects are UB. We already have code in SCEV which relies on this.  In LLVM, this is tied to the mustprogress attribute.

Together, these let us conclude that the trip count of this loop must come before unsigned overflow unless the body would form a well defined infinite loop.

A couple notes for those reading along:
* I reused the loop properties code which is overly conservative for this case. I may follow up in another patch to generalize it for the actual UB rules.
* We could cache the n(s/u)w facts. I left that out because doing a pre-patch which cached existing inference showed a lot of diffs I had trouble fully explaining. I plan to get back to this, but I don't want it on the critical path.

Differential Revision: https://reviews.llvm.org/D103118
2021-06-07 11:24:00 -07:00
Roman Lebedev e350494fb0
[NFC] Promote willNotOverflow() / getStrengthenedNoWrapFlagsFromBinOp() from IndVars into SCEV proper
We might want to use it when creating SCEV proper in createSCEV(),
now that we don't `forgetValue()` in `SimplifyIndvar::strengthenOverflowingOperation()`,
which might have caused us to loose some optimization potential.
2021-06-05 12:17:51 +03:00
Eli Friedman fd229caa01 [polly] Fix SCEVLoopAddRecRewriter to avoid invalid AddRecs.
When we're remapping an AddRec, the AddRec constructed by a partial
rewrite might not make sense.  This triggers an assertion complaining
it's not loop-invariant.

Instead of constructing the partially rewritten AddRec, just skip
straight to calling evaluateAtIteration.

Testcase was automatically reduced using llvm-reduce, so it's a little
messy, but hopefully makes sense.

Differential Revision: https://reviews.llvm.org/D102959
2021-06-01 09:51:05 -07:00
Roman Lebedev f7c95c3322
[NFC] ScalarEvolution: apply SSO to the ExprValueMap value
ExprValueMap is a map from SCEV * to a set-vector of (Value *, ConstantInt *) pair,
and while the map itself will likely be big-ish (have many keys),
it is a reasonable assumption that each key will refer to a small-ish
number of pairs.

In particular looking at n=512 case from
https://bugs.llvm.org/show_bug.cgi?id=50384,
the small-size of 4 appears to be the sweet spot,
it results in the least allocations while minimizing memory footprint.
```
$ for i in $(ls heaptrack.opt.*.gz); do echo $i; heaptrack_print $i | tail -n 6; echo ""; done
heaptrack.opt.0-orig.gz
total runtime: 14.32s.
calls to allocation functions: 8222442 (574192/s)
temporary memory allocations: 2419000 (168924/s)
peak heap memory consumption: 190.98MB
peak RSS (including heaptrack overhead): 239.65MB
total memory leaked: 67.58KB

heaptrack.opt.1-n1.gz
total runtime: 13.72s.
calls to allocation functions: 7184188 (523705/s)
temporary memory allocations: 2419017 (176338/s)
peak heap memory consumption: 191.38MB
peak RSS (including heaptrack overhead): 239.64MB
total memory leaked: 67.58KB

heaptrack.opt.2-n2.gz
total runtime: 12.24s.
calls to allocation functions: 6146827 (502355/s)
temporary memory allocations: 2418997 (197695/s)
peak heap memory consumption: 163.31MB
peak RSS (including heaptrack overhead): 211.01MB
total memory leaked: 67.58KB

heaptrack.opt.3-n4.gz
total runtime: 12.28s.
calls to allocation functions: 6068532 (494260/s)
temporary memory allocations: 2418985 (197017/s)
peak heap memory consumption: 155.43MB
peak RSS (including heaptrack overhead): 201.77MB
total memory leaked: 67.58KB

heaptrack.opt.4-n8.gz
total runtime: 12.06s.
calls to allocation functions: 6068042 (503321/s)
temporary memory allocations: 2418992 (200646/s)
peak heap memory consumption: 166.03MB
peak RSS (including heaptrack overhead): 213.55MB
total memory leaked: 67.58KB

heaptrack.opt.5-n16.gz
total runtime: 12.14s.
calls to allocation functions: 6067993 (499958/s)
temporary memory allocations: 2418999 (199307/s)
peak heap memory consumption: 187.24MB
peak RSS (including heaptrack overhead): 233.69MB
total memory leaked: 67.58KB
```

While that test may be an edge worst-case scenario,
https://llvm-compile-time-tracker.com/compare.php?from=dee85d47d9f15fc268f7b18f279dac2774836615&to=98a57e31b1947d5bcdf4a5605ac2ab32b4bd5f63&stat=instructions
agrees that this also results in improvements in the usual situations.
2021-05-31 15:34:03 +03:00
Philip Reames ff08c3468f [SCEV] Compute trip multiple for multiple exit loops
This patch implements getSmallConstantTripMultiple(L) correctly for multiple exit loops. The previous implementation was both imprecise, and violated the specified behavior of the method. This was fine in practice, because it turns out the function was both dead in real code, and not tested for the multiple exit case.

Differential Revision: https://reviews.llvm.org/D103189
2021-05-26 11:52:25 -07:00
Philip Reames 9306bb638f [SCEV] Generalize getSmallConstantTripCount(L) for multiple exit loops
This came up in review for another patch, see https://reviews.llvm.org/D102982#2782407 for full context.

I've reviewed the callers to make sure they can handle multiple exit loops w/non-zero returns.  There's two cases in target cost models where results might change (Hexagon and PowerPC), but the results looked legal and reasonable.  If a target maintainer wishes to back out the effect of the costing change, they should explicitly check for multiple exit loops and handle them as desired.

Differential Revision: https://reviews.llvm.org/D103182
2021-05-26 11:18:25 -07:00
Philip Reames 921d3f7af0 [SCEV] Add a utility for converting from "exit count" to "trip count"
(Mostly as a logical place to put a comment since this is a reoccuring confusion.)
2021-05-26 10:41:49 -07:00
Philip Reames fb14577d0c [SCEV] Extract out a helper for computing trip multiples 2021-05-26 10:15:03 -07:00
Vitaly Buka f44f2e0afc [NFC] Fix 'unused' warning 2021-05-25 12:23:57 -07:00
Nikita Popov 6300c37a46 [SCEV] Cache operands used in BEInfo (NFC)
When memoized values for a SCEV expressions are dropped, we also
drop all BECounts that make use of the SCEV expression. This is done
by iterating over all the ExitNotTaken counts and (recursively)
checking whether they use the SCEV expression. If there are many
exits, this will take a lot of time.

This patch improves the situation by pre-computing a set of all
used operands, so that we can determine whether a certain BEInfo
needs to be invalidated using a simple set lookup. Will still need
to loop over all BEInfos though.

This makes for a mild improvement on non-degenerate cases:
https://llvm-compile-time-tracker.com/compare.php?from=b661a55a253f4a1cf5a0fbcb86e5ba7b9fb1387b&to=be1393f450e594c53f0ad7e62339a6bc831b16f6&stat=instructions

For the degenerate case from https://bugs.llvm.org/show_bug.cgi?id=50384,
for n=128 I'm seeing run time drop from 1.6s to 1.1s.

Differential Revision: https://reviews.llvm.org/D102796
2021-05-25 21:03:33 +02:00
Philip Reames aabca2d1da [SCEV] Cleanup doesIVOverflowOnX checks [NFC]
Stylistic changes only.
1) Don't pass a parameter just to do an early exit.
2) Use a name which matches actual behavior.
2021-05-25 10:12:24 -07:00
Philip Reames a47b2d4567 [SCEV] Remove unused parameter from computeBECount [NFC]
All callers pass "false" for the Equality parameter.  Kill the dead code, and update the function block comment.
2021-05-25 09:58:56 -07:00
Nikita Popov b661a55a25 [ScalarEvolution] Remove unused ExitLimit::hasOperand() method (NFC)
We only use BackedgeTakenInfo::hasOperand().
2021-05-19 18:42:14 +02:00
Florian Hahn e2759f110b
[SCEV] Apply guards to max with non-unitary steps.
We already apply loop-guards when computing the maximum with unitary
steps. This extends the code to also do so when dealing with non-unitary
steps.

This allows us to infer a tighter maximum in some cases.

Reviewed By: nikic

Differential Revision: https://reviews.llvm.org/D102267
2021-05-13 09:47:29 +01:00
Nikita Popov d26ca78c18 [SCEV] Handle and/or in applyLoopGuards()
applyLoopGuards() already combines conditions from multiple nested
guards. However, it cannot use multiple conditions on the same guard,
combined using and/or. Add support for this by recursing into either
`and` or `or`, depending on the direction of the branch.

Differential Revision: https://reviews.llvm.org/D101692
2021-05-09 21:34:28 +02:00
Florian Hahn 6c99e63120 [SCEV] By more careful when traversing phis in isImpliedViaMerge.
I think currently isImpliedViaMerge can incorrectly return true for phis
in a loop/cycle, if the found condition involves the previous value of

Consider the case in exit_cond_depends_on_inner_loop.

At some point, we call (modulo simplifications)
isImpliedViaMerge(<=, %x.lcssa, -1, %call, -1).

The existing code tries to prove IncV <= -1 for all incoming values
InvV using the found condition (%call <= -1). At the moment this succeeds,
but only because it does not compare the same runtime value. The found
condition checks the value of the last iteration, but the incoming value
is from the *previous* iteration.

Hence we incorrectly determine that the *previous* value was <= -1,
which may not be true.

I think we need to be more careful when looking at the incoming values
here. In particular, we need to rule out that a found condition refers to
any value that may refer to one of the previous iterations. I'm not sure
there's a reliable way to do so (that also works of irreducible control
flow).

So for now this patch adds an additional requirement that the incoming
value must properly dominate the phi block. This should ensure the
values do not change in a cycle. I am not entirely sure if will catch
all cases and I appreciate a through second look in that regard.

Alternatively we could also unconditionally bail out in this case,
instead of checking the incoming values

Reviewed By: nikic

Differential Revision: https://reviews.llvm.org/D101829
2021-05-07 19:52:29 +01:00
Nikita Popov cc58e8918b [SCEV] Simplify backedge count clearing (NFC)
This seems to be a leftover from when the BackedgeTakenInfo
stored multiple exit counts with manual memory management. At
some point this was switchted to a simple vector, and there should
be no need to micro-manage the clearing anymore. We can simply
drop the loop from the map and the the destructor do its job.
2021-05-01 17:50:01 +02:00
Philip Reames 0cc3e10f5e [SCEV] Avoid range intersection idiom in getRangeForUnkownRecurrence [NFC]
Addresses a review comment from D101181
2021-04-28 12:48:17 -07:00
Philip Reames a836de0bde [SCEV] Compute ranges for ashr recurrences
Straight forward extension to the recently added infrastructure which was pioneered with shl. This was originally posted as part of D99687, but split off for ease of review.

(I also decided to exclude the unknown start sign case explicitly for simplicity of understanding.)

Differential Revision: https://reviews.llvm.org/D101181
2021-04-28 12:36:20 -07:00
Nikita Popov e45168c4fa [SCEV] Handle uge/ugt predicates in applyLoopGuards()
These can be handled the same way as ule/ult, just using umax
instead of umin. This is useful in cases where the umax prevents
the upper bound from overflowing.

Differential Revision: https://reviews.llvm.org/D101196
2021-04-27 22:41:05 +02:00
Nikita Popov a5051f2fa2 [SCEV] Fix applyLoopGuards() chaining for ne predicates
ICMP_NE predicates directly overwrote the rewritten result,
instead of chaining it with previous rewrites, as was done for
ICMP_ULT and ICMP_ULE. This means that some guards were effectively
discarded, depending on their order.
2021-04-24 21:43:46 +02:00
Philip Reames 424d6cb902 [SCEV] Compute ranges for lshr recurrences
Straight forward extension to the recently added infrastructure which was pioneered with shl.

Differential Revision: https://reviews.llvm.org/D99687
2021-04-22 11:06:31 -07:00
Yang Fan 4307446e9f
[SCEV] Fix -Wunused-variable warning (NFC)
GCC warning:
```
/llvm-project/llvm/lib/Analysis/ScalarEvolution.cpp: In member function ‘const llvm::SCEV* llvm::ScalarEvolution::getLosslessPtrToIntExpr(const llvm::SCEV*, unsigned int)::SCEVPtrToIntSinkingRewriter::visitUnknown(const llvm::SCEVUnknown*)’:
/llvm-project/llvm/lib/Analysis/ScalarEvolution.cpp:1152:13: warning: unused variable ‘ExprPtrTy’ [-Wunused-variable]
 1152 |       Type *ExprPtrTy = Expr->getType();
      |             ^~~~~~~~~
```
2021-04-21 16:01:46 +08:00
Philip Reames 9c1a145aeb Rearrange code to reduce diff for D99687 [nfc]
Adding the switches to reduce diffs.  I'm about to split that into an lshr part and an ashr part, doing the NFC part first makes it easier to maintain both diffs.
2021-04-20 11:40:15 -07:00
Roman Lebedev 7186764884
[NFC][SCEV] Split getLosslessPtrToIntExpr out of getPtrToIntExpr() 2021-04-20 21:29:21 +03:00
Roman Lebedev 41c22acc22
[NFC][SCEV] Assert that we don't try to create SCEVPtrToIntExpr of a non-integral pointer
ptr<->int casts are only valid for integral pointes,
defensively assert that we don't try to break that here.
2021-04-19 18:38:38 +03:00
Roman Lebedev d480f968ad
Revert "[SCEV] Model `ashr exact x, C` as `(abs(x) EXACT/u (1<<C)) * signum(x)`"
As being discussed in https://reviews.llvm.org/D100721,
this modelling is lossy, we can't reconstruct `ash`/`ashr exact`
from it, which means that whenever we actually expand the IR,
we've just pessimized the code..

It would be good to model this pattern, after all it comes up every time
you want to compute a distance between two pointers, but not at this cost.

This reverts commit ec54867df5.
2021-04-18 16:26:45 +03:00
Nikita Popov a1ed025d0e Revert "[SCEV] Don't walk uses of phis without SCEV expression when forgetting"
This reverts commit faf9f11589.

Issues with this patch have been reported in
https://reviews.llvm.org/D100264#2689917 and
https://bugs.llvm.org/show_bug.cgi?id=49967.
2021-04-15 09:43:52 +02:00
Nikita Popov faf9f11589 [SCEV] Don't walk uses of phis without SCEV expression when forgetting
I've run into some cases where a large fraction of compile-time is
spent invalidating SCEV. One of the causes is forgetLoop(), which
walks all values that are def-use reachable from the loop header
phis. When invalidating a topmost loop, that might be close to all
values in a function. Additionally, it's fairly common for there to
not actually be anything to invalidate, but we'll still be performing
this walk again and again.

My first thought was that we don't need to continue walking the uses
if the current value doesn't have a SCEV expression. However, this
isn't quite right, because SCEV construction can skip over values
(e.g. for a chain of adds, we might only create a SCEV expression
for the final value).

What this patch does instead is to only walk the (full) def-use chain
of loop phis that have a SCEV expression. If there's no expression
for a phi, then we also don't have any dependent expressions to
invalidate.

Differential Revision: https://reviews.llvm.org/D100264
2021-04-13 20:28:17 +02:00
Roman Lebedev e8c7f43e2c
[NFC][ConstantRange] Add 'icmp' helper method
"Does the predicate hold between two ranges?"

Not very surprisingly, some places were already doing this check,
without explicitly naming the algorithm, cleanup them all.
2021-04-10 19:38:55 +03:00
Roman Lebedev 7b12c8c59d
Revert "[NFC][ConstantRange] Add 'icmp' helper method"
This reverts commit 17cf2c9423.
2021-04-10 19:37:53 +03:00
Roman Lebedev 17cf2c9423
[NFC][ConstantRange] Add 'icmp' helper method
"Does the predicate hold between two ranges?"

Not very surprisingly, some places were already doing this check,
without explicitly naming the algorithm, cleanup them all.
2021-04-10 19:09:52 +03:00
Max Kazantsev fee330824a [SCEV] Fix false-positive recognition of simple recurrences. PR49856
A value from reachable block may come to a Phi node as its input from
unreachable block. This may confuse matchSimpleRecurrence  which
has no access to DomTree and can falsely recognize something as a recurrency
because of this effect, as the attached test shows.

Patch `ae7b1e` deals with half of this problem, but it only accounts from
the case when an unreachable instruction comes to Phi as an input.

This patch provides a generalization by checking that no Phi block's
predecessor is unreachable (no matter what the input is).

Differential Revision: https://reviews.llvm.org/D99929
Reviewed By: reames
2021-04-07 13:55:17 +07:00
Philip Reames ae7b1e8823 [SCEV] Handle unreachable binop when matching shift recurrence
This fixes an issue introduced with my change d4648e, and reported in pr49768.

The root problem is that dominance collapses in unreachable code, and that LoopInfo explicitly only models reachable code.  Since the recurrence matcher doesn't filter by reachability (and can't easily because not all consumers have domtree), we need to bailout before assuming that finding a recurrence implies we found a loop.
2021-03-31 10:33:34 -07:00
Nikita Popov a7efed5a20 [SCEV] Improve handling of not expressions in isImpliedCond()
SCEV currently tries to prove implications of x pred y by also
trying to imply ~y pred ~x. This is expensive in terms of
compile-time (in fact, the majority of isImpliedCond compile-time
is spent here) and generally not fruitful. The issue is that this
also swaps the operands and thus breaks canonical ordering. If
originally we were trying to prove an implication like
X > C1 -> Y > C2, then we'll now try to prove X > C1 -> C3 > ~Y,
which will not work.

The only real case where we can get some use out of this transform
is if the original conditions were in the form X > C1 -> Y < C2, were
then swapped to X > C1 -> C2 > Y and are then swapped again here to
X > C1 -> ~Y > C3.

As such, handle this at a higher level, where we are doing the
swapping in the first place. There's four different ways that we
can line up a predicate and a swapped predicate, so we use some
heuristics to pick some profitable way.

Because we now try this transform at a higher level
(isImpliedCondOperands rather than isImpliedCondOperandsHelper),
we can also prove additional facts. Of the added tests, one was
proven previously while the other wasn't.

Differential Revision: https://reviews.llvm.org/D90926
2021-03-24 21:53:02 +01:00
Juneyoung Lee b00209ed10 [SCEV] Use logical and/or matcher
This is a minor patch that updates ScalarEvolution::isImpliedCond to use logical and/or matcher.
2021-03-23 06:00:54 +09:00
Philip Reames 93ce855d4b 2nd attempt at a speculative fix for windows builders after d4648eea 2021-03-22 10:32:57 -07:00
Philip Reames 6ba73c4743 Speculative fix for windows builders after d4648eea 2021-03-22 10:22:01 -07:00
Philip Reames d4648eeaa2 [SCEV] Use trip count information to improve shift recurrence ranges
This patch exploits the knowledge that we may be running many fewer than bitwidth iterations of the loop, and may be able to disallow the overflow case. This patch specifically implements only the shl case, but this can be generalized to ashr and lshr without difficulty.

Differential Revision: https://reviews.llvm.org/D98222
2021-03-22 09:38:43 -07:00
Philip Reames 00d0315a7c [SCEV] Factor out a lambda for strict condition splitting [NFC] 2021-03-19 10:07:12 -07:00
Max Kazantsev fff1363ba0 [SCEV] Add false->any implication
By definition of Implication operator, `false -> true` and `false -> false`. It means that
`false` implies any predicate, no matter true or false. We don't need to go any further
trying to prove the statement we need and just always say that `false` implies it in this case.

In practice it means that we are trying to prove something guarded by `false` condition,
which means that this code is unreachable, and we can safely prove any fact or perform any
transform in this code.

Differential Revision: https://reviews.llvm.org/D98706
Reviewed By: lebedev.ri
2021-03-19 11:29:48 +07:00
Max Kazantsev b3a1500ea8 [SCEV][NFC] API for predicate evaluation
Provides API that allows to check predicate for being true or
false with one call. Current implementation is naive and just
calls isKnownPredicate twice, but further we can rework this
logic trying to use one check to prove both facts.
2021-03-18 19:21:29 +07:00
Max Kazantsev 5097143f0e [SCEV][NFC] Move check up the stack
One of (and primary) callers of isBasicBlockEntryGuardedByCond is
isKnownPredicateAt, which makes isKnownPredicate check before it.
It already makes non-recursive check inside. So, on this execution
path this check is made twice. The only other caller is
isLoopEntryGuardedByCond. Moving the check there should save some
compile time.
2021-03-16 22:09:17 +07:00
Roman Lebedev 78b8ce40ef
Reland [SCEV] Improve modelling for (null) pointer constants
This reverts commit 329aeb5db4,
and relands commit 61f006ac65.

This is a continuation of D89456.

As it was suggested there, now that SCEV models `PtrToInt`,
we can try to improve SCEV's pointer handling.
In particular, i believe, i will need this in the future
to further fix `SCEVAddExpr`operation type handling.

This removes special handling of `ConstantPointerNull`
from `ScalarEvolution::createSCEV()`, and add constant folding
into `ScalarEvolution::getPtrToIntExpr()`.
This way, `null` constants stay as such in SCEV's,
but gracefully become zero integers when asked.

Reviewed By: Meinersbur

Differential Revision: https://reviews.llvm.org/D98147
2021-03-13 16:05:34 +03:00
Roman Lebedev 329aeb5db4
Temporairly evert "[SCEV] Improve modelling for (null) pointer constants"
This appears to have broken ubsan bot:
https://lab.llvm.org/buildbot/#/builders/85/builds/3062
https://reviews.llvm.org/D98147#2623549

It looks like LSR needs some kind of a change around insertion point handling.
Reverting until i have a fix.

This reverts commit 61f006ac65.
2021-03-13 09:10:28 +03:00
Roman Lebedev 61f006ac65
[SCEV] Improve modelling for (null) pointer constants
This is a continuation of D89456.

As it was suggested there, now that SCEV models `PtrToInt`,
we can try to improve SCEV's pointer handling.
In particular, i believe, i will need this in the future
to further fix `SCEVAddExpr`operation type handling.

This removes special handling of `ConstantPointerNull`
from `ScalarEvolution::createSCEV()`, and add constant folding
into `ScalarEvolution::getPtrToIntExpr()`.
This way, `null` constants stay as such in SCEV's,
but gracefully become zero integers when asked.

Reviewed By: Meinersbur

Differential Revision: https://reviews.llvm.org/D98147
2021-03-12 22:11:58 +03:00
Philip Reames a25b537bf4 [SCEV] Infer known bits from known sign bits
This was suggested by lebedev.ri over on D96534.  You'll note lack of tests.  During review, we weren't actually able to find a case which exercises it, but both I and lebedev.ri feel it's a reasonable change, straight forward, and near free.

Differential Revision: https://reviews.llvm.org/D97064
2021-03-09 12:37:17 -08:00
Philip Reames 4a5edea193 [SCEV] Use both known bits and sign bits when computing range of SCEV unknowns
When computing a range for a SCEVUnknown, today we use computeKnownBits for unsigned ranges, and computeNumSignBots for signed ranges. This means we miss opportunities to improve range results.

One common missed pattern is that we have a signed range of a value which CKB can determine is positive, but CNSB doesn't convey that information. The current range includes the negative part, and is thus double the size.

Per the removed comment, the original concern which delayed using both (after some code merging years back) was a compile time concern. CTMark results (provided by Nikita, thanks!) showed a geomean impact of about 0.1%. This doesn't seem large enough to avoid higher quality results.

Differential Revision: https://reviews.llvm.org/D96534
2021-02-19 08:29:12 -08:00
Kazu Hirata df35a183d7 [SCEV] Use ListSeparator (NFC) 2021-02-16 23:23:05 -08:00
Michael Kruse 606aa622b2 Revert "[AssumptionCache] Avoid dangling llvm.assume calls in the cache"
This reverts commit b7d870eae7 and the
subsequent fix "[Polly] Fix build after AssumptionCache change (D96168)"
(commit e6810cab09).

It caused indeterminism in the output, such that e.g. the
polly-x86_64-linux buildbot failed accasionally.
2021-02-11 12:17:38 -06:00
Philip Reames 9bf3cfa77b [SCEV] Add a missing AssumptionCache parameter
The AssumptionCache mechanism is used to feed assumes into known bits computations.  Most places in SCEV passed it in, but one place appears to have been missed.

Spotted via inspection, don't have a test case which actually exercises this, but it seemed like an obvious fixit.
2021-02-10 12:08:55 -08:00
Johannes Doerfert b7d870eae7 [AssumptionCache] Avoid dangling llvm.assume calls in the cache
PR49043 exposed a problem when it comes to RAUW llvm.assumes. While
D96106 would fix it for GVNSink, it seems a more general concern. To
avoid future problems this patch moves away from the vector of weak
reference model used in the assumption cache. Instead, we track the
llvm.assume calls with a callback handle which will remove itself from
the cache if the call is deleted.

Fixes PR49043.

Reviewed By: nikic

Differential Revision: https://reviews.llvm.org/D96168
2021-02-06 12:18:39 -06:00
Gil Rapaport d475030dc2 [SCEV] Apply loop guards to divisibility tests
Extend applyLoopGuards() to take into account conditions/assumes proving some
value %v to be divisible by D by rewriting %v to (%v / D) * D. This lets the
loop unroller and the loop vectorizer identify more loops as not requiring
remainder loops.

Differential Revision: https://reviews.llvm.org/D95521
2021-02-02 08:09:39 +02:00
Florian Hahn f1e8136115
[SCEV] Bail out if URem operand cannot be zero-extended.
In some cases, LHS is larger than the target expression type. Bail out
in that case for now, to avoid crashing
2021-02-01 13:50:54 +00:00
Max Kazantsev 8a4ad8849f [SCEV] Do not cache comparison result upon reached max depth as "equivalence". PR48725
We use `EquivalenceClasses` to cache the notion that two SCEVs are equivalent,
so save time in situation when `A` is equivalent to `B` and `B` is equivalent to `C`,
making check "if `A` is equivalent to `C`?" cheaper.

We also return `0` in the comparator when we reach max analysis depth to save
compile time. After doing this, we also cache them as being equivalent.

Now, imagine the following situation:
- `A` is proved equivalent to `B`;
- `C` is proved equivalent to `D`;
- Comparison of `A` against `D` is proved non-zero;
- Comparison of `B` against `C` reaches max depth (and gets cached as equivalence).

Now, before the invocation of compare(`B`, `C`), `A` and `D` belonged
to different equivalence classes, and their comparison returned non-zero.
After the the invocation of compare(`B`, `C`), equivalence classes get merged
and `A`, `B`, `C` and `D` all fall into the same equivalence class. So the comparator
will change its behavior for couple `A` and `D`, with weird consequences following it.
This comparator is finally used in `std::stable_sort`, and this behavior change
makes it crash (looks like it's causing a memory corruption).

Solution: this patch changes `CompareSCEVComplexity` to return `None`
when the max depth is reached. So in this case, we do not cache these SCEVs
(and their parents in the tree) as being equivalent.

Differential Revision: https://reviews.llvm.org/D94654
Reviewed By: lebedev.ri
2021-01-29 12:08:34 +07:00
Mindong Chen 00fcc03687 [SCEV] Fix incorrect loop exit count analysis.
In computeLoadConstantCompareExitLimit, the addrec used to compute the
exit count should be from the loop which the exiting block belongs to.

Reviewed by: mkazantsev

Differential Revision: https://reviews.llvm.org/D92367
2021-01-27 19:36:05 +08:00
Kazu Hirata 8f5da41c4d [llvm] Construct SmallVector with iterator ranges (NFC) 2021-01-20 21:35:52 -08:00
Kazu Hirata 23b0ab2acb [llvm] Use the default value of drop_begin (NFC) 2021-01-18 10:16:36 -08:00
Kazu Hirata 19aacdb715 [llvm] Construct SmallVector with iterator ranges (NFC) 2021-01-16 09:40:53 -08:00
Kazu Hirata 848e8f938f [llvm] Construct SmallVector with iterator ranges (NFC) 2021-01-04 11:42:44 -08:00
Gil Rapaport d9c0b128e3 [SCEV] Simplify trunc to zero based on known bits
Let getTruncateExpr() short-circuit to zero when the value being truncated is
known to have at least as many trailing zeros as the target type.

Differential Revision: https://reviews.llvm.org/D93973
2021-01-03 13:57:12 +02:00
Juneyoung Lee 509fa8e02e [SCEV] recognize logical and/or pattern
This patch makes SCEV recognize 'select A, B, false' and 'select A, true, B'.
This is a performance improvement that will be helpful after unsound select -> and/or transformation is removed, as discussed in D93065.

SCEV's answers for the select form should be a bit more conservative than the equivalent `and A, B` / `or A, B`.
Take this example: https://alive2.llvm.org/ce/z/NsP9ue .
To check whether it is valid for SCEV's computeExitLimit to return min(n, m) as ExactNotTaken value, I put llvm.assume at tgt.
It fails because the exit limit becomes poison if n is zero and m is poison. This is problematic if e.g. the exit value of i is replaced with min(n, m).
If either n or m is constant, we can revive the analysis again. I added relevant tests and put alive2 links there.

If and is used instead, this is okay: https://alive2.llvm.org/ce/z/K9rbJk . Hence the existing analysis is sound.

Reviewed By: nikic

Differential Revision: https://reviews.llvm.org/D93882
2021-01-01 04:37:57 +09:00
Kazu Hirata f76e83bfbb [Analysis] Use llvm::append_range (NFC) 2020-12-29 19:23:21 -08:00
Kazu Hirata 3285ee143b [Analysis, IR, CodeGen] Use llvm::erase_if (NFC) 2020-12-20 09:19:35 -08:00
Max Kazantsev 8b330f1f69 [SCEV] Add missing type check into getRangeForAffineNoSelfWrappingAR
We make type widening without checking if it's needed. Bail if the max
iteration count is wider than AR's type.
2020-12-15 14:50:32 +07:00
Kazu Hirata eb44682d67 [Analysis] Use is_contained (NFC) 2020-12-11 21:19:31 -08:00
Max Kazantsev 035955f925 Revert "Return "[SCEV] Use isBasicBlockEntryGuardedByCond in isLoopBackedgeGuardedByCond", 2nd try"
This reverts commit f690986f31.

Compile time then and again...
2020-11-26 18:12:51 +07:00
Max Kazantsev f690986f31 Return "[SCEV] Use isBasicBlockEntryGuardedByCond in isLoopBackedgeGuardedByCond", 2nd try
Reverted because the compile time impact is still too high.

isKnownViaNonRecursiveReasoning is used twice, we can do it just once.

Differential Revision: https://reviews.llvm.org/D92152
2020-11-26 17:45:13 +07:00
Max Kazantsev 91d6b6b5fb Revert "[SCEV] Use isBasicBlockEntryGuardedByCond in isLoopBackedgeGuardedByCond"
This reverts commit 3d4c0460ec.

Compile time impact is still high. Need to understand why.

Differential Revision: https://reviews.llvm.org/D92153
2020-11-26 17:28:30 +07:00
Max Kazantsev 3d4c0460ec [SCEV] Use isBasicBlockEntryGuardedByCond in isLoopBackedgeGuardedByCond
Previously we tried to using isKnownPredicateAt, but it makes an
extra query to isKnownPredicate, which has negative impact on compile
time. Let's try to use more lightweight isBasicBlockEntryGuardedByCond.

Differential Revision: https://reviews.llvm.org/D92152
2020-11-26 17:08:38 +07:00
Max Kazantsev 3b6481eae2 Revert "[SCEV] Use isKnownPredicateAt in isLoopBackedgeGuardedByCond"
This reverts commit 14f2ad0e3c.

Reverting to investigate compile time drop.

Differential Revision: https://reviews.llvm.org/D92152
2020-11-26 16:42:43 +07:00
Max Kazantsev 14f2ad0e3c [SCEV] Use isKnownPredicateAt in isLoopBackedgeGuardedByCond
A piece of code in `isLoopBackedgeGuardedByCond` basically duplicates
the dominators traversal from `isBlockEntryGuardedByCond` called from
`isKnownPredicateAt`, but it's less powerful because it does not give context
to `isImpliedCond`. This patch reuses the `isKnownPredicateAt `function there,
reducing the amount of code duplication and making it more powerful.

Differential Revision: https://reviews.llvm.org/D92152
Reviewed By: skatkov
2020-11-26 13:20:02 +07:00
Max Kazantsev f10500e220 [IndVars] Use isLoopBackedgeGuardedByCond for last iteration check
Use more context to prove contextual facts about the last iteration. It is
only executed when the backedge is taken, so we can use `isLoopBackedgeGuardedByCond`
to make this check.

Differential Revision: https://reviews.llvm.org/D91535
Reviewed By: skatkov
2020-11-26 12:37:21 +07:00