In 35676a4f9a I've added handling for
non-trivial dominating conditions that imply non-zero on the true
branch. This adds the same support for the false branch.
The changes in pr45360.ll change block ordering and naming, but
don't change the control flow. The urem is still guaraded by a
non-zero check correctly.
And that exposes that a number of tests don't *actually* manage to
maintain DomTree validity, which is inline with my observations.
Once again, SimlifyCFG pass currently does not require/preserve DomTree
by default, so this is effectively NFC.
... so just ensure that we pass DomTreeUpdater it into it.
Fixes DomTree preservation for a large number of tests,
all of which are marked as such so that they do not regress.
... so just ensure that we pass DomTreeUpdater it into it.
Fixes DomTree preservation for a large number of tests,
all of which are marked as such so that they do not regress.
First step after e113317958,
in these tests, DomTree is valid afterwards, so mark them as such,
so that they don't regress.
In further steps, SimplifyCFG transforms shall taught to preserve DomTree,
in as small steps as possible.
The NPM runs loop passes over loops in forward program order, rather
than the legacy loop PM's reverse program order. This seems to produce
better results as shown here.
I verified that changing the loop order to reverse program order results
in the same IR with the NPM.
Reviewed By: fhahn
Differential Revision: https://reviews.llvm.org/D92817
This reverts commit 4bd35cdc3a.
The patch was reverted during the investigation. The investigation
shown that the patch did not cause any trouble, but just exposed
the existing problem that is addressed by the previous patch
"[IndVars] Quick fix LHS/RHS bug". Returning without changes.
This reverts commit 0c9c6ddf17.
We are seeing some failures with this patch locally. Not clear
if it's causing them or just triggering a problem in another
place. Reverting while investigating.
If we decided to widen IV with zext, then unsigned comparisons
should not prevent widening (same for sext/sign comparisons).
The result of comparison in wider type does not change in this case.
Differential Revision: https://reviews.llvm.org/D92207
Reviewed By: nikic
When widening an IndVar that has LCSSA Phi users outside
the loop, we can safely widen it as usual and then truncate
the result outside the loop without hurting the performance.
Differential Revision: https://reviews.llvm.org/D91593
Reviewed By: skatkov
When bailing out in rewriteLoopExitValues() you could be left with PHI
nodes in the DeadInsts vector. Those would be not handled by the use of
RecursivelyDeleteTriviallyDeadInstructions() in IndVarSimplify. This
resulted in the IndVarSimplify pass returning an incorrect modified
status. This was caught by the expensive check introduced in D86589.
This patches changes IndVarSimplify so that it deletes those PHI nodes,
using RecursivelyDeleteDeadPHINode().
This fixes PR47486.
Reviewed By: mkazantsev
Differential Revision: https://reviews.llvm.org/D91153
Use more context to prove contextual facts about the last iteration. It is
only executed when the backedge is taken, so we can use `isLoopBackedgeGuardedByCond`
to make this check.
Differential Revision: https://reviews.llvm.org/D91535
Reviewed By: skatkov
When deciding to widen narrow use, we may need to prove some facts
about it. For proof, the context is used. Currently we take the instruction
being widened as the context.
However, we may be more precise here if we take as context the point that
dominates all users of instruction being widened.
Differential Revision: https://reviews.llvm.org/D90456
Reviewed By: skatkov
Some nested loops may share the same ExitingBB, so after we finishing FoldExit,
we need to notify OuterLoop and SCEV to drop any stored trip count.
Patched by: guopeilin
Reviewed By: mkazantsev
Differential Revision: https://reviews.llvm.org/D91325
In some cases we can handle IV and iter count of different types. It's a typical situation
after IV have been widened. This patch adds support for such cases, when legal.
Differential Revision: https://reviews.llvm.org/D88528
Reviewed By: skatkov
In an effort to make code around flag determination more readable, and (possibly) prepare for a follow up change, factor out some of the flag detection logic. In the process, reduce the number of locations we mutate wrap flags by a couple.
Note that this isn't NFC. The old code tried for NSW xor (NUW || NW). This is, two different paths computed different sets of wrap flags. The new code will try for all three. The result is that some expressions end up with a few extra flags set.
If we cannot prove that the check is trivially true, but can prove that it either
fails on the 1st iteration or never fails, we can replace it with first iteration check.
Differential Revision: https://reviews.llvm.org/D88527
Reviewed By: skatkov
Sometimes the an instruction we are trying to widen is used by the IV
(which means the instruction is the IV increment). Currently this may
prevent its widening. We should ignore such user because it will be
dead once the transform is done anyways.
Differential Revision: https://reviews.llvm.org/D90920
Reviewed By: fhahn
InstCombine canonicalizes 'sub nuw' instructions to 'add' without the
`nuw` flag. The typical case where we see it is decrementing induction
variables. For them, IndVars fails to prove that it's legal to widen them,
and inserts unprofitable `zext`'s.
This patch adds recognition of such pattern using SCEV.
Differential Revision: https://reviews.llvm.org/D89550
Reviewed By: fhahn, skatkov
Our range computation methods benefit from no-wrap flags. But if the ranges
were first computed before the flags were set, the cached range will be too
pessimistic.
We need to drop cached ranges whenever we sharpen AddRec's no wrap flags.
Differential Revision: https://reviews.llvm.org/D89847
Reviewed By: fhahn
If we know that some check will not be executed on the last iteration, we can use this
fact to eliminate its check.
Differential Revision: https://reviews.llvm.org/D88210
Reviwed By: ebrevnov
This reverts commit e038b60d91.
This reverts commit a0d84d8031.
This revert was a mistake. The reason of the failures was
"Use uint64_t for branch weights instead of uint32_t"
Differential Revision: https://reviews.llvm.org/D87832
This reverts commit c6ca26c0bf.
This breaks stage2 builds due to hitting this assert:
```
Assertion failed: (WeightSum <= UINT32_MAX && "Expected weights to scale down to 32 bits"), function calcMetadataWeights
```
when compiling AArch64RegisterBankInfo.cpp in LLVM.
Even if the exact exit count is unknown, we can still prove that this
exit will not be taken. If we can prove that the predicate is monotonic,
fulfilled on first & last iteration, and no overflow happened in between,
then the check can be removed.
Differential Revision: https://reviews.llvm.org/D87832
Reviewed By: apilipenko
Same change as 0dda633317, but for
mul expressions. We want to first fold any constant operans and
then strengthen the nowrap flags, as we can compute more precise
flags at that point.
We should first try to constant fold the add expression and only
strengthen nowrap flags afterwards. This allows us to determine
stronger flags if e.g. only two operands are left after constant
folding (and thus "guaranteed no wrap region" code applies) or the
resulting operands are non-negative and thus nsw->nuw strengthening
applies.