Use more context to prove contextual facts about the last iteration. It is
only executed when the backedge is taken, so we can use `isLoopBackedgeGuardedByCond`
to make this check.
Differential Revision: https://reviews.llvm.org/D91535
Reviewed By: skatkov
The TypeSize warning would occur because RuntimePointerChecking::insert
was not scalable vector aware. The fix is to use
ScalarEvolution::getSizeOfExpr to grab the size of types.
Differential Revision: https://reviews.llvm.org/D90171
This reverts commit 7dcc889917.
This patch introduced a logical error that breaks whole logic of this analysis.
All checks we are making are supposed to be loop-independent, so that we could
safely remove the range check. The 'nw' fact is loop-dependent, so we can remove
the check basing on facts from this very check.
Motivating examples will follow-up.
This reverts commit 2734a9ebf4.
This patch appeared to not be a NFC. It introduced an execution path where
monotonicity check on limited space started relying in existing nsw/nuw
flags, which is illegal. The motivating test will follow-up.
SCEV makes a logical mistake when handling EitherMayExit in
case when both conditions must be met to exit the loop. The
mistake looks like follows: "if condition `A` fails within at most `X` first
iterations, and `B` fails within at most `Y` first iterations, then `A & B`
fails at most within `min (X, Y)` first iterations". This is wrong, because
both of them must fail at the same time.
Simple example illustrating this is following: we have an IV with step 1,
condition `A` = "IV is even", condition `B` = "IV is odd". Both `A` and `B`
will fail within first two iterations. But it doesn't mean that both of them
will fail within first two first iterations at the same time, which would mean
that IV is neither even nor odd at the same time within first 2 iterations.
We can only do so for known exact BE counts, but not for max.
Differential Revision: https://reviews.llvm.org/D91942
Reviewed By: nikic
Handling of `and` and `or` vastly uses copy-paste. Factored out into
a helper function as preparation step for further fix (see PR48225).
Differential Revision: https://reviews.llvm.org/D91864
Reviewed By: nikic
This is a cut down version of 1ec6e1 which was reverted due to a compile time issue. The key changes made from that patch: 1) only infer the flags needed along each path, 2) be careful to preserve order of checks, and 3) avoid computing NW flags at all since we need to prove the stronger property (does not cross 0) in the caller anyways.
Assuming this doesn't trip regressions, I'm going to try weakening (1). My end objective is to move flag inference into addrec construction. If I can't weaken (1) without compile time impact, I'll have a problem.
In an effort to make code around flag determination more readable, and (possibly) prepare for a follow up change, factor out some of the flag detection logic. In the process, reduce the number of locations we mutate wrap flags by a couple.
Note that this isn't NFC. The old code tried for NSW xor (NUW || NW). This is, two different paths computed different sets of wrap flags. The new code will try for all three. The result is that some expressions end up with a few extra flags set.
The SCEV code for constructing GEP expressions currently assumes
that the addition of the base and all the offsets is nsw if the GEP
is inbounds. While the addition of the offsets is indeed nsw, the
addition to the base address is not, as the base address is
interpreted as an unsigned value.
Fix the GEP expression code to not assume nsw for the base+offset
calculation. However, do assume nuw if we know that the offset is
non-negative. With this, we use the same behavior as the
construction of GEP addrecs does. (Modulo the fact that we
disregard SCEV unification, as the pre-existing FIXME points out).
Differential Revision: https://reviews.llvm.org/D90648
A piece of logic of `isLoopInvariantExitCondDuringFirstIterations` is actually
a generalized predicate monotonicity check. This patch moves it into the
corresponding method and generalizes it a bit.
Differential Revision: https://reviews.llvm.org/D90395
Reviewed By: apilipenko
Lift limitation on step being `+/- 1`. In fact, the only thing it is needed for
is proving no-self-wrap. We can instead check this flag directly.
Theoretically it can increase the scope of the transform, but I could not
construct such test easily.
Differential Revision: https://reviews.llvm.org/D91126
Reviewed By: apilipenko
This changes the definition of t2DoLoopStart from
t2DoLoopStart rGPR
to
GPRlr = t2DoLoopStart rGPR
This will hopefully mean that low overhead loops are more tied together,
and we can more reliably generate loops without reverting or being at
the whims of the register allocator.
This is a fairly simple change in itself, but leads to a number of other
required alterations.
- The hardware loop pass, if UsePhi is set, now generates loops of the
form:
%start = llvm.start.loop.iterations(%N)
loop:
%p = phi [%start], [%dec]
%dec = llvm.loop.decrement.reg(%p, 1)
%c = icmp ne %dec, 0
br %c, loop, exit
- For this a new llvm.start.loop.iterations intrinsic was added, identical
to llvm.set.loop.iterations but produces a value as seen above, gluing
the loop together more through def-use chains.
- This new instrinsic conceptually produces the same output as input,
which is taught to SCEV so that the checks in MVETailPredication are not
affected.
- Some minor changes are needed to the ARMLowOverheadLoop pass, but it has
been left mostly as before. We should now more reliably be able to tell
that the t2DoLoopStart is correct without having to prove it, but
t2WhileLoopStart and tail-predicated loops will remain the same.
- And all the tests have been updated. There are a lot of them!
This patch on it's own might cause more trouble that it helps, with more
tail-predicated loops being reverted, but some additional patches can
hopefully improve upon that to get to something that is better overall.
Differential Revision: https://reviews.llvm.org/D89881
Our range computation methods benefit from no-wrap flags. But if the ranges
were first computed before the flags were set, the cached range will be too
pessimistic.
We need to drop cached ranges whenever we sharpen AddRec's no wrap flags.
Differential Revision: https://reviews.llvm.org/D89847
Reviewed By: fhahn
Strengthening nowrap flags is relatively expensive. Make sure we
only do it if we're actually going to use the flags -- we don't
use them for many recursive invocations. Additionally, if we're
reusing an existing SCEV node, there's no point in trying to
strengthen the flags if we don't have any new baseline facts.
This change falls slightly short of being NFC, because the way
flags during add+addrec / mul+addrec folding are handled may be
more precise (as less operands are included in the calculation).
Instead of performing a sequence of pairwise additions, directly
construct a multi-operand add expression.
This should be NFC modulo any SCEV canonicalization deficiencies.
This is functionally-identical to the previous implementation,
just using a generic interface to do that instead of hand-rolled one,
with caching as a bonus. Thought the sinking is still recursive..
Note that SCEVRewriteVisitor<>'s default implementations
don't preserve NoWrap flags on Add/Mul (but does on AddRec!),
but here we know we can preserve them,
so `visitAddExpr()`/`visitMulExpr()` are specialized.
If we've got an SCEVPtrToIntExpr(op), where op is not an SCEVUnknown,
we want to sink the SCEVPtrToIntExpr into an operand,
so that the operation is performed on integers,
and eventually we end up with just an `SCEVPtrToIntExpr(SCEVUnknown)`.
Reviewed By: mkazantsev
Differential Revision: https://reviews.llvm.org/D89692
And use it to model LLVM IR's `ptrtoint` cast.
This is essentially an alternative to D88806, but with no chance for
all the problems it caused due to having the cast as implicit there.
(see rG7ee6c402474a2f5fd21c403e7529f97f6362fdb3)
As we've established by now, there are at least two reasons why we want this:
* It will allow SCEV to actually model the `ptrtoint` casts
and their operands, instead of treating them as `SCEVUnknown`
* It should help with initial problem of PR46786 - this should eventually allow us
to not loose pointer-ness of an expression in more cases
As discussed in [[ https://bugs.llvm.org/show_bug.cgi?id=46786 | PR46786 ]], in principle,
we could just extend `SCEVUnknown` with a `is ptrtoint` cast, because `ScalarEvolution::getPtrToIntExpr()`
should sink the cast as far down into the expression as possible,
so in the end we should always end up with `SCEVPtrToIntExpr` of `SCEVUnknown`.
But i think that it isn't the best solution, because it doesn't really matter
from memory consumption side - there probably won't be *that* many `SCEVPtrToIntExpr`s
for it to matter, and it allows for much better discoverability.
Reviewed By: mkazantsev
Differential Revision: https://reviews.llvm.org/D89456
URem operations with constant power-of-2 second operands are modeled as
such. This patch on its own has very little impact (e.g. no changes in
CodeGen for MultiSource/SPEC2000/SPEC2006 on X86 -O3 -flto), but I'll
soon post follow-up patches that make use of it to more accurately
determine the trip multiple.
Reviewed By: mkazantsev
Differential Revision: https://reviews.llvm.org/D89821
This reverts commit e038b60d91.
This reverts commit a0d84d8031.
This revert was a mistake. The reason of the failures was
"Use uint64_t for branch weights instead of uint32_t"
Differential Revision: https://reviews.llvm.org/D87832
When we need to prove implication of expressions of different type width,
the default strategy is to widen everything to wider type and prove in this
type. This does not interact well with AddRecs with negative steps and
unsigned predicates: such AddRec will likely not have a `nuw` flag, and its
`zext` to wider type will not be an AddRec. In contraty, `trunc` of an AddRec
in some cases can easily be proved to be an `AddRec` too.
This patch introduces an alternative way to handling implications of different
type widths. If we can prove that wider type values actually fit in the narrow type,
we truncate them and prove the implication in narrow type.
The return was due to revert of underlying patch that this one depends on.
Unit test temporarily disabled because the required logic in SCEV is switched
off due to compile time reasons.
Differential Revision: https://reviews.llvm.org/D89548
We can sharpen the range of a AddRec if we know that it does not
self-wrap and know the symbolic iteration count in the loop. If we can
evaluate the value of AddRec on the last iteration and prove that at least
one its intermediate value lies between start and end, then no-wrap flag
allows us to conclude that all of them also lie between start and end. So
the estimate of range can be improved to union of ranges of start and end.
Switched off by default, can be turned on by flag.
Differential Revision: https://reviews.llvm.org/D89381
Reviewed By: lebedev.ri, nikic
This reverts commit c6ca26c0bf.
This breaks stage2 builds due to hitting this assert:
```
Assertion failed: (WeightSum <= UINT32_MAX && "Expected weights to scale down to 32 bits"), function calcMetadataWeights
```
when compiling AArch64RegisterBankInfo.cpp in LLVM.
Even if the exact exit count is unknown, we can still prove that this
exit will not be taken. If we can prove that the predicate is monotonic,
fulfilled on first & last iteration, and no overflow happened in between,
then the check can be removed.
Differential Revision: https://reviews.llvm.org/D87832
Reviewed By: apilipenko
Same change as 0dda633317, but for
mul expressions. We want to first fold any constant operans and
then strengthen the nowrap flags, as we can compute more precise
flags at that point.
Establish parity with the handling of add expressions, by always
constant folding mul expression operands before checking the depth
limit (this is a non-recursive simplification). The code was already
unconditionally constant folding the case where all operands were
constants, but was not folding multiple constant operands together
if there were also non-constant operands.
This requires picking out a different demonstration for depth-based
folding differences in the limit-depth.ll test.
Separate out the code handling constant folding into a separate
block, that is independent of other folds that need a constant
first operand. Also make some minor adjustments to make the
constant folding look nearly identical to the same code in
getAddExpr().
The only reason this change is not strictly NFC is that the
C1*(C2+V) fold is moved below the constant folding, which means
that it now also applies to C1*C2*(C3+V), as it should.
We should first try to constant fold the add expression and only
strengthen nowrap flags afterwards. This allows us to determine
stronger flags if e.g. only two operands are left after constant
folding (and thus "guaranteed no wrap region" code applies) or the
resulting operands are non-negative and thus nsw->nuw strengthening
applies.
We want to have a caching version of symbolic BE exit count
rather than recompute it every time we need it.
Differential Revision: https://reviews.llvm.org/D89954
Reviewed By: nikic, efriedma
When we need to prove implication of expressions of different type width,
the default strategy is to widen everything to wider type and prove in this
type. This does not interact well with AddRecs with negative steps and
unsigned predicates: such AddRec will likely not have a `nuw` flag, and its
`zext` to wider type will not be an AddRec. In contraty, `trunc` of an AddRec
in some cases can easily be proved to be an `AddRec` too.
This patch introduces an alternative way to handling implications of different
type widths. If we can prove that wider type values actually fit in the narrow type,
we truncate them and prove the implication in narrow type.
Differential Revision: https://reviews.llvm.org/D89548
Reviewed By: fhahn
This reverts commit a10a64e7e3.
It broke polly/test/ScopInfo/NonAffine/non-affine-loop-condition-dependent-access_3.ll
The difference suggests that this may be a serious issue.
Fixed wrapping range case & proof methods reduced to constant range
checks to save compile time.
Differential Revision: https://reviews.llvm.org/D89381
The main tricky thing here is forward-declaring the enum:
we have to specify it's underlying data type.
In particular, this avoids the danger of switching over the SCEVTypes,
but actually switching over an integer, and not being notified
when some case is not handled.
I have updated most of such switches to be exaustive and not have
a default case, where it's pretty obvious to be the intent,
however not all of them.
All existing SCEV cast types operate on integers.
D89456 will add SCEVPtrToIntExpr cast expression type.
I believe this is best for consistency.
Reviewed By: mkazantsev
Differential Revision: https://reviews.llvm.org/D89455
It's not pretty, but probably better than modelling it
as an opaque SCEVUnknown, i guess.
It is relevant e.g. for the loop that was brought up in
https://bugs.llvm.org/show_bug.cgi?id=46786#c26
as an example of what we'd be able to better analyze
once SCEV handles `ptrtoint` (D89456).
But as it is evident, even if we deal with `ptrtoint` there,
we also fail to model such an `ashr`.
Also, modeling of mul-of-exact-shr/div could use improvement.
As per alive2:
https://alive2.llvm.org/ce/z/tnfZKd
```
define i8 @src(i8 %0) {
%2 = ashr exact i8 %0, 4
ret i8 %2
}
declare i8 @llvm.abs(i8, i1)
declare i8 @llvm.smin(i8, i8)
declare i8 @llvm.smax(i8, i8)
define i8 @tgt(i8 %x) {
%abs_x = call i8 @llvm.abs(i8 %x, i1 false)
%div = udiv exact i8 %abs_x, 16
%t0 = call i8 @llvm.smax(i8 %x, i8 -1)
%t1 = call i8 @llvm.smin(i8 %t0, i8 1)
%r = mul nsw i8 %div, %t1
ret i8 %r
}
```
Transformation seems to be correct!
It was reverted because of negative compile time impact. In this version,
less powerful proof methods are used (non-recursive reasoning only), and
scope limited to constant End values to avoid explision of complex proofs.
Differential Revision: https://reviews.llvm.org/D89381
We can sharpen the range of a AddRec if we know that it does not
self-wrap and know the symbolic iteration count in the loop. If we can
evaluate the value of AddRec on the last iteration and prove that at least
one its intermediate value lies between start and end, then no-wrap flag
allows us to conclude that all of them also lie between start and end. So
the estimate of range can be improved to union of ranges of start and end.
Differential Revision: https://reviews.llvm.org/D89381
Reviewed By: efriedma
While we haven't encountered an earth-shattering problem with this yet,
by now it is pretty evident that trying to model the ptr->int cast
implicitly leads to having to update every single place that assumed
no such cast could be needed. That is of course the wrong approach.
Let's back this out, and re-attempt with some another approach,
possibly one originally suggested by Eli Friedman in
https://bugs.llvm.org/show_bug.cgi?id=46786#c20
which should hopefully spare us this pain and more.
This reverts commits 1fb6104293,
7324616660,
aaafe350bb,
e92a8e0c74.
I've kept&improved the tests though.
As being pointed out by @efriedma in
https://reviews.llvm.org/rGaaafe350bb65#inline-4883
of course we can't just call ptrtoint in sign-extending case
and be done with it, because it will zero-extend.
I'm not sure what i was thinking there.
This is very much not an NFC, however looking at the user of
BuildConstantFromSCEV() i'm not sure how to actually show that
it results in a different constant expression.
Much similar to the ZExt/Trunc handling.
Thanks goes to Alexander Richardson for nudging towards noticing this one proactively.
The appropriate (currently crashing) test coverage added.
This relands commit 1c021c64ca which was
reverted in commit 17cec6a11a because
an assertion was being triggered, since `BuildConstantFromSCEV()`
wasn't updated to handle the case where the constant we want to truncate
is actually a pointer. I was unsuccessful in coming up with a test case
where we'd end there with constant zext/sext of a pointer,
so i didn't handle those cases there until there is a test case.
Original commit message:
While we indeed can't treat them as no-ops, i believe we can/should
do better than just modelling them as `unknown`. `inttoptr` story
is complicated, but for `ptrtoint`, it seems straight-forward
to model it just as a zext-or-trunc of unknown.
This may be important now that we track towards
making inttoptr/ptrtoint casts not no-op,
and towards preventing folding them into loads/etc
(see D88979/D88789/D88788)
Reviewed By: mkazantsev
Differential Revision: https://reviews.llvm.org/D88806
> While we indeed can't treat them as no-ops, i believe we can/should
> do better than just modelling them as `unknown`. `inttoptr` story
> is complicated, but for `ptrtoint`, it seems straight-forward
> to model it just as a zext-or-trunc of unknown.
>
> This may be important now that we track towards
> making inttoptr/ptrtoint casts not no-op,
> and towards preventing folding them into loads/etc
> (see D88979/D88789/D88788)
>
> Reviewed By: mkazantsev
>
> Differential Revision: https://reviews.llvm.org/D88806
It caused the following assert during Chromium builds:
llvm/lib/IR/Constants.cpp:1868:
static llvm::Constant *llvm::ConstantExpr::getTrunc(llvm::Constant *, llvm::Type *, bool):
Assertion `C->getType()->isIntOrIntVectorTy() && "Trunc operand must be integer"' failed.
See code review for a link to a reproducer.
This reverts commit 1c021c64ca.
While we indeed can't treat them as no-ops, i believe we can/should
do better than just modelling them as `unknown`. `inttoptr` story
is complicated, but for `ptrtoint`, it seems straight-forward
to model it just as a zext-or-trunc of unknown.
This may be important now that we track towards
making inttoptr/ptrtoint casts not no-op,
and towards preventing folding them into loads/etc
(see D88979/D88789/D88788)
Reviewed By: mkazantsev
Differential Revision: https://reviews.llvm.org/D88806
We cannot guarantee that the replacement expression is loop-invariant in
all AddRecs in the source expression. Use a rewriter that skips
AddRecExpr for now.
Fixes PR47776.
The initial version of the patch was reverted because it missed the check that
the predicate being proved is actually guarded by this check on 1st iteration.
If it was not executed on 1st iteration (but possibly executes after that), then
it is incorrect to use reasoning about IV start to prove it.
Added the test where the miscompile was seen. Unfortunately, my attempts
to reduce it with bugpoint did not succeed; it can further be reduced when
we understand how to do it without losing the initial bug's notion.
Returning assuming the miscompiles are now gone.
Differential Revision: https://reviews.llvm.org/D88208
The logic there only considers `SLT/SGT` predicates. We can use the same logic
for proving `ULT/UGT` predicates if all involved values are non-negative.
Adding full-scale support for unsigned might be challenging because of code amount,
so we can consider this in the future.
Differential Revision: https://reviews.llvm.org/D88087
Reviewed By: reames
If we know that some predicate is true for AddRec and an invariant
(w.r.t. this AddRec's loop), this fact is, in particular, true on the first
iteration. We can try to prove the facts we need using the start value.
The motivating example is proving things like
```
isImpliedCondOperands(>=, X, 0, {X,+,-1}, 0}
```
Differential Revision: https://reviews.llvm.org/D88208
Reviewed By: reames
This check helps to guard against cases where expressions referring to
invalidated/deleted loops are not properly invalidated.
The additional check is motivated by the reproducer shared for 8fdac7cb7a
and I think in general make sense as a sanity check.
Reviewed By: reames
Differential Revision: https://reviews.llvm.org/D88166
Currently, we have `isLoopEntryGuardedByCond` method in SCEV, which
checks that some fact is true if we enter the loop. In fact, this is just a
particular case of more general concept `isBasicBlockEntryGuardedByCond`
applied to given loop's header. In fact, the logic if this code is largely
independent on the given loop and only cares code above it.
This patch makes this generalization. Now we can query it for any block,
and `isBasicBlockEntryGuardedByCond` is just a particular case.
Differential Revision: https://reviews.llvm.org/D87828
Reviewed By: fhahn
Similar to collecting information from branches guarding a loop, we can
also collect information from assumes dominating the loop header.
Fixes PR47247.
Reviewed By: jdoerfert
Differential Revision: https://reviews.llvm.org/D87854
For some expressions, we can use information from loop guards when
we are looking for a maximum. This patch applies information from
loop guards to the expression used to compute the maximum backedge
taken count in howFarToZero. It currently replaces an unknown
expression X with UMin(X, Y), if the loop is guarded by
X ult Y.
This patch is minimal in what conditions it applies, and there
are a few TODOs to generalize.
This partly addresses PR40961. We will also need an update to
LV to address it completely.
Reviewed By: reames
Differential Revision: https://reviews.llvm.org/D67178
Currently these predicates are ignored, yet their handling is
pretty simple. I could not find a single test where it would
actually change something, but it's only because isImpliedCondOperands
is not smart enough to prove it further on. Yet the situation when
we come there with `less` predicate is pretty common.
Differential Revision: https://reviews.llvm.org/D87890
Reviewed By: fhahn
This commit was originally because it was suspected to cause a crash,
but a reproducer did not surface.
A crash that was exposed by this change was fixed in 1d8f2e5292.
This reverts the revert commit 0581c0b0ee.
This patch adds isGuaranteedNotToBePoison and programUndefinedIfUndefOrPoison.
isGuaranteedNotToBePoison will be used at D75808. The latter function is used at isGuaranteedNotToBePoison.
Reviewed By: nikic
Differential Revision: https://reviews.llvm.org/D84242
Recognize umin/umax/smin/smax intrinsics and convert them to the
already existing SCEV nodes of the same name.
In the future we'll want SCEVExpander to also produce the intrinsics,
but we're not ready for that yet.
Differential Revision: https://reviews.llvm.org/D87160
This reverts commit e441b7a7a0.
This patch causes a compile error in tensorflow opensource project. The stack trace looks like:
Point of crash:
llvm/include/llvm/Analysis/LoopInfoImpl.h : line 35
(gdb) ptype *this
type = const class llvm::LoopBase<llvm::BasicBlock, llvm::Loop> [with BlockT = llvm::BasicBlock, LoopT = llvm::Loop]
(gdb) p *this
$1 = {ParentLoop = 0x0, SubLoops = std::vector of length 0, capacity 0, Blocks = std::vector of length 0, capacity 1,
DenseBlockSet = {<llvm::SmallPtrSetImpl<llvm::BasicBlock const*>> = {<llvm::SmallPtrSetImplBase> = {<llvm::DebugEpochBase> = {Epoch = 3}, SmallArray = 0x1b2bf6c8, CurArray = 0x1b2bf6c8,
CurArraySize = 8, NumNonEmpty = 0, NumTombstones = 0}, <No data fields>}, SmallStorage = {0xfffffffffffffffe, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0}}, IsInvalid = true}
(gdb) p *this->DenseBlockSet->CurArray
$2 = (const void *) 0xfffffffffffffffe
I will try to get a case from tensorflow or use creduce to get a small case.
Now that SCEVExpander can preserve LCSSA form,
we do not have to worry about LCSSA form when
trying to look through PHIs. SCEVExpander will take
care of inserting LCSSA PHI nodes as required.
This increases precision of the analysis in some cases.
Reviewed By: mkazantsev, bmahjour
Differential Revision: https://reviews.llvm.org/D71539
This is the max version of D85046.
This change causes binary changes in 44 out of 237 benchmarks (out of
MultiSource/SPEC2000/SPEC2006)
Reviewed By: lebedev.ri
Differential Revision: https://reviews.llvm.org/D85189
In some cases, it seems like we can get rid of unnecessary s/umins by
using information from the loop guards (unless I am missing something).
One place where this seems to be helpful in practice is when computing
loop trip counts. This patch just changes howManyGreaterThans for now.
Note that this requires a loop for which we can check 'is guarded'.
On SPEC2000/SPEC2006/MultiSource, there are some notable changes for
some programs in the number of loops unrolled and trip counts computed.
```
Same hash: 179 (filtered out)
Remaining: 58
Metric: scalar-evolution.NumTripCountsComputed
Program base patch diff
test-suite...langs-C/compiler/compiler.test 25.00 31.00 24.0%
test-suite.../Applications/SPASS/SPASS.test 2020.00 2323.00 15.0%
test-suite...langs-C/allroots/allroots.test 29.00 32.00 10.3%
test-suite.../Prolangs-C/loader/loader.test 17.00 18.00 5.9%
test-suite...fice-ispell/office-ispell.test 253.00 265.00 4.7%
test-suite...006/450.soplex/450.soplex.test 3552.00 3692.00 3.9%
test-suite...chmarks/MallocBench/gs/gs.test 453.00 470.00 3.8%
test-suite...ngs-C/assembler/assembler.test 29.00 30.00 3.4%
test-suite.../Benchmarks/Ptrdist/bc/bc.test 263.00 270.00 2.7%
test-suite...rks/FreeBench/pifft/pifft.test 722.00 741.00 2.6%
test-suite...count/automotive-bitcount.test 41.00 42.00 2.4%
test-suite...0/253.perlbmk/253.perlbmk.test 1417.00 1451.00 2.4%
test-suite...000/197.parser/197.parser.test 387.00 396.00 2.3%
test-suite...lications/sqlite3/sqlite3.test 1168.00 1189.00 1.8%
test-suite...000/255.vortex/255.vortex.test 173.00 176.00 1.7%
Metric: loop-unroll.NumUnrolled
Program base patch diff
test-suite...langs-C/compiler/compiler.test 1.00 3.00 200.0%
test-suite.../Applications/SPASS/SPASS.test 134.00 234.00 74.6%
test-suite...count/automotive-bitcount.test 3.00 4.00 33.3%
test-suite.../Prolangs-C/loader/loader.test 3.00 4.00 33.3%
test-suite...langs-C/allroots/allroots.test 3.00 4.00 33.3%
test-suite...Source/Benchmarks/sim/sim.test 10.00 12.00 20.0%
test-suite...fice-ispell/office-ispell.test 21.00 25.00 19.0%
test-suite.../Benchmarks/Ptrdist/bc/bc.test 32.00 38.00 18.8%
test-suite...006/450.soplex/450.soplex.test 300.00 352.00 17.3%
test-suite...rks/FreeBench/pifft/pifft.test 60.00 69.00 15.0%
test-suite...chmarks/MallocBench/gs/gs.test 57.00 63.00 10.5%
test-suite...ngs-C/assembler/assembler.test 10.00 11.00 10.0%
test-suite...0/253.perlbmk/253.perlbmk.test 145.00 157.00 8.3%
test-suite...000/197.parser/197.parser.test 43.00 46.00 7.0%
test-suite...TimberWolfMC/timberwolfmc.test 205.00 214.00 4.4%
Geomean difference 7.6%
```
Fixes https://bugs.llvm.org/show_bug.cgi?id=46939
Fixes https://bugs.llvm.org/show_bug.cgi?id=46924 on X86.
Reviewed By: mkazantsev
Differential Revision: https://reviews.llvm.org/D85046