Commit Graph

10733 Commits

Author SHA1 Message Date
Caroline Concatto 6fd1604d14 [InstCombine] Add instcombine fold for extractelement + splat for scalable vectors
This patch allows that scalable vector can also use the fold that already
exists for fixed vector, only when the lane index is lower than the minimum
number of elements of the vector.

Differential Revision: https://reviews.llvm.org/D102404
2021-06-08 10:43:38 +01:00
Philip Reames 3c6e419198 [SCEV] Properly guard reasoning about infinite loops being UB on mustprogress
Noticed via code inspection. We changed the semantics of the IR when we added mustprogress, and we appear to have not updated this location.

Differential Revision: https://reviews.llvm.org/D103834
2021-06-07 14:47:36 -07:00
Daniil Suchkov d32cc150fe [BasicAA] Handle PHIs without incoming values gracefully
Fix a bug introduced by f6f6f6375d.
Now for empty PHIs, instead of crashing on assert(hasVal()) in
Optional's internals, we'll return NoAlias, as we did before that patch.

Differential Revision: https://reviews.llvm.org/D103831
2021-06-07 21:39:01 +00:00
Philip Reames 38540d71c7 [SCEV] Compute exit counts for unsigned IVs using mustprogress semantics
The motivation here is simple loops with unsigned induction variables w/non-one steps. A toy example would be:
for (unsigned i = 0; i < N; i += 2) { body; }

Given C/C++ semantics, we do not get the nuw flag on the induction variable. Given that lack, we currently can't compute a bound for this loop. We can do better for many cases, depending on the contents of "body".

The basic intuition behind this patch is as follows:
* A step which evenly divides the iteration space must wrap through the same numbers repeatedly. And thus, we can ignore potential cornercases where we exit after the n-th wrap through uint32_max.
* Per C++ rules, infinite loops without side effects are UB. We already have code in SCEV which relies on this.  In LLVM, this is tied to the mustprogress attribute.

Together, these let us conclude that the trip count of this loop must come before unsigned overflow unless the body would form a well defined infinite loop.

A couple notes for those reading along:
* I reused the loop properties code which is overly conservative for this case. I may follow up in another patch to generalize it for the actual UB rules.
* We could cache the n(s/u)w facts. I left that out because doing a pre-patch which cached existing inference showed a lot of diffs I had trouble fully explaining. I plan to get back to this, but I don't want it on the critical path.

Differential Revision: https://reviews.llvm.org/D103118
2021-06-07 11:24:00 -07:00
Simon Pilgrim 76a1be05fa AssumeBundleQueries.cpp - don't dereference a dyn_cast<> result. NFCI.
Use cast<> instead which will assert that the cast is correct and not just return null - the match() should have already failed if the cast isn't valid anyhow.

Fixes static analysis warning.
2021-06-06 15:25:03 +01:00
Roman Lebedev e350494fb0
[NFC] Promote willNotOverflow() / getStrengthenedNoWrapFlagsFromBinOp() from IndVars into SCEV proper
We might want to use it when creating SCEV proper in createSCEV(),
now that we don't `forgetValue()` in `SimplifyIndvar::strengthenOverflowingOperation()`,
which might have caused us to loose some optimization potential.
2021-06-05 12:17:51 +03:00
Fangrui Song 06e7de795b Fix some -Wunused-but-set-variable in -DLLVM_ENABLE_ASSERTIONS=off build 2021-06-04 23:34:43 -07:00
Artur Pilipenko a06e63fa52 NFC. Refactor DOTGraphTraits::isNodeHidden
Restructure handling of cfg-hide-unreachable-paths and
cfg-hide-deoptimize-paths options so as to make it easier
to introduce new types of hidden blocks.
2021-06-03 11:27:06 -07:00
Qunyan Mangus cbde248736 Add getDemandedBits for uses.
Add getDemandedBits method for uses so we can query demanded bits for each use.  This can help getting better use information. For example, for the code below
define i32 @test_use(i32 %a) {
  %1 = and i32 %a, -256
  %2 = or i32 %1, 1
  %3 = trunc i32 %2 to i8 (didn't optimize this to 1 for illustration purpose)
  ... some use of %3
  ret %2
}
if we look at the demanded bit of %2 (which is all 32 bits because of the return), we would conclude that %a is used regardless of how its return is used. However, if we look at each use separately, we will see that the demanded bit of %2 in trunc only uses the lower 8 bits of %a which is redefined, therefore %a's usage depends on how the function return is used.

Reviewed By: RKSimon

Differential Revision: https://reviews.llvm.org/D97074
2021-06-02 10:07:40 -04:00
Daniil Fukalov 0195e594fe [TTI] NFC: Change getIntImmCodeSizeCost to return InstructionCost.
This patch migrates the TTI cost interfaces to return an InstructionCost.

See this patch for the introduction of the type: https://reviews.llvm.org/D91174
See this thread for context: http://lists.llvm.org/pipermail/llvm-dev/2020-November/146408.html

Reviewed By: sdesmalen

Differential Revision: https://reviews.llvm.org/D102915
2021-06-02 16:04:11 +03:00
Bjorn Pettersson 9c54ee4378 [SimplifyLibCalls] Take size of int into consideration when emitting ldexp/ldexpf
When rewriting
  powf(2.0, itofp(x)) -> ldexpf(1.0, x)
  exp2(sitofp(x)) -> ldexp(1.0, sext(x))
  exp2(uitofp(x)) -> ldexp(1.0, zext(x))

the wrong type was used for the second argument in the ldexp/ldexpf
libc call, for target architectures with 16 bit "int" type.
The transform incorrectly used a bitcasted function pointer with
a 32-bit argument when emitting the ldexp/ldexpf call for such
targets.

The fault is solved by using the correct function prototype
in the call, by asking TargetLibraryInfo about the size of "int".
TargetLibraryInfo by default derives the size of the int type by
assuming that it is 16 bits for 16-bit architectures, and
32 bits otherwise. If this isn't true for a target it should be
possible to override that default in the TargetLibraryInfo
initializer.

Differential Revision: https://reviews.llvm.org/D99438
2021-06-02 11:40:34 +02:00
Arthur Eubanks 8961293851 [OpaquePtr] Create API to make a copy of a PointerType with some address space
Some existing places use getPointerElementType() to create a copy of a
pointer type with some new address space.

Reviewed By: dblaikie

Differential Revision: https://reviews.llvm.org/D103429
2021-06-01 16:52:32 -07:00
Arthur Eubanks 26044c6a54 [InstSimplify] Treat invariant group insts as bitcasts for load operands
We can look through invariant group intrinsics for the purposes of
simplifying the result of a load.

Since intrinsics can't be constants, but we also don't want to
completely rewrite load constant folding, we convert the load operand to
a constant. For GEPs and bitcasts we just treat them as constants. For
invariant group intrinsics, we treat them as a bitcast.

Reviewed By: lebedev.ri

Differential Revision: https://reviews.llvm.org/D101103
2021-06-01 16:33:06 -07:00
Eli Friedman fd229caa01 [polly] Fix SCEVLoopAddRecRewriter to avoid invalid AddRecs.
When we're remapping an AddRec, the AddRec constructed by a partial
rewrite might not make sense.  This triggers an assertion complaining
it's not loop-invariant.

Instead of constructing the partially rewritten AddRec, just skip
straight to calling evaluateAtIteration.

Testcase was automatically reduced using llvm-reduce, so it's a little
messy, but hopefully makes sense.

Differential Revision: https://reviews.llvm.org/D102959
2021-06-01 09:51:05 -07:00
Florian Hahn aa00b1d763
[LV] Try to sink users recursively for first-order recurrences.
Update isFirstOrderRecurrence to  explore all uses of a recurrence phi
and check if we can sink them. If there are multiple users to sink, they
are all mapped to the previous instruction.

Fixes PR44286 (and another PR or two).

Reviewed By: Ayal

Differential Revision: https://reviews.llvm.org/D84951
2021-05-31 19:55:33 +01:00
Daniil Fukalov e853d3b274 [NFC] MemoryDependenceAnalysis cleanup.
1. Removed redundant includes,
2. Removed never defined and used `releaseMemory()`.
3. Fixed member functions names first letter case.
4. Renamed duplicate (in nested struct `NonLocalPointerInfo`) name
   `NonLocalDeps` to `NonLocalDepsMap`.

Reviewed By: nikic

Differential Revision: https://reviews.llvm.org/D102358
2021-05-31 18:07:55 +03:00
Roman Lebedev f7c95c3322
[NFC] ScalarEvolution: apply SSO to the ExprValueMap value
ExprValueMap is a map from SCEV * to a set-vector of (Value *, ConstantInt *) pair,
and while the map itself will likely be big-ish (have many keys),
it is a reasonable assumption that each key will refer to a small-ish
number of pairs.

In particular looking at n=512 case from
https://bugs.llvm.org/show_bug.cgi?id=50384,
the small-size of 4 appears to be the sweet spot,
it results in the least allocations while minimizing memory footprint.
```
$ for i in $(ls heaptrack.opt.*.gz); do echo $i; heaptrack_print $i | tail -n 6; echo ""; done
heaptrack.opt.0-orig.gz
total runtime: 14.32s.
calls to allocation functions: 8222442 (574192/s)
temporary memory allocations: 2419000 (168924/s)
peak heap memory consumption: 190.98MB
peak RSS (including heaptrack overhead): 239.65MB
total memory leaked: 67.58KB

heaptrack.opt.1-n1.gz
total runtime: 13.72s.
calls to allocation functions: 7184188 (523705/s)
temporary memory allocations: 2419017 (176338/s)
peak heap memory consumption: 191.38MB
peak RSS (including heaptrack overhead): 239.64MB
total memory leaked: 67.58KB

heaptrack.opt.2-n2.gz
total runtime: 12.24s.
calls to allocation functions: 6146827 (502355/s)
temporary memory allocations: 2418997 (197695/s)
peak heap memory consumption: 163.31MB
peak RSS (including heaptrack overhead): 211.01MB
total memory leaked: 67.58KB

heaptrack.opt.3-n4.gz
total runtime: 12.28s.
calls to allocation functions: 6068532 (494260/s)
temporary memory allocations: 2418985 (197017/s)
peak heap memory consumption: 155.43MB
peak RSS (including heaptrack overhead): 201.77MB
total memory leaked: 67.58KB

heaptrack.opt.4-n8.gz
total runtime: 12.06s.
calls to allocation functions: 6068042 (503321/s)
temporary memory allocations: 2418992 (200646/s)
peak heap memory consumption: 166.03MB
peak RSS (including heaptrack overhead): 213.55MB
total memory leaked: 67.58KB

heaptrack.opt.5-n16.gz
total runtime: 12.14s.
calls to allocation functions: 6067993 (499958/s)
temporary memory allocations: 2418999 (199307/s)
peak heap memory consumption: 187.24MB
peak RSS (including heaptrack overhead): 233.69MB
total memory leaked: 67.58KB
```

While that test may be an edge worst-case scenario,
https://llvm-compile-time-tracker.com/compare.php?from=dee85d47d9f15fc268f7b18f279dac2774836615&to=98a57e31b1947d5bcdf4a5605ac2ab32b4bd5f63&stat=instructions
agrees that this also results in improvements in the usual situations.
2021-05-31 15:34:03 +03:00
Sanjay Patel 7bb8bfa062 [InstCombine] fix miscompile from vector select substitution
This is similar to the fix in c590a9880d ( PR49832 ), but
we missed handling the pattern for select of bools (no compare
inst).

We can't substitute a vector value because the equality condition
replacement that we are attempting requires that the condition
is true/false for the entire value. Vector select can be partly
true/false.

I added an assert for vector types, so we shouldn't hit this again.
Fixed formatting while auditing the callers.

https://llvm.org/PR50500
2021-05-30 07:11:58 -04:00
Mindong Chen 71acce68da [NFCI] Move DEBUG_TYPE definition below #includes
When you try to define a new DEBUG_TYPE in a header file, DEBUG_TYPE
definition defined around the #includes in files include it could
result in redefinition warnings even compile errors.

Reviewed By: tejohnson

Differential Revision: https://reviews.llvm.org/D102594
2021-05-30 17:31:01 +08:00
Florian Hahn ec1f6f7e3f
Revert "[LAA] Support pointer phis in loop by analyzing each incoming pointer."
This reverts commit 1ed7f8ede5.

This change can cause loop-distribute to crash in some cases. Revert
until I have more time to wrap up a fix.

See  PR50296, PR5028 and D102266.
2021-05-28 10:33:52 +01:00
Yang Fan f2264ebb08
[ConstantFolding] Fix -Wunused-variable warning (NFC)
GCC warning:
```
/llvm-project/llvm/lib/Analysis/ConstantFolding.cpp: In function ‘llvm::Constant* llvm::ConstantFoldLoadFromConstPtr(llvm::Constant*, llvm::Type*, const llvm::DataLayout&)’:
/llvm-project/llvm/lib/Analysis/ConstantFolding.cpp:713:19: warning: unused variable ‘SimplifiedGEP’ [-Wunused-variable]
  713 |         if (auto *SimplifiedGEP = dyn_cast<GEPOperator>(Simplified)) {
      |                   ^~~~~~~~~~~~~
```
2021-05-28 16:17:12 +08:00
Arthur Eubanks 8086f9d87e [ConstFold] Simplify a load's GEP operand through local aliases
MSVC-style RTTI produces loads through a GEP of a local alias which
itself is a GEP. Currently we aren't able to devirtualize any virtual
calls when MSVC RTTI is enabled.

This patch attempts to simplify a load's GEP operand by calling
SymbolicallyEvaluateGEP() with an option to look through local aliases.

Differential Revision: https://reviews.llvm.org/D101100
2021-05-27 16:04:19 -07:00
Philip Reames ff08c3468f [SCEV] Compute trip multiple for multiple exit loops
This patch implements getSmallConstantTripMultiple(L) correctly for multiple exit loops. The previous implementation was both imprecise, and violated the specified behavior of the method. This was fine in practice, because it turns out the function was both dead in real code, and not tested for the multiple exit case.

Differential Revision: https://reviews.llvm.org/D103189
2021-05-26 11:52:25 -07:00
Philip Reames 9306bb638f [SCEV] Generalize getSmallConstantTripCount(L) for multiple exit loops
This came up in review for another patch, see https://reviews.llvm.org/D102982#2782407 for full context.

I've reviewed the callers to make sure they can handle multiple exit loops w/non-zero returns.  There's two cases in target cost models where results might change (Hexagon and PowerPC), but the results looked legal and reasonable.  If a target maintainer wishes to back out the effect of the costing change, they should explicitly check for multiple exit loops and handle them as desired.

Differential Revision: https://reviews.llvm.org/D103182
2021-05-26 11:18:25 -07:00
Philip Reames 921d3f7af0 [SCEV] Add a utility for converting from "exit count" to "trip count"
(Mostly as a logical place to put a comment since this is a reoccuring confusion.)
2021-05-26 10:41:49 -07:00
Philip Reames fb14577d0c [SCEV] Extract out a helper for computing trip multiples 2021-05-26 10:15:03 -07:00
Philip Reames 9cc2181ec3 [unroll] Use value domain for symbolic execution based cost model
The current full unroll cost model does a symbolic evaluation of the loop up to a fixed limit. That symbolic evaluation currently simplifies to constants, but we can generalize to arbitrary Values using the InstructionSimplify infrastructure at very low cost.

By itself, this enables some simplifications, but it's mainly useful when combined with the branch simplification over in D102928.

Differential Revision: https://reviews.llvm.org/D102934
2021-05-26 08:41:25 -07:00
Vitaly Buka f44f2e0afc [NFC] Fix 'unused' warning 2021-05-25 12:23:57 -07:00
Nikita Popov 6300c37a46 [SCEV] Cache operands used in BEInfo (NFC)
When memoized values for a SCEV expressions are dropped, we also
drop all BECounts that make use of the SCEV expression. This is done
by iterating over all the ExitNotTaken counts and (recursively)
checking whether they use the SCEV expression. If there are many
exits, this will take a lot of time.

This patch improves the situation by pre-computing a set of all
used operands, so that we can determine whether a certain BEInfo
needs to be invalidated using a simple set lookup. Will still need
to loop over all BEInfos though.

This makes for a mild improvement on non-degenerate cases:
https://llvm-compile-time-tracker.com/compare.php?from=b661a55a253f4a1cf5a0fbcb86e5ba7b9fb1387b&to=be1393f450e594c53f0ad7e62339a6bc831b16f6&stat=instructions

For the degenerate case from https://bugs.llvm.org/show_bug.cgi?id=50384,
for n=128 I'm seeing run time drop from 1.6s to 1.1s.

Differential Revision: https://reviews.llvm.org/D102796
2021-05-25 21:03:33 +02:00
Sanjay Patel ca7eaa0a54 [InstSimplify] allow undef element match in vector select condition value
The semantics of select with undefined/poison condition
are not explicitly stated in the LangRef, but this matches
comments in the code and Alive2 appears to concur:
https://alive2.llvm.org/ce/z/KXytmd

We can find this pattern after demanded elements transforms.

As noted in D101191, fuzzers are finding infinite loops because
we may not account for this pattern in other passes.
2021-05-25 14:25:34 -04:00
Philip Reames aabca2d1da [SCEV] Cleanup doesIVOverflowOnX checks [NFC]
Stylistic changes only.
1) Don't pass a parameter just to do an early exit.
2) Use a name which matches actual behavior.
2021-05-25 10:12:24 -07:00
Philip Reames a47b2d4567 [SCEV] Remove unused parameter from computeBECount [NFC]
All callers pass "false" for the Equality parameter.  Kill the dead code, and update the function block comment.
2021-05-25 09:58:56 -07:00
David Goldblatt 8607a02357 [InstSimplify] Transform X * Y % Y --> 0
simplifyDiv already handles the case X * Y / Y --> X (barring overflow).
This adds the equivalent handling to simplifyRem.

Correctness:
https://alive2.llvm.org/ce/z/J2cUbS
https://alive2.llvm.org/ce/z/us9NUM
https://alive2.llvm.org/ce/z/AvaDGJ
https://alive2.llvm.org/ce/z/kq9ige

Extending the situations in which we apply this transform would not be
correct:
https://alive2.llvm.org/ce/z/Lf9V63
https://alive2.llvm.org/ce/z/6RPQK3
https://alive2.llvm.org/ce/z/p9UdxC
https://alive2.llvm.org/ce/z/A2zlhE
https://alive2.llvm.org/ce/z/vHTtLw
https://alive2.llvm.org/ce/z/lvpH42

Differential Revision: https://reviews.llvm.org/D102864
2021-05-25 10:16:04 -04:00
Sanjay Patel a0e71f1832 [ConstProp] propagate poison from vector reduction element(s) to result
This follows from the underlying logic for binops and min/max.
Although it does not appear that we handle this for min/max
intrinsics currently.
https://alive2.llvm.org/ce/z/Kq9Xnh
2021-05-24 10:34:40 -04:00
Martin Storsjö c5638a71d8 [MinGW] Mark a number of library functions unavailable for mingw targets
These functions were marked unavailable for MSVC targets before,
within an "T.isOSWindows() && !T.isOSCygMing()" block, but these ones
are unavailable on MinGW targets too.

This avoids generating calls to stpcpy for MinGW targets, which has
been happening since 6dbf0cfcf7 (in
some cases).

This fixes https://github.com/mstorsjo/llvm-mingw/issues/201.

Differential Revision: https://reviews.llvm.org/D102946
2021-05-22 23:40:19 +03:00
Serge Pavlov c9c05a91c4 [ConstantFolding] Use APFloat for constant folding. NFC
Replace use of host floating types with operations on APFloat when it is
possible. Use of APFloat makes analysis more convenient and facilitates
constant folding in the case of non-default FP environment.

Differential Revision: https://reviews.llvm.org/D102672
2021-05-22 13:00:20 +07:00
Arthur Eubanks f7788e1bff Revert "[NewPM] Only invalidate modified functions' analyses in CGSCC passes"
This reverts commit d14d84af2f.

Causes unacceptable memory regressions.
2021-05-21 16:38:03 -07:00
Arthur Eubanks a52530dd6a Revert "[NPM] Do not run function simplification pipeline unnecessarily"
This reverts commit 97ab068034.

Depends on D100917, which is to be reverted.
2021-05-21 16:38:02 -07:00
Philip Reames cc5f6ae4b4 Move a definition into cpp from header in advance of other changes [nfc] 2021-05-21 09:18:04 -07:00
Daniil Fukalov e1cb98be2d [TTI] NFC: Change getCostOfKeepingLiveOverCall to return InstructionCost.
This patch migrates the TTI cost interfaces to return an InstructionCost.

See this patch for the introduction of the type: https://reviews.llvm.org/D91174
See this thread for context: http://lists.llvm.org/pipermail/llvm-dev/2020-November/146408.html

Reviewed By: sdesmalen

Differential Revision: https://reviews.llvm.org/D102831
2021-05-21 15:18:12 +03:00
Daniil Fukalov e8e88c3353 [TTI] NFC: Change getRegUsageForType to return InstructionCost.
This patch migrates the TTI cost interfaces to return an InstructionCost.

See this patch for the introduction of the type: https://reviews.llvm.org/D91174
See this thread for context: http://lists.llvm.org/pipermail/llvm-dev/2020-November/146408.html

Reviewed By: sdesmalen

Differential Revision: https://reviews.llvm.org/D102541
2021-05-21 15:17:23 +03:00
Joe Ellis 5a476987f7 [InstSimplify] Properly constrain {insert,extract}_subvector intrinsic fold
The previous rule:

   (insert_vector _, (extract_vector X, 0), 0) -> X

is not quite correct. The correct fold should be:

   (insert_vector Y, (extract_vector X, 0), 0) -> X
   where: Y is X, or Y is undef

This commit updates the pattern.

Reviewed By: peterwaller-arm, paulwalker-arm

Differential Revision: https://reviews.llvm.org/D102699
2021-05-21 10:05:03 +00:00
Serge Pavlov c162f086ba [APFloat] convertToDouble/Float can work on shorter types
Previously APFloat::convertToDouble may be called only for APFloats that
were built using double semantics. Other semantics like single precision
were not allowed although corresponding numbers could be converted to
double without loss of precision. The similar restriction applied to
APFloat::convertToFloat.

With this change any APFloat that can be precisely represented by double
can be handled with convertToDouble. Behavior of convertToFloat was
updated similarly. It make the conversion operations more convenient and
adds support for formats like half and bfloat.

Differential Revision: https://reviews.llvm.org/D102671
2021-05-21 11:02:51 +07:00
Nikita Popov b661a55a25 [ScalarEvolution] Remove unused ExitLimit::hasOperand() method (NFC)
We only use BackedgeTakenInfo::hasOperand().
2021-05-19 18:42:14 +02:00
Arthur Eubanks 6b9524a05b [NewPM] Don't mark AA analyses as preserved
Currently all AA analyses marked as preserved are stateless, not taking
into account their dependent analyses. So there's no need to mark them
as preserved, they won't be invalidated unless their analyses are.

SCEVAAResults was the one exception to this, it was treated like a
typical analysis result. Make it like the others and don't invalidate
unless SCEV is invalidated.

Reviewed By: asbirlea

Differential Revision: https://reviews.llvm.org/D102032
2021-05-18 13:49:03 -07:00
Arthur Eubanks cc64ece77d [NFC][OpaquePtr] Avoid using PointerType::getElementType() in VectorUtils.cpp
Reviewed By: nikic

Differential Revision: https://reviews.llvm.org/D102533
2021-05-17 18:35:44 -07:00
Nikita Popov 7243120198 [CaptureTracking] Simplify reachability check (NFCI)
This code was re-implementing the same-BB case of
isPotentiallyReachable(). Historically, this was done because
CaptureTracking used additional caching for local dominance
queries. Now that it is no longer needed, the code is effectively
the same as isPotentiallyReachable().

The only difference are extra checks for invoke/phis. These are
misleading checks related to dominance in the value availability
sense that are not relevant for control reachability. The invoke
check was correct but redundant in that invokes are always
terminators, so `I` could never come before the invoke. The phi
check is a matter of interpretation (should an earlier phi node be
considered reachable from a later phi node in the same block?)
but ultimately doesn't matter because phis don't capture anyway.
2021-05-16 16:04:10 +02:00
Nikita Popov 656296b1c2 Reapply [CaptureTracking] Do not check domination
Reapply after adjusting the synchronized.m test case, where the
TODO is now resolved. The pointer is only captured on the exception
handling path.

-----

For the CapturesBefore tracker, it is sufficient to check that
I can not reach BeforeHere. This does not necessarily require
that BeforeHere dominates I, it can also occur if the capture
happens on an entirely disjoint path.

This change was previously accepted in D90688, but had to be
reverted due to large compile-time impact in some cases: It
increases the number of reachability queries that are performed.

After recent changes, the compile-time impact is largely mitigated,
so I'm reapplying this patch. The remaining compile-time impact
is largely proportional to changes in code-size.
2021-05-16 15:46:31 +02:00
Nikita Popov 541c2845de Revert "[CaptureTracking] Do not check domination"
This reverts commit 6b8b43e7af.

This causes clang test to fail (CodeGenObjC/synchronized.m).
Revert until I can figure out whether that's an expected change.
2021-05-16 11:04:45 +02:00
Nikita Popov 6b8b43e7af [CaptureTracking] Do not check domination
For the CapturesBefore tracker, it is sufficient to check that
I can not reach BeforeHere. This does not necessarily require
that BeforeHere dominates I, it can also occur if the capture
happens on an entirely disjoint path.

This change was previously accepted in D90688, but had to be
reverted due to large compile-time impact in some cases: It
increases the number of reachability queries that are performed.

After recent changes, the compile-time impact is largely mitigated,
so I'm reapplying this patch. The remaining compile-time impact
is largely proportional to changes in code-size.
2021-05-16 10:49:36 +02:00
Nikita Popov 6e9363c942 [CaptureTracking] Only check reachability for capture candidates
Reachability queries are very expensive, and currently performed
for each instruction we look at, even though most of them will
not lead to a capture and are thus ultimately irrelevant. It is
more efficient to walk a few unnecessary instructions than to
perform unnecessary reachability queries.

Theoretically, this may produce worse results, because the additional
instructions considered may cause us to hit the use count limit
earlier. In practice, this does not appear to be a problem, e.g.
on test-suite O3 we report only one more captured-before with this
change, with no resulting codegen differences.

This makes PointerMayBeCapturedBefore() significantly cheaper in
practice, hopefully allowing it to be used in more places.
2021-05-15 22:57:56 +02:00
Nikita Popov f9e9b0cdb4 [CFG] Move reachable from entry checks into basic block variant
These checks are not specific to the instruction based variant of
isPotentiallyReachable(), they are equally valid for the basic
block based variant. Move them there, to make sure that switching
between the instruction and basic block variants cannot introduce
regressions.
2021-05-15 15:42:02 +02:00
Nikita Popov fb9ed1979a [IR] Add BasicBlock::isEntryBlock() (NFC)
This is a recurring and somewhat awkward pattern. Add a helper
method for it.
2021-05-15 12:41:58 +02:00
Nikita Popov 6418bab6f8 [CFG] Use comesBefore() (NFC)
Use comesBefore() instead of performing an instruction walk. In
line with the previous implementation, instructions are considered
to reach themselves.
2021-05-15 12:14:30 +02:00
Nikita Popov f765e54db2 [CaptureTracking] Clean up same instruction check (NFC)
Check the BeforeHere == I case once in shouldExplore, instead of
handling it in four different places.
2021-05-15 11:58:55 +02:00
Nick Desaulniers 8c72749bd9 [LowerConstantIntrinsics] reuse isManifestLogic from ConstantFolding
GlobalVariables are Constants, yet should not unconditionally be
considered true for __builtin_constant_p.

Via the LangRef
https://llvm.org/docs/LangRef.html#llvm-is-constant-intrinsic:

    This intrinsic generates no code. If its argument is known to be a
    manifest compile-time constant value, then the intrinsic will be
    converted to a constant true value. Otherwise, it will be converted
    to a constant false value.

    In particular, note that if the argument is a constant expression
    which refers to a global (the address of which _is_ a constant, but
    not manifest during the compile), then the intrinsic evaluates to
    false.

Move isManifestConstant from ConstantFolding to be a method of
Constant so that we can reuse the same logic in
LowerConstantIntrinsics.

pr/41459

Reviewed By: rsmith, george.burgess.iv

Differential Revision: https://reviews.llvm.org/D102367
2021-05-14 15:35:21 -07:00
Nikita Popov c4fb2a1fc2 [MemDep] Use BatchAA in more places (NFCI)
Previously, we already used BatchAA for individual simple pointer
dependency queries. This extends BatchAA usage for the non-local
case, so that only one BatchAA instance is used for all blocks,
instead of one instance per block.

Use of BatchAA is safe as IR cannot be modified during a MemDep
query.
2021-05-14 22:54:40 +02:00
Nikita Popov 5e289cc597 [AA] Support callCapturesBefore() on BatchAA (NFCI)
This is not expected to have any practical compile-time effect,
as the alias() calls inside callCapturesBefore() are rare. This
should still be supported for API completeness, and might be
useful for reachability caching.
2021-05-14 21:48:08 +02:00
Philip Reames 23c93c2555 Discount invariant instructions in full unrolling
This patch updates the cost model for full unrolling to discount the cost of a loop invariant expression on all but one iteration. The reasoning here is that such an expression (as determined by SCEV) will be CSEd or DSEd once the loop is unrolled. Note that SCEVs reasoning will find things which could be invariant, not simply those outside the loop.

Differential Revision: https://reviews.llvm.org/D102506
2021-05-14 11:07:19 -07:00
dfukalov fdae3fc8b3 [GVN] Clobber partially aliased loads.
Use offsets stored in `AliasResult` implemented in D98718.

Updated with fix of issue reported in https://reviews.llvm.org/D95543#2745161

Reviewed By: nikic

Differential Revision: https://reviews.llvm.org/D95543
2021-05-14 11:17:14 +03:00
Nikita Popov 425781bce0 [CaptureTracking] Use isIdentifiedFunctionLocal() (NFC)
These conditions together exactly match isIdentifiedFunctionLocal(),
and this is also what we logically want to check for here.
2021-05-13 23:06:42 +02:00
Nikita Popov dce158c58d [AA] Use isIdentifiedFunctionLocal() (NFC)
This condition is equivalent to isIdentifiedFunctionLocal(),
and this is also what we semantically want to check here.
2021-05-13 23:06:42 +02:00
Joe Ellis 2ed7db0d20 [InstSimplify] Remove redundant {insert,extract}_vector intrinsic chains
This commit removes some redundant {insert,extract}_vector intrinsic
chains by implementing the following patterns as instsimplifies:

   (insert_vector _, (extract_vector X, 0), 0) -> X
   (extract_vector (insert_vector _, X, 0), 0) -> X

Reviewed By: peterwaller-arm

Differential Revision: https://reviews.llvm.org/D101986
2021-05-13 16:09:50 +00:00
Florian Hahn e2759f110b
[SCEV] Apply guards to max with non-unitary steps.
We already apply loop-guards when computing the maximum with unitary
steps. This extends the code to also do so when dealing with non-unitary
steps.

This allows us to infer a tighter maximum in some cases.

Reviewed By: nikic

Differential Revision: https://reviews.llvm.org/D102267
2021-05-13 09:47:29 +01:00
Jordan Rupprecht fec2945998 Revert "[GVN] Clobber partially aliased loads."
This reverts commit 6c57044231.

It causes assertion errors due to widening atomic loads, and potentially causes miscompile elsewhere too. Repro, also posted to D95543:

```
$ cat repro.ll
; ModuleID = 'repro.ll'
source_filename = "repro.ll"
target datalayout = "e-m:e-p270:32:32-p271:32:32-p272:64:64-i64:64-f80:128-n8:16:32:64-S128"
target triple = "x86_64-unknown-linux-gnu"

%struct.widget = type { i32 }
%struct.baz = type { i32, %struct.snork }
%struct.snork = type { %struct.spam }
%struct.spam = type { i32, i32 }

@global = external local_unnamed_addr global %struct.widget, align 4
@global.1 = external local_unnamed_addr global i8, align 1
@global.2 = external local_unnamed_addr global i32, align 4

define void @zot(%struct.baz* %arg) local_unnamed_addr align 2 {
bb:
  %tmp = getelementptr inbounds %struct.baz, %struct.baz* %arg, i64 0, i32 1
  %tmp1 = bitcast %struct.snork* %tmp to i64*
  %tmp2 = load i64, i64* %tmp1, align 4
  %tmp3 = getelementptr inbounds %struct.baz, %struct.baz* %arg, i64 0, i32 1, i32 0, i32 1
  %tmp4 = icmp ugt i64 %tmp2, 4294967295
  br label %bb5

bb5:                                              ; preds = %bb14, %bb
  %tmp6 = load i32, i32* %tmp3, align 4
  %tmp7 = icmp ne i32 %tmp6, 0
  %tmp8 = select i1 %tmp7, i1 %tmp4, i1 false
  %tmp9 = zext i1 %tmp8 to i8
  store i8 %tmp9, i8* @global.1, align 1
  %tmp10 = load i32, i32* @global.2, align 4
  switch i32 %tmp10, label %bb11 [
    i32 1, label %bb12
    i32 2, label %bb12
  ]

bb11:                                             ; preds = %bb5
  br label %bb14

bb12:                                             ; preds = %bb5, %bb5
  %tmp13 = load atomic i32, i32* getelementptr inbounds (%struct.widget, %struct.widget* @global, i64 0, i32 0) acquire, align 4
  br label %bb14

bb14:                                             ; preds = %bb12, %bb11
  br label %bb5
}
$ opt -O2 repro.ll -disable-output
opt: /home/rupprecht/src/llvm-project/llvm/lib/Transforms/Utils/VNCoercion.cpp:496: llvm::Value *llvm::VNCoercion::getLoadValueForLoad(llvm::LoadInst *, unsigned int, llvm::Type *, llvm::Instruction *, const llvm::DataLayout &): Assertion `SrcVal->isSimple() && "Cannot widen volatile/atomic load!"' failed.
PLEASE submit a bug report to https://bugs.llvm.org/ and include the crash backtrace.
Stack dump:
0.      Program arguments: /home/rupprecht/dev/opt -O2 repro.ll -disable-output
...
```
2021-05-11 16:08:53 -07:00
Stanislav Mekhanoshin 22d295f695 [AMDGPU] Constant fold Intrinsic::amdgcn_perm
Differential Revision: https://reviews.llvm.org/D102203
2021-05-10 16:23:11 -07:00
Florian Hahn 93a9a8a8d9
[VecLib] Add support for vector fns from Darwin's libsystem.
This patch adds support for Darwin's libsystem math vector functions to
TLI. Darwin's libsystem provides a range of vector functions for libm
functions.

This initial patch only adds the 2 x double and 4 x float versions,
which are available on both X86 and ARM64. On X86, wider vector versions
are supported as well.

Reviewed By: jroelofs

Differential Revision: https://reviews.llvm.org/D101856
2021-05-10 21:19:58 +01:00
Andy Kaylor 7086025d65 [Dependence Analysis] Enable delinearization of fixed sized arrays
Patch by Artem Radzikhovskyy!

Allow delinearization of fixed sized arrays if we can prove that the GEP indices do not overflow the array dimensions. The checks applied are similar to the ones that are used for delinearization of parametric size arrays. Make sure that the GEP indices are non-negative and that they are smaller than the range of that dimension.

Changes Summary:

- Updated the LIT tests with more exact values, as we are able to delinearize and apply more exact tests
- profitability.ll - now able to delinearize in all cases, no need to use -da-disable-delinearization-checks flag and run the test twice
- loop-interchange-optimization-remarks.ll - in one of the cases we are able to delinearize without using -da-disable-delinearization-checks
- SimpleSIVNoValidityCheckFixedSize.ll - removed unnecessary "-da-disable-delinearization-checks" flag. Now can get the exact answer without it.
- SimpleSIVNoValidityCheckFixedSize.ll and PreliminaryNoValidityCheckFixedSize.ll - made negative tests more explicit, in order to demonstrate the need for "-da-disable-delinearization-checks" flag

Differential Revision: https://reviews.llvm.org/D101486
2021-05-10 10:30:15 -07:00
Nikita Popov d26ca78c18 [SCEV] Handle and/or in applyLoopGuards()
applyLoopGuards() already combines conditions from multiple nested
guards. However, it cannot use multiple conditions on the same guard,
combined using and/or. Add support for this by recursing into either
`and` or `or`, depending on the direction of the branch.

Differential Revision: https://reviews.llvm.org/D101692
2021-05-09 21:34:28 +02:00
Arthur Eubanks 34a8a437bf [NewPM] Hide pass manager debug logging behind -debug-pass-manager-verbose
Printing pass manager invocations is fairly verbose and not super
useful.

This allows us to remove DebugLogging from pass managers and PassBuilder
since all logging (aside from analysis managers) goes through
instrumentation now.

This has the downside of never being able to print the top level pass
manager via instrumentation, but that seems like a minor downside.

Reviewed By: ychen

Differential Revision: https://reviews.llvm.org/D101797
2021-05-07 21:51:47 -07:00
Florian Hahn 6c99e63120 [SCEV] By more careful when traversing phis in isImpliedViaMerge.
I think currently isImpliedViaMerge can incorrectly return true for phis
in a loop/cycle, if the found condition involves the previous value of

Consider the case in exit_cond_depends_on_inner_loop.

At some point, we call (modulo simplifications)
isImpliedViaMerge(<=, %x.lcssa, -1, %call, -1).

The existing code tries to prove IncV <= -1 for all incoming values
InvV using the found condition (%call <= -1). At the moment this succeeds,
but only because it does not compare the same runtime value. The found
condition checks the value of the last iteration, but the incoming value
is from the *previous* iteration.

Hence we incorrectly determine that the *previous* value was <= -1,
which may not be true.

I think we need to be more careful when looking at the incoming values
here. In particular, we need to rule out that a found condition refers to
any value that may refer to one of the previous iterations. I'm not sure
there's a reliable way to do so (that also works of irreducible control
flow).

So for now this patch adds an additional requirement that the incoming
value must properly dominate the phi block. This should ensure the
values do not change in a cycle. I am not entirely sure if will catch
all cases and I appreciate a through second look in that regard.

Alternatively we could also unconditionally bail out in this case,
instead of checking the incoming values

Reviewed By: nikic

Differential Revision: https://reviews.llvm.org/D101829
2021-05-07 19:52:29 +01:00
Krzysztof Parzyszek 50cf0a1d1a Allow empty value list in propagateMetadata(Inst, ArrayOf...)
This will allow writing
  propagateMetadata(Inst, collectInterestingValues(...))
without concern about empty lists. In case of an empty list,
Inst is returned without any changes.
2021-05-07 13:20:50 -05:00
Fangrui Song d8aba75a76 Internalize some cl::opt global variables or move them under namespace llvm 2021-05-07 11:15:43 -07:00
Whitney Tsang 1006ac3963 [LoopNest] Consider loop nest with inner loop guard using outer loop
induction variable to be perfect

This patch allow more conditional branches to be considered as loop
guard, and so more loop nests can be considered perfect.

Reviewed By: bmahjour, sidbav

Differential Revision: https://reviews.llvm.org/D94717
2021-05-07 16:04:18 +00:00
Joseph Tremoulet bc302bfbef BasicAA: Recognize inttoptr as isEscapeSource
Pointers escape when converted to integers, so a pointer produced by
converting an integer to a pointer must not be a local non-escaping
object.

Reviewed By: nikic, nlopes, aqjune

Differential Revision: https://reviews.llvm.org/D101541
2021-05-07 07:48:50 -07:00
Peilin Guo 911a541620 [LazyValueInfo] Insert an Overdefined placeholder to prevent infinite recursion
getValueFromCondition() uses a Visited set to record the intermediate value.
However, it uses a postorder way to compute the value first and update the
Visited set later. Thus it will be trapped into an infinite recursion if there
exists IRs that use no dominated by its def as in this example:

  %tmp3 = or i1 undef, %tmp4
  %tmp4 = or i1 undef, %tmp3

To prevent this, we can insert an Overdefined placeholder into the set
before computing the actual value.

Reviewed by: nikic

Differential Revision: https://reviews.llvm.org/D101273
2021-05-07 16:05:50 +08:00
Mircea Trofin 97ab068034 [NPM] Do not run function simplification pipeline unnecessarily
The CGSCC pass manager interplay with the FunctionAnalysisManagerCGSCCProxy is 'special' in the sense that the former will rerun the latter if there are changes to a SCC structure; that being said, some of the functions in the SCC may be unchanged. In that case, the function simplification pipeline will be re-run, which impacts compile time[1].

This patch allows the function simplification pipeline be skipped if it was already run and the function was not modified since.

The behavior is currently disabled by default. This is because, currently, the rerunning of the function simplification pipeline on an unchanged function may still result in changes. The patch simplifies investigating and fixing those cases where repeated function pass runs do actually positively impact code quality, while offering an easy workaround for those impacted negatively by compile time regressions, and not impacting mainline scenarios.

[1] A [[ http://llvm-compile-time-tracker.com/compare.php?from=eb37d3546cd0c6e67798496634c45e501f7806f1&to=ac722d1190dc7bbdd17e977ef7ec95e69eefc91e&stat=instructions | compile time tracker ]] run with the option enabled.

Differential Revision: https://reviews.llvm.org/D98103
2021-05-06 12:24:33 -07:00
Bjorn Pettersson 3ee826594a Make dependency between certain analysis passes transitive (reapply)
LazyBlockFrequenceInfoPass, LazyBranchProbabilityInfoPass and
LoopAccessLegacyAnalysis all cache pointers to their nestled required
analysis passes. One need to use addRequiredTransitive to describe
that the nestled passes can't be freed until those analysis passes
no longer are used themselves.

There is still a bit of a mess considering the getLazyBPIAnalysisUsage
and getLazyBFIAnalysisUsage functions. Those functions are used from
both Transform, CodeGen and Analysis passes. I figure it is OK to
use addRequiredTransitive also when being used from Transform and
CodeGen passes. On the other hand, I figure we must do it when
used from other Analysis passes. So using addRequiredTransitive should
be more correct here. An alternative solution would be to add a
bool option in those functions to let the user tell if it is a
analysis pass or not. Since those lazy passes will be obsolete when
new PM has conquered the world I figure we can leave it like this
right now.

Intention with the patch is to fix PR49950. It at least solves the
problem for the reproducer in PR49950. However, that reproducer
need five passes in a specific order, so there are lots of various
"solutions" that could avoid the crash without actually fixing the
root cause.

This is a reapply of commit 3655f0757f, that was reverted in
33ff3c2049 due to problems with assertions in the polly
lit tests. That problem is supposed to be solved by also adjusting
ScopPass to explicitly preserve LazyBlockFrequencyInfo and
LazyBranchProbabilityInfo (it already preserved
OptimizationRemarkEmitter which depends on those lazy passes).

Differential Revision: https://reviews.llvm.org/D100958
2021-05-05 15:17:55 +02:00
Bjorn Pettersson 33ff3c2049 Revert "Make dependency between certain analysis passes transitive"
This reverts commit 3655f0757f.

It caused assertion failures related to setLastUser in polly builds.
2021-05-04 19:08:41 +02:00
Bjorn Pettersson 3655f0757f Make dependency between certain analysis passes transitive
LazyBlockFrequenceInfoPass, LazyBranchProbabilityInfoPass and
LoopAccessLegacyAnalysis all cache pointers to their nestled required
analysis passes. One need to use addRequiredTransitive to describe
that the nestled passes can't be freed until those analysis passes
no longer are used themselves.

There is still a bit of a mess considering the getLazyBPIAnalysisUsage
and getLazyBFIAnalysisUsage functions. Those functions are used from
both Transform, CodeGen and Analysis passes. I figure it is OK to
use addRequiredTransitive also when being used from Transform and
CodeGen passes. On the other hand, I figure we must do it when
used from other Analysis passes. So using addRequiredTransitive should
be more correct here. An alternative solution would be to add a
bool option in those functions to let the user tell if it is a
analysis pass or not. Since those lazy passes will be obsolete when
new PM has conquered the world I figure we can leave it like this
right now.

Intention with the patch is to fix PR49950. It at least solves the
problem for the reproducer in PR49950. However, that reproducer
need five passes in a specific order, so there are lots of various
"solutions" that could avoid the crash without actually fixing the
root cause.

Differential Revision: https://reviews.llvm.org/D100958
2021-05-04 11:50:08 +02:00
Simon Moll 1db4dbba24 Recommit "[VP,Integer,#2] ExpandVectorPredication pass"
This reverts the revert 02c5ba8679

Fix:

Pass was registered as DUMMY_FUNCTION_PASS causing the newpm-pass
functions to be doubly defined. Triggered in -DLLVM_ENABLE_MODULE=1
builds.

Original commit:

This patch implements expansion of llvm.vp.* intrinsics
(https://llvm.org/docs/LangRef.html#vector-predication-intrinsics).

VP expansion is required for targets that do not implement VP code
generation. Since expansion is controllable with TTI, targets can switch
on the VP intrinsics they do support in their backend offering a smooth
transition strategy for VP code generation (VE, RISC-V V, ARM SVE,
AVX512, ..).

Reviewed By: rogfer01

Differential Revision: https://reviews.llvm.org/D78203
2021-05-04 11:47:52 +02:00
Arthur Eubanks d14d84af2f [NewPM] Only invalidate modified functions' analyses in CGSCC passes
Previously, any change in any function in an SCC would cause all
analyses for all functions in the SCC to be invalidated. With this
change, we now manually invalidate analyses for functions we modify,
then let the pass manager know that all function analyses should be
preserved.

So far this only touches the inliner, argpromotion, funcattrs, and
updateCGAndAnalysisManager(), since they are the most used.

Slight compile time improvements:
http://llvm-compile-time-tracker.com/compare.php?from=326da4adcb8def2abdd530299d87ce951c0edec9&to=8942c7669f330082ef159f3c6c57c3c28484f4be&stat=instructions

Reviewed By: mtrofin

Differential Revision: https://reviews.llvm.org/D100917
2021-05-03 17:21:44 -07:00
Philip Reames e38ccb729b Recommit "Generalize getInvertibleOperand recurrence handling slightly"
This was reverted because of a reported problem.  It turned out this patch didn't introduce said problem, it just exposed it more widely.  15a4233 fixes the root issue, so this simple a) rebases over that, and b) adds a much more extensive comment explaining why that weakened assert is correct.

Original commit message follows:

Follow up to D99912, specifically the revert, fix, and reapply thereof.

This generalizes the invertible recurrence logic in two ways:
* By allowing mismatching operand numbers of the phi, we can recurse through a pair of phi recurrences whose operand orders have not been canonicalized.
* By allowing recurrences through operand 1, we can invert these odd (but legal) recurrence.

Differential Revision: https://reviews.llvm.org/D100884
2021-05-03 16:40:56 -07:00
Sanjay Patel 15a42339fe [ValueTracking] soften assert for invertible recurrence matching
There's a TODO comment in the code and discussion in D99912
about generalizing this, but I wasn't sure how to implement that,
so just going with a potential minimal fix to avoid crashing.

The test is a reduction beyond useful code (there's no user of
%user...), but it is based on https://llvm.org/PR50191, so this
is asserting on real code.

Differential Revision: https://reviews.llvm.org/D101772
2021-05-03 15:57:40 -04:00
Juneyoung Lee d4d1caafc8 Fix MSan crash after 1977c53b 2021-05-02 13:44:43 +09:00
Arthur Eubanks 07a9df5993 [NFC] Use getParamByValType instead of pointee type
To reduce dependence on pointee types for opaque pointers.

Reviewed By: dblaikie

Differential Revision: https://reviews.llvm.org/D101706
2021-05-01 21:22:41 -07:00
Juneyoung Lee 7257e6a68a [ValueTracking] ctpop propagates poison
This is a patch that adds ctpop intrinsics to propagatesPoison.

Splitted from D101191
2021-05-02 13:04:37 +09:00
Juneyoung Lee 64e768e816 [ValueTracking] Improve impliesPoison to look into overflow intrinsics
This update supports the following transformation:

```
select(extract(mul_with_overflow(a, _), _), (a == 0), false)
=>
and(extract(mul_with_overflow(a, _), _), (a == 0))
```

which is correct because if `a` was poison the select's condition was
also poison.

This update is splitted from D101423.
2021-05-02 12:03:55 +09:00
Juneyoung Lee 1977c53b2a [InstCombine] Fold overflow bit of [u|s]mul.with.overflow in a poison-safe way
As discussed in D101191, this patch adds a poison-safe folding of overflow bit check:
```
  %Op0 = icmp ne i4 %X, 0
  %Agg = call { i4, i1 } @llvm.[us]mul.with.overflow.i4(i4 %X, i4 %Y)
  %Op1 = extractvalue { i4, i1 } %Agg, 1
  %ret = select i1 %Op0, i1 %Op1, i1 false
=>
  %Y.fr = freeze %Y
  %Agg = call { i4, i1 } @llvm.[us]mul.with.overflow.i4(i4 %X, i4 %Y.fr)
  %Op1 = extractvalue { i4, i1 } %Agg, 1
  %ret = %Op1
```

https://alive2.llvm.org/ce/z/zgPUGT
https://alive2.llvm.org/ce/z/h2gZ_6

Note that there are cases where inserting freeze is not necessary: e.g. %Y is `noundef`.
In this case, LLVM is already good because `%ret` is already successfully folded into `and`,
triggering the pre-existing optimization in InstSimplify: https://godbolt.org/z/v6qena15K

Differential Revision: https://reviews.llvm.org/D101423
2021-05-02 11:54:12 +09:00
Nikita Popov db9d00c5e7 [LVI] Handle mask not equal zero conditions
If V & Mask != 0, we know that at least one of the bits in Mask
must be set, so the value must be >= the lowest bit in Mask.
2021-05-01 23:08:49 +02:00
Nikita Popov cc58e8918b [SCEV] Simplify backedge count clearing (NFC)
This seems to be a leftover from when the BackedgeTakenInfo
stored multiple exit counts with manual memory management. At
some point this was switchted to a simple vector, and there should
be no need to micro-manage the clearing anymore. We can simply
drop the loop from the map and the the destructor do its job.
2021-05-01 17:50:01 +02:00
Adrian Prantl 02c5ba8679 Revert "[VP,Integer,#2] ExpandVectorPredication pass"
This reverts commit 43bc584dc0.

The commit broke the -DLLVM_ENABLE_MODULES=1 builds.

http://green.lab.llvm.org/green/view/LLDB/job/lldb-cmake/31603/consoleFull#2136199809a1ca8a51-895e-46c6-af87-ce24fa4cd561
2021-04-30 17:02:28 -07:00
Nikita Popov fe230dc197 [ValueTracking] Slightly clean up programUndefinedIfUndefOrPoison() (NFC)
Use contains() to check set membership, and adjust an oddly
structured loop.
2021-04-30 23:05:41 +02:00
Nikita Popov 2cd7868605 [ValueTracking] Limit scan when checking poison UB (PR50155)
The current code can scan an unlimited number of instructions,
if the containing basic block is very large. The test case from
PR50155 contains a basic block with approximately 100k instructions.

To avoid this, limit the number of instructions we inspect. At
the same time, drop the limit on the number of basic blocks, as
this will be implicitly limited by the number of instructions as
well.
2021-04-30 23:04:49 +02:00
Duncan P. N. Exon Smith 518d955f9d Support: Stop using F_{None,Text,Append} compatibility synonyms, NFC
Stop using the compatibility spellings of `OF_{None,Text,Append}`
left behind by 1f67a3cba9. A follow-up
will remove them.

Differential Revision: https://reviews.llvm.org/D101650
2021-04-30 11:00:03 -07:00
Simon Moll 43bc584dc0 [VP,Integer,#2] ExpandVectorPredication pass
This patch implements expansion of llvm.vp.* intrinsics
(https://llvm.org/docs/LangRef.html#vector-predication-intrinsics).

VP expansion is required for targets that do not implement VP code
generation. Since expansion is controllable with TTI, targets can switch
on the VP intrinsics they do support in their backend offering a smooth
transition strategy for VP code generation (VE, RISC-V V, ARM SVE,
AVX512, ..).

Reviewed By: rogfer01

Differential Revision: https://reviews.llvm.org/D78203
2021-04-30 15:47:28 +02:00
Roman Lebedev ba5b015b0d
[InlineCost] CallAnalyzer: use TTI info for extractvalue - they are free (PR50099)
It seems incorrect to use TTI data in some places,
and override it in others. In this case, TTI says
that `extractvalue` are free, yet we bill them.

While this doesn't address https://bugs.llvm.org/show_bug.cgi?id=50099 yet,
it reduces the cost from 55 to 50 while the threshold is 45.

Differential Revision: https://reviews.llvm.org/D101228
2021-04-30 13:55:11 +03:00
Arthur Eubanks a3a798d49d [InlineCost] Remove visitUnaryInstruction()
The simplifyInstruction() in visitUnaryInstruction() does not trigger
for all of check-llvm. Looking at all delegates to UnaryInstruction in
InstVisitor, the only instructions that either don't have a visitor in
CallAnalyzer, or redirect to UnaryInstruction, are VAArgInst and Alloca.
VAArgInst will never get simplified, and visitUnaryInstruction(Alloca)
would always return false anyway.

Reviewed By: mtrofin, lebedev.ri

Differential Revision: https://reviews.llvm.org/D101577
2021-04-29 20:33:30 -07:00
jasonliu 7049fbf960 [XCOFF] Handle the case when personality routine is an alias
Summary:
Personality routine could be an alias to another personality routine.
Fix the situation when we compile the file that contains the personality
routine and the file also have functions that need to refer to the
personality routine.

Reviewed By: hubert.reinterpretcast

Differential Revision: https://reviews.llvm.org/D101401
2021-04-29 22:03:30 +00:00
Philip Reames a047837b90 Revert "Generalize getInvertibleOperand recurrence handling slightly"
This reverts commit 0c01b37eeb while a problem reported is investigated.
2021-04-29 13:06:26 -07:00
Sanjay Patel 1089158c5a [ConstantFolding] propagate poison through vector reduction intrinsics 2021-04-29 12:54:20 -04:00
Sanjay Patel 71597d40e8 [ConstantFolding] refactor helper for vector reductions; NFC
We should handle other cases (undef/poison), so reduce
the duplication of repeated switches.
2021-04-29 12:09:22 -04:00
Craig Topper 25391cec3a [RISCV] Teach computeKnownBits that vsetvli returns number less than 2^31.
This seems like a reasonable upper bound on VL. WG discussions for
the V spec would probably allow us to use 2^16 as an upper bound
on VLEN, but this is good enough for now.

This allows us to remove sext and zext if user happens to assign
the size_t result into an int and then uses it as a VL intrinsic
argument which is size_t.

Reviewed By: frasercrmck, rogfer01, arcbbb

Differential Revision: https://reviews.llvm.org/D101472
2021-04-29 08:07:59 -07:00
Philip Reames 0c01b37eeb Generalize getInvertibleOperand recurrence handling slightly
Follow up to D99912, specifically the revert, fix, and reapply thereof.

This generalizes the invertible recurrence logic in two ways:
* By allowing mismatching operand numbers of the phi, we can recurse through a pair of phi recurrences whose operand orders have not been canonicalized.
* By allowing recurrences through operand 1, we can invert these odd (but legal) recurrence.

Differential Revision: https://reviews.llvm.org/D100884
2021-04-28 14:38:07 -07:00
Philip Reames 0cc3e10f5e [SCEV] Avoid range intersection idiom in getRangeForUnkownRecurrence [NFC]
Addresses a review comment from D101181
2021-04-28 12:48:17 -07:00
Philip Reames a836de0bde [SCEV] Compute ranges for ashr recurrences
Straight forward extension to the recently added infrastructure which was pioneered with shl. This was originally posted as part of D99687, but split off for ease of review.

(I also decided to exclude the unknown start sign case explicitly for simplicity of understanding.)

Differential Revision: https://reviews.llvm.org/D101181
2021-04-28 12:36:20 -07:00
Florian Hahn 1ed7f8ede5
[LAA] Support pointer phis in loop by analyzing each incoming pointer.
SCEV does not look through non-header PHIs inside the loop. Such phis
can be analyzed by adding separate accesses for each incoming pointer
value.

This results in 2 more loops vectorized in SPEC2000/186.crafty and
avoids regressions when sinking instructions before vectorizing.

Reviewed By: Meinersbur

Differential Revision: https://reviews.llvm.org/D101286
2021-04-28 20:19:40 +01:00
Arthur Eubanks cbce28f07e [ConstFold] Use const-folded operands in more places
Previously we were const folding operands but not passing them.

Reviewed By: nikic

Differential Revision: https://reviews.llvm.org/D101394
2021-04-27 14:30:19 -07:00
Nikita Popov e45168c4fa [SCEV] Handle uge/ugt predicates in applyLoopGuards()
These can be handled the same way as ule/ult, just using umax
instead of umin. This is useful in cases where the umax prevents
the upper bound from overflowing.

Differential Revision: https://reviews.llvm.org/D101196
2021-04-27 22:41:05 +02:00
Andy Kaylor 0a82d885a4 [Dependence Analysis] Fix ExactSIV producing wrong analysis
Patch by Artem Radzikhovskyy!

Symptom: ExactSIV test produced incorrect analysis of dependencies see LIT tests
Bug: At the end of the algorithm when determining dependence direction original author forgot to divide intermediate results by gcd and round result toward zero

Although this bug can be fixed with significantly fewer changes I opted to write the code in such a way that reflects the original algorithm that Banerjee proposed, for easier reference in the future. This surprisingly results in shorter code, and fewer quotient and max/min calculations.

Changes Summary:

- fixed findGCD to return valid x and y so that they match the function description where: ax - by = gcd(a,b)
- Fixed ExactSIV test, to produce proper results
- Documented the extension of Banerjee's algorithm that the original code author introduced. Banerjee's original algorithm only tested whether Dst depends on Src, the extension also allows us to test whether Src depends on Dst, in one pass.
- ExactRDIV test worked fine. Since it uses findGCD(), it needed to be updated.Since ExactRDIV test has very few changes from the core algorithm of ExactSIV I modified the test to have consistent format as ExactSIV.
- Updated the LIT tests to be testing for correct values.

Differential Revision: https://reviews.llvm.org/D100331
2021-04-27 12:24:00 -07:00
dfukalov e4c606acaf [TTI] NFC: Change getScalarizationOverhead and getOperandsScalarizationOverhead to return InstructionCost.
This patch migrates the TTI cost interfaces to return an InstructionCost.

See this patch for the introduction of the type: https://reviews.llvm.org/D91174
See this thread for context: http://lists.llvm.org/pipermail/llvm-dev/2020-November/146408.html

Reviewed By: sdesmalen

Differential Revision: https://reviews.llvm.org/D101283
2021-04-27 08:51:48 +03:00
Hongtao Yu 30bb5be389 [CSSPGO] Unblock optimizations with pseudo probe instrumentation part 2.
As a follow-up to D95982, this patch continues unblocking optimizations that are blocked by pseudu probe instrumention.

The optimizations unblocked are:
		- In-block load propagation.
		- In-block dead store elimination
		- Memory copy optimization that turns stores to consecutive memories into a memset.

These optimizations are local to a block, so they shouldn't affect the profile quality.

Reviewed By: wmi

Differential Revision: https://reviews.llvm.org/D100075
2021-04-26 16:52:33 -07:00
Vineet Kumar 84d16e2055 Implementation for TargetTransformInfo::hasActiveVectorLength()
This patch adds the missing implementation for
TargetTransformInfo::hasActiveVectorLength() without which using
hasActiveVectorLength() causes linker error.

Patch by Vineet Kumar!

Differential Revision: https://reviews.llvm.org/D100941
2021-04-26 21:20:05 +00:00
Nikita Popov a5051f2fa2 [SCEV] Fix applyLoopGuards() chaining for ne predicates
ICMP_NE predicates directly overwrote the rewritten result,
instead of chaining it with previous rewrites, as was done for
ICMP_ULT and ICMP_ULE. This means that some guards were effectively
discarded, depending on their order.
2021-04-24 21:43:46 +02:00
dfukalov 6c57044231 [GVN] Clobber partially aliased loads.
Use offsets stored in `AliasResult` implemented in D98718.

Reviewed By: nikic

Differential Revision: https://reviews.llvm.org/D95543
2021-04-24 14:14:20 +03:00
Sander de Smalen f9a50f04ba [TTI] NFC: Change getIntImmCost[Inst|Intrin] to return InstructionCost
This patch migrates the TTI cost interfaces to return an InstructionCost.

See this patch for the introduction of the type: https://reviews.llvm.org/D91174
See this thread for context: http://lists.llvm.org/pipermail/llvm-dev/2020-November/146408.html

Differential Revision: https://reviews.llvm.org/D100565
2021-04-23 16:06:36 +01:00
Sander de Smalen 43ace8b5ce [TTI] NFC: Change getScalingFactorCost to return InstructionCost
This patch migrates the TTI cost interfaces to return an InstructionCost.

See this patch for the introduction of the type: https://reviews.llvm.org/D91174
See this thread for context: http://lists.llvm.org/pipermail/llvm-dev/2020-November/146408.html

Differential Revision: https://reviews.llvm.org/D100564
2021-04-23 16:06:36 +01:00
Sander de Smalen 008a072ded [TTI] NFC: Change getMemcpyCost to return InstructionCost
This patch migrates the TTI cost interfaces to return an InstructionCost.

See this patch for the introduction of the type: https://reviews.llvm.org/D91174
See this thread for context: http://lists.llvm.org/pipermail/llvm-dev/2020-November/146408.html

Differential Revision: https://reviews.llvm.org/D100563
2021-04-23 16:06:35 +01:00
Sander de Smalen 9ba07f37f8 [TTI] NFC: Change getGEPCost to return InstructionCost
This patch migrates the TTI cost interfaces to return an InstructionCost.

See this patch for the introduction of the type: https://reviews.llvm.org/D91174
See this thread for context: http://lists.llvm.org/pipermail/llvm-dev/2020-November/146408.html

Differential Revision: https://reviews.llvm.org/D100562
2021-04-23 16:06:35 +01:00
Sander de Smalen e0edfa052f [TTI] NFC: Change getAddressComputationCost to return InstructionCost
This patch migrates the TTI cost interfaces to return an InstructionCost.

See this patch for the introduction of the type: https://reviews.llvm.org/D91174
See this thread for context: http://lists.llvm.org/pipermail/llvm-dev/2020-November/146408.html

Differential Revision: https://reviews.llvm.org/D100561
2021-04-23 16:06:35 +01:00
dfukalov 9ab17a60eb [TTI] NFC: Use InstructionCost to store ScalarizationCost in IntrinsicCostAttributes.
This patch migrates the TTI cost interfaces to return an InstructionCost.

See this patch for the introduction of the type: https://reviews.llvm.org/D91174
See this thread for context: http://lists.llvm.org/pipermail/llvm-dev/2020-November/146408.html

Reviewed By: samparker

Differential Revision: https://reviews.llvm.org/D101151
2021-04-23 18:02:00 +03:00
Philip Reames 424d6cb902 [SCEV] Compute ranges for lshr recurrences
Straight forward extension to the recently added infrastructure which was pioneered with shl.

Differential Revision: https://reviews.llvm.org/D99687
2021-04-22 11:06:31 -07:00
Wenlei He dff8315892 [CSSPGO][llvm-profdata] Support trimming cold context when merging profiles
The change adds support for triming and merging cold context when mergine CSSPGO profiles using llvm-profdata. This is similar to the context profile trimming in llvm-profgen, however the flexibility to trim cold context after profile is generated can be useful.

Differential Revision: https://reviews.llvm.org/D100528
2021-04-22 00:42:37 -07:00
Sanjay Patel 5e6dc5e404 [InstSimplify] generalize ctlz-of-shifted-constant
https://alive2.llvm.org/ce/z/zWL_VQ
2021-04-21 14:23:55 -04:00
Nico Weber ba7a92c01e [Support] Don't include VirtualFileSystem.h in CommandLine.h
CommandLine.h is indirectly included in ~50% of TUs when building
clang, and VirtualFileSystem.h is large.

(Already remarked by jhenderson on D70769.)

No behavior change.

Differential Revision: https://reviews.llvm.org/D100957
2021-04-21 10:19:01 -04:00
Yang Fan 4307446e9f
[SCEV] Fix -Wunused-variable warning (NFC)
GCC warning:
```
/llvm-project/llvm/lib/Analysis/ScalarEvolution.cpp: In member function ‘const llvm::SCEV* llvm::ScalarEvolution::getLosslessPtrToIntExpr(const llvm::SCEV*, unsigned int)::SCEVPtrToIntSinkingRewriter::visitUnknown(const llvm::SCEVUnknown*)’:
/llvm-project/llvm/lib/Analysis/ScalarEvolution.cpp:1152:13: warning: unused variable ‘ExprPtrTy’ [-Wunused-variable]
 1152 |       Type *ExprPtrTy = Expr->getType();
      |             ^~~~~~~~~
```
2021-04-21 16:01:46 +08:00
Nikita Popov de18fa9e52 Revert "[InstSimplify] Bypass no-op `and`-mask, using known bits (PR49543)"
This reverts commit ea1a0d7c9a.

While this is strictly more powerful, it is also strictly slower.
InstSimplify intentionally does not perform many folds that it
is allowed to perform, if doing so requires a KnownBits calculation
that will be repeated in InstCombine.

Maybe it's worthwhile to do this here, but that needs a more
explicitly stated motivation, evaluated in a review.
2021-04-21 09:55:25 +02:00
Philip Reames 4824d876f0 Revert "Allow invokable sub-classes of IntrinsicInst"
This reverts commit d87b9b81cc.

Post commit review raised concerns, reverting while discussion happens.
2021-04-20 15:38:38 -07:00
Philip Reames d87b9b81cc Allow invokable sub-classes of IntrinsicInst
It used to be that all of our intrinsics were call instructions, but over time, we've added more and more invokable intrinsics. According to the verifier, we're up to 8 right now. As IntrinsicInst is a sub-class of CallInst, this puts us in an awkward spot where the idiomatic means to check for intrinsic has a false negative if the intrinsic is invoked.

This change switches IntrinsicInst from being a sub-class of CallInst to being a subclass of CallBase. This allows invoked intrinsics to be instances of IntrinsicInst, at the cost of requiring a few more casts to CallInst in places where the intrinsic really is known to be a call, not an invoke.

After this lands and has baked for a couple days, planned cleanups:
    Make GCStatepointInst a IntrinsicInst subclass.
    Merge intrinsic handling in InstCombine and use idiomatic visitIntrinsicInst entry point for InstVisitor.
    Do the same in SelectionDAG.
    Do the same in FastISEL.

Differential Revision: https://reviews.llvm.org/D99976
2021-04-20 15:03:49 -07:00
Roman Lebedev ea1a0d7c9a
[InstSimplify] Bypass no-op `and`-mask, using known bits (PR49543)
We already special-cased a few interesting patterns,
but that is strictly less powerful than using KnownBits.

So instead get the known bits for the operand of `and`,
and iff all the unset bits of the `and`-mask are known to be zeros
in the operand, we can omit said `and`.
2021-04-21 00:31:46 +03:00
Philip Reames 6792e26c0d Reapply "Look through invertible recurrences in isKnownNonEqual"
I'd reverted this in commit 3b6acb1797 due to buildbot failures.  This patch contains the fix for said issue.  I'd forgotten to handle the case where two phis in the same block have different operand order.  We canonicalize away from this, but it's still valid IR.  The tests included in this change (as opposed to simply having test output changed), crashed without the fix.

Original commit message follows...

This extends the phi handling in isKnownNonEqual with a special case based on invertible recurrences. If we can prove the recurrence is invertible (which many common ones are), we can recurse through the start operands of the recurrence skipping the phi cycle.

(Side note: Instcombine currently does not push back through these cases. I will implement that in a follow up change w/separate review.)

Differential Revision: https://reviews.llvm.org/D99912
2021-04-20 12:47:59 -07:00
Philip Reames 3b6acb1797 Revert "Look through invertible recurrences in isKnownNonEqual"
This reverts commit be20eae25f.  It appears to have caused a crash on a buildbot (https://lab.llvm.org/buildbot#builders/77/builds/5653).  Reverting while investigating.
2021-04-20 11:47:10 -07:00
Philip Reames 9c1a145aeb Rearrange code to reduce diff for D99687 [nfc]
Adding the switches to reduce diffs.  I'm about to split that into an lshr part and an ashr part, doing the NFC part first makes it easier to maintain both diffs.
2021-04-20 11:40:15 -07:00
Roman Lebedev 7186764884
[NFC][SCEV] Split getLosslessPtrToIntExpr out of getPtrToIntExpr() 2021-04-20 21:29:21 +03:00
Philip Reames be20eae25f Look through invertible recurrences in isKnownNonEqual
This extends the phi handling in isKnownNonEqual with a special case based on invertible recurrences. If we can prove the recurrence is invertible (which many common ones are), we can recurse through the start operands of the recurrence skipping the phi cycle.

(Side note: Instcombine currently does not push back through these cases. I will implement that in a follow up change w/separate review.)

Differential Revision: https://reviews.llvm.org/D99912
2021-04-20 10:52:22 -07:00
Dávid Bolvanský 319c9f6e58 [MemoryBuiltins] Added support for memalign
memalign is older aligned_alloc.
2021-04-20 12:39:54 +02:00
Joe Ellis effacc1599 [AArch64] Constant fold sve_convert_from_svbool(zero) to zero
Co-authored-by: Paul Walker <paul.walker@arm.com>

Differential Revision: https://reviews.llvm.org/D100463
2021-04-20 10:02:49 +00:00
Arthur Eubanks 5e71b9fa93 Explicitly pass type to cast load constant folding result
Previously we would use the type of the pointee to determine what to
cast the result of constant folding a load. To aid with opaque pointer
types, we should explicitly pass the type of the load rather than
looking at pointee types.

ConstantFoldLoadThroughBitcast() converts the const prop'd value to the
proper load type (e.g. [1 x i32] -> i32). Instead of calling this in
every intermediate step like bitcasts, we only call this when we
actually see the global initializer value.

In some existing uses of this API, we don't know the exact type we're
loading from immediately (e.g. first we visit a bitcast, then we visit
the load using the bitcast). In those cases we have to manually call
ConstantFoldLoadThroughBitcast() when simplifying the load to make sure
that we cast to the proper type.

Reviewed By: dblaikie

Differential Revision: https://reviews.llvm.org/D100718
2021-04-20 00:53:21 -07:00
Sanjay Patel 9d43f6d7ce [LowerConstantIntrinsics] avoid crashing on alloca with unexpected operand type
The test here is reduced from the fuzzer-generated crasher in:
https://llvm.org/PR50023
https://bugs.chromium.org/p/oss-fuzz/issues/detail?id=33395

I don't know if this is the best or complete solution, but the
zext of the i42 type appears to match the behavior if we run a
weird type example like this through the IR optimizer with -O1.

Differential Revision: https://reviews.llvm.org/D100766
2021-04-19 13:06:29 -04:00
Roman Lebedev 41c22acc22
[NFC][SCEV] Assert that we don't try to create SCEVPtrToIntExpr of a non-integral pointer
ptr<->int casts are only valid for integral pointes,
defensively assert that we don't try to break that here.
2021-04-19 18:38:38 +03:00
Simon Pilgrim ddcdeae358 [Analysis] ImportedFunctionsInliningStatistics.h - add <memory> and remove unused <string> include. NFCI.
Move <string> include to ImportedFunctionsInliningStatistics.cpp and add missing <memory> include as we have explicit uses of std::unique_ptr in the header.
2021-04-19 16:20:56 +01:00
Cullen Rhodes f0bc2782f2 [TTI] NFC: Remove unused 'OptSize' parameter from shouldMaximizeVectorBandwidth
Reviewed By: sdesmalen

Differential Revision: https://reviews.llvm.org/D100377
2021-04-19 11:01:34 +00:00
Roman Lebedev d480f968ad
Revert "[SCEV] Model `ashr exact x, C` as `(abs(x) EXACT/u (1<<C)) * signum(x)`"
As being discussed in https://reviews.llvm.org/D100721,
this modelling is lossy, we can't reconstruct `ash`/`ashr exact`
from it, which means that whenever we actually expand the IR,
we've just pessimized the code..

It would be good to model this pattern, after all it comes up every time
you want to compute a distance between two pointers, but not at this cost.

This reverts commit ec54867df5.
2021-04-18 16:26:45 +03:00
Serge Guelton d6de1e1a71 Normalize interaction with boolean attributes
Such attributes can either be unset, or set to "true" or "false" (as string).
throughout the codebase, this led to inelegant checks ranging from

        if (Fn->getFnAttribute("no-jump-tables").getValueAsString() == "true")

to

        if (Fn->hasAttribute("no-jump-tables") && Fn->getFnAttribute("no-jump-tables").getValueAsString() == "true")

Introduce a getValueAsBool that normalize the check, with the following
behavior:

no attributes or attribute set to "false" => return false
attribute set to "true" => return true

Differential Revision: https://reviews.llvm.org/D99299
2021-04-17 08:17:33 +02:00
Thomas Lively 5c729750a6 [WebAssembly] Remove saturating fp-to-int target intrinsics
Use the target-independent @llvm.fptosi and @llvm.fptoui intrinsics instead.
This includes removing the instrinsics for i32x4.trunc_sat_zero_f64x2_{s,u},
which are now represented in IR as a saturating truncation to a v2i32 followed by
a concatenation with a zero vector.

Differential Revision: https://reviews.llvm.org/D100596
2021-04-16 12:11:20 -07:00
Sanjay Patel bb907b26e2 [ValueTracking] don't recursively compute known bits using multiple llvm.assumes
This is an alternative to D99759 to avoid the compile-time explosion seen in:
https://llvm.org/PR49785

Another potential solution would make the exclusion logic stronger to avoid
blowing up, but note that we reduced the complexity of the exclusion mechanism
in D16204 because it was too costly.

So I'm questioning the need for recursion/exclusion entirely - what is the
optimization value vs. cost of recursively computing known bits based on
assumptions?
This was built into the implementation from the start with 60db058,
and we have kept adding code/cost to deal with that capability.

By clearing the query's AssumptionCache inside computeKnownBitsFromAssume(),
this patch retains all existing assume functionality except refining known
bits based on even more assumptions.

We have 1 regression test that shows a difference in optimization power.

Differential Revision: https://reviews.llvm.org/D100573
2021-04-16 08:43:35 -04:00
Mircea Trofin 0d06b14f59 [MLGO] Fix use of AM.invalidate post D100519
The ML inline advisors more aggressively invalidate certain analyses
after each call site inlining, to more accurately capture the problem
state.
2021-04-15 18:45:39 -07:00
Arthur Eubanks c8f0a7c215 [NewPM] Cleanup IR printing instrumentation
Being lazy with printing the banner seems hard to reason with, we should print it
unconditionally first (it could also lead to duplicate banners if we
have multiple functions in -filter-print-funcs).

The printIR() functions were doing too many things. I separated out the
call from PrintPassInstrumentation since we were essentially doing two
completely separate things in printIR() from different callers.

There were multiple ways to generate the name of some IR. That's all
been moved to getIRName(). The printing of the IR name was also
inconsistent, now it's always "IR Dump on $foo" where "$foo" is the
name. For a function, it's the function name. For a loop, it's what's
printed by Loop::print(), which is more detailed. For an SCC, it's the
list of functions in parentheses. For a module it's "[module]", to
differentiate between a possible SCC with a function called "module".

To preserve D74814, we have to check if we're going to print anything at
all first. This is unfortunate, but I would consider this a special
case that shouldn't be handled in the core logic.

Reviewed By: jamieschmeiser

Differential Revision: https://reviews.llvm.org/D100231
2021-04-15 09:50:55 -07:00
dfukalov ce1626f34a [AA] Updates for D95543.
Addressing latter comments in D95543:
- `AliasResult::Result` renamed to `AliasResult::Kind`
- Offset printing added for `PartialAlias` case in `-aa-eval`
- Removed VisitedPhiBBs check from BasicAA'

Reviewed By: nikic

Differential Revision: https://reviews.llvm.org/D100454
2021-04-15 12:22:03 +03:00
Nikita Popov a1ed025d0e Revert "[SCEV] Don't walk uses of phis without SCEV expression when forgetting"
This reverts commit faf9f11589.

Issues with this patch have been reported in
https://reviews.llvm.org/D100264#2689917 and
https://bugs.llvm.org/show_bug.cgi?id=49967.
2021-04-15 09:43:52 +02:00
William S. Moses d3e2b4c0a2 [SROA][TBAA] Handle shift of regular TBAA nodes
SROA shifts TBAA nodes in a way that may present a problem for !tbaa but not !tbaa.struct nodes.

Differential Revision: https://reviews.llvm.org/D99851
2021-04-14 14:35:20 -04:00
Nikita Popov 0d91075f77 [ValueTracking] Don't require strictly positive for mul nsw recurrence
Just like in the mul nuw case, it's sufficient that the step is
non-zero. If the step is negative, then the values will jump
between positive and negative, "crossing" zero, but the value of
the recurrence is never actually zero.
2021-04-14 19:39:59 +02:00
Nikita Popov 5c0fb026c9 [ValueTracking] Don't require non-zero step for add nuw
It's okay if the step is zero, we'll just stay at the same non-zero
value in that case. The valuable part of this is that the step
doesn't even need to be a constant anymore.
2021-04-14 19:06:18 +02:00
Sander de Smalen 4f42d873c2 [TTI] NFC: Change getArithmeticInstrCost to return InstructionCost
This patch migrates the TTI cost interfaces to return an InstructionCost.

See this patch for the introduction of the type: https://reviews.llvm.org/D91174
See this thread for context: http://lists.llvm.org/pipermail/llvm-dev/2020-November/146408.html

Reviewed By: dmgreen

Differential Revision: https://reviews.llvm.org/D100317
2021-04-14 17:20:36 +01:00
Sander de Smalen d84bd951a8 [TTI] NFC: Change getFPOpCost to return InstructionCost
This patch migrates the TTI cost interfaces to return an InstructionCost.

See this patch for the introduction of the type: https://reviews.llvm.org/D91174
See this thread for context: http://lists.llvm.org/pipermail/llvm-dev/2020-November/146408.html

Reviewed By: c-rhodes

Differential Revision: https://reviews.llvm.org/D100316
2021-04-14 17:20:36 +01:00
Sander de Smalen 1af35e77f4 [TTI] NFC: Change getVectorInstrCost to return InstructionCost
This patch migrates the TTI cost interfaces to return an InstructionCost.

See this patch for the introduction of the type: https://reviews.llvm.org/D91174
See this thread for context: http://lists.llvm.org/pipermail/llvm-dev/2020-November/146408.html

Reviewed By: dmgreen

Differential Revision: https://reviews.llvm.org/D100315
2021-04-14 17:20:35 +01:00
Sander de Smalen 174e8f6c5e [TTI] NFC: Change getShuffleCost to return InstructionCost
This patch migrates the TTI cost interfaces to return an InstructionCost.

See this patch for the introduction of the type: https://reviews.llvm.org/D91174
See this thread for context: http://lists.llvm.org/pipermail/llvm-dev/2020-November/146408.html

Reviewed By: dmgreen

Differential Revision: https://reviews.llvm.org/D100314
2021-04-14 17:20:35 +01:00
Sander de Smalen 14b934f8a6 [TTI] NFC: Change getCFInstrCost to return InstructionCost
This patch migrates the TTI cost interfaces to return an InstructionCost.

See this patch for the introduction of the type: https://reviews.llvm.org/D91174
See this thread for context: http://lists.llvm.org/pipermail/llvm-dev/2020-November/146408.html

Reviewed By: samparker

Differential Revision: https://reviews.llvm.org/D100313
2021-04-14 17:20:34 +01:00
Sander de Smalen 596f669cfb [TTI] NFC: Change getCallInstrCost to return InstructionCost
This patch migrates the TTI cost interfaces to return an InstructionCost.

See this patch for the introduction of the type: https://reviews.llvm.org/D91174
See this thread for context: http://lists.llvm.org/pipermail/llvm-dev/2020-November/146408.html

Reviewed By: c-rhodes

Differential Revision: https://reviews.llvm.org/D100312
2021-04-14 17:20:34 +01:00
Sanjay Patel 7ef2c68a3d [InstSimplify] improve efficiency for detecting non-zero value
Stepping through callstacks in the example from D99759 reveals
this potential compile-time improvement.

The savings come from avoiding ValueTracking's computing known
bits if we have already dealt with special-case patterns.

Further improvements in this direction seem possible.

This makes a degenerate test based on PR49785 about 40x faster
(25 sec -> 0.6 sec), but it does not address the larger question
of how to limit computeKnownBitsFromAssume(). Ie, the original
test there is still infinite-time for all practical purposes.

Differential Revision: https://reviews.llvm.org/D100408
2021-04-14 09:04:15 -04:00
Sanjay Patel 5ae5d25e38 [ValueTracking] match negative-stepping non-zero recurrence
This is pulled out of D100408.

This avoids a regression that would be exposed by making the
calling code from InstSimplify more efficient.
2021-04-14 08:57:53 -04:00
Sanjay Patel 4919365397 [ValueTracking] reduce code duplication; NFC
The start value can't be null for something to be a non-zero
recurrence, so hoist that common check out of the switch.

Subsequent checks may be incomplete or over-specified as noted in:
D100408
2021-04-14 08:32:42 -04:00
Philip Reames 00c8be3f93 fix whitespace type 2021-04-13 19:02:41 -07:00
Nikita Popov faf9f11589 [SCEV] Don't walk uses of phis without SCEV expression when forgetting
I've run into some cases where a large fraction of compile-time is
spent invalidating SCEV. One of the causes is forgetLoop(), which
walks all values that are def-use reachable from the loop header
phis. When invalidating a topmost loop, that might be close to all
values in a function. Additionally, it's fairly common for there to
not actually be anything to invalidate, but we'll still be performing
this walk again and again.

My first thought was that we don't need to continue walking the uses
if the current value doesn't have a SCEV expression. However, this
isn't quite right, because SCEV construction can skip over values
(e.g. for a chain of adds, we might only create a SCEV expression
for the final value).

What this patch does instead is to only walk the (full) def-use chain
of loop phis that have a SCEV expression. If there's no expression
for a phi, then we also don't have any dependent expressions to
invalidate.

Differential Revision: https://reviews.llvm.org/D100264
2021-04-13 20:28:17 +02:00
Sander de Smalen 03f47bdcb1 [TTI] NFC: Change get[Interleaved]MemoryOpCost to return InstructionCost
This patch migrates the TTI cost interfaces to return an InstructionCost.

See this patch for the introduction of the type: https://reviews.llvm.org/D91174
See this thread for context: http://lists.llvm.org/pipermail/llvm-dev/2020-November/146408.html

Reviewed By: dmgreen

Differential Revision: https://reviews.llvm.org/D100205
2021-04-13 14:21:02 +01:00
Sander de Smalen d676b5749d [TTI] NFC: Change getMaskedMemoryOpCost to return InstructionCost
This patch migrates the TTI cost interfaces to return an InstructionCost.

See this patch for the introduction of the type: https://reviews.llvm.org/D91174
See this thread for context: http://lists.llvm.org/pipermail/llvm-dev/2020-November/146408.html

Reviewed By: dmgreen

Differential Revision: https://reviews.llvm.org/D100204
2021-04-13 14:21:01 +01:00
Sander de Smalen db134e2428 [TTI] NFC: Change getCmpSelInstrCost to return InstructionCost
This patch migrates the TTI cost interfaces to return an InstructionCost.

See this patch for the introduction of the type: https://reviews.llvm.org/D91174
See this thread for context: http://lists.llvm.org/pipermail/llvm-dev/2020-November/146408.html

Reviewed By: dmgreen

Differential Revision: https://reviews.llvm.org/D100203
2021-04-13 14:21:01 +01:00
Sander de Smalen 2285dfb73f [TTI] NFC: Change getMinMaxReductionCost to return InstructionCost
This patch migrates the TTI cost interfaces to return an InstructionCost.

See this patch for the introduction of the type: https://reviews.llvm.org/D91174
See this thread for context: http://lists.llvm.org/pipermail/llvm-dev/2020-November/146408.html

Reviewed By: dmgreen

Differential Revision: https://reviews.llvm.org/D100202
2021-04-13 14:21:00 +01:00
Sander de Smalen bd86824d98 [TTI] NFC: Change getArithmeticReductionCost to return InstructionCost
This patch migrates the TTI cost interfaces to return an InstructionCost.

See this patch for the introduction of the type: https://reviews.llvm.org/D91174
See this thread for context: http://lists.llvm.org/pipermail/llvm-dev/2020-November/146408.html

This patch is practically NFC, with the exception of an AArch64 SVE related
cost-model change, where we can now return an Invalid cost instead of some
bogus number.

Reviewed By: dmgreen

Differential Revision: https://reviews.llvm.org/D100201
2021-04-13 14:20:59 +01:00
Sander de Smalen fd1f8a5462 [TTI] NFC: Change getGatherScatterOpCost to return InstructionCost
This patch migrates the TTI cost interfaces to return an InstructionCost.

See this patch for the introduction of the type: https://reviews.llvm.org/D91174
See this thread for context: http://lists.llvm.org/pipermail/llvm-dev/2020-November/146408.html

Reviewed By: dmgreen

Differential Revision: https://reviews.llvm.org/D100200
2021-04-13 14:20:59 +01:00
Sander de Smalen 92d8421f49 [TTI] NFC: Change getCastInstrCost and getExtractWithExtendCost to return InstructionCost
This patch migrates the TTI cost interfaces to return an InstructionCost.

See this patch for the introduction of the type: https://reviews.llvm.org/D91174
See this thread for context: http://lists.llvm.org/pipermail/llvm-dev/2020-November/146408.html

Reviewed By: dmgreen

Differential Revision: https://reviews.llvm.org/D100199
2021-04-13 14:20:58 +01:00
Gulfem Savrun Yeniceri e96df3e531 [Passes] Add relative lookup table converter pass
Lookup tables generate non PIC-friendly code, which requires dynamic relocation as described in:
https://bugs.llvm.org/show_bug.cgi?id=45244

This patch adds a new pass that converts lookup tables to relative lookup tables to make them PIC-friendly.

Differential Revision: https://reviews.llvm.org/D94355
2021-04-13 01:29:41 +00:00
Yuanfang Chen c5fda0e662 Reland "Revert "[InstCombine] when calling conventions are compatible, don't convert the call to undef idiom""
This reverts commit a3fabc79ae (relands
f4d682d6ce with fix for the compile-time
regression issue).
2021-04-12 14:50:54 -07:00
Nikita Popov a3fabc79ae Revert "[InstCombine] when calling conventions are compatible, don't convert the call to undef idiom"
This reverts commit f4d682d6ce.

This caused a significant compile-time regression:
https://llvm-compile-time-tracker.com/compare.php?from=4b7bad9eaea2233521a94f6b096aaa88dc584e23&to=f4d682d6ce6c5b3a41a0acf297507c82f5c21eef&stat=instructions

Possibly this is due to overeager parsing of target triples.
2021-04-12 22:55:59 +02:00
Arthur Eubanks 269b335bd7 [Inliner] Propagate SROA analysis through invariant group intrinsics
SROA can handle invariant group intrinsics, let the inliner know that
for better heuristics when the intrinsics are present.

This fixes size issues in a couple files when turning on
-fstrict-vtable-pointers in Chrome.

Reviewed By: rnk, mtrofin

Differential Revision: https://reviews.llvm.org/D100249
2021-04-12 10:54:22 -07:00
Hamza Sood 0a92aff721 Replace uses of std::iterator with explicit using
This patch removes all uses of `std::iterator`, which was deprecated in C++17.
While this isn't currently an issue while compiling LLVM, it's useful for those using LLVM as a library.

For some reason there're a few places that were seemingly able to use `std` functions unqualified, which no longer works after this patch. I've updated those places, but I'm not really sure why it worked in the first place.

Reviewed By: MaskRay

Differential Revision: https://reviews.llvm.org/D67586
2021-04-12 10:47:14 -07:00
Yuanfang Chen f4d682d6ce [InstCombine] when calling conventions are compatible, don't convert the call to undef idiom
D24453 enabled libcalls simplication for ARM PCS. This may cause
caller/callee calling conventions mismatch in some situations such as
LTO. This patch makes instcombine aware that the compatible calling
conventions differences are benign (not emitting undef idom).

Differential Revision: https://reviews.llvm.org/D99773
2021-04-12 09:32:23 -07:00
Roman Lebedev 6d44b3c56d
[NFCI][DomTreeUpdater] applyUpdates(): reserve space for updates first
While, indeed, we may end up pushing less updates that we'd reserve space
for, self-dominating updates aren't often enough for that to matter.
But this should matter for normal updates.
2021-04-11 23:56:22 +03:00
Roman Lebedev 9829f5e6b1
[CVP] @llvm.[us]{min,max}() intrinsics handling
If we can tell that either one of the arguments is taken,
bypass the intrinsic.

Notably, we are indeed fine with non-strict predicate:
* UL: https://alive2.llvm.org/ce/z/69qVW9 https://alive2.llvm.org/ce/z/kNFTKf
      https://alive2.llvm.org/ce/z/AvaPw2 https://alive2.llvm.org/ce/z/oxo53i
* UG: https://alive2.llvm.org/ce/z/wxHeGH https://alive2.llvm.org/ce/z/Lf76qx
* SL: https://alive2.llvm.org/ce/z/hkeTGS https://alive2.llvm.org/ce/z/eR_b-W
* SG: https://alive2.llvm.org/ce/z/wEqRm7 https://alive2.llvm.org/ce/z/FpAsVr

Much like with all other comparison handling in CVP,
while we could sort-of handle two Value's,
at least for plain ICmpInst it does not appear to be worthwhile.

This only fires 78 times on test-suite + dt + rs,
but we don't canonicalize to these yet. (only SCEV produces them)
2021-04-11 00:33:47 +03:00
Nikita Popov 8de2f1ff79 [IVUsers] Check LoopSimplify cache earlier (NFC)
Check the cache before calling isLoopSimplifyForm(). Otherwise we'd
always perform the check for the innermost loop and only skip it
for dominating loops.
2021-04-10 22:58:13 +02:00
Roman Lebedev e8c7f43e2c
[NFC][ConstantRange] Add 'icmp' helper method
"Does the predicate hold between two ranges?"

Not very surprisingly, some places were already doing this check,
without explicitly naming the algorithm, cleanup them all.
2021-04-10 19:38:55 +03:00
Roman Lebedev 7b12c8c59d
Revert "[NFC][ConstantRange] Add 'icmp' helper method"
This reverts commit 17cf2c9423.
2021-04-10 19:37:53 +03:00
Roman Lebedev 17cf2c9423
[NFC][ConstantRange] Add 'icmp' helper method
"Does the predicate hold between two ranges?"

Not very surprisingly, some places were already doing this check,
without explicitly naming the algorithm, cleanup them all.
2021-04-10 19:09:52 +03:00
dfukalov 8f4b7e94a2 [AMDGPU][CostModel] Refine cost model for control-flow instructions.
Added cost estimation for switch instruction, updated costs of branches, fixed
phi cost.
Had to increase `-amdgpu-unroll-threshold-if` default value since conditional
branch cost (size) was corrected to higher value.
Test renamed to "control-flow.ll".

Removed redundant code in `X86TTIImpl::getCFInstrCost()` and
`PPCTTIImpl::getCFInstrCost()`.

Reviewed By: rampitec

Differential Revision: https://reviews.llvm.org/D96805
2021-04-10 09:20:24 +03:00
Roman Lebedev 077bff39d4
[Analysis] isDereferenceableAndAlignedPointer(): recurse into select's hands
By doing this within the method itself,
we support traversing multiple levels of selects (TODO: PHI's),
fixing the SROA `std::clamp()` testcase.

Fixes https://bugs.llvm.org/show_bug.cgi?id=47271
Mostly fixes https://bugs.llvm.org/show_bug.cgi?id=49909
2021-04-10 00:56:28 +03:00
Alina Sbirlea f6bff8d157 [MSSA] Rename uses in IDF regardless of new def position in basic block.
When inserting a new def and renaming of uses is asked, always compute
IDF and do the renaming for the blocks with Phis in that IDF.
Resolves PR49859.

Differential Revision: https://reviews.llvm.org/D100163
2021-04-09 12:32:37 -07:00
Momchil Velikov acf3279a03 For non-null pointer checks, do not descend through out-of-bounds GEPs
In LazyValueInfoImpl::isNonNullAtEndOfBlock we populate a set of
pointers, known to be non-null at the end of a block (e.g. because we
did a load through them). We then infer that any pointer, based on an
element of this set is non-null as well ("based" here meaning a
non-null pointer is the underlying object). This is incorrect, even if
the base pointer was non-null, the value of a GEP, that lacks the
inbounds` attribute, may be null.

This issue appeared as miscompilation of the following test case:

int puts(const char *);

typedef struct iter {
  int *val;
} iter_t;

static long distance(iter_t first, iter_t last) {
  long r = 0;
  for (; first.val != last.val; first.val++)
    ++r;
  return r;
}

int main() {
  int arr[2] = {0};
  iter_t i, j;
  i.val = arr;
  j.val = arr + 1;
  if (distance(i, j) >= 2)
    puts("failed");
  else
    puts("passed");
}

This fixes PR49662.

Differential Revision: https://reviews.llvm.org/D99642
2021-04-09 14:09:23 +01:00
dfukalov c1a88e007b [AA][NFC] Convert AliasResult to class containing offset for PartialAlias case.
Add an ability to store `Offset` between partially aliased location. Use this
storage within returned `ResultAlias` instead of caching it in `AAQueryInfo`.

Reviewed By: asbirlea

Differential Revision: https://reviews.llvm.org/D98718
2021-04-09 13:26:09 +03:00
dfukalov d066079728 [NFC][AA] Prepare to convert AliasResult to class with PartialAlias offset.
Main reason is preparation to transform AliasResult to class that contains
offset for PartialAlias case.

Reviewed By: asbirlea

Differential Revision: https://reviews.llvm.org/D98027
2021-04-09 12:54:22 +03:00
Arthur Eubanks c5d1ccbcdf [GVN] Properly invalidate ICF cache when we simplify a value
This fixes a "Cached first special instruction is wrong!" assert.

The assert fires because replacing a value with another can cause an
instruction to no longer be "special" to ICF. In this case,
devirtualization happened, turning an indirect call to a
call to a willreturn function which is no longer special.

Reviewed By: nikic, rnk

Differential Revision: https://reviews.llvm.org/D99977
2021-04-08 14:01:57 -07:00
Stanislav Mekhanoshin 5f0ac1ef78 Set IgnoreLLVMUsed to false in CallGraph::addToCallGraph()
clang++ uses llvm.compiler.used in certain cases to preserve
symbol which is fully inlined. D96087 has resulted in undefined
symbols in such cases. Set it to false by default to preserve
old behavior but keep the option for specific uses where we
want to ignore these (e.g. to detect a potential indirect call
to a function).

Differential Revision: https://reviews.llvm.org/D99897
2021-04-08 11:14:09 -07:00
Max Kazantsev fee330824a [SCEV] Fix false-positive recognition of simple recurrences. PR49856
A value from reachable block may come to a Phi node as its input from
unreachable block. This may confuse matchSimpleRecurrence  which
has no access to DomTree and can falsely recognize something as a recurrency
because of this effect, as the attached test shows.

Patch `ae7b1e` deals with half of this problem, but it only accounts from
the case when an unreachable instruction comes to Phi as an input.

This patch provides a generalization by checking that no Phi block's
predecessor is unreachable (no matter what the input is).

Differential Revision: https://reviews.llvm.org/D99929
Reviewed By: reames
2021-04-07 13:55:17 +07:00
Philip Reames 908215b346 Use AssumeInst in a few more places [nfc]
Follow up to a6d2a8d6f5.  These were found by simply grepping for "::assume", and are the subset of that result which looked cleaner to me using the isa/dyn_cast patterns.
2021-04-06 13:18:53 -07:00
Philip Reames 9ef6aa020b Plumb AssumeInst through operand bundle apis [nfc]
Follow up to a6d2a8d6f5.  This covers all the public interfaces of the bundle related code.  I tried to cleanup the internals where the changes were obvious, but there's definitely more room for improvement.
2021-04-06 12:53:53 -07:00
Philip Reames a6d2a8d6f5 Add a subclass of IntrinsicInst for llvm.assume [nfc]
Add the subclass, update a few places which check for the intrinsic to use idiomatic dyn_cast, and update the public interface of AssumptionCache to use the new class.  A follow up change will do the same for the newer assumption query/bundle mechanisms.
2021-04-06 11:16:22 -07:00
Florian Hahn 4059c1c32d [SimplifyInst] Use correct type for GEPs with vector indices.
The current code does not properly handle vector indices unless they are
the first index.

At the moment LangRef gives the impression that the vector index must be
the one and only index (https://llvm.org/docs/LangRef.html#getelementptr-instruction).

But vector indices can appear at any position and according to the
verifier there may be multiple vector indices. If that's the case, the
number of elements must match.

This patch updates SimplifyGEPInst to properly handle those additional
cases.

Reviewed By: nikic

Differential Revision: https://reviews.llvm.org/D99961
2021-04-06 17:56:10 +01:00
Philip Reames 21d4839948 Move GCRelocateInst and GCResultInst to IntrinsicInst.h [nfc]
These two are part of the IntrinsicInst class hierarchy and it helps to cut down on some redundant includes.
2021-04-06 08:33:15 -07:00
Kerry McLaughlin 7344f3d39a [LoopVectorize] Add strict in-order reduction support for fixed-width vectorization
Previously we could only vectorize FP reductions if fast math was enabled, as this allows us to
reorder FP operations. However, it may still be beneficial to vectorize the loop by moving
the reduction inside the vectorized loop and making sure that the scalar reduction value
be an input to the horizontal reduction, e.g:

  %phi = phi float [ 0.0, %entry ], [ %reduction, %vector_body ]
  %load = load <8 x float>
  %reduction = call float @llvm.vector.reduce.fadd.v8f32(float %phi, <8 x float> %load)

This patch adds a new flag (IsOrdered) to RecurrenceDescriptor and makes use of the changes added
by D75069 as much as possible, which already teaches the vectorizer about in-loop reductions.
For now in-order reduction support is off by default and controlled with the `-enable-strict-reductions` flag.

Reviewed By: david-arm

Differential Revision: https://reviews.llvm.org/D98435
2021-04-06 14:45:34 +01:00
Kerry McLaughlin 857b8a73da [LoopVectorize] Change the identity element for FAdd
Changes getRecurrenceIdentity to always return a neutral value of -0.0 for FAdd.

Reviewed By: dmgreen, spatel

Differential Revision: https://reviews.llvm.org/D98963
2021-04-06 12:13:43 +01:00
Simon Pilgrim ddbb58736a [KnownBits] Rename KnownBits::computeForMul to KnownBits::mul. NFCI.
As promised in D98866
2021-04-06 10:11:41 +01:00