Commit Graph

10303 Commits

Author SHA1 Message Date
Mircea Trofin 92ccc6cb17 Reapply "[NPM][CGSCC] FunctionAnalysisManagerCGSCCProxy: do not clear immutable function passes"
This reverts commit 11b70b9e3a.

The bot failure was due to ArgumentPromotion deleting functions
without deleting their analyses. This was separately fixed in 4b1c807.
2021-03-18 09:44:34 -07:00
Max Kazantsev b3a1500ea8 [SCEV][NFC] API for predicate evaluation
Provides API that allows to check predicate for being true or
false with one call. Current implementation is naive and just
calls isKnownPredicate twice, but further we can rework this
logic trying to use one check to prove both facts.
2021-03-18 19:21:29 +07:00
Philip Reames 31764ea295 [LCSSA] Extract a utility for deciding if a new use requires a new lcssa phi [NFC]
(Triggered by a review comment on D98728, but otherwise unrelated.)
2021-03-17 12:14:01 -07:00
David Green e2935dcfc4 [TTI] Add a Mask to getShuffleCost
This adds an Mask ArrayRef to getShuffleCost, so that if an exact mask
can be provided a more accurate cost can be provided by the backend.
For example VREV costs could be returned by the ARM backend. This should
be an NFC until then, laying the groundwork for that to be added.

Differential Revision: https://reviews.llvm.org/D98206
2021-03-17 17:46:26 +00:00
Bardia Mahjour fa9d8ace09 [CGSCC] Print CG node itself instead of its address
Fix the debug output from cgscc
2021-03-17 12:36:55 -04:00
Max Kazantsev a6074b092c [BasicAA] Drop dependency on Loop Info. PR43276
BasicAA stores a reference to LoopInfo inside. This imposes an implicit
requirement of keeping it up to date whenever we modify the IR (in particular,
whenever we modify terminators of blocks that belong to loops). Failing
to do so leads to incorrect state of the LoopInfo.

Because general AA does not require loop info updates and provides to API to
update it properly, the users of AA reasonably assume that there is no need to
update the loop info. It may be a reason of bugs, as example in PR43276 shows.

This patch drops dependence of BasicAA on LoopInfo to avoid this problem.

This may potentially pessimize the result of queries to BasicAA.

Differential Revision: https://reviews.llvm.org/D98627
Reviewed By: nikic
2021-03-17 11:43:44 +07:00
Zequan Wu cbd7eabea8 Revert "[ConstantFold] Handle vectors in ConstantFoldLoadThroughBitcast()"
That commit caused chromium build to crash: https://bugs.chromium.org/p/chromium/issues/detail?id=1188885

This reverts commit edf7004851.
2021-03-16 14:36:21 -07:00
Simonas Kazlauskas 6513995be3 [InstSimplify] Restrict a GEP transform to avoid provenance changes
This is a follow-up to D98588, and fixes the inline `FIXME` about a GEP-related simplification not
preserving the provenance.

https://alive2.llvm.org/ce/z/qbQoAY

Additional tests were added in {rGf125f28afdb59eba29d2491dac0dfc0a7bf1b60b}

Depends on D98672

Reviewed By: lebedev.ri

Differential Revision: https://reviews.llvm.org/D98611
2021-03-16 18:53:05 +02:00
Thomas Preud'homme f12433f127 [MemDepAnalysis] Remove redundant comment.
Exact same comment is found 2 lines above.
2021-03-16 15:51:17 +00:00
Max Kazantsev 5097143f0e [SCEV][NFC] Move check up the stack
One of (and primary) callers of isBasicBlockEntryGuardedByCond is
isKnownPredicateAt, which makes isKnownPredicate check before it.
It already makes non-recursive check inside. So, on this execution
path this check is made twice. The only other caller is
isLoopEntryGuardedByCond. Moving the check there should save some
compile time.
2021-03-16 22:09:17 +07:00
Simonas Kazlauskas a977324800 [InstSimplify] Match PtrToInt more directly in a GEP transform (NFC)
In preparation for D98611, the upcoming change will need to apply additional checks to `P` and `V`,
and so this refactor paves the way for adding additional checks in a less awkward way.

Reviewed By: lebedev.ri

Differential Revision: https://reviews.llvm.org/D98672
2021-03-16 15:45:19 +02:00
Johannes Doerfert f40a2c3bef [NVPTX] CUDA does provide malloc/free since compute capability 2.X
https://docs.nvidia.com/cuda/cuda-c-programming-guide/index.html#dynamic-global-memory-allocation-and-operations

Reviewed By: tra

Differential Revision: https://reviews.llvm.org/D98606
2021-03-15 22:45:56 -05:00
Sanjay Patel 660728acd4 [InstSimplify] ctlz({signbit} >>u x) --> x
The motivating pattern was handled in 0a2d69480d ,
but we should have this for symmetry.

But this really highlights that we could generalize for
any shifted constant if we match this in instcombine.

https://alive2.llvm.org/ce/z/MrmVNt
2021-03-15 12:03:35 -04:00
Nikita Popov 5fb43477dc Revert "[NFCI][ValueTracking] getUnderlyingObject(): gracefully handle cycles"
This reverts commit aa440ba24d.

This has a non-trivial compile-time impact:
https://llvm-compile-time-tracker.com/compare.php?from=0c5b789c7342ee8384507c3242fc256e23248c4d&to=aa440ba24dc25e4c95f6dcf8ff647024f3b12661&stat=instructions

I don't believe this is the correct way to address the issue in
this case.
2021-03-15 13:12:39 +01:00
Roman Lebedev aa440ba24d
[NFCI][ValueTracking] getUnderlyingObject(): gracefully handle cycles
Normally, this function just doesn't bother about cycles,
and hopes that the caller supplied small-enough depth
so that at worst it will take a potentially large,
but limited amount of time. But that obviously doesn't work
if there is no depth limit.

This reapples 36f1c3db66,
but without asserting, just bailout once cycle is detected.
2021-03-15 13:51:02 +03:00
Roman Lebedev f247d2ab9a
Revert "[NFCI][ValueTracking] getUnderlyingObject(): assert that no cycles are encountered"
This reverts commit 36f1c3db66.
Seems to make bots unhappy.
2021-03-15 12:00:59 +03:00
Roman Lebedev 36f1c3db66
[NFCI][ValueTracking] getUnderlyingObject(): assert that no cycles are encountered
Jeroen Dobbelaere in
https://lists.llvm.org/pipermail/llvm-dev/2021-March/149206.html
is reporting that this function can end up in an endless loop
when called from SROA w/ full restrict patches.

For now, simply ensure that such problems are caught earlier/easier.
2021-03-15 11:52:31 +03:00
Roman Lebedev 78b8ce40ef
Reland [SCEV] Improve modelling for (null) pointer constants
This reverts commit 329aeb5db4,
and relands commit 61f006ac65.

This is a continuation of D89456.

As it was suggested there, now that SCEV models `PtrToInt`,
we can try to improve SCEV's pointer handling.
In particular, i believe, i will need this in the future
to further fix `SCEVAddExpr`operation type handling.

This removes special handling of `ConstantPointerNull`
from `ScalarEvolution::createSCEV()`, and add constant folding
into `ScalarEvolution::getPtrToIntExpr()`.
This way, `null` constants stay as such in SCEV's,
but gracefully become zero integers when asked.

Reviewed By: Meinersbur

Differential Revision: https://reviews.llvm.org/D98147
2021-03-13 16:05:34 +03:00
Nikita Popov b2f933a6ce [MemorySSA] Don't bail on phi starting access
When calling getClobberingMemoryAccess() with MemoryLocation on a
MemoryPHI starting access, the walker currently immediately bails
and returns the starting access. This makes sense for the API that
does not accept a location (as we wouldn't know what clobber we
should be checking for), but doesn't make sense for the
MemoryLocation-based API. This means that it can't look through
a MemoryPHI if it's the starting access, but can if there is one
more non-clobbering def in between. This patch removes the limitation.

Differential Revision: https://reviews.llvm.org/D98557
2021-03-13 10:53:13 +01:00
Roman Lebedev 329aeb5db4
Temporairly evert "[SCEV] Improve modelling for (null) pointer constants"
This appears to have broken ubsan bot:
https://lab.llvm.org/buildbot/#/builders/85/builds/3062
https://reviews.llvm.org/D98147#2623549

It looks like LSR needs some kind of a change around insertion point handling.
Reverting until i have a fix.

This reverts commit 61f006ac65.
2021-03-13 09:10:28 +03:00
Roman Lebedev 61f006ac65
[SCEV] Improve modelling for (null) pointer constants
This is a continuation of D89456.

As it was suggested there, now that SCEV models `PtrToInt`,
we can try to improve SCEV's pointer handling.
In particular, i believe, i will need this in the future
to further fix `SCEVAddExpr`operation type handling.

This removes special handling of `ConstantPointerNull`
from `ScalarEvolution::createSCEV()`, and add constant folding
into `ScalarEvolution::getPtrToIntExpr()`.
This way, `null` constants stay as such in SCEV's,
but gracefully become zero integers when asked.

Reviewed By: Meinersbur

Differential Revision: https://reviews.llvm.org/D98147
2021-03-12 22:11:58 +03:00
Stanislav Mekhanoshin b7b99b0799 [AMDGPU] Fix -amdgpu-inline-arg-alloca-cost
Before D94153 this threshold was in a pre-scaled units.
After D94153 inlining threshold multiplier is not applied
to this portion of the threshold anymore. Restore the
threshold by applying the multiplier.

Differential Revision: https://reviews.llvm.org/D98362
2021-03-12 10:19:50 -08:00
serge-sans-paille 1ce2b58454 [NFC] Use llvm::raw_string_ostream instead of std::stringstream
That's more efficient and we don't loose any valuable feature when doing so.
2021-03-12 18:43:59 +01:00
Bjorn Pettersson 529c8e8dc6 [InstSimplify] Simplify smul.fix and smul.fix.sat
Add simplification of smul.fix and smul.fix.sat according to
  X * 0 -> 0
  X * undef -> 0
  X * (1 << scale) -> X

This includes the commuted patterns and splatted vectors.

Reviewed By: nikic

Differential Revision: https://reviews.llvm.org/D98299
2021-03-12 09:09:58 +01:00
Bjorn Pettersson 3638bdfbda [ConstantFold] Handle undef/poison when constant folding smul_fix/smul_fix_sat
Do constant folding according to
  posion * C -> poison
  C * poison -> poison
  undef * C -> 0
  C * undef -> 0
for smul_fix and smul_fix_sat intrinsics (for any scale).

Reviewed By: nikic, aqjune, nagisa

Differential Revision: https://reviews.llvm.org/D98410
2021-03-12 09:09:58 +01:00
Johannes Doerfert d22fbccfe2 [FIX] Allow non-constant assume operand bundle operands.
Fixes PR49545

Reviewed By: zequanwu, fhahn, lebedev.ri

Differential Revision: https://reviews.llvm.org/D98444
2021-03-11 23:31:09 -06:00
Mircea Trofin 11b70b9e3a Revert "[NPM][CGSCC] FunctionAnalysisManagerCGSCCProxy: do not clear immutable function passes"
This reverts commit 5eaeb0fa67.

It appears there are analyses that assume clearing - example:
https://lab.llvm.org/buildbot#builders/36/builds/5964
2021-03-11 18:31:19 -08:00
Mircea Trofin 5eaeb0fa67 [NPM][CGSCC] FunctionAnalysisManagerCGSCCProxy: do not clear immutable function passes
Check with the analysis result by calling invalidate instead of clear on
the analysis manager.

Differential Revision: https://reviews.llvm.org/D98440
2021-03-11 18:15:28 -08:00
Nikita Popov 403da6a69a Reapply [LICM] Make promotion faster
Relative to the previous implementation, this always uses
aliasesUnknownInst() instead of aliasesPointer() to correctly
handle atomics. The added test case was previously miscompiled.

-----

Even when MemorySSA-based LICM is used, an AST is still populated
for scalar promotion. As the AST has quadratic complexity, a lot
of time is spent in this step despite the existing access count
limit. This patch optimizes the identification of promotable stores.

The idea here is pretty simple: We're only interested in must-alias
mod sets of loop invariant pointers. As such, only populate the AST
with loop-invariant loads and stores (anything else is definitely
not promotable) and then discard any sets which alias with any of
the remaining, definitely non-promotable accesses.

If we promoted something, check whether this has made some other
accesses loop invariant and thus possible promotion candidates.

This is much faster in practice, because we need to perform AA
queries for O(NumPromotable^2 + NumPromotable*NumNonPromotable)
instead of O(NumTotal^2), and NumPromotable tends to be small.
Additionally, promotable accesses have loop invariant pointers,
for which AA is cheaper.

This has a signicant positive compile-time impact. We save ~1.8%
geomean on CTMark at O3, with 6% on lencod in particular and 25%
on individual files.

Conceptually, this change is NFC, but may not be so in practice,
because the AST is only an approximation, and can produce
different results depending on the order in which accesses are
added. However, there is at least no impact on the number of promotions
(licm.NumPromoted) in test-suite O3 configuration with this change.

Differential Revision: https://reviews.llvm.org/D89264
2021-03-11 10:50:28 +01:00
Juneyoung Lee 720a828045 Resolve unused variable warning (NFC) 2021-03-11 12:03:03 +09:00
Juneyoung Lee 8652c3e1a3 [InstSimplify] Pass SimplifyQuery to computePointerICmp (NFC) 2021-03-11 11:13:46 +09:00
Ta-Wei Tu 7ff2768be1 Revert "[LoopInterchange] Replace tightly-nesting-ness check with the one from `LoopNest`"
This reverts commit df9158c9a4.
2021-03-11 01:24:43 +08:00
Alex Richardson b26d6758f0 [SLC] Simplify strcpy and friends with non-zero address spaces
The current logic in TargetLibraryInfoImpl::getLibFunc() was only treating
strcpy, etc. with i8* arguments in address space zero as a valid library
function. However, in the CHERI and Morello targets we expect all libc
functions to use address space 200 arguments.

This commit updates isValidProtoForLibFunc() to check that the argument
is a pointer type. This also drops the check for i8* since we should not
be checking the pointee type any more.

Reviewed By: arsenm
Differential Revision: https://reviews.llvm.org/D95142
2021-03-10 11:17:34 +00:00
William S. Moses 875891a10d [MemoryDependence] Fix invariant group store
Fix bug in MemoryDependence [and thus GVN] for invariant group.

Previously MemDep didn't verify that the store was storing into a
pointer rather than a store simply using a pointer.

Differential Revision: https://reviews.llvm.org/D98267
2021-03-09 19:03:39 -05:00
Juneyoung Lee f49354838e Revert "[InstCombine] Add simplification of two logical and/ors"
This reverts commit 07c3b97e18 due to a reported failure in two-stage build.
2021-03-10 05:48:31 +09:00
Philip Reames a25b537bf4 [SCEV] Infer known bits from known sign bits
This was suggested by lebedev.ri over on D96534.  You'll note lack of tests.  During review, we weren't actually able to find a case which exercises it, but both I and lebedev.ri feel it's a reasonable change, straight forward, and near free.

Differential Revision: https://reviews.llvm.org/D97064
2021-03-09 12:37:17 -08:00
Benjamin Kramer 0d96ea0792 [ValueTracking] Move matchSimpleRecurrence out of line
The header only has a forward declaration of PHINode available, and this
function doesn't seem to get much out of inlining.
2021-03-09 00:04:47 +01:00
Sanjay Patel 34d0d644ff [ValueTracking] move/add helper to get inverse min/max; NFC
We will need to this functionality to improve min/max folds
in instcombine when we canonicalize to intrinsics.
2021-03-08 17:38:22 -05:00
Sanjay Patel 0a2d69480d [InstSimplify] cttz(1<<x) --> x
https://alive2.llvm.org/ce/z/TDacYu
https://alive2.llvm.org/ce/z/KF84S3
2021-03-08 16:30:14 -05:00
Alina Sbirlea 29482426b5 Revert "[LICM] Make promotion faster"
Revert 3d8f842712
Revision triggers a miscompile sinking a store incorrectly outside a
threading loop. Detected by tsan.
Reverting while investigating.

Differential Revision: https://reviews.llvm.org/D89264
2021-03-08 12:53:03 -08:00
Philip Reames d9a29a6752 constify getUnderlyingObject implementation [nfc] 2021-03-08 11:32:54 -08:00
Ta-Wei Tu df9158c9a4 [LoopInterchange] Replace tightly-nesting-ness check with the one from `LoopNest`
The check `tightlyNested()` in `LoopInterchange` is similar to the one in `LoopNest`.
In fact, the former misses some cases where loop-interchange is not feasible and results in incorrect behaviour.
Replacing it with the much robust version provided by `LoopNest` reduces code duplications and fixes https://bugs.llvm.org/show_bug.cgi?id=48113.

`LoopInterchange` has a weaker definition of tightly or perfectly nesting-ness than the one implemented in `LoopNest::arePerfectlyNested()`.
Therefore, `tightlyNested()` is instead implemented with `LoopNest::checkLoopsStructure` and additional checks for unsafe instructions.

Reviewed By: Whitney

Differential Revision: https://reviews.llvm.org/D97290
2021-03-08 11:36:08 +08:00
Juneyoung Lee 07c3b97e18 [InstCombine] Add simplification of two logical and/ors
This is a patch that adds folding of two logical and/ors that share one variable:

a && (a && b) -> a && b
a && (a & b)  -> a && b
...

This is towards removing the poison-unsafe select optimization (D93065 has more context).

Reviewed By: nikic

Differential Revision: https://reviews.llvm.org/D96945
2021-03-08 02:38:43 +09:00
Juneyoung Lee 2c16c4a43c [ValueTracking] update directlyImpliesPoison to look into select's condition
This is a minor update in directlyImpliesPoison and makes it look into select's
condition.
Splitted from https://reviews.llvm.org/D96945
2021-03-07 23:16:44 +09:00
Fangrui Song e6a104465d [ModuleSummaryAnalysis] Avoid duplicate elements in Worklist. NFC 2021-03-06 14:19:22 -08:00
Nikita Popov f278734bf1 [Loads] Restructure getAvailableLoadStore implementation (NFC)
Separate out some conditions with early exits, to make it easier to
support additional cases.
2021-03-06 16:58:11 +01:00
Nikita Popov edf7004851 [ConstantFold] Handle vectors in ConstantFoldLoadThroughBitcast()
There seems to be an impedance mismatch between what the type
system considers an aggregate (structs and arrays) and what
constants consider an aggregate (structs, arrays and vectors).

Rather than adjusting the type check, simply drop it entirely,
as getAggregateElement() is well-defined for non-aggregates: It
simply returns null in that case.
2021-03-06 12:17:56 +01:00
Nikita Popov a917fb89dc [LVI] Simplify and generalize handling of clamp patterns
Instead of handling a number of special cases for selects, handle
this generally when inferring ranges from conditions. We already
infer ranges from `x + C pred C2` to `x`, so doing the same for
`x pred C2` to `x + C` is straightforward.
2021-03-06 10:42:41 +01:00
Nikita Popov b42be01788 [LVI] Pass offset by reference (NFC)
Instead of by pointer. This allows us to use offsets that are not
materialized in the IR.
2021-03-06 10:24:44 +01:00
Wei Mi 2357d29335 [SampleFDO] Another fix to prevent repeated indirect call promotion in
sample loader pass.

In https://reviews.llvm.org/rG5fb65c02ca5e91e7e1a00e0efdb8edc899f3e4b9,
to prevent repeated indirect call promotion for the same indirect call
and the same target, we used zero-count value profile to indicate an
indirect call has been promoted for a certain target. We removed
PromotedInsns cache in the same patch. However, there was a problem in
that patch described below, and that problem led me to add PromotedInsns
back as a mitigation in
https://reviews.llvm.org/rG4ffad1fb489f691825d6c7d78e1626de142f26cf.

When we get value profile from metadata by calling getValueProfDataFromInst,
we need to specify the maximum possible number of values we expect to read.
We uses MaxNumPromotions in the last patch so the maximum number of value
information extracted from metadata is MaxNumPromotions. If we have many
values including zero-count values when we write the metadata, some of them
will be dropped when we read them because we only read MaxNumPromotions
values. It will allow repeated indirect call promotion again. We need to
make sure if there are values indicating promoted targets, those values need
to be saved in metadata with higher priority than other values.

The patch fixed that problem. We change to use -1 to represent the count
of a promoted target instead of 0 so it is easier to sort the values.
When we prepare to update the metadata in updateIDTMetaData, we will sort
the values in the descending count order and extract only MaxNumPromotions
values to write into metadata. Since -1 is the max uint64_t number, if we
have equal to or less than MaxNumPromotions of -1 count values, they will
all be kept in metadata. If we have more than MaxNumPromotions of -1 count
values, we will only save MaxNumPromotions such values maximally. In such
case, we have logic in place in doesHistoryAllowICP to guarantee no more
promotion in sample loader pass will happen for the indirect call, because
it has been promoted enough.

With this change, now we can remove PromotedInsns without problem.

Differential Revision: https://reviews.llvm.org/D97350
2021-03-04 18:44:12 -08:00