Commit Graph

10990 Commits

Author SHA1 Message Date
Florian Hahn 819bca9b90
[SCEV] Use APIntOps::umin to select best max BC count (NFC).
Suggested in D102267, but I missed this in the committed version.
2021-11-12 12:20:01 +00:00
Mircea Trofin f64eee1625 [NFC][InlineAdvisor] Inform advisor when the module is invalidated
This avoids unnecessary re-calculation of module-wide features in the
MLInlineAdvisor. In cases where function passes don't invalidate
functions (and, thus, don't invalidate the module), but we re-process a
CGSCC, we currently refreshed module features unnecessarily. The
overhead of fetching cached results (albeit they weren't themselves
invalidated) was noticeable in certain modules' compilations.

We don't want to just invalidate the advisor object, though, via the
analysis manager, because we'd then need to re-create expensive state
(like the model evaluator in the ML 'development' mode).

Reviewed By: phosek

Differential Revision: https://reviews.llvm.org/D113644
2021-11-11 10:23:49 -08:00
duanbo.db 53dc525828 [LoopInfo] Fix function getInductionVariable
The way function gets the induction variable is by judging whether
StepInst or IndVar in the phi statement is one of the operands of CMP.
But if the LatchCmpOp0/LatchCmpOp1 is a constant,  the subsequent
comparison may result in null == null, which is meaningless. This patch
fixes the typo.

Reviewed By: Whitney

Differential Revision: https://reviews.llvm.org/D112980
2021-11-11 16:22:42 +08:00
Bin Cheng bf76e64854 [BPI] Push exit block rather than exiting ones in getSccExitBlocks
The function BranchProbabilityInfo::SccInfo::getSccExitBlocks is
supposed to collect all exit blocks for SCC rather than all exiting
blocks. This patch fixes the typo.

Reviewed By: ebrevnov

Differential Revision: https://reviews.llvm.org/D113344
2021-11-11 14:22:19 +08:00
Chris Jackson 116dc70cf3 [DebugInfo][LSR] Add more stringent checks on IV selection and salvage
attempts

Prevent the selection of IVs that have a SCEV containing an undef. Also
prevent salvaging attempts for values for which a SCEV could not be
created by ScalarEvolution and have only SCEVUknown.

Reviewed by: Orlando

Differential Revision: https://reviews.llvm.org/D111810
2021-11-09 13:09:37 +00:00
Roman Lebedev d484cc152b
[TTI] Adjust `getReplicationShuffleCost()` interface
It is trivial to produce DemandedSrcElts given DemandedReplicatedElts,
so don't pass the former. Also, it isn't really useful so far
to have the overload taking the Mask, so just inline it.
2021-11-09 14:07:59 +03:00
Michael Liao bf225939bc [InferAddressSpaces] Support assumed addrspaces from addrspace predicates.
- CUDA cannot associate memory space with pointer types. Even though Clang could add extra attributes to specify the address space explicitly on a pointer type, it breaks the portability between Clang and NVCC.
- This change proposes to assume the address space from a pointer from the assumption built upon target-specific address space predicates, such as `__isGlobal` from CUDA. E.g.,

```
  foo(float *p) {
    __builtin_assume(__isGlobal(p));
    // From there, we could assume p is a global pointer instead of a
    // generic one.
  }
```

This makes the code portable without introducing the implementation-specific features.

Note that NVCC starts to support __builtin_assume from version 11.

Reviewed By: arsenm

Differential Revision: https://reviews.llvm.org/D112041
2021-11-08 16:51:57 -05:00
Sander de Smalen 2829376bb2 [LV] Use VScaleForTuning to fine-tune the cost per lane.
When targeting a specific CPU with scalable vectorization, the knowledge
of that particular CPU's vscale value can be used to tune the cost-model
and make the cost per lane less pessimistic.

If the target implements 'TTI.getVScaleForTuning()', the cost-per-lane
is calculated as:

  Cost / (VScaleForTuning * VF.KnownMinLanes)

Otherwise, it assumes a value of 1 meaning that the behavior
is unchanged and calculated as:

  Cost / VF.KnownMinLanes

Reviewed By: kmclaughlin, david-arm

Differential Revision: https://reviews.llvm.org/D113209
2021-11-08 16:59:46 +00:00
Nikita Popov a8c318b50e [BasicAA] Use index size instead of pointer size
When accumulating the GEP offset in BasicAA, we should use the
pointer index size rather than the pointer size.

Differential Revision: https://reviews.llvm.org/D112370
2021-11-07 18:56:11 +01:00
Benjamin Kramer 9b8b16457c Put implementation details into anonymous namespaces. NFCI. 2021-11-07 15:18:30 +01:00
Kazu Hirata 843d1eda18 [llvm] Use llvm::reverse (NFC) 2021-11-06 19:31:18 -07:00
Nikita Popov e3cec17b2d [InstSimplify] Remove incorrect icmp of gep fold (PR52429)
As described in https://bugs.llvm.org/show_bug.cgi?id=52429 this
fold is incorrect, because inbounds only guarantees that the
pointers don't wrap in the unsigned space: It is possible that
the sign boundary is crossed by an object.

I'm dropping the fold entirely rather than adjusting it, because
computePointerICmp() fully subsumes it (just with correct predicate
handling).

Differential Revision: https://reviews.llvm.org/D113343
2021-11-06 21:03:21 +01:00
Roman Lebedev a30ec4778a
[TTI][CostModel] `getUserCost()`: recognize replication shuffles and query their cost
This finally creates proper test coverage for replication shuffles,
that are used by LV for conditional loads, and will allow to add
proper costmodel at least for AVX512.

Reviewed By: RKSimon

Differential Revision: https://reviews.llvm.org/D113324
2021-11-06 16:45:15 +03:00
Roman Lebedev f8efc5c0ac
[NFC][TTI] Add/extract `getReplicationShuffleCost()` method, deduplicate it's implementations
Hiding it in `getInterleavedMemoryOpCost()` is problematic for a number of reasons,
including testability and reuse, let's do better.

In a followup `getUserCost()` will be taught to use to to estimate the mask costs,
which will allow for better cost model tests for it.

Reviewed By: RKSimon

Differential Revision: https://reviews.llvm.org/D113313
2021-11-06 16:45:15 +03:00
Kazu Hirata 87e53a0ad8 [llvm] Use make_early_inc_range (NFC) 2021-11-05 19:39:07 -07:00
Philip Reames d24a0e8857 [SCEV] Use constant range of RHS to prove NUW on narrow IV in trip count logic
The basic idea here is that given a zero extended narrow IV, we can prove the inner IV to be NUW if we can prove there's a value the inner IV must take before overflow which must exit the loop.

Differential Revision: https://reviews.llvm.org/D109457
2021-11-05 15:36:47 -07:00
David Green 61225c0818 [ValueTracking][InstCombine] Introduce and use ComputeMinSignedBits
This introduces a new ComputeMinSignedBits method for ValueTracking that
returns the BitWidth - SignBits + 1 from ComputeSignBits, and represents
the minimum bit size for the value as a signed integer.  Similar to the
existing APInt::getMinSignedBits method, this can make some of the
reasoning around ComputeSignBits more natural.

See https://reviews.llvm.org/D112298
2021-11-05 14:41:37 +00:00
Arthur Eubanks 7175886a0f [NewPM] Make eager analysis invalidation per-adaptor
Follow-up change to D111575.
We don't need eager invalidation on every adaptor. Most notably,
adaptors running passes that use very few analyses, or passes that
purely invalidate specific analyses.

Also allow testing of this via a pipeline string
"function<eager-inv>()".

The compile time/memory impact of this is very comparable to D111575.
https://llvm-compile-time-tracker.com/compare.php?from=9a2eec512a29df45c90c2fcb741e9d5c693b1383&to=b9f20bcdea138060967d95a98eab87ce725b22bb&stat=instructions

Reviewed By: nikic

Differential Revision: https://reviews.llvm.org/D113196
2021-11-04 17:16:11 -07:00
Liren Peng 57e093162e [ScalarEvolution] Infer loop max trip count from array accesses
Data references in a loop should not access elements over the
statically allocated size. So we can infer a loop max trip count
from this undefined behavior.

Reviewed By: reames, mkazantsev, nikic

Differential Revision: https://reviews.llvm.org/D109821
2021-11-03 10:40:18 +08:00
Nikita Popov c00e9c6345 [BasicAA] Check known access sizes earlier (NFC)
All heuristics for variable accesses require both access sizes to
be known, so check this once at the start, rather than for each
particular heuristic.
2021-11-02 21:26:26 +01:00
Nikita Popov 0b6ed92c8a [BasicAA] Use early returns (NFC)
Reduce nesting in aliasGEP() a bit by returning early.
2021-11-02 21:17:36 +01:00
Nikita Popov 51e9f33603 [BasicAA] Use saturating multiply on range if nsw
If we know that the var * scale multiplication is nsw, we can use
a saturating multiplication on the range (as a good approximation
of an nsw multiply). This recovers some cases where the fix from
D112611 is unnecessarily strict. (This can be further strengthened
by using a saturating add, but we currently don't track all the
necessary information for that.)

This exposes an issue in our NSW tracking for multiplies. The code
was assuming that (X +nsw Y) *nsw Z results in
(X *nsw Z) +nsw (Y *nsw Z) -- however, it is possible that the
distributed multiplications overflow, even if the non-distributed
one does not. We should discard the nsw flag if the the offset is
non-zero. If we just have (X *nsw Y) *nsw Z then concluding
X *nsw (Y *nsw Z) is fine.

Differential Revision: https://reviews.llvm.org/D112848
2021-11-02 20:27:39 +01:00
Arthur Eubanks e2024d72fa Revert "[NFC] Remove LinkAll*.h"
This reverts commit fe364e5dc7.

Causes breakages, e.g. https://lab.llvm.org/buildbot/#/builders/188/builds/5266
2021-11-02 09:08:09 -07:00
Arthur Eubanks fe364e5dc7 [NFC] Remove LinkAll*.h
These were added to prevent functions from being removed by WPO.

But that doesn't make sense, correct WPO will not remove functions we actually use.

I noticed these because compiling cc1_main.cpp was pulling in random LLVM pass headers.

Reviewed By: MaskRay

Differential Revision: https://reviews.llvm.org/D112971
2021-11-02 08:43:17 -07:00
Arthur Eubanks 029f1a5344 [LazyCallGraph] Skip blockaddresses
blockaddresses do not participate in the call graph since the only
instructions that use them must all return to someplace within the
current function. And passes cannot retrieve a function address from a
blockaddress.

This was suggested by efriedma in D58260.

Fixes PR50881.

Reviewed By: nickdesaulniers

Differential Revision: https://reviews.llvm.org/D112178
2021-11-01 13:10:24 -07:00
Nikita Popov 4972d12185 [SCEV] Only add direct loop users (NFC)
It it now sufficient to track only direct addrec users of a loop,
and let the SCEVUsers mechanism track and invalidate transitive users.

Differential Revision: https://reviews.llvm.org/D112875
2021-11-01 18:49:43 +01:00
Max Kazantsev e512c5b166 [SCEV][NFC] Factor out common API for getting unique operands of a SCEV
This function is used at least in 2 places, to it makes sense to make it separate.

Differential Revision: https://reviews.llvm.org/D112516
Reviewed By: reames
2021-11-01 11:36:47 +07:00
Kazu Hirata c8b1ed5fb2 [clang, llvm] Use Optional::getValueOr (NFC) 2021-10-30 19:00:21 -07:00
David Green 2c4a9e830c [ValueTracking] Teach computeConstantRange that the maximum value of a half is 65504
The maximal value of a half is 0x7bff, which is 65504 when converted to
an integer. This patch teaches that to computeConstantRange to compute a
constant range with the correct maximum value.
https://alive2.llvm.org/ce/z/BV_Spb
https://alive2.llvm.org/ce/z/Nwuqvb

The maximum value for a float converted in the same way is 3.4e38, which
requires 129bits of data. I have not added that here as integer types so
larger are rare, compared to integers types larger than 17 bits require
for half floats.

The MVE tests change because instsimplify happens to be run as a part of
the backend, where it doesn't tend to for other backends.

Differential Revision: https://reviews.llvm.org/D112694
2021-10-30 14:27:38 +01:00
Kazu Hirata 972d4133e9 Use {DenseSet,SmallPtrSet}::contains (NFC) 2021-10-29 20:26:07 -07:00
Nikita Popov cdf45f98ca [BasicAA] Extract linear expression multiplication (NFC)
Extract a common method for multiplying a linear expression by a
factor.
2021-10-29 22:41:40 +02:00
Nikita Popov 7cf7378a9d [BasicAA] Don't treat non-inbounds GEP as nsw
The scale multiplication is only guaranteed to be nsw if the GEP
is inbounds (or the multiplication is trivial). Previously we were
only considering explicit muls in GEP indices.
2021-10-29 22:30:44 +02:00
modimo 5caad9b5d3 [InlineAdvisor] Add fallback/format switches and negative remark processing to Replay Inliner
Adds the following switches:

1. --sample-profile-inline-replay-fallback/--cgscc-inline-replay-fallback: controls what the replay advisor does for inline sites that are not present in the replay. Options are:

 1. Original: defers to original advisor
 2. AlwaysInline: inline all sites not in replay
 3. NeverInline: inline no sites not in replay

2. --sample-profile-inline-replay-format/--cgscc-inline-replay-format: controls what format should be generated to match against the replay remarks. Options are:

  1. Line
  2. LineColumn
  3. LineDiscriminator
  4. LineColumnDiscriminator

Adds support for negative inlining decisions. These are denoted by "will not be inlined into" as compared to the positive "inlined into" in the remarks.

All of these together with the previous `--sample-profile-inline-replay-scope/--cgscc-inline-replay-scope` allow tweaking in how to apply replay. In my testing, I'm using:
1. --sample-profile-inline-replay-scope/--cgscc-inline-replay-scope = Function to only replay on a function
2. --sample-profile-inline-replay-fallback/--cgscc-inline-replay-fallback = NeverInline since I'm feeding in only positive remarks to the replay system
3. --sample-profile-inline-replay-format/--cgscc-inline-replay-format = Line since I'm generating the remarks from DWARF information from GCC which can conflict quite heavily in column number compared to Clang

An alternative configuration could be to do Function, AlwaysInline, Line fallback with negative remarks which closer matches the final call-sites. Note that this can lead to unbounded inlining if a negative remark doesn't match/exist for one reason or another.

Updated various tests to cover the new switches and negative remarks

Testing:
ninja check-all

Reviewed By: wenlei, mtrofin

Differential Revision: https://reviews.llvm.org/D112040
2021-10-29 12:32:03 -07:00
Peter Waller 98f08752f7 [InstCombine][ConstantFolding] Make ConstantFoldLoadThroughBitcast TypeSize-aware
The newly added test previously caused the compiler to fail an
assertion. It looks like a strightforward TypeSize upgrade.

Reviewed By: paulwalker-arm

Differential Revision: https://reviews.llvm.org/D112142
2021-10-28 12:15:15 +00:00
Max Kazantsev 513914e1f3 [SCEV] Invalidate user SCEVs along with operand SCEVs to avoid cache corruption
Following discussion in D110390, it seems that we are suffering from unability
to traverse users of a SCEV being invalidated. The result of that is that ScalarEvolution's
inner caches may store obsolete data about SCEVs even if their operands are
forgotten. It creates problems when we try to verify the contents of those caches.

It's also a frequent situation when messing with cache causes very sneaky and
hard-to-analyze bugs related to corruption of memory when dealing with cached
data. They are lurking there because ScalarEvolution's veirfication is not powerful
enough and misses many problematic cases. I plan to make SCEV's verification
much stricter in follow-ups, and this requires dangling-pointers-free caches.

This patch makes sure that, whenever we forget cached information for a SCEV,
we also forget it for all SCEVs that (transitively) use it.

This may have negative compile time impact. It's a sacrifice we are more
than willing to make to enforce correctness. We can also save some time by
reworking invokers of forgetMemoizedResults (maybe we can forget multiple
SCEVs with single query).

Differential Revision: https://reviews.llvm.org/D111533
Reviewed By: reames
2021-10-28 09:39:24 +07:00
Nikita Popov 665060ea45 [BasicAA] Remove misleading overflow check
GEP decomposition currently checks whether the multiplication of
the linear expression offset and GEP scale overflows. However, if
everything else works correctly, this overflow check is both
unnecessary and dangerously misleading. While it will avoid an
overflow in Scale * Offset in particular, other parts of the
calculation (including those on dynamic values) may still overflow.
The code working on the decomposed GEPs is responsible for ensuring
that it remains correct in the presence of overflow. D112611 fixes
the last issue of that kind that I'm aware of (in fact, the overflow
check was originally introduced to work around precisely that issue).

Differential Revision: https://reviews.llvm.org/D112618
2021-10-27 20:56:03 +02:00
Philip Reames 425cbbc602 [Operator] Add hasPoisonGeneratingFlags [mostly NFC]
This method parallels the dropPoisonGeneratingFlags on Instruction, but is hoisted to operator to handle constant expressions as well.

This is mostly code movement, but I did go ahead and add the inrange constexpr gep case.  This had been discussed previously, but apparently never followed up o.
2021-10-27 11:25:40 -07:00
Nikita Popov fbc0c308d5 [BasicAA] Handle known bits as ranges
BasicAA currently tries to determine that the offset is positive by
checking whether all variable indices are positive based on known
bits, multiplied by a positive scale. However, this is incorrect
if the scale multiplication might overflow. In the modified test
case the original value is positive, but may be negative after a
left shift.

Fix this by converting known bits into a constant range and reusing
the range-based logic, which handles overflow correctly.

Differential Revision: https://reviews.llvm.org/D112611
2021-10-27 14:41:31 +02:00
Nikita Popov 9bc7e543b4 [BasicAA] Make range check more precise
Make the range check more precise by calculating the range of
potentially accessed bytes for both accesses and checking whether
their intersection is empty. In that case there can be no overlap
between the accesses and the result is NoAlias.

This is more powerful than the previous approach, because it can
deal with sign-wrapped ranges. In the test case the original range
is [-1, INT_MAX] but becomes [0, INT_MIN] after applying the offset.
This is a wrapping range, so getSignedMin/getSignedMax will treat
it as a full range. However, the range excludes the elements
[INT_MIN+1, -1], which is enough to prove NoAlias with an access
at offset -1.

Differential Revision: https://reviews.llvm.org/D112486
2021-10-27 12:40:58 +02:00
Max Kazantsev 5961f0308f [SCEV][NFC] Verify intergity of SCEVUsers
Make sure that, for every living SCEV, we have all its direct
operand tracking it as their user.

Differential Revision: https://reviews.llvm.org/D112402
Reviewed By: reames
2021-10-27 09:54:49 +07:00
Nikita Popov 3a995c918e [SCEV] Move SCEVLostPoisonFlags() check into SCEVExpander
Always insert values into ExprValueMap, and instead skip using them
in SCEVExpander if poison-generating flags have been lost. This
ensures that all values that are in ValueExprMap are also in
ExprValueMap, so we can use the latter to invalidate the former.

This change is probably not entirely NFC for the case where
originally the SCEV had no nowrap flags but they were inferred
later, in which case that would now allow reusing the existing
value for expansion.

Differential Revision: https://reviews.llvm.org/D112389
2021-10-25 22:37:20 +02:00
Nikita Popov 0d20ebf686 [BasicAA] Use ranges for more than one index
D109746 made BasicAA use range information to determine the
minimum/maximum GEP offset. However, it was limited to the case of
a single variable index. This patch extends support to multiple
indices by adding all the ranges together.

Differential Revision: https://reviews.llvm.org/D112378
2021-10-25 15:30:50 +02:00
Nikita Popov 75384ecdf8 [InstSimplify] Refactor invariant.group load folding
Currently strip.invariant/launder.invariant are handled by
constructing constant expressions with the intrinsics skipped.
This takes an alternative approach of accumulating the offset
using stripAndAccumulateConstantOffsets(), with a flag to look
through invariant.group intrinsics.

Differential Revision: https://reviews.llvm.org/D112382
2021-10-25 10:56:25 +02:00
Kazu Hirata 3729a5abf4 [SCEV] Fix a warning on an unused lambda capture
This patch fixes:

  llvm/lib/Analysis/ScalarEvolution.cpp:12770:37: error: lambda
  capture 'this' is not used [-Werror,-Wunused-lambda-capture]
2021-10-25 00:45:18 -07:00
Max Kazantsev f8623b0783 [SCEV][NFC] Win some compile time from mass forgetMemoizedResults
Mass forgetMemoizedResults can be done more efficiently than bunch
of individual invocations of helper because we can traverse maps being
updated just once, rather than doing this for each invidivual SCEV.

Should be NFC and supposedly improves compile time.

Differential Revision: https://reviews.llvm.org/D112294
Reviewed By: reames
2021-10-25 14:09:41 +07:00
Max Kazantsev dbab339ea4 [SCEV][NFC] Apply mass forgetMemoizedResults queries where possible
When forgetting multiple SCEVs, rather than doing this one by one, we can
instead use mass updates. We plan to make them more efficient than they
are now, potentially improving compile time.

Differential Revision: https://reviews.llvm.org/D111602
Reviewed By: reames
2021-10-25 13:50:49 +07:00
Max Kazantsev a6096b7f9e [SCEV][NFC] Introduce API for mass forgetMemoizedResults query
This patch changes signature of forgetMemoizedResults to be able to work with
multiple SCEVs. Usage will come in follow-ups. We also plan to optimize it in the
future to work faster than individual invalidation updates. Should not change
behavior in any sense.

Split-off from D111602.

Differential Revision: https://reviews.llvm.org/D112293
Reviewed By: reames
2021-10-25 13:49:31 +07:00
Max Kazantsev 1c18ebb2cc [NFC][SCEV] Do not track users of SCEVConstants
Follow-up from D112295, suggested by Nikita: we can avoid tracking
users of SCEVConstants because dropping their cached info is unlikely
to give any new prospects for fact inference, and it should not introduce
any correctness problems.
2021-10-25 12:30:46 +07:00
Max Kazantsev fea4a48c0b [SCEV][NFC] API for tracking of SCEV users
This patch introduces API that keeps track of SCEVs users of
another SCEVs, required to handle invalidations of users along
with operands that comes in follow-up patches.

Differential Revision: https://reviews.llvm.org/D112295
Reviewed By: reames
2021-10-25 12:14:18 +07:00
Kazu Hirata 4bd46501c3 Use llvm::any_of and llvm::none_of (NFC) 2021-10-24 17:35:33 -07:00