Commit Graph

10854 Commits

Author SHA1 Message Date
Roman Lebedev 6f6e9a867f
[BasicTTIImpl][LoopUnroll] getUnrollingPreferences(): emit ORE remark when advising against unrolling due to a call in a loop
I'm not sure this is the best way to approach this,
but the situation is rather not very detectable unless we explicitly call it out when refusing to advise to unroll.

Reviewed By: efriedma

Differential Revision: https://reviews.llvm.org/D107271
2021-08-03 00:57:26 +03:00
Chang-Sun Lin, Jr b58eda39eb [ValueTracking] Fix computeConstantRange to use "may" instead of "always" semantics for llvm.assume
ValueTracking should allow for value ranges that may satisfy
llvm.assume, instead of restricting the ranges only to values that
will always satisfy the condition.

Differential Revision: https://reviews.llvm.org/D107298
2021-08-02 22:20:17 +02:00
Sanjay Patel 7f55557765 [Analysis] improve function signature checking for snprintf
The check for size_t parameter 1 was already here for snprintf_chk,
but it wasn't applied to regular snprintf. This could lead to
mismatching and eventually crashing as shown in:
https://llvm.org/PR50885
2021-07-31 15:17:20 -04:00
Kerry McLaughlin 9d35594993 Reland "[LV] Use lookThroughAnd with logical reductions"
If a reduction Phi has a single user which `AND`s the Phi with a type mask,
`lookThroughAnd` will return the user of the Phi and the narrower type represented
by the mask. Currently this is only used for arithmetic reductions, whereas loops
containing logical reductions will create a reduction intrinsic using the widened
type, for example:

  for.body:
    %phi = phi i32 [ %and, %for.body ], [ 255, %entry ]
    %mask = and i32 %phi, 255
    %gep = getelementptr inbounds i8, i8* %ptr, i32 %iv
    %load = load i8, i8* %gep
    %ext = zext i8 %load to i32
    %and = and i32 %mask, %ext
    ...

^ this will generate an and reduction intrinsic such as the following:
    call i32 @llvm.vector.reduce.and.v8i32(<8 x i32>...)

The same example for an add instruction would create an intrinsic of type i8:
    call i8 @llvm.vector.reduce.add.v8i8(<8 x i8>...)

This patch changes AddReductionVar to call lookThroughAnd for other integer
reductions, allowing loops similar to the example above with reductions such
as and, or & xor to vectorize.

Reviewed By: david-arm, dmgreen

Differential Revision: https://reviews.llvm.org/D105632
2021-07-30 18:04:09 +01:00
Sander de Smalen 84a4caeb84 [InstSimplify] Don't assume parent function when simplifying llvm.vscale.
D106850 introduced a simplification for llvm.vscale by looking at the
surrounding function's vscale_range attributes. The call that's being
simplified may not yet have been inserted into the IR. This happens for
example during function cloning.

This patch fixes the issue by checking if the instruction is in a
parent basic block.
2021-07-29 20:08:08 +01:00
Jun Ma e2fe26e77b [NFC][InstSimplify] Use more intuitive variable names. 2021-07-29 13:55:47 +08:00
Wenlei He 1a8087adaf [ThinLTO] Disallow importing for functions with indir branch to block address
We don't allowing inlining for functions with blockaddress with uses other than strictly callbr. This is because if the blockaddress escapes the function via a global variable, inlining may lead to an invalid cross-function reference.

We check against such cases during inlining, however the check can fail for ThinLTO post-link because CFG simplification can incorrectly removes blocks based on wrong block reachability.

When we import a function with blockaddress taken in a global variable but without importing that variable, we won't go through value mapping to reflect the real address-taken-ness of the cloned blocks. For the imported clone, this leads to blocks reachable from indirect branch through global variable being incorrectly treated as unreachable and removed by SimplifyCFG.

Since inlining for such cases shouldn't be allowed in the first place, I'm marking them as ineligible for importing during pre-link to save the problem of missing address-taken-ness of imported clone as well as bad DCE and inlining.

Differential Revision: https://reviews.llvm.org/D106930
2021-07-28 18:02:48 -07:00
Jun Ma ca0fe3447f [InstSimplify] Simplify llvm.vscale when vscale_range attribute exists
Reduce llvm.vscale to constant based on vscale_range attribute.

Differential Revision: https://reviews.llvm.org/D106850
2021-07-28 21:41:52 +08:00
Mircea Trofin 935dea2cb2 [MLGO] fix silly LLVM_DEBUG misuse 2021-07-27 15:10:28 -07:00
Mircea Trofin eb76ca573d [NFC][MLGO] Debug messages for what inline advisor is selected
We already have an indication (error) if the desired inline advisor
cannot be enabled, but we don't have a positive indication. Added
LLVM_DEBUG messages for the latter.
2021-07-27 15:05:39 -07:00
Anna Thomas 68ffed12b7 [IVDescriptors] Fix bug in checkOrderedReduction
The Exit instruction passed in for checking if it's an ordered reduction need not be
an FPAdd operation. We need to bail out at that point instead of
assuming it is an FPAdd (and hence has two operands). See added testcase.
It crashes without the patch because the Exit instruction is a phi with
exactly one operand.
This latent bug was exposed by 95346ba which added support for
multi-exit loops for vectorization.

Reviewed-By: kmclaughlin
Differential Revision: https://reviews.llvm.org/D106843
2021-07-27 09:31:44 -04:00
Johannes Doerfert 75636868e2 [InstSimplify] Expose generic interface for replaced operand simplification
Users, especially the Attributor, might replace multiple operands at
once. The actual implementation of simplifyWithOpReplaced is able to
handle that just fine, the interface was simply not allowing to replace
more than one operand at a time. This is exposing a more generic
interface without intended changes for existing code.

Differential Revision: https://reviews.llvm.org/D106189
2021-07-27 00:56:12 -05:00
Philip Reames f82f39b9cf [SCEV] Add a comment about invariant in howManyLessThans 2021-07-26 16:39:26 -07:00
Eli Friedman 5c486ce04d [LLVM IR] Allow volatile stores to trap.
Proposed alternative to D105338.

This is ugly, but short-term I think it's the best way forward: first,
let's formalize the hacks into a coherent model. Then we can consider
extensions of that model (we could have different flavors of volatile
with different rules).

Differential Revision: https://reviews.llvm.org/D106309
2021-07-26 10:51:00 -07:00
Florian Hahn 6d753b0751
[LAA] Remove RuntimeCheckingPtrGroup::RtCheck member (NFC).
This patch removes RtCheck from RuntimeCheckingPtrGroup to make it
possible to construct RuntimeCheckingPtrGroup objects without a
RuntimePointerChecking object. This should make it easier to
re-use the code to generate runtime checks, e.g. in D102834.

RtCheck was only used to access the pointer info for a given index.
Instead, the start and end expressions can be passed directly.

For code-gen, we also need to know the address space to use. This can
also be explicitly passed at construction.

Reviewed By: efriedma

Differential Revision: https://reviews.llvm.org/D105481
2021-07-26 17:38:10 +01:00
Nikita Popov 33146857e9 [IR] Consider non-willreturn as side effect (PR50511)
This adjusts mayHaveSideEffect() to return true for !willReturn()
instructions. Just like other side-effects, non-willreturn calls
(aka "divergence") cannot be removed and cannot be reordered relative
to other side effects. This fixes a number of bugs where
non-willreturn calls are either incorrectly dropped or moved. In
particular, it also fixes the last open problem in
https://bugs.llvm.org/show_bug.cgi?id=50511.

I performed a cursory review of all current mayHaveSideEffect()
uses, which convinced me that these are indeed the desired default
semantics. Places that do not want to consider non-willreturn as a
sideeffect generally do not want mayHaveSideEffect() semantics at
all. I identified two such cases, which are addressed by D106591
and D106742. Finally, there is a use in SCEV for which we don't
really have an appropriate API right now -- what it wants is
basically "would this be considered forward progress". I've just
spelled out the previous semantics there.

Differential Revision: https://reviews.llvm.org/D106749
2021-07-26 16:35:14 +02:00
Paul Walker 8a8d01d58c [NFC] Change VFShape so it contains an ElementCount rather than seperate VF and IsScalable properties.
Differential Revision: https://reviews.llvm.org/D106750
2021-07-26 12:25:46 +01:00
Philipp Krones 46c0366877 [Inliner] Make the CallPenalty configurable
Tests with multiple benchmarks, like Embench [1], showed that the
CallPenalty magic number has the most influence on inlining decisions
when optimizing for size.

On the other hand, there was no good default value for this parameter.
Some benchmarks profited strongly from a reduced call penalty. On
example is the picojpeg benchmark compiled for RISC-V, which got 6%
smaller with a CallPenalty of 10 instead of 12. Other benchmarks
increased in size, like matmult.

This commit makes the compromise of turning the magic number constant of
CallPenalty into a configurable value. This introduces the flag
`--inline-call-penalty`. With that flag users can fine tune the inliner
to their needs.

The CallPenalty constant was also used for loops. This commit replaces
the CallPenalty constant with a new LoopPenalty constant that is now
used instead.

This is a slimmed down version of https://reviews.llvm.org/D30899

[1]: https://github.com/embench/embench-iot

Differential Revision: https://reviews.llvm.org/D105976
2021-07-26 12:07:49 +01:00
David Sherwood 0aff1798b5 [Analysis] Add simple cost model for strict (in-order) reductions
I have added a new FastMathFlags parameter to getArithmeticReductionCost
to indicate what type of reduction we are performing:

  1. Tree-wise. This is the typical fast-math reduction that involves
  continually splitting a vector up into halves and adding each
  half together until we get a scalar result. This is the default
  behaviour for integers, whereas for floating point we only do this
  if reassociation is allowed.
  2. Ordered. This now allows us to estimate the cost of performing
  a strict vector reduction by treating it as a series of scalar
  operations in lane order. This is the case when FP reassociation
  is not permitted. For scalable vectors this is more difficult
  because at compile time we do not know how many lanes there are,
  and so we use the worst case maximum vscale value.

I have also fixed getTypeBasedIntrinsicInstrCost to pass in the
FastMathFlags, which meant fixing up some X86 tests where we always
assumed the vector.reduce.fadd/mul intrinsics were 'fast'.

New tests have been added here:

  Analysis/CostModel/AArch64/reduce-fadd.ll
  Analysis/CostModel/AArch64/sve-intrinsics.ll
  Transforms/LoopVectorize/AArch64/strict-fadd-cost.ll
  Transforms/LoopVectorize/AArch64/sve-strict-fadd-cost.ll

Differential Revision: https://reviews.llvm.org/D105432
2021-07-26 10:26:06 +01:00
Liqiang Tao 4bdfea2c51 [llvm][Inline] Add interface to return cost-benefit stuff
Return cost-benefit stuff which is computed by cost-benefit analysis.

Reviewed By: mtrofin

Differential Revision: https://reviews.llvm.org/D105349
2021-07-25 20:18:19 +08:00
Philip Reames ec43def700 Style tweaks for SCEV's computeMaxBECountForLT [NFC] 2021-07-23 17:19:45 -07:00
Philip Reames 4a3dc7dc9a [SCEV] Fix bug involving zero step and non-invariant RHS in trip count logic
Eli pointed out the issue when reviewing D104140. The max trip count logic makes an assumption that the value of IV changes. When the step is zero, the nowrap fact becomes trivial, and thus there's nothing preventing the loop from being nearly infinite. (The "nearly" part is because mustprogress may disallow an infinite loop while still allowing 999999999 iterations before RHS happens to allow an exit.)

This is very difficult to see in practice. You need a means to produce a loop varying RHS in a mustprogress loop which doesn't allow the loop to be infinite. In most cases, LICM or SCEV are smart enough to remove the loop varying expressions.

Differential Revision: https://reviews.llvm.org/D106327
2021-07-23 15:19:23 -07:00
Mircea Trofin 55e12f7080 [NFC][MLGO] Just use the underlying protobuf object for logging
Avoid buffering just to copy the buffered data, in 'development
mode', when logging. Instead, just populate the underlying protobuf.

Differential Revision: https://reviews.llvm.org/D106592
2021-07-23 10:56:48 -07:00
Serge Pavlov 1c64b5dc5e [ConstantFolding] Fold constrained arithmetic intrinsics
Constfold constrained variants of operations fadd, fsub, fmul, fdiv,
frem, fma and fmuladd.

The change also sets up some means to support for removal of unused
constrained intrinsics. They are declared as accessing memory to model
interaction with floating point environment, so they were not removed,
as they have side effect. Now constrained intrinsics that have
"fpexcept.ignore" as exception behavior are removed if they have no uses.
As for intrinsics that have exception behavior other than "fpexcept.ignore",
they can be removed if it is known that they do not raise floating point
exceptions. It happens when doing constant folding, attributes of such
intrinsic are changed so that the intrinsic is not claimed as accessing
memory.

Differential Revision: https://reviews.llvm.org/D102673
2021-07-23 14:39:51 +07:00
Mircea Trofin df0066a1c9 [NFC][MLGO] Fix vector sizing
The bots only build release mode, and the use of `reserve` instead of
`resize`, while not causing invalid memory accesses, is incorrect.
2021-07-22 13:06:00 -07:00
Joseph Huber 754eb1c210 [OpenMP] Change `__kmpc_free_shared` to include the paired allocation size
This patch changes `__kmpc_free_shared` to take an additional argument
corresponding to the associated allocation's size. This makes it easier to
implement the allocator in the runtime.

Reviewed By: jdoerfert

Differential Revision: https://reviews.llvm.org/D106496
2021-07-21 20:56:21 -04:00
Jacob Hegna cfc4def85d [NFC] Code cleanups in InlineCost.cpp.
- annotate const functions with "const"
 - replace C-style casts with static_cast

Differential Revision: https://reviews.llvm.org/D105362
2021-07-22 00:03:36 +00:00
Kerry McLaughlin be753b207f Revert "[LV] Use lookThroughAnd with logical reductions"
Reverting patch due to buildbot failures.

This reverts commit e22a599672.
2021-07-21 15:16:00 +01:00
Rosie Sumpter 44c9adb414 [LoopFlatten][LoopInfo] Use Loop to identify latch compare instruction
Make getLatchCmpInst non-static and use it in LoopFlatten as a more
robust way of identifying the compare.

Differential Revision: https://reviews.llvm.org/D106256
2021-07-21 10:14:18 +01:00
Kerry McLaughlin e22a599672 [LV] Use lookThroughAnd with logical reductions
If a reduction Phi has a single user which `AND`s the Phi with a type mask,
`lookThroughAnd` will return the user of the Phi and the narrower type represented
by the mask. Currently this is only used for arithmetic reductions, whereas loops
containing logical reductions will create a reduction intrinsic using the widened
type, for example:

  for.body:
    %phi = phi i32 [ %and, %for.body ], [ 255, %entry ]
    %mask = and i32 %phi, 255
    %gep = getelementptr inbounds i8, i8* %ptr, i32 %iv
    %load = load i8, i8* %gep
    %ext = zext i8 %load to i32
    %and = and i32 %mask, %ext
    ...

^ this will generate an and reduction intrinsic such as the following:
    call i32 @llvm.vector.reduce.and.v8i32(<8 x i32>...)

The same example for an add instruction would create an intrinsic of type i8:
    call i8 @llvm.vector.reduce.add.v8i8(<8 x i8>...)

This patch changes AddReductionVar to call lookThroughAnd for other integer
reductions, allowing loops similar to the example above with reductions such
as and, or & xor to vectorize.

Reviewed By: david-arm, dmgreen

Differential Revision: https://reviews.llvm.org/D105632
2021-07-21 09:56:00 +01:00
Sanjay Patel 13302c06cd [ConstantFolding] avoid crashing on a fake math library call
https://llvm.org/PR50960
2021-07-20 18:25:21 -04:00
Jacob Hegna 1f3e90e128 Fix Threshold overwrite bug in the Oz inlining model features.
Differential Revision: https://reviews.llvm.org/D106336
2021-07-20 18:05:06 +00:00
Eli Friedman de3ea51be4 [ScalarEvolution] Refine computeMaxBECountForLT to be accurate in more cases.
Allow arbitrary strides, and make sure we return the correct result when
the backedge-taken count is zero.

Differential Revision: https://reviews.llvm.org/D106197
2021-07-19 15:43:30 -07:00
Philip Reames 4402d0d4fb [SCEV] Add a clarifying comment in howManyLessThans
Wrap semantics are subtle when combined with multiple exits.  This has caused several rounds of confusion during recent reviews, so try to document the subtly distinction between when wrap flags provide <u and <=u facts.
2021-07-19 15:13:48 -07:00
Arthur Eubanks 6cbb35dd3b [NewPM] Bail out of devirtualization wrapper if the current SCC is invalidated
The specific case that triggered this was when inlining a recursive
internal function into itself caused the recursion to go away, allowing
the inliner to mark the function as dead. The inliner marks the SCC as
invalidated but does not provide a new SCC to continue with.

This matches the implementations of ModuleToPostOrderCGSCCPassAdaptor
and CGSCCPassManager.

Fixes PR50363.

Reviewed By: asbirlea

Differential Revision: https://reviews.llvm.org/D106306
2021-07-19 15:07:30 -07:00
Mircea Trofin 55e2d2060a [MLGO] Use binary protobufs for improved training performance.
It turns out that during training, the time required to parse the
textual protobuf of a training log is about the same as the time it
takes to compile the module generating that log. Using binary protobufs
instead elides that cost almost completely.

Differential Revision: https://reviews.llvm.org/D106157
2021-07-19 13:59:28 -07:00
Mindong Chen e908e063d1 [LoopUtils] Fix incorrect RT check bounds of loop-invariant mem accesses
This fixes the lower and upper bound calculation of a
RuntimeCheckingPtrGroup when it has more than one loop
invariant pointers. Resolves PR50686.

Reviewed By: fhahn

Differential Revision: https://reviews.llvm.org/D104148
2021-07-19 19:38:24 +08:00
Nikita Popov 2b17c24a03 [SCEV] Fix unused variable warning (NFC) 2021-07-18 23:12:22 +02:00
Wenlei He 68fa6f7c7c [CSSPGO][NFC] Allow cl::ZeroOrMore for use-iterative-bfi-inference 2021-07-18 13:22:32 -07:00
Eli Friedman 28a3ad3f86 [ScalarEvolution] Remove uses of PointerType::getElementType. 2021-07-18 13:14:33 -07:00
Kazu Hirata 1993b73755 [Analaysis, CodeGen] Remove getHotSucc (NFC)
These functions seem to be unused for at least 5 years.
2021-07-17 07:31:36 -07:00
Eli Friedman cbba71bfb5 [ScalarEvolution] Fix overflow in computeBECount.
The current implementation of computeBECount doesn't account for the
possibility that adding "Stride - 1" to Delta might overflow. For almost
all loops, it doesn't, but it's not actually proven anywhere.

To deal with this, use a variety of tricks to try to prove that the
addition doesn't overflow.  If the proof is impossible, use an alternate
sequence which never overflows.

Differential Revision: https://reviews.llvm.org/D105216
2021-07-16 16:15:18 -07:00
Eli Friedman 5d5b08761f [DependenceAnalysis] Guard analysis using getPointerBase().
D104806 broke some uses of getMinusSCEV() in DependenceAnalysis:
subtraction with different pointer bases returns a SCEVCouldNotCompute.
Make sure we avoid cases involving such subtractions.

Differential Revision: https://reviews.llvm.org/D106099
2021-07-15 14:57:32 -07:00
Philip Reames a99d420a93 [SCEV] Fix unsound reasoning in howManyLessThans
This is split from D105216, it handles only a subset of the cases in that patch.

Specifically, the issue being fixed is that the code incorrectly assumed that (Start-Stide) < End implied that the backedge was taken at least once. This is not true when e.g. Start = 4, Stride = 2, and End = 3. Note that we often do produce the right backedge taken count despite the flawed reasoning.

The fix chosen here is to use an alternate form of uceil (ceiling of unsigned divide) lowering which is safe when max(RHS,Start) > Start - Stride.  (Note that signedness of both max expression and comparison depend on the signedness of the comparison being analyzed, and that overflow in the Start - Stride expression is allowed.)  Note that this is weaker than proving the backedge is taken because it allows start - stride < end < start.  Some cases which can't be proven safe are sent down the generic path, and we do end up generating less optimal expressions in a few cases.

Credit for coming up with the approach goes entirely to Eli.  I just split it off, tweaked the comments a bit, and did some additional testing.

Differential Revision: https://reviews.llvm.org/D105942
2021-07-15 10:32:47 -07:00
Philip Reames 205ed009a4 [SCEV] Handle zero stride correctly in howManyLessThans
This is split from D105216, but the code is hoisted much earlier into
the path where we can actually get a zero stride flowing through. Some
fairly simple proofs handle the cases which show up in practice. The
only test changes are the cases where we really do need a non-zero
divider to produce the right result.

Recommitting with isLoopInvariant() check.

Differential Revision: https://reviews.llvm.org/D105921
2021-07-13 19:14:01 -07:00
Arthur Eubanks 5738819679 Revert "[SCEV] Handle zero stride correctly in howManyLessThans"
This reverts commit 4df591b5c9.

Causes crashes, see comments on D105921.
2021-07-13 17:53:48 -07:00
Eli Friedman bb8c7a980f [ScalarEvolution] Make isKnownNonZero handle more cases.
Using an unsigned range instead of signed ranges is a bit more precise.

Differential Revision: https://reviews.llvm.org/D105941
2021-07-13 15:36:45 -07:00
Philip Reames 4df591b5c9 [SCEV] Handle zero stride correctly in howManyLessThans
This is split from D105216, but the code is hoisted much earlier into the path where we can actually get a zero stride flowing through. Some fairly simple proofs handle the cases which show up in practice. The only test changes are the cases where we really do need a non-zero divider to produce the right result.

Differential Revision: https://reviews.llvm.org/D105921
2021-07-13 13:31:40 -07:00
Philip Reames 087310c71e [SCEV] Strengthen inference of RHS > Start in howManyLessThans
Split off from D105216 to simplify review.  Rewritten with a lambda to be easier to follow.  Comments clarified.

Sorry for no test case, this is tricky to exercise with the current structure of the code.  It's about to be hit more frequently in a follow up patch, and the change itself is simple.
2021-07-13 11:54:07 -07:00
Philip Reames e4b43973fb [ScalarEvolution] Fix overflow when computing max trip counts
This is split from D105216 to reduce patch complexity.  Original code by Eli with very minor modification by me.

The primary point of this patch is to add the getUDivCeilSCEV routine.  I included the two callers with constant arguments as we know those must constant fold even without any of the fancy inference logic.
2021-07-13 10:01:10 -07:00
Nikita Popov 6ac32872ee [Attributes] Replace doesAttrKindHaveArgument() (NFC)
This is now the same as isIntAttrKind(), so use that instead, as
it does not require manual maintenance. The naming is also more
accurate in that both int and type attributes have an argument,
but this method was only targeting int attributes.

I initially wanted to tighten the AttrBuilder assertion, but we
have some in-tree uses that would violate it.
2021-07-12 21:57:26 +02:00
Kazu Hirata 4f94121cce [Analysis] Remove changeCondBranchToUnconditionalTo (NFC)
The last use was removed on Jan 21, 2021 in commit
0895b836d7.
2021-07-10 17:31:43 -07:00
Eli Friedman 882ee7fbd6 Fix buildbot regression from 9c4baf5.
Apparently ScalarEvolution::isImpliedCond tries to truncate a pointer in
some obscure cases. Guard the code with a check for pointers.
2021-07-09 17:54:09 -07:00
Eli Friedman 9c4baf5101 [ScalarEvolution] Strictly enforce pointer/int type rules.
Rules:

1. SCEVUnknown is a pointer if and only if the LLVM IR value is a
   pointer.
2. SCEVPtrToInt is never a pointer.
3. If any other SCEV expression has no pointer operands, the result is
   an integer.
4. If a SCEVAddExpr has exactly one pointer operand, the result is a
   pointer.
5. If a SCEVAddRecExpr's first operand is a pointer, and it has no other
   pointer operands, the result is a pointer.
6. If every operand of a SCEVMinMaxExpr is a pointer, the result is a
   pointer.
7. Otherwise, the SCEV expression is invalid.

I'm not sure how useful rule 6 is in practice.  If we exclude it, we can
guarantee that ScalarEvolution::getPointerBase always returns a
SCEVUnknown, which might be a helpful property. Anyway, I'll leave that
for a followup.

This is basically mop-up at this point; all the changes with significant
functional effects have landed.  Some of the remaining changes could be
split off, but I don't see much point.

Differential Revision: https://reviews.llvm.org/D105510
2021-07-09 17:29:26 -07:00
Nikita Popov 2e3f4694d6 [IR] Add GEPOperator::indices() (NFC)
In order to mirror the GetElementPtrInst::indices() API.

Wanted to use this in the IRForTarget code, and was surprised to
find that it didn't exist yet.
2021-07-09 21:41:20 +02:00
Kevin P. Neal 52900486a1 [FPEnv][InstSimplify] Constrained FP support for NaN
Currently InstructionSimplify.cpp knows how to simplify floating point
instructions that have a NaN operand. It does not know how to handle the
matching constrained FP intrinsic.

This patch teaches it how to simplify so long as the exception handling
is not "fpexcept.strict".

Differential Revision: https://reviews.llvm.org/D103169
2021-07-09 11:26:28 -04:00
Martin Storsjö e479777d3c Revert "[ScalarEvolution] Fix overflow in computeBECount."
This reverts commit 5b350183cd (and
also "[NFC][ScalarEvolution] Cleanup howManyLessThans.",
009436e9c1, to make it apply).

See https://reviews.llvm.org/D105216 for discussion on various
miscompilations caused by that commit.
2021-07-09 14:26:48 +03:00
David Green 38c9a4068d [TTI] Remove IsPairwiseForm from getArithmeticReductionCost
This patch removes the IsPairwiseForm flag from the Reduction Cost TTI
hooks, along with some accompanying code for pattern matching reductions
from trees starting at extract elements. IsPairWise is now assumed to be
false, which was the predominant way that the value was used from both
the Loop and SLP vectorizers. Since the adjustments such as D93860, the
SLP vectorizer has not relied upon this distinction between paiwise and
non-pairwise reductions.

This also removes some code that was detecting reductions trees starting
from extract elements inside the costmodel. This case was
double-counting costs though, adding the individual costs on the
individual instruction _and_ the total cost of the reduction. Removing
it changes the costs in llvm/test/Analysis/CostModel/X86/reduction.ll to
not double count. The cost of reduction intrinsics is still tested
through the various tests in
llvm/test/Analysis/CostModel/X86/reduce-xyz.ll.

Differential Revision: https://reviews.llvm.org/D105484
2021-07-09 11:51:16 +01:00
Bjorn Pettersson 472462c472 [NewPM] Consistently use 'simplifycfg' rather than 'simplify-cfg'
There was an alias between 'simplifycfg' and 'simplify-cfg' in the
PassRegistry. That was the original reason for this patch, which
effectively removes the alias.

This patch also replaces all occurrances of 'simplify-cfg'
by 'simplifycfg'. Reason for choosing that form for the name is
that it matches the DEBUG_TYPE for the pass, and the legacy PM name
and also how it is spelled out in other passes such as
'loop-simplifycfg', and in other options such as
'simplifycfg-merge-cond-stores'.

I for some reason the name should be changed to 'simplify-cfg' in
the future, then I think such a renaming should be more widely done
and not only impacting the PassRegistry.

Reviewed By: aeubanks

Differential Revision: https://reviews.llvm.org/D105627
2021-07-09 09:47:03 +02:00
Eli Friedman 009436e9c1 [NFC][ScalarEvolution] Cleanup howManyLessThans.
In preparation for D104075. Some NFC cleanup, and some test coverage for
planned changes.
2021-07-08 17:56:26 -07:00
Michael Liao 8c7ff9da90 [Metadata] Decorate methods with 'const'. NFC.
- Minor coding style fix.
2021-07-08 14:11:14 -04:00
Eli Friedman 5b350183cd [ScalarEvolution] Fix overflow in computeBECount.
There are two issues with the current implementation of computeBECount:

1. It doesn't account for the possibility that adding "Stride - 1" to
Delta might overflow. For almost all loops, it doesn't, but it's not
actually proven anywhere.
2. It doesn't account for the possibility that Stride is zero. If Delta
is zero, the backedge is never taken; the value of Stride isn't
relevant. To handle this, we have to make sure that the expression
returned by computeBECount evaluates to zero.

To deal with this, add two new checks:

1. Use a variety of tricks to try to prove that the addition doesn't
overflow.  If the proof is impossible, use an alternate sequence which
never overflows.
2. Use umax(Stride, 1) to handle the possibility that Stride is zero.

Differential Revision: https://reviews.llvm.org/D105216
2021-07-08 10:09:55 -07:00
Eli Friedman f5603aa050 [ScalarEvolution] Make sure getMinusSCEV doesn't negate pointers.
Add a function removePointerBase that returns, essentially, S -
getPointerBase(S).  Use it in getMinusSCEV instead of actually
subtracting pointers.

Differential Revision: https://reviews.llvm.org/D105503
2021-07-07 10:27:10 -07:00
Eli Friedman 7ac1c7bead Recommit [ScalarEvolution] Make getMinusSCEV() fail for unrelated pointers.
As part of making ScalarEvolution's handling of pointers consistent, we
want to forbid multiplying a pointer by -1 (or any other value). This
means we can't blindly subtract pointers.

There are a few ways we could deal with this:
1. We could completely forbid subtracting pointers in getMinusSCEV()
2. We could forbid subracting pointers with different pointer bases
(this patch).
3. We could try to ptrtoint pointer operands.

The option in this patch is more friendly to non-integral pointers: code
that works with normal pointers will also work with non-integral
pointers. And it seems like there are very few places that actually
benefit from the third option.

As a minimal patch, the ScalarEvolution implementation of getMinusSCEV
still ends up subtracting pointers if they have the same base.  This
should eliminate the shared pointer base, but eventually we'll need to
rewrite it to avoid negating the pointer base. I plan to do this as a
separate step to allow measuring the compile-time impact.

This doesn't cause obvious functional changes in most cases; the one
case that is significantly affected is ICmpZero handling in LSR (which
is the source of almost all the test changes).  The resulting changes
seem okay to me, but suggestions welcome.  As an alternative, I tried
explicitly ptrtoint'ing the operands, but the result doesn't seem
obviously better.

I deleted the test lsr-undef-in-binop.ll becuase I couldn't figure out
how to repair it to test what it was actually trying to test.

Recommitting with fix to MemoryDepChecker::isDependent.

Differential Revision: https://reviews.llvm.org/D104806
2021-07-06 12:16:05 -07:00
Eli Friedman a6d081b2cb Revert "[ScalarEvolution] Make getMinusSCEV() fail for unrelated pointers."
This reverts commit 74d6ce5d5f.

Seeing crashes on buildbots in MemoryDepChecker::isDependent.
2021-07-06 11:17:13 -07:00
Sanjay Patel 4ec7c02197 [InstSimplify] fix bug in poison propagation for FP ops
If any operand of a math op is poison, that takes
precedence over general undef/NaN.

This should not be visible with binary ops because
it requires 2 constant operands to trigger (and if
both operands of a binop are constant, that should
get handled first in ConstantFolding).
2021-07-06 14:06:50 -04:00
Eli Friedman 74d6ce5d5f [ScalarEvolution] Make getMinusSCEV() fail for unrelated pointers.
As part of making ScalarEvolution's handling of pointers consistent, we
want to forbid multiplying a pointer by -1 (or any other value). This
means we can't blindly subtract pointers.

There are a few ways we could deal with this:
1. We could completely forbid subtracting pointers in getMinusSCEV()
2. We could forbid subracting pointers with different pointer bases
(this patch).
3. We could try to ptrtoint pointer operands.

The option in this patch is more friendly to non-integral pointers: code
that works with normal pointers will also work with non-integral
pointers. And it seems like there are very few places that actually
benefit from the third option.

As a minimal patch, the ScalarEvolution implementation of getMinusSCEV
still ends up subtracting pointers if they have the same base.  This
should eliminate the shared pointer base, but eventually we'll need to
rewrite it to avoid negating the pointer base. I plan to do this as a
separate step to allow measuring the compile-time impact.

This doesn't cause obvious functional changes in most cases; the one
case that is significantly affected is ICmpZero handling in LSR (which
is the source of almost all the test changes).  The resulting changes
seem okay to me, but suggestions welcome.  As an alternative, I tried
explicitly ptrtoint'ing the operands, but the result doesn't seem
obviously better.

I deleted the test lsr-undef-in-binop.ll becuase I couldn't figure out
how to repair it to test what it was actually trying to test.

Differential Revision: https://reviews.llvm.org/D104806
2021-07-06 10:54:41 -07:00
Kerry McLaughlin a7512401e5 [LV] Prevent vectorization with unsupported element types.
This patch adds a TTI function, isElementTypeLegalForScalableVector, to query
whether it is possible to vectorize a given element type. This is called by
isLegalToVectorizeInstTypesForScalable to reject scalable vectorization if
any of the instruction types in the loop are unsupported, e.g:

  int foo(__int128_t* ptr, int N)
    #pragma clang loop vectorize_width(4, scalable)
    for (int i=0; i<N; ++i)
      ptr[i] = ptr[i] + 42;

This example currently crashes if we attempt to vectorize since i128 is not a
supported type for scalable vectorization.

Reviewed By: sdesmalen, david-arm

Differential Revision: https://reviews.llvm.org/D102253
2021-07-06 13:06:21 +01:00
Sanjay Patel 3d3c0ed932 [InstSimplify] fold extractelement of splat with variable extract index
We already have a fold for variable index with constant vector,
but if we can determine a scalar splat value, then it does not
matter whether that value is constant or not.

We overlooked this fold in D102404 and earlier patches,
but the fixed vector variant is shown in:
https://llvm.org/PR50817

Alive2 agrees on that:
https://alive2.llvm.org/ce/z/HpijPC

The same logic applies to scalable vectors.

Differential Revision: https://reviews.llvm.org/D104867
2021-07-05 08:19:40 -04:00
Paul Walker 287d39dd5a [NFC] Fix a few whitespace issues and typos. 2021-07-04 11:49:58 +01:00
Roman Lebedev fc150cecd7
[SimplifyCFG] simplifyUnreachable(): erase instructions iff they are guaranteed to transfer execution to unreachable
This replaces the current ad-hoc implementation,
by syncing the code from InstCombine's implementation in `InstCombinerImpl::visitUnreachableInst()`,
with one exception that here in SimplifyCFG we are allowed to remove EH instructions.

Effectively, this now allows SimplifyCFG to remove calls (iff they won't throw and will return),
arithmetic/logic operations, etc.

Reviewed By: nikic

Differential Revision: https://reviews.llvm.org/D105374
2021-07-03 10:45:44 +03:00
Jacob Hegna 8cc8caa1b1 [MLGO] Update Oz model url. 2021-07-02 17:29:15 +00:00
Jacob Hegna 99f00635d7 Unpack the CostEstimate feature in ML inlining models.
This change yields an additional 2% size reduction on an internal search
binary, and an additional 0.5% size reduction on fuchsia.

Differential Revision: https://reviews.llvm.org/D104751
2021-07-02 16:57:16 +00:00
Sanjay Patel 9eb613b2de [InstSimplify] do not propagate poison from select arm to icmp user
This is the cause of the miscompile in:
https://llvm.org/PR50944

The problem has likely existed for some time, but it was made visible with:
5af8bacc94 ( D104661 )
handleOtherCmpSelSimplifications() assumed it can convert select of
constants to bool logic ops, but that does not work with poison.
We had a very similar construct in InstCombine, so the fix here
mimics the fix there.

The bug is in instsimplify, but I'm not sure how to reproduce it outside of
instcombine. The reason this is visible in instcombine is because we have a
hack (FIXME) to bypass simplification of a select when it has an icmp user:
955f125899/llvm/lib/Transforms/InstCombine/InstCombineSelect.cpp (L2632)

So we get to an unusual case where we are trying to simplify an instruction
that has an operand that would have already simplified if we had processed
it in normal order.

Differential Revision: https://reviews.llvm.org/D105298
2021-07-01 17:40:07 -04:00
Florian Hahn dc4299a7f3
[BasicAA] Fix typo ScaleForGDC -> ScaleForGCD. 2021-07-01 09:58:38 +01:00
Florian Hahn e6d22d0174
[BasicAA] Use separate scale variable for GCD.
Use separate variable for adjusted scale used for GCD computations. This
fixes an issue where we incorrectly determined that all indices are
non-negative and returned noalias because of that.

Follow up to 91fa3565da.
2021-06-30 20:04:39 +01:00
Philip Reames 14d8f1546a [SCEV] Fold (0 udiv %x) to 0
We have analogous rules in instsimplify, etc.., but were missing the same in SCEV.  The fold is near trivial, but came up in the context of a larger change.
2021-06-30 08:31:13 -07:00
Jacob Hegna 7b639f5095 [NFC] clang-format on InlineCost.cpp and InlineAdvisor.h. 2021-06-29 18:15:27 +00:00
Florian Hahn 91fa3565da
[BasicAA] Be more careful with modulo ops on VariableGEPIndex.
(V * Scale) % X may not produce the same result for any possible value
of V, e.g. if the multiplication overflows. This means we currently
incorrectly determine NoAlias in some cases.

This patch updates LinearExpression to track whether the expression
has NSW and uses that to adjust the scale used for alias checks.

Reviewed By: nikic

Differential Revision: https://reviews.llvm.org/D99424
2021-06-29 09:22:36 +01:00
Sanjay Patel 7414bbebc2 [Analysis] improve function signature checking for calloc
This would crash later if we thought the parameters were
valid for the standard library call as shown in:
https://llvm.org/PR50846
2021-06-27 08:19:00 -04:00
Eli Friedman 8d5bf0709d [NFC] Prefer ConstantRange::makeExactICmpRegion over makeAllowedICmpRegion
The implementation is identical, but it makes the semantics a bit more
obvious.
2021-06-25 14:43:13 -07:00
Sanjay Patel 1076b6c4f0 [Analysis] use better version of getLibFunc to check for alloc/free calls
There's no reason to use the weaker name-only analysis when we
have a function prototype to check (in fact, we probably should
not even have that name-only function exposed for general use,
but removing it requires auditing all of the callers).

The version of getLibFunc that takes a Function argument also
does some prototype checking to make sure the arguments/return
type match the expected signature of a real library call.

This is NFC-intended because the code in MemoryBuiltins does its
own function signature checking. For now, that means there may
be some redundancy in the checking, but that should not be above
the noise for compile-time. Ideally, we can move the checks to
a single location.

There's still a hole in the logic that allows the example in
https://llvm.org/PR50846 to cause a compiler crash.
2021-06-25 12:14:07 -04:00
Florian Hahn 6478f3fb78
[SCEV] Support single-cond range check idiom in applyLoopGuards.
This patch extends applyLoopGuards to detect a single-cond range check
idiom that InstCombine generates.

It extends applyLoopGuards to detect conditions of the form
(-C1 + X < C2). InstCombine will create this form when combining two
checks of the form (X u< C2 + C1) and (X >=u C1).

In practice, this enables us to correctly compute a tight trip count
bounds for code as in the function below. InstCombine will fold the
minimum iteration check created by LoopRotate with the user check (< 8).

    void unsigned_check(short *pred, unsigned width) {
        if (width < 8) {
            for (int x = 0; x < width; x++)
                pred[x] = pred[x] * pred[x];
        }
    }

As a consequence, LLVM creates dead vector loops for the code above,
e.g. see https://godbolt.org/z/cb8eTcqET

https://alive2.llvm.org/ce/z/SHHW4d

Reviewed By: nikic

Differential Revision: https://reviews.llvm.org/D104741
2021-06-25 10:24:40 +01:00
Sanjay Patel 50db987d59 [InstSimplify] move extract with undef index fold; NFC
This puts it closer to the other undef query check and
will avoid a potential ordering problem if we allow
folding non-constant-int indexes.
2021-06-24 13:22:10 -04:00
Florian Hahn 121ecb05e7
[SCEV] Generalize MatchBinaryAddToConst to support non-add expressions.
This patch generalizes MatchBinaryAddToConst to support matching
(A + C1), (A + C2), instead of just matching (A + C1), A.

The existing cases can be handled by treating non-add expressions A as
A + 0.

Reviewed By: mkazantsev

Differential Revision: https://reviews.llvm.org/D104634
2021-06-24 12:16:15 +01:00
Carl Ritson ae266e743c [LVI] Remove recursion from getValueForCondition (NFCI)
Convert getValueForCondition to a worklist model instead of using
recursion.

In pathological cases getValueForCondition recurses heavily.
Stack frames are quite expensive on x86-64, and some operating
systems (e.g. Windows) have relatively low stack size limits.
Using a worklist avoids potential failures from stack overflow.

Differential Revision: https://reviews.llvm.org/D104191
2021-06-24 09:58:22 +09:00
Eli Friedman b12192f7cd [ScalarEvolution] Clarify implementation of getPointerBase().
getPointerBase should only be looking through Add and AddRec
expressions; other expressions either aren't pointers, or can't be
looked through.

Technically, this is a functional change. For a multiply or min/max
expression, if they have exactly one pointer operand, and that operand
is the first operand, the behavior here changes. Similarly, if an AddRec
has a pointer-type step, the behavior changes. But that shouldn't be
happening in practice, and we plan to make such expressions illegal.
2021-06-23 12:55:59 -07:00
Eli Friedman fdaf304e0d [NFC][ScalarEvolution] Fix SCEVNAryExpr::getType().
SCEVNAryExpr::getType() could return the wrong type for a SCEVAddExpr.
Remove it, and add getType() methods to the relevant subclasses.

NFC because nothing uses it directly, as far as I know; this is just
future-proofing.
2021-06-23 12:55:59 -07:00
Nikita Popov 00d3f7cc3c [LAA] Make getPointersDiff() API compatible with opaque pointers
Make getPointersDiff() and sortPtrAccesses() compatible with opaque
pointers by explicitly passing in the element type instead of
determining it from the pointer element type.

The SLPVectorizer result is slightly non-optimal in that unnecessary
pointer bitcasts are added.

Differential Revision: https://reviews.llvm.org/D104784
2021-06-23 18:44:34 +02:00
Sanjay Patel 656001e7b2 [ValueTracking] look through bitcast of vector in computeKnownBits
This borrows as much as possible from the SDAG version of the code
(originally added with D27129 and since updated with big endian support).

In IR, we can test more easily for correctness than we did in the
original patch. I'm using the simplest cases that I could find for
InstSimplify: we computeKnownBits on variable shift amounts to see if
they are zero or in range. So shuffle constant elements into a vector,
cast it, and shift it.

The motivating x86 example from https://llvm.org/PR50123 is also here.
We computeKnownBits in the caller code, but we only check if the shift
amount is in range. That could be enhanced to catch the 2nd x86 test -
if the shift amount is known too big, the result is 0.

Alive2 understands the datalayout and agrees that the tests here are
correct - example:
https://alive2.llvm.org/ce/z/KZJFMZ

Differential Revision: https://reviews.llvm.org/D104472
2021-06-23 11:46:46 -04:00
Juneyoung Lee 5af8bacc94 [InstSimplify] Add more poison folding optimizations
This adds more poison folding optimizations to InstSimplify.

Since all binary operators propagate poison, these are fine.

Also, the precondition of `select cond, undef, x` -> `x` is relaxed to allow the case when `x` is undef.

Reviewed By: nikic

Differential Revision: https://reviews.llvm.org/D104661
2021-06-23 20:25:24 +09:00
Florian Hahn adee485adf
[SCEV] Support signed predicates in applyLoopGuards.
This adds handling for signed predicates, similar to how unsigned
predicates are already handled.

Reviewed By: nikic

Differential Revision: https://reviews.llvm.org/D104732
2021-06-23 10:21:05 +01:00
Joseph Huber 2662351e3b [OpenMP] Add new OpenMP globalization functions to library info
Summary:
The changes to globalization introduced in D97680 created two new functions to
push / pop shareably memory on the GPU, __kmpc_alloc_shared and
__kmpc_free_shared. This patch adds these new runtime functions to the
library info so they can be used by the HeapToStack attributor interface. This
optimization replaces malloc / free pairs with stack memory if legal.

Reviewed By: tianshilei1992

Differential Revision: https://reviews.llvm.org/D102087
2021-06-22 13:23:05 -04:00
Florian Hahn 6c782e6eb0
[SCEV] Reduce code to handle predicates in applyLoopGuards (NFC).
Hoist out common recurrence check and sink updating the map, to reduce
the code required to support additional predicates.
2021-06-22 15:56:45 +01:00
Nikita Popov e638a290f7 [ConstantFold] Delay fetching pointer element type
Don't do this while stipping pointer casts, instead fetch it at
the end. This improves compatibility with opaque pointers for the
case where the base object is not opaque.
2021-06-22 15:51:00 +02:00
Florian Hahn d17798823c
[SCEV] Retain AddExpr flags when subtracting a foldable constant.
Currently we drop wrapping flags for expressions like (A + C1)<flags> - C2.

But we can retain flags under certain conditions:

* Adding a smaller constant is NUW if the original AddExpr was NUW.

* Adding a constant with the same sign and small magnitude is NSW, if the
  original AddExpr was NSW.

This can improve results after using `SimplifyICmpOperands`, which may
subtract one in order to use stricter predicates, as is the case for
`isKnownPredicate`.

Reviewed By: efriedma

Differential Revision: https://reviews.llvm.org/D104319
2021-06-22 11:27:51 +01:00
Nikita Popov 04395fd6cb [ConstantFolding] Separate conditions in GEP evaluation (NFC)
Handle to gep p, 0-v case separately, and not as part of the loop
that ensures all indices are constant integers. Those two things
are not really related.
2021-06-22 11:14:47 +02:00
Eli Friedman 8f3d16905d [ScalarEvolution] Ensure backedge-taken counts are not pointers.
A backedge-taken count doesn't refer to memory; returning a pointer type
is nonsense. So make sure we always return an integer.

The obvious way to do this would be to just convert the operands of the
icmp to integers, but that doesn't quite work out at the moment:
isLoopEntryGuardedByCond currently gets confused by ptrtoint operations.
So we perform the ptrtoint conversion late for lt/gt operations.

The test changes are mostly innocuous. The most interesting changes are
more complex SCEV expressions of the form "(-1 * (ptrtoint i8* %ptr to
i64)) + %ptr)". This is expected: we can't fold this to zero because we
need to preserve the pointer base.

The call to isLoopEntryGuardedByCond in howFarToZero is less precise
because of ptrtoint operations; this shows up in the function
pr46786_c26_char in ptrtoint.ll. Fixing it here would require more
complex refactoring.  It should eventually be fixed by future
improvements to isImpliedCond.

See https://bugs.llvm.org/show_bug.cgi?id=46786 for context.

Differential Revision: https://reviews.llvm.org/D103656
2021-06-21 16:24:16 -07:00
Jacob Hegna f86d1f99b3 Remove ML inlining model artifacts.
They are not conducive to being stored in git. Instead, we autogenerate
mock model artifacts for use in tests. Production models can be
specified with the cmake flag LLVM_INLINER_MODEL_PATH.

LLVM_INLINER_MODEL_PATH has two sentinel values:
 - download, which will download the most recent compatible model.
 - autogenerate, which will autogenerate a "fake" model for testing the
 model uptake infrastructure.

Differential Revision: https://reviews.llvm.org/D104251
2021-06-21 17:38:09 +00:00
Eli Friedman 62ed024c74 [NFC][ScalarEvolution] Clean up ExitLimit constructors.
Make all the constructors forward to one constructor.  Remove redundant
assertions.
2021-06-20 17:40:30 -07:00
Juneyoung Lee 09e8c0d5aa [InstSimplify] icmp poison, X -> poison
This adds a simple transformation from icmp with poison constant to poison.
Comparing poison with something else is poison, so this is okay.

https://alive2.llvm.org/ce/z/e8iReb
https://alive2.llvm.org/ce/z/q4MurY
2021-06-20 15:39:07 +09:00
Tomas Matheson 1bcfa84ae9 Allow building for release with EXPENSIVE_CHECKS
D97225 moved LazyCallGraph verify() calls behind EXPENSIVE_CHECKS,
but verity() is defined for debug builds only so this had the unintended
effect of breaking release builds with EXPENSIVE_CHECKS.

Fix by enabling verify() for both debug and EXPENSIVE_CHECKS.

Differential Revision: https://reviews.llvm.org/D104514
2021-06-19 17:02:11 +01:00
Eli Friedman 8a567e5f22 [ScalarEvolution] Fix pointer/int type handling converting select/phi to min/max.
The old version of this code would blindly perform arithmetic without
paying attention to whether the types involved were pointers or
integers.  This could lead to weird expressions like negating a pointer.

Explicitly handle simple cases involving pointers, like "x < y ? x : y".
In all other cases, coerce the operands of the comparison to integer
types.  This avoids the weird cases, while handling most of the
interesting cases.

Differential Revision: https://reviews.llvm.org/D103660
2021-06-17 14:05:12 -07:00
Bjorn Pettersson 4c7f820b2b Update @llvm.powi to handle different int sizes for the exponent
This can be seen as a follow up to commit 0ee439b705,
that changed the second argument of __powidf2, __powisf2 and
__powitf2 in compiler-rt from si_int to int. That was to align with
how those runtimes are defined in libgcc.
One thing that seem to have been missing in that patch was to make
sure that the rest of LLVM also handle that the argument now depends
on the size of int (not using the si_int machine mode for 32-bit).
When using __builtin_powi for a target with 16-bit int clang crashed.
And when emitting libcalls to those rtlib functions, typically when
lowering @llvm.powi), the backend would always prepare the exponent
argument as an i32 which caused miscompiles when the rtlib was
compiled with 16-bit int.

The solution used here is to use an overloaded type for the second
argument in @llvm.powi. This way clang can use the "correct" type
when lowering __builtin_powi, and then later when emitting the libcall
it is assumed that the type used in @llvm.powi matches the rtlib
function.

One thing that needed some extra attention was that when vectorizing
calls several passes did not support that several arguments could
be overloaded in the intrinsics. This patch allows overload of a
scalar operand by adding hasVectorInstrinsicOverloadedScalarOpd, with
an entry for powi.

Differential Revision: https://reviews.llvm.org/D99439
2021-06-17 09:38:28 +02:00
Joachim Meyer 053dbb939d Use `-cfg-func-name` value as filter for `-view-cfg`, etc.
Currently the value is only used when calling `F->viewCFG()` which is missing out on its potential and usefulness.
So I added the check to the printer passes as well.

Reviewed By: MaskRay

Differential Revision: https://reviews.llvm.org/D102011
2021-06-16 23:54:51 +02:00
Eli Friedman 27963ccf07 [NFC][ScalarEvolution] Refactor createNodeForSelectOrPHI
In preparation for D103660.
2021-06-16 12:32:32 -07:00
Sanjay Patel ce95200b79 [InstSimplify] propagate poison through FP ops
We already have this fold:
  fadd float poison, 1.0 --> poison
...via ConstantFolding, so this makes the behavior consistent
if the other operand(s) are non-constant.

The fold for undef was added before poison existed as a
value/type in IR.

This came up in D102673 / D103169
because we're trying to sort out the more complicated handling
for constrained math ops.
We should have the handling for the regular instructions done
first, so we can build on that (or diverge as needed).

Differential Revision: https://reviews.llvm.org/D104383
2021-06-16 11:31:58 -04:00
Roman Lebedev a3113df219
[SCEV] PtrToInt on non-integral pointers is allowed
As per (committed without review) @reames's rGac81cb7e6dde9b0890ee1780eae94ab96743569b change,
we are now allowed to produce `ptrtoint` for non-integral pointers.
This will unblock further unbreaking of SCEV regarding int-vs-pointer type confusion.

Reviewed By: mkazantsev

Differential Revision: https://reviews.llvm.org/D104322
2021-06-16 10:24:25 +03:00
Arthur Eubanks 9aa1428174 [InstSimplify] Treat invariant group insts as bitcasts for load operands
We can look through invariant group intrinsics for the purposes of
simplifying the result of a load.

Since intrinsics can't be constants, but we also don't want to
completely rewrite load constant folding, we convert the load operand to
a constant. For GEPs and bitcasts we just treat them as constants. For
invariant group intrinsics, we treat them as a bitcast.

Relanding with a check for self-referential values.

Reviewed By: lebedev.ri

Differential Revision: https://reviews.llvm.org/D101103
2021-06-15 12:59:43 -07:00
spupyrev 0a0800c4d1 A post-processing for BFI inference
The current implementation for computing relative block frequencies does
not handle correctly control-flow graphs containing irreducible loops. This
results in suboptimally generated binaries, whose perf can be up to 5%
worse than optimal.

To resolve the problem, we apply a post-processing step, which iteratively
updates block frequencies based on the frequencies of their predesessors.
This corresponds to finding the stationary point of the Markov chain by
an iterative method aka "PageRank computation". The algorithm takes at
most O(|E| * IterativeBFIMaxIterations) steps but typically converges faster.

It is turned on by passing option `use-iterative-bfi-inference`
and applied only for functions containing profile data and irreducible loops.

Tested on SPEC06/17, where it is helping to get correct profile counts for one of
the binaries (403.gcc). In prod binaries, we've seen a speedup of up to 2%-5%
for binaries containing functions with hot irreducible loops.

Reviewed By: hoy, wenlei, davidxl

Differential Revision: https://reviews.llvm.org/D103289
2021-06-11 21:46:04 -07:00
Andrew Litteken f6dea2e732 [IRSim] Strip out the findSimilarity call from the constructor
Both doInitialize and runOnModule were running the entire analysis
due to the actual work being done in the constructor. Strip it out here
and only get the similarity during runOnModule.

Author: lanza
Reviewers: AndrewLitteken, paquette, plofti

Differential Revision: https://reviews.llvm.org/D92524
2021-06-11 18:41:28 -05:00
Andrew Litteken 64720f57be [IRSim] Don't copy the Mapper for createCandidatesFromSuffixTree
Every invocation this was copying the Mapper for no reason. Take a const
ref instead.

Author: lanza
Reviewers: AndrewLitteken, plofti, paquette,

Differential Review: https://reviews.llvm.org/D92532
2021-06-11 16:36:23 -05:00
Simon Pilgrim 5e6bfb661e [Analysis] Pass RecurrenceDescriptor as const reference. NFCI.
We were passing the RecurrenceDescriptor by value to most of the reduction analysis methods, despite it being rather bulky with TrackingVH members (that can be costly to copy). In all these cases we're only using the RecurrenceDescriptor for rather basic purposes (access to types/kinds etc.).

Differential Revision: https://reviews.llvm.org/D104029
2021-06-11 10:24:14 +01:00
Philip Reames 7629b2a09c [LI] Add a cover function for checking if a loop is mustprogress [nfc]
Essentially, the cover function simply combines the loop level check and the function level scope into one call.  This simplifies several callers and is (subjectively) less error prone.
2021-06-10 13:37:32 -07:00
Philip Reames aaaeb4b160 [SCEV] Use mustprogress flag on loops (in addition to function attribute)
This addresses a performance regression reported against 3c6e4191.  That change (correctly) limited a transform based on assumed finiteness to mustprogress loops, but the previous change (38540d7) which introduced the mustprogress check utility only handled function attributes, not the loop metadata form.

It turns out that clang uses the function attribute form for C++, and the loop metadata form for C.  As a result, 3c6e4191 ended up being a large regression in practice for C code as loops weren't being considered mustprogress despite the language semantics.
2021-06-10 13:20:28 -07:00
Philip Reames b6ee5f2b1d Move code for checking loop metadata into Analysis [nfc]
I need the mustprogress loop metadata in ScalarEvolution and it makes sense to keep all the accessors for quering loop metadate together.
2021-06-10 13:01:22 -07:00
Serge Pavlov 8ff36aab69 [ConstantFolding] Enable folding of min/max/copysign for all floats
Previously such folding was enabled for half, float and double values
only. With this change it is allowed for other floating point values
also.

Differential Revision: https://reviews.llvm.org/D103956
2021-06-10 11:57:51 +07:00
Philip Reames b65f30d6fb [SCEV] Minor code motion to simplify a later patch [nfc] 2021-06-09 14:17:06 -07:00
Arthur Eubanks 222cce3828 Revert "[InstSimplify] Treat invariant group insts as bitcasts for load operands"
This reverts commit 26044c6a54.

Breaks on invalid IR (see D101103).
2021-06-09 11:46:10 -07:00
Florian Hahn b76f1f1202
[SCEV] Keep common NUW flags when inlining Add operands.
Currently, NoWrapFlags are dropped if we inline operands of SCEVAddExpr
operands. As a consequence, we always drop flags when building
expressions like `getAddExpr(A, getAddExpr(B, C, NUW), NUW)`.

We should be able to retain NUW flags common among all inlined
SCEVAddExpr and the original flags.

Reviewed By: nikic, mkazantsev

Differential Revision: https://reviews.llvm.org/D103877
2021-06-09 17:13:21 +01:00
Artur Pilipenko 9197bac297 Add an option to hide "cold" blocks from CFG graph
Introduce a new cl::opt to hide "cold" blocks from CFG DOT graphs.
Use BFI to get block relative frequency. Hide the block if the
frequency is below the threshold set by the command line option value.

Reviewed By: davidxl, hoy
Differential Revision: https://reviews.llvm.org/D103640
2021-06-08 11:29:27 -07:00
Caroline Concatto 6fd1604d14 [InstCombine] Add instcombine fold for extractelement + splat for scalable vectors
This patch allows that scalable vector can also use the fold that already
exists for fixed vector, only when the lane index is lower than the minimum
number of elements of the vector.

Differential Revision: https://reviews.llvm.org/D102404
2021-06-08 10:43:38 +01:00
Philip Reames 3c6e419198 [SCEV] Properly guard reasoning about infinite loops being UB on mustprogress
Noticed via code inspection. We changed the semantics of the IR when we added mustprogress, and we appear to have not updated this location.

Differential Revision: https://reviews.llvm.org/D103834
2021-06-07 14:47:36 -07:00
Daniil Suchkov d32cc150fe [BasicAA] Handle PHIs without incoming values gracefully
Fix a bug introduced by f6f6f6375d.
Now for empty PHIs, instead of crashing on assert(hasVal()) in
Optional's internals, we'll return NoAlias, as we did before that patch.

Differential Revision: https://reviews.llvm.org/D103831
2021-06-07 21:39:01 +00:00
Philip Reames 38540d71c7 [SCEV] Compute exit counts for unsigned IVs using mustprogress semantics
The motivation here is simple loops with unsigned induction variables w/non-one steps. A toy example would be:
for (unsigned i = 0; i < N; i += 2) { body; }

Given C/C++ semantics, we do not get the nuw flag on the induction variable. Given that lack, we currently can't compute a bound for this loop. We can do better for many cases, depending on the contents of "body".

The basic intuition behind this patch is as follows:
* A step which evenly divides the iteration space must wrap through the same numbers repeatedly. And thus, we can ignore potential cornercases where we exit after the n-th wrap through uint32_max.
* Per C++ rules, infinite loops without side effects are UB. We already have code in SCEV which relies on this.  In LLVM, this is tied to the mustprogress attribute.

Together, these let us conclude that the trip count of this loop must come before unsigned overflow unless the body would form a well defined infinite loop.

A couple notes for those reading along:
* I reused the loop properties code which is overly conservative for this case. I may follow up in another patch to generalize it for the actual UB rules.
* We could cache the n(s/u)w facts. I left that out because doing a pre-patch which cached existing inference showed a lot of diffs I had trouble fully explaining. I plan to get back to this, but I don't want it on the critical path.

Differential Revision: https://reviews.llvm.org/D103118
2021-06-07 11:24:00 -07:00
Simon Pilgrim 76a1be05fa AssumeBundleQueries.cpp - don't dereference a dyn_cast<> result. NFCI.
Use cast<> instead which will assert that the cast is correct and not just return null - the match() should have already failed if the cast isn't valid anyhow.

Fixes static analysis warning.
2021-06-06 15:25:03 +01:00
Roman Lebedev e350494fb0
[NFC] Promote willNotOverflow() / getStrengthenedNoWrapFlagsFromBinOp() from IndVars into SCEV proper
We might want to use it when creating SCEV proper in createSCEV(),
now that we don't `forgetValue()` in `SimplifyIndvar::strengthenOverflowingOperation()`,
which might have caused us to loose some optimization potential.
2021-06-05 12:17:51 +03:00
Fangrui Song 06e7de795b Fix some -Wunused-but-set-variable in -DLLVM_ENABLE_ASSERTIONS=off build 2021-06-04 23:34:43 -07:00
Artur Pilipenko a06e63fa52 NFC. Refactor DOTGraphTraits::isNodeHidden
Restructure handling of cfg-hide-unreachable-paths and
cfg-hide-deoptimize-paths options so as to make it easier
to introduce new types of hidden blocks.
2021-06-03 11:27:06 -07:00
Qunyan Mangus cbde248736 Add getDemandedBits for uses.
Add getDemandedBits method for uses so we can query demanded bits for each use.  This can help getting better use information. For example, for the code below
define i32 @test_use(i32 %a) {
  %1 = and i32 %a, -256
  %2 = or i32 %1, 1
  %3 = trunc i32 %2 to i8 (didn't optimize this to 1 for illustration purpose)
  ... some use of %3
  ret %2
}
if we look at the demanded bit of %2 (which is all 32 bits because of the return), we would conclude that %a is used regardless of how its return is used. However, if we look at each use separately, we will see that the demanded bit of %2 in trunc only uses the lower 8 bits of %a which is redefined, therefore %a's usage depends on how the function return is used.

Reviewed By: RKSimon

Differential Revision: https://reviews.llvm.org/D97074
2021-06-02 10:07:40 -04:00
Daniil Fukalov 0195e594fe [TTI] NFC: Change getIntImmCodeSizeCost to return InstructionCost.
This patch migrates the TTI cost interfaces to return an InstructionCost.

See this patch for the introduction of the type: https://reviews.llvm.org/D91174
See this thread for context: http://lists.llvm.org/pipermail/llvm-dev/2020-November/146408.html

Reviewed By: sdesmalen

Differential Revision: https://reviews.llvm.org/D102915
2021-06-02 16:04:11 +03:00
Bjorn Pettersson 9c54ee4378 [SimplifyLibCalls] Take size of int into consideration when emitting ldexp/ldexpf
When rewriting
  powf(2.0, itofp(x)) -> ldexpf(1.0, x)
  exp2(sitofp(x)) -> ldexp(1.0, sext(x))
  exp2(uitofp(x)) -> ldexp(1.0, zext(x))

the wrong type was used for the second argument in the ldexp/ldexpf
libc call, for target architectures with 16 bit "int" type.
The transform incorrectly used a bitcasted function pointer with
a 32-bit argument when emitting the ldexp/ldexpf call for such
targets.

The fault is solved by using the correct function prototype
in the call, by asking TargetLibraryInfo about the size of "int".
TargetLibraryInfo by default derives the size of the int type by
assuming that it is 16 bits for 16-bit architectures, and
32 bits otherwise. If this isn't true for a target it should be
possible to override that default in the TargetLibraryInfo
initializer.

Differential Revision: https://reviews.llvm.org/D99438
2021-06-02 11:40:34 +02:00
Arthur Eubanks 8961293851 [OpaquePtr] Create API to make a copy of a PointerType with some address space
Some existing places use getPointerElementType() to create a copy of a
pointer type with some new address space.

Reviewed By: dblaikie

Differential Revision: https://reviews.llvm.org/D103429
2021-06-01 16:52:32 -07:00
Arthur Eubanks 26044c6a54 [InstSimplify] Treat invariant group insts as bitcasts for load operands
We can look through invariant group intrinsics for the purposes of
simplifying the result of a load.

Since intrinsics can't be constants, but we also don't want to
completely rewrite load constant folding, we convert the load operand to
a constant. For GEPs and bitcasts we just treat them as constants. For
invariant group intrinsics, we treat them as a bitcast.

Reviewed By: lebedev.ri

Differential Revision: https://reviews.llvm.org/D101103
2021-06-01 16:33:06 -07:00
Eli Friedman fd229caa01 [polly] Fix SCEVLoopAddRecRewriter to avoid invalid AddRecs.
When we're remapping an AddRec, the AddRec constructed by a partial
rewrite might not make sense.  This triggers an assertion complaining
it's not loop-invariant.

Instead of constructing the partially rewritten AddRec, just skip
straight to calling evaluateAtIteration.

Testcase was automatically reduced using llvm-reduce, so it's a little
messy, but hopefully makes sense.

Differential Revision: https://reviews.llvm.org/D102959
2021-06-01 09:51:05 -07:00
Florian Hahn aa00b1d763
[LV] Try to sink users recursively for first-order recurrences.
Update isFirstOrderRecurrence to  explore all uses of a recurrence phi
and check if we can sink them. If there are multiple users to sink, they
are all mapped to the previous instruction.

Fixes PR44286 (and another PR or two).

Reviewed By: Ayal

Differential Revision: https://reviews.llvm.org/D84951
2021-05-31 19:55:33 +01:00
Daniil Fukalov e853d3b274 [NFC] MemoryDependenceAnalysis cleanup.
1. Removed redundant includes,
2. Removed never defined and used `releaseMemory()`.
3. Fixed member functions names first letter case.
4. Renamed duplicate (in nested struct `NonLocalPointerInfo`) name
   `NonLocalDeps` to `NonLocalDepsMap`.

Reviewed By: nikic

Differential Revision: https://reviews.llvm.org/D102358
2021-05-31 18:07:55 +03:00
Roman Lebedev f7c95c3322
[NFC] ScalarEvolution: apply SSO to the ExprValueMap value
ExprValueMap is a map from SCEV * to a set-vector of (Value *, ConstantInt *) pair,
and while the map itself will likely be big-ish (have many keys),
it is a reasonable assumption that each key will refer to a small-ish
number of pairs.

In particular looking at n=512 case from
https://bugs.llvm.org/show_bug.cgi?id=50384,
the small-size of 4 appears to be the sweet spot,
it results in the least allocations while minimizing memory footprint.
```
$ for i in $(ls heaptrack.opt.*.gz); do echo $i; heaptrack_print $i | tail -n 6; echo ""; done
heaptrack.opt.0-orig.gz
total runtime: 14.32s.
calls to allocation functions: 8222442 (574192/s)
temporary memory allocations: 2419000 (168924/s)
peak heap memory consumption: 190.98MB
peak RSS (including heaptrack overhead): 239.65MB
total memory leaked: 67.58KB

heaptrack.opt.1-n1.gz
total runtime: 13.72s.
calls to allocation functions: 7184188 (523705/s)
temporary memory allocations: 2419017 (176338/s)
peak heap memory consumption: 191.38MB
peak RSS (including heaptrack overhead): 239.64MB
total memory leaked: 67.58KB

heaptrack.opt.2-n2.gz
total runtime: 12.24s.
calls to allocation functions: 6146827 (502355/s)
temporary memory allocations: 2418997 (197695/s)
peak heap memory consumption: 163.31MB
peak RSS (including heaptrack overhead): 211.01MB
total memory leaked: 67.58KB

heaptrack.opt.3-n4.gz
total runtime: 12.28s.
calls to allocation functions: 6068532 (494260/s)
temporary memory allocations: 2418985 (197017/s)
peak heap memory consumption: 155.43MB
peak RSS (including heaptrack overhead): 201.77MB
total memory leaked: 67.58KB

heaptrack.opt.4-n8.gz
total runtime: 12.06s.
calls to allocation functions: 6068042 (503321/s)
temporary memory allocations: 2418992 (200646/s)
peak heap memory consumption: 166.03MB
peak RSS (including heaptrack overhead): 213.55MB
total memory leaked: 67.58KB

heaptrack.opt.5-n16.gz
total runtime: 12.14s.
calls to allocation functions: 6067993 (499958/s)
temporary memory allocations: 2418999 (199307/s)
peak heap memory consumption: 187.24MB
peak RSS (including heaptrack overhead): 233.69MB
total memory leaked: 67.58KB
```

While that test may be an edge worst-case scenario,
https://llvm-compile-time-tracker.com/compare.php?from=dee85d47d9f15fc268f7b18f279dac2774836615&to=98a57e31b1947d5bcdf4a5605ac2ab32b4bd5f63&stat=instructions
agrees that this also results in improvements in the usual situations.
2021-05-31 15:34:03 +03:00
Sanjay Patel 7bb8bfa062 [InstCombine] fix miscompile from vector select substitution
This is similar to the fix in c590a9880d ( PR49832 ), but
we missed handling the pattern for select of bools (no compare
inst).

We can't substitute a vector value because the equality condition
replacement that we are attempting requires that the condition
is true/false for the entire value. Vector select can be partly
true/false.

I added an assert for vector types, so we shouldn't hit this again.
Fixed formatting while auditing the callers.

https://llvm.org/PR50500
2021-05-30 07:11:58 -04:00
Mindong Chen 71acce68da [NFCI] Move DEBUG_TYPE definition below #includes
When you try to define a new DEBUG_TYPE in a header file, DEBUG_TYPE
definition defined around the #includes in files include it could
result in redefinition warnings even compile errors.

Reviewed By: tejohnson

Differential Revision: https://reviews.llvm.org/D102594
2021-05-30 17:31:01 +08:00
Florian Hahn ec1f6f7e3f
Revert "[LAA] Support pointer phis in loop by analyzing each incoming pointer."
This reverts commit 1ed7f8ede5.

This change can cause loop-distribute to crash in some cases. Revert
until I have more time to wrap up a fix.

See  PR50296, PR5028 and D102266.
2021-05-28 10:33:52 +01:00
Yang Fan f2264ebb08
[ConstantFolding] Fix -Wunused-variable warning (NFC)
GCC warning:
```
/llvm-project/llvm/lib/Analysis/ConstantFolding.cpp: In function ‘llvm::Constant* llvm::ConstantFoldLoadFromConstPtr(llvm::Constant*, llvm::Type*, const llvm::DataLayout&)’:
/llvm-project/llvm/lib/Analysis/ConstantFolding.cpp:713:19: warning: unused variable ‘SimplifiedGEP’ [-Wunused-variable]
  713 |         if (auto *SimplifiedGEP = dyn_cast<GEPOperator>(Simplified)) {
      |                   ^~~~~~~~~~~~~
```
2021-05-28 16:17:12 +08:00
Arthur Eubanks 8086f9d87e [ConstFold] Simplify a load's GEP operand through local aliases
MSVC-style RTTI produces loads through a GEP of a local alias which
itself is a GEP. Currently we aren't able to devirtualize any virtual
calls when MSVC RTTI is enabled.

This patch attempts to simplify a load's GEP operand by calling
SymbolicallyEvaluateGEP() with an option to look through local aliases.

Differential Revision: https://reviews.llvm.org/D101100
2021-05-27 16:04:19 -07:00
Philip Reames ff08c3468f [SCEV] Compute trip multiple for multiple exit loops
This patch implements getSmallConstantTripMultiple(L) correctly for multiple exit loops. The previous implementation was both imprecise, and violated the specified behavior of the method. This was fine in practice, because it turns out the function was both dead in real code, and not tested for the multiple exit case.

Differential Revision: https://reviews.llvm.org/D103189
2021-05-26 11:52:25 -07:00
Philip Reames 9306bb638f [SCEV] Generalize getSmallConstantTripCount(L) for multiple exit loops
This came up in review for another patch, see https://reviews.llvm.org/D102982#2782407 for full context.

I've reviewed the callers to make sure they can handle multiple exit loops w/non-zero returns.  There's two cases in target cost models where results might change (Hexagon and PowerPC), but the results looked legal and reasonable.  If a target maintainer wishes to back out the effect of the costing change, they should explicitly check for multiple exit loops and handle them as desired.

Differential Revision: https://reviews.llvm.org/D103182
2021-05-26 11:18:25 -07:00
Philip Reames 921d3f7af0 [SCEV] Add a utility for converting from "exit count" to "trip count"
(Mostly as a logical place to put a comment since this is a reoccuring confusion.)
2021-05-26 10:41:49 -07:00
Philip Reames fb14577d0c [SCEV] Extract out a helper for computing trip multiples 2021-05-26 10:15:03 -07:00
Philip Reames 9cc2181ec3 [unroll] Use value domain for symbolic execution based cost model
The current full unroll cost model does a symbolic evaluation of the loop up to a fixed limit. That symbolic evaluation currently simplifies to constants, but we can generalize to arbitrary Values using the InstructionSimplify infrastructure at very low cost.

By itself, this enables some simplifications, but it's mainly useful when combined with the branch simplification over in D102928.

Differential Revision: https://reviews.llvm.org/D102934
2021-05-26 08:41:25 -07:00
Vitaly Buka f44f2e0afc [NFC] Fix 'unused' warning 2021-05-25 12:23:57 -07:00
Nikita Popov 6300c37a46 [SCEV] Cache operands used in BEInfo (NFC)
When memoized values for a SCEV expressions are dropped, we also
drop all BECounts that make use of the SCEV expression. This is done
by iterating over all the ExitNotTaken counts and (recursively)
checking whether they use the SCEV expression. If there are many
exits, this will take a lot of time.

This patch improves the situation by pre-computing a set of all
used operands, so that we can determine whether a certain BEInfo
needs to be invalidated using a simple set lookup. Will still need
to loop over all BEInfos though.

This makes for a mild improvement on non-degenerate cases:
https://llvm-compile-time-tracker.com/compare.php?from=b661a55a253f4a1cf5a0fbcb86e5ba7b9fb1387b&to=be1393f450e594c53f0ad7e62339a6bc831b16f6&stat=instructions

For the degenerate case from https://bugs.llvm.org/show_bug.cgi?id=50384,
for n=128 I'm seeing run time drop from 1.6s to 1.1s.

Differential Revision: https://reviews.llvm.org/D102796
2021-05-25 21:03:33 +02:00
Sanjay Patel ca7eaa0a54 [InstSimplify] allow undef element match in vector select condition value
The semantics of select with undefined/poison condition
are not explicitly stated in the LangRef, but this matches
comments in the code and Alive2 appears to concur:
https://alive2.llvm.org/ce/z/KXytmd

We can find this pattern after demanded elements transforms.

As noted in D101191, fuzzers are finding infinite loops because
we may not account for this pattern in other passes.
2021-05-25 14:25:34 -04:00
Philip Reames aabca2d1da [SCEV] Cleanup doesIVOverflowOnX checks [NFC]
Stylistic changes only.
1) Don't pass a parameter just to do an early exit.
2) Use a name which matches actual behavior.
2021-05-25 10:12:24 -07:00
Philip Reames a47b2d4567 [SCEV] Remove unused parameter from computeBECount [NFC]
All callers pass "false" for the Equality parameter.  Kill the dead code, and update the function block comment.
2021-05-25 09:58:56 -07:00
David Goldblatt 8607a02357 [InstSimplify] Transform X * Y % Y --> 0
simplifyDiv already handles the case X * Y / Y --> X (barring overflow).
This adds the equivalent handling to simplifyRem.

Correctness:
https://alive2.llvm.org/ce/z/J2cUbS
https://alive2.llvm.org/ce/z/us9NUM
https://alive2.llvm.org/ce/z/AvaDGJ
https://alive2.llvm.org/ce/z/kq9ige

Extending the situations in which we apply this transform would not be
correct:
https://alive2.llvm.org/ce/z/Lf9V63
https://alive2.llvm.org/ce/z/6RPQK3
https://alive2.llvm.org/ce/z/p9UdxC
https://alive2.llvm.org/ce/z/A2zlhE
https://alive2.llvm.org/ce/z/vHTtLw
https://alive2.llvm.org/ce/z/lvpH42

Differential Revision: https://reviews.llvm.org/D102864
2021-05-25 10:16:04 -04:00
Sanjay Patel a0e71f1832 [ConstProp] propagate poison from vector reduction element(s) to result
This follows from the underlying logic for binops and min/max.
Although it does not appear that we handle this for min/max
intrinsics currently.
https://alive2.llvm.org/ce/z/Kq9Xnh
2021-05-24 10:34:40 -04:00
Martin Storsjö c5638a71d8 [MinGW] Mark a number of library functions unavailable for mingw targets
These functions were marked unavailable for MSVC targets before,
within an "T.isOSWindows() && !T.isOSCygMing()" block, but these ones
are unavailable on MinGW targets too.

This avoids generating calls to stpcpy for MinGW targets, which has
been happening since 6dbf0cfcf7 (in
some cases).

This fixes https://github.com/mstorsjo/llvm-mingw/issues/201.

Differential Revision: https://reviews.llvm.org/D102946
2021-05-22 23:40:19 +03:00
Serge Pavlov c9c05a91c4 [ConstantFolding] Use APFloat for constant folding. NFC
Replace use of host floating types with operations on APFloat when it is
possible. Use of APFloat makes analysis more convenient and facilitates
constant folding in the case of non-default FP environment.

Differential Revision: https://reviews.llvm.org/D102672
2021-05-22 13:00:20 +07:00
Arthur Eubanks f7788e1bff Revert "[NewPM] Only invalidate modified functions' analyses in CGSCC passes"
This reverts commit d14d84af2f.

Causes unacceptable memory regressions.
2021-05-21 16:38:03 -07:00
Arthur Eubanks a52530dd6a Revert "[NPM] Do not run function simplification pipeline unnecessarily"
This reverts commit 97ab068034.

Depends on D100917, which is to be reverted.
2021-05-21 16:38:02 -07:00
Philip Reames cc5f6ae4b4 Move a definition into cpp from header in advance of other changes [nfc] 2021-05-21 09:18:04 -07:00
Daniil Fukalov e1cb98be2d [TTI] NFC: Change getCostOfKeepingLiveOverCall to return InstructionCost.
This patch migrates the TTI cost interfaces to return an InstructionCost.

See this patch for the introduction of the type: https://reviews.llvm.org/D91174
See this thread for context: http://lists.llvm.org/pipermail/llvm-dev/2020-November/146408.html

Reviewed By: sdesmalen

Differential Revision: https://reviews.llvm.org/D102831
2021-05-21 15:18:12 +03:00
Daniil Fukalov e8e88c3353 [TTI] NFC: Change getRegUsageForType to return InstructionCost.
This patch migrates the TTI cost interfaces to return an InstructionCost.

See this patch for the introduction of the type: https://reviews.llvm.org/D91174
See this thread for context: http://lists.llvm.org/pipermail/llvm-dev/2020-November/146408.html

Reviewed By: sdesmalen

Differential Revision: https://reviews.llvm.org/D102541
2021-05-21 15:17:23 +03:00
Joe Ellis 5a476987f7 [InstSimplify] Properly constrain {insert,extract}_subvector intrinsic fold
The previous rule:

   (insert_vector _, (extract_vector X, 0), 0) -> X

is not quite correct. The correct fold should be:

   (insert_vector Y, (extract_vector X, 0), 0) -> X
   where: Y is X, or Y is undef

This commit updates the pattern.

Reviewed By: peterwaller-arm, paulwalker-arm

Differential Revision: https://reviews.llvm.org/D102699
2021-05-21 10:05:03 +00:00
Serge Pavlov c162f086ba [APFloat] convertToDouble/Float can work on shorter types
Previously APFloat::convertToDouble may be called only for APFloats that
were built using double semantics. Other semantics like single precision
were not allowed although corresponding numbers could be converted to
double without loss of precision. The similar restriction applied to
APFloat::convertToFloat.

With this change any APFloat that can be precisely represented by double
can be handled with convertToDouble. Behavior of convertToFloat was
updated similarly. It make the conversion operations more convenient and
adds support for formats like half and bfloat.

Differential Revision: https://reviews.llvm.org/D102671
2021-05-21 11:02:51 +07:00
Nikita Popov b661a55a25 [ScalarEvolution] Remove unused ExitLimit::hasOperand() method (NFC)
We only use BackedgeTakenInfo::hasOperand().
2021-05-19 18:42:14 +02:00
Arthur Eubanks 6b9524a05b [NewPM] Don't mark AA analyses as preserved
Currently all AA analyses marked as preserved are stateless, not taking
into account their dependent analyses. So there's no need to mark them
as preserved, they won't be invalidated unless their analyses are.

SCEVAAResults was the one exception to this, it was treated like a
typical analysis result. Make it like the others and don't invalidate
unless SCEV is invalidated.

Reviewed By: asbirlea

Differential Revision: https://reviews.llvm.org/D102032
2021-05-18 13:49:03 -07:00
Arthur Eubanks cc64ece77d [NFC][OpaquePtr] Avoid using PointerType::getElementType() in VectorUtils.cpp
Reviewed By: nikic

Differential Revision: https://reviews.llvm.org/D102533
2021-05-17 18:35:44 -07:00
Nikita Popov 7243120198 [CaptureTracking] Simplify reachability check (NFCI)
This code was re-implementing the same-BB case of
isPotentiallyReachable(). Historically, this was done because
CaptureTracking used additional caching for local dominance
queries. Now that it is no longer needed, the code is effectively
the same as isPotentiallyReachable().

The only difference are extra checks for invoke/phis. These are
misleading checks related to dominance in the value availability
sense that are not relevant for control reachability. The invoke
check was correct but redundant in that invokes are always
terminators, so `I` could never come before the invoke. The phi
check is a matter of interpretation (should an earlier phi node be
considered reachable from a later phi node in the same block?)
but ultimately doesn't matter because phis don't capture anyway.
2021-05-16 16:04:10 +02:00
Nikita Popov 656296b1c2 Reapply [CaptureTracking] Do not check domination
Reapply after adjusting the synchronized.m test case, where the
TODO is now resolved. The pointer is only captured on the exception
handling path.

-----

For the CapturesBefore tracker, it is sufficient to check that
I can not reach BeforeHere. This does not necessarily require
that BeforeHere dominates I, it can also occur if the capture
happens on an entirely disjoint path.

This change was previously accepted in D90688, but had to be
reverted due to large compile-time impact in some cases: It
increases the number of reachability queries that are performed.

After recent changes, the compile-time impact is largely mitigated,
so I'm reapplying this patch. The remaining compile-time impact
is largely proportional to changes in code-size.
2021-05-16 15:46:31 +02:00
Nikita Popov 541c2845de Revert "[CaptureTracking] Do not check domination"
This reverts commit 6b8b43e7af.

This causes clang test to fail (CodeGenObjC/synchronized.m).
Revert until I can figure out whether that's an expected change.
2021-05-16 11:04:45 +02:00
Nikita Popov 6b8b43e7af [CaptureTracking] Do not check domination
For the CapturesBefore tracker, it is sufficient to check that
I can not reach BeforeHere. This does not necessarily require
that BeforeHere dominates I, it can also occur if the capture
happens on an entirely disjoint path.

This change was previously accepted in D90688, but had to be
reverted due to large compile-time impact in some cases: It
increases the number of reachability queries that are performed.

After recent changes, the compile-time impact is largely mitigated,
so I'm reapplying this patch. The remaining compile-time impact
is largely proportional to changes in code-size.
2021-05-16 10:49:36 +02:00
Nikita Popov 6e9363c942 [CaptureTracking] Only check reachability for capture candidates
Reachability queries are very expensive, and currently performed
for each instruction we look at, even though most of them will
not lead to a capture and are thus ultimately irrelevant. It is
more efficient to walk a few unnecessary instructions than to
perform unnecessary reachability queries.

Theoretically, this may produce worse results, because the additional
instructions considered may cause us to hit the use count limit
earlier. In practice, this does not appear to be a problem, e.g.
on test-suite O3 we report only one more captured-before with this
change, with no resulting codegen differences.

This makes PointerMayBeCapturedBefore() significantly cheaper in
practice, hopefully allowing it to be used in more places.
2021-05-15 22:57:56 +02:00
Nikita Popov f9e9b0cdb4 [CFG] Move reachable from entry checks into basic block variant
These checks are not specific to the instruction based variant of
isPotentiallyReachable(), they are equally valid for the basic
block based variant. Move them there, to make sure that switching
between the instruction and basic block variants cannot introduce
regressions.
2021-05-15 15:42:02 +02:00
Nikita Popov fb9ed1979a [IR] Add BasicBlock::isEntryBlock() (NFC)
This is a recurring and somewhat awkward pattern. Add a helper
method for it.
2021-05-15 12:41:58 +02:00
Nikita Popov 6418bab6f8 [CFG] Use comesBefore() (NFC)
Use comesBefore() instead of performing an instruction walk. In
line with the previous implementation, instructions are considered
to reach themselves.
2021-05-15 12:14:30 +02:00
Nikita Popov f765e54db2 [CaptureTracking] Clean up same instruction check (NFC)
Check the BeforeHere == I case once in shouldExplore, instead of
handling it in four different places.
2021-05-15 11:58:55 +02:00
Nick Desaulniers 8c72749bd9 [LowerConstantIntrinsics] reuse isManifestLogic from ConstantFolding
GlobalVariables are Constants, yet should not unconditionally be
considered true for __builtin_constant_p.

Via the LangRef
https://llvm.org/docs/LangRef.html#llvm-is-constant-intrinsic:

    This intrinsic generates no code. If its argument is known to be a
    manifest compile-time constant value, then the intrinsic will be
    converted to a constant true value. Otherwise, it will be converted
    to a constant false value.

    In particular, note that if the argument is a constant expression
    which refers to a global (the address of which _is_ a constant, but
    not manifest during the compile), then the intrinsic evaluates to
    false.

Move isManifestConstant from ConstantFolding to be a method of
Constant so that we can reuse the same logic in
LowerConstantIntrinsics.

pr/41459

Reviewed By: rsmith, george.burgess.iv

Differential Revision: https://reviews.llvm.org/D102367
2021-05-14 15:35:21 -07:00
Nikita Popov c4fb2a1fc2 [MemDep] Use BatchAA in more places (NFCI)
Previously, we already used BatchAA for individual simple pointer
dependency queries. This extends BatchAA usage for the non-local
case, so that only one BatchAA instance is used for all blocks,
instead of one instance per block.

Use of BatchAA is safe as IR cannot be modified during a MemDep
query.
2021-05-14 22:54:40 +02:00
Nikita Popov 5e289cc597 [AA] Support callCapturesBefore() on BatchAA (NFCI)
This is not expected to have any practical compile-time effect,
as the alias() calls inside callCapturesBefore() are rare. This
should still be supported for API completeness, and might be
useful for reachability caching.
2021-05-14 21:48:08 +02:00
Philip Reames 23c93c2555 Discount invariant instructions in full unrolling
This patch updates the cost model for full unrolling to discount the cost of a loop invariant expression on all but one iteration. The reasoning here is that such an expression (as determined by SCEV) will be CSEd or DSEd once the loop is unrolled. Note that SCEVs reasoning will find things which could be invariant, not simply those outside the loop.

Differential Revision: https://reviews.llvm.org/D102506
2021-05-14 11:07:19 -07:00
dfukalov fdae3fc8b3 [GVN] Clobber partially aliased loads.
Use offsets stored in `AliasResult` implemented in D98718.

Updated with fix of issue reported in https://reviews.llvm.org/D95543#2745161

Reviewed By: nikic

Differential Revision: https://reviews.llvm.org/D95543
2021-05-14 11:17:14 +03:00
Nikita Popov 425781bce0 [CaptureTracking] Use isIdentifiedFunctionLocal() (NFC)
These conditions together exactly match isIdentifiedFunctionLocal(),
and this is also what we logically want to check for here.
2021-05-13 23:06:42 +02:00
Nikita Popov dce158c58d [AA] Use isIdentifiedFunctionLocal() (NFC)
This condition is equivalent to isIdentifiedFunctionLocal(),
and this is also what we semantically want to check here.
2021-05-13 23:06:42 +02:00
Joe Ellis 2ed7db0d20 [InstSimplify] Remove redundant {insert,extract}_vector intrinsic chains
This commit removes some redundant {insert,extract}_vector intrinsic
chains by implementing the following patterns as instsimplifies:

   (insert_vector _, (extract_vector X, 0), 0) -> X
   (extract_vector (insert_vector _, X, 0), 0) -> X

Reviewed By: peterwaller-arm

Differential Revision: https://reviews.llvm.org/D101986
2021-05-13 16:09:50 +00:00
Florian Hahn e2759f110b
[SCEV] Apply guards to max with non-unitary steps.
We already apply loop-guards when computing the maximum with unitary
steps. This extends the code to also do so when dealing with non-unitary
steps.

This allows us to infer a tighter maximum in some cases.

Reviewed By: nikic

Differential Revision: https://reviews.llvm.org/D102267
2021-05-13 09:47:29 +01:00
Jordan Rupprecht fec2945998 Revert "[GVN] Clobber partially aliased loads."
This reverts commit 6c57044231.

It causes assertion errors due to widening atomic loads, and potentially causes miscompile elsewhere too. Repro, also posted to D95543:

```
$ cat repro.ll
; ModuleID = 'repro.ll'
source_filename = "repro.ll"
target datalayout = "e-m:e-p270:32:32-p271:32:32-p272:64:64-i64:64-f80:128-n8:16:32:64-S128"
target triple = "x86_64-unknown-linux-gnu"

%struct.widget = type { i32 }
%struct.baz = type { i32, %struct.snork }
%struct.snork = type { %struct.spam }
%struct.spam = type { i32, i32 }

@global = external local_unnamed_addr global %struct.widget, align 4
@global.1 = external local_unnamed_addr global i8, align 1
@global.2 = external local_unnamed_addr global i32, align 4

define void @zot(%struct.baz* %arg) local_unnamed_addr align 2 {
bb:
  %tmp = getelementptr inbounds %struct.baz, %struct.baz* %arg, i64 0, i32 1
  %tmp1 = bitcast %struct.snork* %tmp to i64*
  %tmp2 = load i64, i64* %tmp1, align 4
  %tmp3 = getelementptr inbounds %struct.baz, %struct.baz* %arg, i64 0, i32 1, i32 0, i32 1
  %tmp4 = icmp ugt i64 %tmp2, 4294967295
  br label %bb5

bb5:                                              ; preds = %bb14, %bb
  %tmp6 = load i32, i32* %tmp3, align 4
  %tmp7 = icmp ne i32 %tmp6, 0
  %tmp8 = select i1 %tmp7, i1 %tmp4, i1 false
  %tmp9 = zext i1 %tmp8 to i8
  store i8 %tmp9, i8* @global.1, align 1
  %tmp10 = load i32, i32* @global.2, align 4
  switch i32 %tmp10, label %bb11 [
    i32 1, label %bb12
    i32 2, label %bb12
  ]

bb11:                                             ; preds = %bb5
  br label %bb14

bb12:                                             ; preds = %bb5, %bb5
  %tmp13 = load atomic i32, i32* getelementptr inbounds (%struct.widget, %struct.widget* @global, i64 0, i32 0) acquire, align 4
  br label %bb14

bb14:                                             ; preds = %bb12, %bb11
  br label %bb5
}
$ opt -O2 repro.ll -disable-output
opt: /home/rupprecht/src/llvm-project/llvm/lib/Transforms/Utils/VNCoercion.cpp:496: llvm::Value *llvm::VNCoercion::getLoadValueForLoad(llvm::LoadInst *, unsigned int, llvm::Type *, llvm::Instruction *, const llvm::DataLayout &): Assertion `SrcVal->isSimple() && "Cannot widen volatile/atomic load!"' failed.
PLEASE submit a bug report to https://bugs.llvm.org/ and include the crash backtrace.
Stack dump:
0.      Program arguments: /home/rupprecht/dev/opt -O2 repro.ll -disable-output
...
```
2021-05-11 16:08:53 -07:00
Stanislav Mekhanoshin 22d295f695 [AMDGPU] Constant fold Intrinsic::amdgcn_perm
Differential Revision: https://reviews.llvm.org/D102203
2021-05-10 16:23:11 -07:00
Florian Hahn 93a9a8a8d9
[VecLib] Add support for vector fns from Darwin's libsystem.
This patch adds support for Darwin's libsystem math vector functions to
TLI. Darwin's libsystem provides a range of vector functions for libm
functions.

This initial patch only adds the 2 x double and 4 x float versions,
which are available on both X86 and ARM64. On X86, wider vector versions
are supported as well.

Reviewed By: jroelofs

Differential Revision: https://reviews.llvm.org/D101856
2021-05-10 21:19:58 +01:00
Andy Kaylor 7086025d65 [Dependence Analysis] Enable delinearization of fixed sized arrays
Patch by Artem Radzikhovskyy!

Allow delinearization of fixed sized arrays if we can prove that the GEP indices do not overflow the array dimensions. The checks applied are similar to the ones that are used for delinearization of parametric size arrays. Make sure that the GEP indices are non-negative and that they are smaller than the range of that dimension.

Changes Summary:

- Updated the LIT tests with more exact values, as we are able to delinearize and apply more exact tests
- profitability.ll - now able to delinearize in all cases, no need to use -da-disable-delinearization-checks flag and run the test twice
- loop-interchange-optimization-remarks.ll - in one of the cases we are able to delinearize without using -da-disable-delinearization-checks
- SimpleSIVNoValidityCheckFixedSize.ll - removed unnecessary "-da-disable-delinearization-checks" flag. Now can get the exact answer without it.
- SimpleSIVNoValidityCheckFixedSize.ll and PreliminaryNoValidityCheckFixedSize.ll - made negative tests more explicit, in order to demonstrate the need for "-da-disable-delinearization-checks" flag

Differential Revision: https://reviews.llvm.org/D101486
2021-05-10 10:30:15 -07:00
Nikita Popov d26ca78c18 [SCEV] Handle and/or in applyLoopGuards()
applyLoopGuards() already combines conditions from multiple nested
guards. However, it cannot use multiple conditions on the same guard,
combined using and/or. Add support for this by recursing into either
`and` or `or`, depending on the direction of the branch.

Differential Revision: https://reviews.llvm.org/D101692
2021-05-09 21:34:28 +02:00
Arthur Eubanks 34a8a437bf [NewPM] Hide pass manager debug logging behind -debug-pass-manager-verbose
Printing pass manager invocations is fairly verbose and not super
useful.

This allows us to remove DebugLogging from pass managers and PassBuilder
since all logging (aside from analysis managers) goes through
instrumentation now.

This has the downside of never being able to print the top level pass
manager via instrumentation, but that seems like a minor downside.

Reviewed By: ychen

Differential Revision: https://reviews.llvm.org/D101797
2021-05-07 21:51:47 -07:00
Florian Hahn 6c99e63120 [SCEV] By more careful when traversing phis in isImpliedViaMerge.
I think currently isImpliedViaMerge can incorrectly return true for phis
in a loop/cycle, if the found condition involves the previous value of

Consider the case in exit_cond_depends_on_inner_loop.

At some point, we call (modulo simplifications)
isImpliedViaMerge(<=, %x.lcssa, -1, %call, -1).

The existing code tries to prove IncV <= -1 for all incoming values
InvV using the found condition (%call <= -1). At the moment this succeeds,
but only because it does not compare the same runtime value. The found
condition checks the value of the last iteration, but the incoming value
is from the *previous* iteration.

Hence we incorrectly determine that the *previous* value was <= -1,
which may not be true.

I think we need to be more careful when looking at the incoming values
here. In particular, we need to rule out that a found condition refers to
any value that may refer to one of the previous iterations. I'm not sure
there's a reliable way to do so (that also works of irreducible control
flow).

So for now this patch adds an additional requirement that the incoming
value must properly dominate the phi block. This should ensure the
values do not change in a cycle. I am not entirely sure if will catch
all cases and I appreciate a through second look in that regard.

Alternatively we could also unconditionally bail out in this case,
instead of checking the incoming values

Reviewed By: nikic

Differential Revision: https://reviews.llvm.org/D101829
2021-05-07 19:52:29 +01:00
Krzysztof Parzyszek 50cf0a1d1a Allow empty value list in propagateMetadata(Inst, ArrayOf...)
This will allow writing
  propagateMetadata(Inst, collectInterestingValues(...))
without concern about empty lists. In case of an empty list,
Inst is returned without any changes.
2021-05-07 13:20:50 -05:00
Fangrui Song d8aba75a76 Internalize some cl::opt global variables or move them under namespace llvm 2021-05-07 11:15:43 -07:00
Whitney Tsang 1006ac3963 [LoopNest] Consider loop nest with inner loop guard using outer loop
induction variable to be perfect

This patch allow more conditional branches to be considered as loop
guard, and so more loop nests can be considered perfect.

Reviewed By: bmahjour, sidbav

Differential Revision: https://reviews.llvm.org/D94717
2021-05-07 16:04:18 +00:00
Joseph Tremoulet bc302bfbef BasicAA: Recognize inttoptr as isEscapeSource
Pointers escape when converted to integers, so a pointer produced by
converting an integer to a pointer must not be a local non-escaping
object.

Reviewed By: nikic, nlopes, aqjune

Differential Revision: https://reviews.llvm.org/D101541
2021-05-07 07:48:50 -07:00
Peilin Guo 911a541620 [LazyValueInfo] Insert an Overdefined placeholder to prevent infinite recursion
getValueFromCondition() uses a Visited set to record the intermediate value.
However, it uses a postorder way to compute the value first and update the
Visited set later. Thus it will be trapped into an infinite recursion if there
exists IRs that use no dominated by its def as in this example:

  %tmp3 = or i1 undef, %tmp4
  %tmp4 = or i1 undef, %tmp3

To prevent this, we can insert an Overdefined placeholder into the set
before computing the actual value.

Reviewed by: nikic

Differential Revision: https://reviews.llvm.org/D101273
2021-05-07 16:05:50 +08:00
Mircea Trofin 97ab068034 [NPM] Do not run function simplification pipeline unnecessarily
The CGSCC pass manager interplay with the FunctionAnalysisManagerCGSCCProxy is 'special' in the sense that the former will rerun the latter if there are changes to a SCC structure; that being said, some of the functions in the SCC may be unchanged. In that case, the function simplification pipeline will be re-run, which impacts compile time[1].

This patch allows the function simplification pipeline be skipped if it was already run and the function was not modified since.

The behavior is currently disabled by default. This is because, currently, the rerunning of the function simplification pipeline on an unchanged function may still result in changes. The patch simplifies investigating and fixing those cases where repeated function pass runs do actually positively impact code quality, while offering an easy workaround for those impacted negatively by compile time regressions, and not impacting mainline scenarios.

[1] A [[ http://llvm-compile-time-tracker.com/compare.php?from=eb37d3546cd0c6e67798496634c45e501f7806f1&to=ac722d1190dc7bbdd17e977ef7ec95e69eefc91e&stat=instructions | compile time tracker ]] run with the option enabled.

Differential Revision: https://reviews.llvm.org/D98103
2021-05-06 12:24:33 -07:00
Bjorn Pettersson 3ee826594a Make dependency between certain analysis passes transitive (reapply)
LazyBlockFrequenceInfoPass, LazyBranchProbabilityInfoPass and
LoopAccessLegacyAnalysis all cache pointers to their nestled required
analysis passes. One need to use addRequiredTransitive to describe
that the nestled passes can't be freed until those analysis passes
no longer are used themselves.

There is still a bit of a mess considering the getLazyBPIAnalysisUsage
and getLazyBFIAnalysisUsage functions. Those functions are used from
both Transform, CodeGen and Analysis passes. I figure it is OK to
use addRequiredTransitive also when being used from Transform and
CodeGen passes. On the other hand, I figure we must do it when
used from other Analysis passes. So using addRequiredTransitive should
be more correct here. An alternative solution would be to add a
bool option in those functions to let the user tell if it is a
analysis pass or not. Since those lazy passes will be obsolete when
new PM has conquered the world I figure we can leave it like this
right now.

Intention with the patch is to fix PR49950. It at least solves the
problem for the reproducer in PR49950. However, that reproducer
need five passes in a specific order, so there are lots of various
"solutions" that could avoid the crash without actually fixing the
root cause.

This is a reapply of commit 3655f0757f, that was reverted in
33ff3c2049 due to problems with assertions in the polly
lit tests. That problem is supposed to be solved by also adjusting
ScopPass to explicitly preserve LazyBlockFrequencyInfo and
LazyBranchProbabilityInfo (it already preserved
OptimizationRemarkEmitter which depends on those lazy passes).

Differential Revision: https://reviews.llvm.org/D100958
2021-05-05 15:17:55 +02:00
Bjorn Pettersson 33ff3c2049 Revert "Make dependency between certain analysis passes transitive"
This reverts commit 3655f0757f.

It caused assertion failures related to setLastUser in polly builds.
2021-05-04 19:08:41 +02:00