Summary: This patch makes code motion checks optional which are dependent on
specific analysis example, dominator tree, post dominator tree and dependence
info. The aim is to make the adoption of CodeMoverUtils easier for clients that
don't use analysis which were strictly required by CodeMoverUtils. This will
also help in diversifying code motion checks using other analysis example MSSA.
Authored By: RithikSharma
Reviewer: Whitney, bmahjour, etiotto
Reviewed By: Whitney
Subscribers: Prazek, hiraditya, george.burgess.iv, asbirlea, llvm-commits
Tag: LLVM
Differential Revision: https://reviews.llvm.org/D82566
Summary:
It is reasonably common to want to clone some call with different bundles.
Let's actually provide an interface to do that.
Reviewers: chandlerc, jdoerfert, dblaikie, nickdesaulniers
Reviewed By: nickdesaulniers
Subscribers: llvm-commits, hiraditya
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D83248
Summary:
Avoid exposing details about how children are stored. This will enable
subsequent type-erasure changes.
New methods are introduced to cover common access patterns.
Change-Id: Idb5f4b1b9c84e4cc71ddb39bb52a388682f5674f
Reviewers: arsenm, RKSimon, mehdi_amini, courbet
Subscribers: qcolombet, sdardis, wdng, hiraditya, jrtc27, zzheng, atanasyan, asbirlea, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D83083
clang w/ old-pm currently would simply crash
when -mllvm -enable-knowledge-retention=true is specified.
Clearly, these two passes had no Old-PM test coverage,
which would have shown the problem - not requiring AssumptionCacheTracker,
but then trying to always get it.
Also, why try to get domtree only if it's cached,
but at the same time marking it as required?
Summary:
This patch changes call graph analysis to recognize callback call sites
and add an artificial 'reference' call record from the broker function
caller to the callback function in the call graph. A presence of such
reference enforces bottom-up traversal order for callback functions in
CG SCC pass manager because callback function logically becomes a callee
of the broker function caller.
Reviewers: jdoerfert, hfinkel, sstefan1, baziotis
Reviewed By: jdoerfert
Subscribers: hiraditya, kuter, sstefan1, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D82572
Sometimes SimplifyCFG may decide to perform jump threading. In order
to do it, it follows the following algorithm:
1. Checks if the block is small enough for threading;
2. If yes, inserts a PR Phi relying that the next iteration will remove it
by performing jump threading;
3. The next iteration checks the block again and performs the threading.
This logic has a corner case: inserting the PR Phi increases block's size
by 1. If the block size at first check was max possible, one more Phi will
exceed this size, and we will neither perform threading nor remove the
created Phi node. As result, we will end up with worse IR than before.
This patch fixes this situation by excluding Phis from block size computation.
Excluding Phis from size computation for threading also makes sense by
itself because in case of threadign all those Phis will be removed.
Differential Revision: https://reviews.llvm.org/D81835
Reviewed By: asbirlea, nikic
It's possible for the first loop trip(s) to set the `Changed` Status, and to a
later one to early exit, in which case `Changed` must be return.
Differential Revision: https://reviews.llvm.org/D82753
In https://reviews.llvm.org/D81198, we outlined a number of scenarios
where dropping debug locations is appropriate. Stop issuing an error
when this happens.
Summary:
According to HowToUpdateDebugInfo.rst:
```
Preserving the debug locations of speculated instructions can make
it seem like a condition is true when it's not (or vice versa), which
leads to a confusing single-stepping experience
```
This patch follows the recommendation to drop debug locations on
speculated instructions.
Reviewers: aprantl, davide
Subscribers: hiraditya, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D82420
InjectTLIMappings fails to preserve the analysis result of GlobalsAA. Not preserving the analysis might affect benchmark performance. This change fixes this issue.
Patch by: Ryan Santhiraraja <rsanthir@quicinc.com>
Reviewers: fpetrogalli, joerg, fhahn
Reviewed By: fhahn
Differential Revision: https://reviews.llvm.org/D82343
Summary:
As [[ https://bugs.llvm.org/show_bug.cgi?id=45360 | PR45360 ]] reports,
with new cost-model we can sometimes end up being able to expand `udiv`/`urem` instructions.
And that exposes at least one instance of when we do that
regardless of whether or not it is safe to do.
In this particular case, it's `SimplifyIndvar::replaceIVUserWithLoopInvariant()`.
It seems to me, we simply need to check with `isSafeToExpandAt()` first.
The test isn't great. I'm not sure how to make it only run `-indvars`.
Fixes [[ https://bugs.llvm.org/show_bug.cgi?id=45360 | PR45360 ]].
Reviewers: mkazantsev, reames, helloqirun
Reviewed By: mkazantsev
Subscribers: hiraditya, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D82108
This reverts commit 29b2c1ca72.
The patch causes the DT verifier failure like:
DominatorTree is different than a freshly computed one!
Not sure the patch itself it wrong but revert to investigate the failure.
Currently we allow peeling of the loops if there is a exiting latch block
and all other exits are blocks ending with deopt.
Actually we want that exit would end up with deopt unconditionally but
it is not required that exit itself ends with deopt.
Reviewers: reames, ashlykov, fhahn, apilipenko, fedor.sergeev
Reviewed By: apilipenko
Subscribers: hiraditya, zzheng, dantrushin, llvm-commits
Differential Revision: https://reviews.llvm.org/D81140
When an invoke instruction is converted to a call its
profile metadata is dropped because it has incompatible
format (see commit 16ad6eeb94).
This patch adds an attempt to convert profile data to
format of the call instruction. This used to work well
before the commit dcfa78a4cc.
Reviewers: reames
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D82071
Summary:
this reduces significantly the number of assumes generated without aftecting too much
the information that is preserved. this improves the compile-time cost
of enable-knowledge-retention significantly.
Reviewers: jdoerfert, sstefan1
Reviewed By: jdoerfert
Subscribers: hiraditya, asbirlea, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D79650
I don't know anything about debug info, but this seems like more work
should be necessary. This constructs a new IRBuilder and reconstructs
the original divides rather than moving the original.
One problem this has is if a div/rem pair are handled, both end up
with the same debugloc. I'm not sure how to fix this, since this uses
a cache when it sees the same input operands again, which will have
the first instance's location attached.
Summary:
llvm::SplitEdge was failing an assertion that the BasicBlock only had
one successor (for BasicBlocks terminated by CallBrInst, we typically
have multiple successors). It was surprising that the earlier call to
SplitCriticalEdge did not handle the critical edge (there was an early
return). Removing that triggered another assertion relating to creating
a BlockAddress for a BasicBlock that did not (yet) have a parent, which
is a simple order of operations issue in llvm::SplitCriticalEdge (a
freshly constructed BasicBlock must be inserted into a Function's basic
block list to have a parent).
Thanks to @nathanchance for the report.
Fixes: https://github.com/ClangBuiltLinux/linux/issues/1018
Reviewers: craig.topper, jyknight, void, fhahn, efriedma
Reviewed By: efriedma
Subscribers: eli.friedman, rnk, efriedma, fhahn, hiraditya, llvm-commits, nathanchance, srhines
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D81607
The invoke instruction can have profile metadata with branch_weights,
which does not make sense for a call instruction and will be
rejected by the verifier.
Differential revision: https://reviews.llvm.org/D81996
Summary:
this reduces significantly the number of assumes generated without aftecting too much
the information that is preserved. this improves the compile-time cost
of enable-knowledge-retention significantly.
Reviewers: jdoerfert, sstefan1
Reviewed By: jdoerfert
Subscribers: hiraditya, asbirlea, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D79650
is not necessary one of them.
Summary: Currently LoopUnrollPass already allow loops with multiple
exiting blocks, but it is only allowed when the loop latch is one of the
exiting blocks.
When the loop latch is not an exiting block, then only single exiting
block is supported.
When possible, the single loop latch or the single exiting block
terminator is optimized to an unconditional branch in the unrolled loop.
This patch allows loops with multiple exiting blocks even if the loop
latch is not one of them. However, the optimization of exiting block
terminator to unconditional branch is not done when there exists more
than one exiting block.
Reviewer: dmgreen, Meinersbur, etiotto, fhahn, efriedma, bmahjour
Reviewed By: efriedma
Subscribers: hiraditya, zzheng, llvm-commits
Tag: LLVM
Differential Revision: https://reviews.llvm.org/D81053
This patch adds a new option to CriticalEdgeSplittingOptions to control
whether loop-simplify form must be preserved. It is them used by GVN to
indicate that loop-simplify form does not have to be preserved.
This fixes a crash exposed by 189efe295b.
If the critical edge we are splitting goes from a block inside a loop to
a block outside the loop, splitting the edge will create a new exit
block. As a result, the new block will branch to the original exit
block, which will add a non-loop predecessor, breaking loop-simplify
form. To preserve loop-simplify form, the predecessor blocks of the
original exit are split, but that does not work for blocks with
indirectbr terminators. If preserving loop-simplify form is requested,
bail out , before making any changes.
Reviewers: reames, hfinkel, davide, efriedma
Reviewed By: efriedma
Differential Revision: https://reviews.llvm.org/D81582
Change BasicBlock::removePredecessor to optionally return a vector of
instructions which might be dead. Use this in ConstantFoldTerminator to
delete them if they are dead.
Reapply with a bug fix: don't drop the "!KeepOneInputPHIs" argument when
removePredecessor calls PHINode::removeIncomingValue.
Differential Revision: https://reviews.llvm.org/D80206
Change BasicBlock::removePredecessor to optionally return a vector of
instructions which might be dead. Use this in ConstantFoldTerminator to
delete them if they are dead.
Differential Revision: https://reviews.llvm.org/D80206
[ v1 was reverted by c6ec352a6b due to
modpost failing; v2 fixes this. More info:
https://github.com/ClangBuiltLinux/linux/issues/1045#issuecomment-640381783 ]
This makes -fsanitize=kernel-address emit the correct globals
constructors for the kernel. We had to do the following:
* Disable generation of constructors that rely on linker features such
as dead-global elimination.
* Only instrument globals *not* in explicit sections. The kernel uses
sections for special globals, which we should not touch.
* Do not instrument globals that are prefixed with "__" nor that are
aliased by a symbol that is prefixed with "__". For example, modpost
relies on specially named aliases to find globals and checks their
contents. Unfortunately modpost relies on size stored as ELF debug info
and any padding of globals currently causes the debug info to cause size
reported to be *with* redzone which throws modpost off.
Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=203493
Tested:
* With 'clang/test/CodeGen/asan-globals.cpp'.
* With test_kasan.ko, we can see:
BUG: KASAN: global-out-of-bounds in kasan_global_oob+0xb3/0xba [test_kasan]
* allyesconfig, allmodconfig (x86_64)
Reviewed By: glider
Tags: #clang, #llvm
Differential Revision: https://reviews.llvm.org/D81390
- Now all SalvageDebugInfo() calls will mark undef if the salvage
attempt fails.
Reviewed by: vsk, Orlando
Differential Revision: https://reviews.llvm.org/D78369
Summary:
This makes -fsanitize=kernel-address emit the correct globals
constructors for the kernel. We had to do the following:
- Disable generation of constructors that rely on linker features such
as dead-global elimination.
- Only emit constructors for globals *not* in explicit sections. The
kernel uses sections for special globals, which we should not touch.
Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=203493
Tested:
1. With 'clang/test/CodeGen/asan-globals.cpp'.
2. With test_kasan.ko, we can see:
BUG: KASAN: global-out-of-bounds in kasan_global_oob+0xb3/0xba [test_kasan]
Reviewers: glider, andreyknvl
Reviewed By: glider
Subscribers: cfe-commits, nickdesaulniers, hiraditya, llvm-commits
Tags: #llvm, #clang
Differential Revision: https://reviews.llvm.org/D80805
Summary:
In SCEVExpander FactorOutConstant(), when GEP indexing into/over scalable vector,
it is legal for the 'Factor' in a MulExpr to be the size of a scalable vector
instead of a compile-time constant.
Current upstream crash with the test attached.
Reviewers: efriedma, sdesmalen, sanjoy.google, mkazantsev
Reviewed By: efriedma
Subscribers: hiraditya, javed.absar, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D80973
As discussed in https://bugs.llvm.org/show_bug.cgi?id=45951 and
D80584, the name 'tmp' is almost always a bad choice, but we have
a legacy of regression tests with that name because it was baked
into utils/update_test_checks.py.
This change makes -instnamer more consistent (already using "arg"
and "bb", the common LLVM shorthand). And it avoids the conflict
in telling users of the FileCheck script to run "-instnamer" to
create a better regression test and having that cause a warn/fail
in update_test_checks.py.
Prevent `invertCondition` from creating the inversion instruction, in
case the given value is an argument which has already been inverted.
Note that this approach has already been taken in case the given value
is an instruction (and not an argument).
Differential Revision: https://reviews.llvm.org/D80399
Summary: The following code from
/llvm/lib/Transforms/Utils/LoopUnrollAndJam.cpp can be used by other
transformations:
while (!MergeBlocks.empty()) {
BasicBlock *BB = *MergeBlocks.begin();
BranchInst *Term = dyn_cast<BranchInst>(BB->getTerminator());
if (Term && Term->isUnconditional() &&
L->contains(Term->getSuccessor(0))) {
BasicBlock *Dest = Term->getSuccessor(0);
BasicBlock *Fold = Dest->getUniquePredecessor();
if (MergeBlockIntoPredecessor(Dest, &DTU, LI)) {
// Don't remove BB and add Fold as they are the same BB
assert(Fold == BB);
(void)Fold;
MergeBlocks.erase(Dest);
} else
MergeBlocks.erase(BB);
} else
MergeBlocks.erase(BB);
}
Hence it should be separated into its own utility function.
Authored By: sidbav
Reviewer: Whitney, Meinersbur, asbirlea, dmgreen, etiotto
Reviewed By: asbirlea
Subscribers: hiraditya, zzheng, llvm-commits
Tag: LLVM
Differential Revision: https://reviews.llvm.org/D80583
Now that all of the statepoint related routines have classes with isa support, let's cleanup.
I'm leaving the (dead) utitilities in tree for a few days so that I can do the same cleanup downstream without breakage.
code motion
Summary: Currently isSafeToMoveBefore uses DFS numbering for determining
the relative position of instruction and insert point which is not
always correct. This PR proposes the use of Dominator Tree depth for the
same. If a node is at a higher level than the insert point then it is
safe to say that we want to move in the forward direction.
Authored By: RithikSharma
Reviewer: Whitney, nikic, bmahjour, etiotto, fhahn
Reviewed By: Whitney
Subscribers: fhahn, hiraditya, llvm-commits
Tag: LLVM
Differential Revision: https://reviews.llvm.org/D80084
This makes sure to correctly register the loop info of the children
of unroll and jammed loops. It re-uses some code from the unroller for
registering subloops.
Differential Revision: https://reviews.llvm.org/D80619
This fixes the output of the check-debugify option.
Without the patch an example of running the option:
$ opt -check-debugify test.ll -S -o testDebugify.ll
CheckModuleDebugifySkipping module without debugify metadata
After the patch:
$ opt -check-debugify test.ll -S -o testDebugify.ll
CheckModuleDebugify: Skipping module without debugify metadata
Differential Revision: https://reviews.llvm.org/D80553
I think the current code dealing with connecting the unrolled iterations
is a bit more complicated than necessary currently. To connect the
unrolled iterations, we have to update the unrolled latch blocks to
branch to the header of the next unrolled iteration.
We need to do this regardless whether the latch is exiting or not.
Additionally, we try to turn the conditional branch in the exiting block
to an unconditional one. This is an optimization only; alternatively we
could leave the conditional branches in place and rely on other passes
to simplify the conditions.
Logically, this is a separate step from connecting the latches to the
headers, but it is convenient to fold them into the same loop, if the
latch is also exiting. For headers (or other non-latch exiting blocks,
this is done separately).
Hopefully the patch with additional comments makes things a bit clearer.
Reviewers: efriedma, dmgreen, hfinkel, Whitney
Reviewed By: efriedma
Differential Revision: https://reviews.llvm.org/D80544
This intrinsic implements IEEE-754 operation roundToIntegralTiesToEven,
and performs rounding to the nearest integer value, rounding halfway
cases to even. The intrinsic represents the missed case of IEEE-754
rounding operations and now llvm provides full support of the rounding
operations defined by the standard.
Differential Revision: https://reviews.llvm.org/D75670
Summary:
Added a new IRCanonicalizer pass which aims to transform LLVM modules into
a canonical form by reordering and renaming instructions while preserving the
same semantics. The canonicalizer makes it easier to spot semantic differences
when diffing two modules which have undergone different passes.
Presentation: https://www.youtube.com/watch?v=c9WMijSOEUg
Reviewed by: plotfi
Differential Revision: https://reviews.llvm.org/D66029
In case the then-path of an if-region is empty, then merging with the
else-path should be handled with the inverse of the condition (leading
to that path).
Fix PR37662
Differential Revision: https://reviews.llvm.org/D78881
Summary:
Currently, `rewriteLoopExitValues()`'s logic is roughly as following:
> Loop over each incoming value in each PHI node.
> Query whether the SCEV for that incoming value is high-cost.
> Expand the SCEV.
> Perform sanity check (`isValidRewrite()`, D51582)
> Record the info
> Afterwards, see if we can drop the loop given replacements.
> Maybe perform replacements.
The problem is that we interleave SCEV cost checking and expansion.
This is A Problem, because `isHighCostExpansion()` takes special care
to not bill for the expansions that were already expanded, and we can reuse.
While it makes sense in general - if we know that we will expand some SCEV,
all the other SCEV's costs should account for that, which might cause
some of them to become non-high-cost too, and cause chain reaction.
But that isn't what we are doing here. We expand *all* SCEV's, unconditionally.
So every next SCEV's cost will be affected by the already-performed expansions
for previous SCEV's. Even if we are not planning on keeping
some of the expansions we performed.
Worse yet, this current "bonus" depends on the exact PHI node
incoming value processing order. This is completely wrong.
As an example of an issue, see @dmajor's `pr45835.ll` - if we happen to have
a PHI node with two(!) identical high-cost incoming values for the same basic blocks,
we would decide first time around that it is high-cost, expand it,
and immediately decide that it is not high-cost because we have an expansion
that we could reuse (because we expanded it right before, temporarily),
and replace the second incoming value but not the first one;
thus resulting in a broken PHI.
What we instead should do for now, is not perform any expansions
until after we've queried all the costs.
Later, in particular after `isValidRewrite()` is an assertion (D51582)
we could improve upon that, but in a more coherent fashion.
See [[ https://bugs.llvm.org/show_bug.cgi?id=45835 | PR45835 ]]
Reviewers: dmajor, reames, mkazantsev, fhahn, efriedma
Reviewed By: dmajor, mkazantsev
Subscribers: smeenai, nikic, hiraditya, javed.absar, llvm-commits, dmajor
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D79787
Hide the method that allows setting probability for particular edge
and introduce a public method that sets probabilities for all
outgoing edges at once.
Setting individual edge probability is error prone. More over it is
difficult to check that the total probability is 1.0 because there is
no easy way to know when the user finished setting all
the probabilities.
Related bug is fixed in BranchProbabilityInfo::calcMetadataWeights().
Changing unreachable branch probabilities to raw(1) and distributing
the rest (oldProbability - raw(1)) over the reachable branches could
introduce total probability inaccuracy bigger than 1/numOfBranches.
Reviewers: yamauchi, ebrevnov
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D79396
Summary:
If an induction variable is frozen and used, SCEV yields imprecise result
because it doesn't say anything about frozen variables.
Due to this reason, performance degradation happened after
https://reviews.llvm.org/D76483 is merged, causing
SCEV yield imprecise result and preventing LSR to optimize a loop.
The suggested solution here is to add a pass which canonicalizes frozen variables
inside a loop. To be specific, it pushes freezes out of the loop by freezing
the initial value and step values instead & dropping nsw/nuw flags from instructions used by freeze.
This solution was also mentioned at https://reviews.llvm.org/D70623 .
Reviewers: spatel, efriedma, lebedev.ri, fhahn, jdoerfert
Reviewed By: fhahn
Subscribers: nikic, mgorny, hiraditya, javed.absar, llvm-commits, sanwou01, nlopes
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D77523
SCEVExpander modifies the underlying function so it is more suitable in
Transforms/Utils, rather than Analysis. This allows using other
transform utils in SCEVExpander.
This patch was originally committed as b8a3c34eee, but broke the
modules build, as LoopAccessAnalysis was using the Expander.
The code-gen part of LAA was moved to lib/Transforms recently, so this
patch can be landed again.
Reviewers: sanjoy.google, efriedma, reames
Reviewed By: sanjoy.google
Differential Revision: https://reviews.llvm.org/D71537
After D76797 the dominator tree is no longer used in LVI, so we
can remove it as a pass dependency, and also get rid of the
dominator tree enabling/disabling logic in JumpThreading.
Apart from cleaning up the code, this also clarifies LVI
cache consistency, in that the LVI cache can no longer
depend on whether the DT was or wasn't enabled due to
pending DT updates at any given time.
Differential Revision: https://reviews.llvm.org/D76985
We already check hasNoNaNs and that x is finite and strictly positive.
That only leaves the following special cases (taken from the Linux man
page for pow):
If x is +1, the result is 1.0 (even if y is a NaN).
If the absolute value of x is less than 1, and y is negative infinity, the result is positive infinity.
If the absolute value of x is greater than 1, and y is negative infinity, the result is +0.
If the absolute value of x is less than 1, and y is positive infinity, the result is +0.
If the absolute value of x is greater than 1, and y is positive infinity, the result is positive infinity.
The first case is handled elsewhere, and this transformation preserves
all the others, so there is no need to limit it to hasNoInfs.
Differential Revision: https://reviews.llvm.org/D79409
Summary:
When a loop has multiple backedges, loop simplification attempts to
separate them out into nested loops. This results in incorrect control
flow in the presence of some functions like a GPU barrier. This change
skips the transformation when such "convergent" function calls are
present in the loop body.
Reviewed By: nhaehnle
Differential Revision: https://reviews.llvm.org/D80078
Summary:
When salvaging a dead zext instruction, append a convert operation to
the DIExpressions of the debug uses of the instruction, to prevent the
salvaged value from being sign-extended.
I confirmed that lldb prints out the correct unsigned result for "f" in
the example from PR45923 with this changed applied.
rdar://63246143
Reviewers: aprantl, jmorse, chrisjackson, davide
Subscribers: hiraditya, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D80034
Along the lines of D77454 and D79968. Unlike loads and stores, the
default alignment is getPrefTypeAlign, to match the existing handling in
various places, including SelectionDAG and InstCombine.
Differential Revision: https://reviews.llvm.org/D80044
This reverts commit 454de99a6f.
The problem was that one of the ctor arguments of CallAnalyzer was left
to be const std::function<>&. A function_ref was passed for it, and then
the ctor stored the value in a function_ref field. So a std::function<>
would be created as a temporary, and not survive past the ctor
invocation, while the field would.
Tested locally by following https://github.com/google/sanitizers/wiki/SanitizerBotReproduceBuild
Original Differential Revision: https://reviews.llvm.org/D79917
This is D77454, except for stores. All the infrastructure work was done
for loads, so the remaining changes necessary are relatively small.
Differential Revision: https://reviews.llvm.org/D79968
This has been duplicated since before
2372a193ba, but that commit has it
appearing twice in the space of 10 lines of the same function body. It
could also be hoisted up to the point just after where the last
special-case is considered, but I want to keep the intent of the
original authors.
Committed as obvious without a review.
The "null-pointer-is-valid" attribute needs to be checked by many
pointer-related combines. To make the check more efficient, convert
it from a string into an enum attribute.
In the future, this attribute may be replaced with data layout
properties.
Differential Revision: https://reviews.llvm.org/D78862
Summary:
This change exposes the vector name mangling with LLVM ISA (used as part
of vector-function-abi-variant) as a utility.
This can then be used by front-ends that add this attribute.
Note that all parameters passed in to the function will be mangled with
the "v" token to identify that they are of of vector type. So, it is the
responsibility of the caller to confirm that all parameters in the
vectorized variant is of vector type.
Added unit test to show vector name mangling.
Reviewed-By: fpetrogalli, simoll
Differential Revision: https://reviews.llvm.org/D79867
Summary:
Replacing uses of std::function pointers or refs, or Optional, to
function_ref, since the usage pattern allows that. If the function is
optional, using a default parameter value (nullptr). This led to a few
parameter reshufles, to push all optionals to the end of the parameter
list.
Reviewers: davidxl, dblaikie
Subscribers: arsenm, jvesely, nhaehnle, eraman, hiraditya, haicheng, kerbowa, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D79917
Summary:
Analyses that are statefull should not be retrieved through a proxy from
an outer IR unit, as these analyses are only invalidated at the end of
the inner IR unit manager.
This patch disallows getting the outer manager and provides an API to
get a cached analysis through the proxy. If the analysis is not
stateless, the call to getCachedResult will assert.
Reviewers: chandlerc
Subscribers: mehdi_amini, eraman, hiraditya, zzheng, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D72893
Hide the method that allows setting probability for particular
edge and introduce a public method that sets probabilities for
all outgoing edges at once.
Setting individual edge probability is error prone. More over
it is difficult to check that the total probability is 1.0
because there is no easy way to know when the user finished
setting all the probabilities.
Reviewers: yamauchi, ebrevnov
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D79396
We want to add a way to avoid merging identical calls so as to keep the
separate debug-information for those calls. There is also an asan
usecase where having this attribute would be beneficial to avoid
alternative work-arounds.
Here is the link to the feature request:
https://bugs.llvm.org/show_bug.cgi?id=42783.
`nomerge` is different from `noline`. `noinline` prevents function from
inlining at callsites, but `nomerge` prevents multiple identical calls
from being merged into one.
This patch adds `nomerge` to disable the optimization in IR level. A
followup patch will be needed to let backend understands `nomerge` and
avoid tail merge at backend.
Reviewed By: asbirlea, rnk
Differential Revision: https://reviews.llvm.org/D78659
Summary:
this patch fixe crash/asserts found in the test-suite.
the AssumeptionCache cannot be assumed to have all assumes contrary to what i tought.
prevent generation of information for terminators, because this can create broken IR in transfromation where we insert the new terminator before removing the old one.
Reviewers: jdoerfert
Reviewed By: jdoerfert
Subscribers: hiraditya, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D79458
don't span their entire scope.
The previous commit (6d1c40c171) is an older version of the test.
Reviewed By: aprantl, vsk
Differential Revision: https://reviews.llvm.org/D79573
Summary:
The assume builder was non-deterministic when working on unamed values.
this patch fixes this.
Reviewers: jdoerfert
Reviewed By: jdoerfert
Subscribers: hiraditya, mgrang, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D78616
Summary: with this patch the assume salvageKnowledge will not generate assume if all knowledge is already available in an assume with valid context. assume bulider can also in some cases update an existing assume with better information.
Reviewers: jdoerfert
Reviewed By: jdoerfert
Subscribers: hiraditya, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D78014
Currently LAA's uses of ScalarEvolutionExpander blocks moving the
expander from Analysis to Transforms. Conceptually the expander does not
fit into Analysis (it is only used for code generation) and
runtime-check generation also seems to be better suited as a
transformation utility.
Reviewers: Ayal, anemet
Reviewed By: Ayal
Differential Revision: https://reviews.llvm.org/D78460
Summary:
If the only use of a value is a start or end lifetime intrinsic then mark the intrinsic as trivially dead. This should allow for that value to then be removed as well.
Currently, this only works for allocas, globals, and arguments.
Subscribers: hiraditya, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D79355
FoldBranchToCommonDest clones instructions to a different basic block,
but handles debug intrinsics in a separate path. Previously, when
cloning debug intrinsics, their operands were not updated to reference
the correct cloned values. As a result, we would emit debug.value
intrinsics with broken operand references which are discarded in later
passes. This leads to incorrect debuginfo that reports incorrect values
for variables.
Fix this by remapping debug intrinsic operands when cloning them.
Fixes https://bugs.llvm.org/show_bug.cgi?id=45667.
Differential Revision: https://reviews.llvm.org/D79602
Splitting critical edges for indirect branches
the SplitIndirectBrCriticalEdges() function may break branch
probabilities if target basic block happens to have unset
a probability for any of its successors. That is because in
such cases the getEdgeProbability(Target) function returns
probability 1/NumOfSuccessors and it is called after Target
was split (thus Target has a single successor). As the result
the correspondent successor of the split block gets
probability 100% but 1/NumOfSuccessors is expected (or better
be left unset).
Reviewers: yamauchi
Differential Revision: https://reviews.llvm.org/D78806
loop nest.
Summary: As discussed in https://reviews.llvm.org/D73129.
Example
Before unroll and jam:
for
A
for
B
for
C
D
E
After unroll and jam (currently):
for
A
A'
for
B
for
C
D
B'
for
C'
D'
E
E'
After unroll and jam (Ideal):
for
A
A'
for
B
B'
for
C
C'
D
D'
E
E'
This is the first patch to change unroll and jam to work in the ideal
way.
This patch change the safety checks needed to make sure is safe to
unroll and jam in the ideal way.
Reviewer: dmgreen, jdoerfert, Meinersbur, kbarton, bmahjour, etiotto
Reviewed By: Meinersbur
Subscribers: fhahn, hiraditya, zzheng, llvm-commits, anhtuyen, prithayan
Tag: LLVM
Differential Revision: https://reviews.llvm.org/D76132
Summary:
If the only use of a value is a start or end lifetime intrinsic then mark the intrinsic as trivially dead. This should allow for that value to then be removed as well.
Currently, this only works for allocas, globals, and arguments.
Subscribers: hiraditya, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D79355
We check that C is finite and strictly positive, but there's no need to
check that it's normal too. exp2 should be just as accurate on denormals
as pow is.
Differential Revision: https://reviews.llvm.org/D79413
I don't think there's any good reason not to do this transformation when
the pow has multiple uses.
Differential Revision: https://reviews.llvm.org/D79407
optimizePow does not create any new calls to pow, so it should work
regardless of whether the pow library function is available. This allows
it to optimize the llvm.pow intrinsic on targets with no math library.
Based on a patch by Tim Renouf.
Differential Revision: https://reviews.llvm.org/D68231
When the shufflevector mask operand was converted into special
instruction data, the FunctionComparator was not updated to
account for this. As such, MergeFuncs will happily merge
shufflevectors with different masks.
This fixes https://bugs.llvm.org/show_bug.cgi?id=45773.
Differential Revision: https://reviews.llvm.org/D79261
In D74183 clang started emitting alignment for sret parameters
unconditionally. This caused a 1.5% compile-time regression on
tramp3d-v4. The reason is that we now generate many instance of IR like
%ptrint = ptrtoint %class.GuardLayers* %guards_m to i64
%maskedptr = and i64 %ptrint, 3
%maskcond = icmp eq i64 %maskedptr, 0
tail call void @llvm.assume(i1 %maskcond)
to preserve the alignment information during inlining. Based on IR
analysis, these assumptions also regress optimization. The attached
phase ordering test case illustrates two issues: One are instruction
count based optimization heuristics, which are affected by the four
additional instructions of the assumption. The other is blocking of
SROA due to ptrtoint casts (PR45763).
We already encountered the same problem in Rust, where we (unlike
Clang) generally prefer to emit alignment information absolutely
everywhere it is available. We were only able to do this after
hardcoding -preserve-alignment-assumptions-during-inlining=false,
because we were seeing significant optimization and compile-time
regressions otherwise.
This patch disables -preserve-alignment-assumptions-during-inlining
by default, because we should not be punishing people for adding
more alignment annotations.
Once the assume bundle work shakes out and we can represent (and use)
alignment assumptions using assume bundles, it should be possible to
re-enable this with reduced overhead.
Differential Revision: https://reviews.llvm.org/D76886
This allows forward declarations of PointerCheck, which in turn reduce
the number of times LoopAccessAnalysis needs to be included.
Ultimately this helps with moving runtime check generation to
Transforms/Utils/LoopUtils.h, without having to include it there.
Reviewers: anemet, Ayal
Reviewed By: Ayal
Differential Revision: https://reviews.llvm.org/D78458
There are several different types of cost that TTI tries to provide
explicit information for: throughput, latency, code size along with
a vague 'intersection of code-size cost and execution cost'.
The vectorizer is a keen user of RecipThroughput and there's at least
'getInstructionThroughput' and 'getArithmeticInstrCost' designed to
help with this cost. The latency cost has a single use and a single
implementation. The intersection cost appears to cover most of the
rest of the API.
getUserCost is explicitly called from within TTI when the user has
been explicit in wanting the code size (also only one use) as well
as a few passes which are concerned with a mixture of size and/or
a relative cost. In many cases these costs are closely related, such
as when multiple instructions are required, but one evident diverging
cost in this function is for div/rem.
This patch adds an argument so that the cost required is explicit,
so that we can make the important distinction when necessary.
Differential Revision: https://reviews.llvm.org/D78635
This method has been commented as deprecated for a while. Remove
it and replace all uses with the equivalent getCalledOperand().
I also made a few cleanups in here. For example, to removes use
of getElementType on a pointer when we could just use getFunctionType
from the call.
Differential Revision: https://reviews.llvm.org/D78882
Add llvm.call.preallocated.{setup,arg} instrinsics.
Add "preallocated" operand bundle which takes a token produced by llvm.call.preallocated.setup.
Add "preallocated" parameter attribute, which is like byval but without the copy.
Verifier changes for these IR constructs.
See https://github.com/rnk/llvm-project/blob/call-setup-docs/llvm/docs/CallSetup.md
Subscribers: hiraditya, jdoerfert, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D74651
Summary:
refactor assume bulider for the next patch.
the assume builder now generate only one assume per attribute kind and per value they are on. to do this it takes the highest. this is desirable because currently, for all attributes the higest value is the most valuable.
Reviewers: jdoerfert
Reviewed By: jdoerfert
Subscribers: hiraditya, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D78013
We should only skip `lifetime` and `dbg` intrinsics when searching for users.
Other intrinsics are legit users that can't be ignored.
Without this fix, the testcase would result in an invalid IR. `memcpy`
will have a reference to the, now, external value (local to the
extracted loop function).
Fix PR42194
Differential Revision: https://reviews.llvm.org/D78749
Summary:
Teach MachineDebugify how to insert DBG_VALUE instructions. This can
help find bugs causing CodeGen differences when debug info is present.
DBG_VALUE instructions are only emitted when -debugify-level is set to
locations+variables.
There is essentially no attempt made to match up DBG_VALUE register
operands with the local variables they ought to correspond to. I'm not
sure how to improve the situation. In some cases (MachineMemOperand?)
it's possible to find the IR instruction a MachineInstr corresponds to,
but in general this seems to call for "undoing" the work done by ISel.
Reviewers: dsanders, aprantl
Subscribers: hiraditya, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D78135
Summary:
When an irreducible SCC is converted into a new natural loop, existing
loops included in that SCC now become children of the new loop. The
logic that moves these loops from the parent loop to the new loop
invoked undefined behaviour when it modified the container that it was
iterating over. Fixed this by first extracting all the loops that are
to be removed from the parent.
Fixes bug 45623.
Reviewed By: arsenm
Differential Revision: https://reviews.llvm.org/D78544
With clang option -funique-internal-linkage-symbols, symbols with
internal linkage get names with the module hash appended.
Differential Revision: https://reviews.llvm.org/D78243
The recently added Instruction::comesBefore can be used instead of
OrderedInstructions.
Reviewers: rnk, nikic, efriedma
Reviewed By: nikic
Differential Revision: https://reviews.llvm.org/D78452
Most of the includes in LoopUtils.h are not required in the header and
they can be replaced by forward declarations.
Unfortunately includes of TargetTransformInfo.h and IVDescriptors.h pull
in a bunch of additional things, but there is no easy way to get rid of
them at the moment I think.
When running IPSCCP on a module with many small functions, memory
usage is dominated by PredicateInfo, which is a huge structure
(partially due to some unfortunate nested SmallVector use). However,
most of it is actually only temporary state needed to build
predicate info, and does not need to be retained after initial
construction.
This patch factors out the predicate building logic and state
into a separate PrediceInfoBuilder, with the extra bonus that
it does not need to live in the header anymore.
Differential Revision: https://reviews.llvm.org/D78326
We previously clamped the trailing zero count to 31 bits. And
then clamped the final alignment to MaximumAlignment which is
1 << 29.
This patch simplifies this to just clamp the trailing zero to
29 using MaxAlignmentExponent.
I was looking into changing this function to use Align/MaybeAlign
and noticed this.
Differential Revision: https://reviews.llvm.org/D78418