The Assignment Tracking debug-info feature is outlined in this RFC:
https://discourse.llvm.org/t/
rfc-assignment-tracking-a-better-way-of-specifying-variable-locations-in-ir
Merge DIAssignID attachments on stores that are merged and sunk out of
loops. The store may be sunk into multiple exit blocks, and in this case all
the copies of the store get the same DIAssignID.
Reviewed By: jmorse
Differential Revision: https://reviews.llvm.org/D133313
The pointsToConstantMemory() method returns true only if the memory pointed to
by the memory location is globally invariant. However, the LLVM memory model
also has the semantic notion of *locally-invariant*: memory that is known to be
invariant for the life of the SSA value representing that pointer. The most
common example of this is a pointer argument that is marked readonly noalias,
which the Rust compiler frequently emits.
It'd be desirable for LLVM to treat locally-invariant memory the same way as
globally-invariant memory when it's safe to do so. This patch implements that,
by introducing the concept of a *ModRefInfo mask*. A ModRefInfo mask is a bound
on the Mod/Ref behavior of an instruction that writes to a memory location,
based on the knowledge that the memory is globally-constant memory (in which
case the mask is NoModRef) or locally-constant memory (in which case the mask
is Ref). ModRefInfo values for an instruction can be combined with the
ModRefInfo mask by simply using the & operator. Where appropriate, this patch
has modified uses of pointsToConstantMemory() to instead examine the mask.
The most notable optimization change I noticed with this patch is that now
redundant loads from readonly noalias pointers can be eliminated across calls,
even when the pointer is captured. Internally, before this patch,
AliasAnalysis was assigning Ref to reads from constant memory; now AA can
assign NoModRef, which is a tighter bound.
Differential Revision: https://reviews.llvm.org/D136659
Followup to D135962 to rename remaining uses of
FunctionModRefBehavior to MemoryEffects. Does not touch API names
yet, but also updates variables names FMRB/MRB to ME, to match the
new type name.
If the single-thread model is used, or the
-licm-force-thread-model-single flag is specified, skip checks
related to thread-safety. This means that store promotion for
conditionally executed stores only requires proof of
dereferenceability and writability, but not of thread-safety. For
example, this enables promotion of stores to (non-constant) globals,
as well as captured allocas.
Fixes https://github.com/llvm/llvm-project/issues/50537.
Differential Revision: https://reviews.llvm.org/D130466
Collect more statistics for scalar promotion. In particular,
keep track of how many promotion candidates there were, and
whether it is a load or a load/store promotion.
Currently, FunctionModRefBehavior tracks whether the function reads
or writes memory (ModRefInfo) and which locations it can access
(argmem, inaccessiblemem and other). This patch changes it to track
ModRef information per-location instead.
To give two examples of why this is useful:
* D117095 highlights a weakness of ModRef modelling in the presence
of operand bundles. For a memcpy call with deopt operand bundle,
we want to say that it can read any memory, but only write argument
memory. This would allow them to be treated like any other calls.
However, we currently can't express this and have to say that it
can read or write any memory.
* D127383 would ideally be modelled as a separate threadid location,
where threadid Refs outside pre-split coroutines can be ignored
(like other accesses to constant memory). The current representation
does not allow modelling this precisely.
The patch as implemented is intended to be NFC, but there are some
obvious opportunities for improvements and simplification. To fully
capitalize on this we would also want to change the way we represent
memory attributes on functions, but that's a larger change, and I
think it makes sense to separate out the FunctionModRefBehavior
refactoring.
Differential Revision: https://reviews.llvm.org/D130896
If there are non-load/store users of the promoted pointer, we
currently abort promotion. However, having such users isn't really
relevant to the transform. We already separately check that a)
there are no instructions that modref the promoted pointer and
b) that a pointer capture disables store promotion.
In the affected @test_captured_in_loop test case we have a readnone
capture of the promoted pointer, which means that load promotion
can be performed (while store promotion cannot).
Differential Revision: https://reviews.llvm.org/D133485
This used a single check to make sure that the object is both
writable and thread-local. Separate them out to make the
deficiencies in the current code more obvious.
Currently, we bail out of scalar promotion if the loop may unwind
and the memory may be visible on unwind. This is because we can't
insert stores of the promoted value on unwind edges.
However, nowadays scalar promotion also has support for only
promoting loads, while leaving stores in place. This kind of
promotion is safe even in the presence of unwinding.
Differential Revision: https://reviews.llvm.org/D133111
This code was relying on a very subtle contract: The expectation
was that for non-allocas, the unwind safety check would already
perform a capture check, so we don't need to perform it later.
This held true when this unwind safety was only handled for allocas
and noalias calls, but became incorrect when byval support was
added.
To avoid this kind of issue, just remove the dependency between the
unwind and thread-safety checks entirely. At worst, this means we
perform a redundant capture check. If this should turn out to be
problematic for compile-time, we can cache that query in a more
explicit way.
* Replace getUserCost with getInstructionCost, covering all cost kinds.
* Remove getInstructionLatency, it's not implemented by any backends, and we should fold the functionality into getUserCost (now getInstructionCost) to make it easier for targets to handle the cost kinds with their existing cost callbacks.
Original Patch by @samparker (Sam Parker)
Differential Revision: https://reviews.llvm.org/D79483
SplitBlockPredecessors currently asserts if one of the predecessor
terminators is a callbr. This limitation was originally necessary,
because just like with indirectbr, it was not possible to replace
successors of a callbr. However, this is no longer the case since
D67252. As the requirement nowadays is that callbr must reference
all blockaddrs directly in the call arguments, and these get
automatically updated when setSuccessor() is called, we no longer
need this limitation.
The only thing we need to do here is use replaceSuccessorWith()
instead of replaceUsesOfWith(), because only the former does the
necessary blockaddr updating magic.
I believe there's other similar limitations that can be removed,
e.g. related to critical edge splitting.
Differential Revision: https://reviews.llvm.org/D129205
This reverts the revert commit ad95255b92.
The updated version also creates a load when the store may not execute.
In those cases, we still need to introduce a load in a function where
there may not have been one before, so this doesn't completely resolve
issue #51248.
Original message:
When only a store is sunk, there is no need to create a load in the
pre-header, as the result of the load will never get used.
The dead load can can introduce UB, if the function is marked as
writeonly.
Reviewed By: nikic
Differential Revision: https://reviews.llvm.org/D123473
This reverts the revert commit 1ddc719680.
This version of the patch sets the initial available value to poison,
which resolves an issue with the SSAUpdater breaking LCSSA form.
When only a store is sunk, there is no need to create a load in the
pre-header, as the result of the load will never get used.
The dead load can can introduce UB, if the function is marked as
writeonly.
Fixes#51248.
Reviewed By: nikic
Differential Revision: https://reviews.llvm.org/D123473
This makes MemorySSA in LoopSink required, and removes the AST-based
implementation, as well as the related support code in LICM.
Differential Revision: https://reviews.llvm.org/D123288
Rather than iterating over users and comparing operands, iterate
over uses and check operand number. Otherwise, we'll end up
promoting a store twice if it has two equal operands.
This can only happen with opaque pointers, as otherwise both
operands differ by a level of indirection, so a bitcast would have
to be involved.
Fixes https://github.com/llvm/llvm-project/issues/54495.
This adds a new option to control AllowSpeculation added in D119965 when
using `-passes=...`.
This allows reproducing #54023 using opt.
Reviewed By: aeubanks
Differential Revision: https://reviews.llvm.org/D121944
This changes MemorySSA to be constructed in unoptimized form.
MemorySSA::ensureOptimizedUses() can be called to optimize all
uses (once). This should be done by passes where having optimized
uses is beneficial, either because we're going to query all uses
anyway, or because we're doing def-use walks.
This should help reduce the compile-time impact of MemorySSA for
some use cases (the reason why I started looking into this is
D117926), which can avoid optimizing all uses upfront, and instead
only optimize those that are actually queried.
Actually, we have an existing use-case for this, which is EarlyCSE.
Disabling eager use optimization there gives a significant
compile-time improvement, because EarlyCSE will generally only query
clobbers for a subset of all uses (this change is not included in
this patch).
Differential Revision: https://reviews.llvm.org/D121381
LICM will speculatively hoist code outside of loops. This requires removing information, like alias analysis (https://github.com/llvm/llvm-project/issues/53794), range information (https://bugs.llvm.org/show_bug.cgi?id=50550), among others. Prior to https://reviews.llvm.org/D99249 , LICM would only be run after LoopRotate. Running Loop Rotate prior to LICM prevents a instruction hoist from being speculative, if it was conditionally executed by the iteration (as is commonly emitted by clang and other frontends). Adding the additional LICM pass first, however, forces all of these instructions to be considered speculative, even if they are not speculative after LoopRotate. This destroys information, resulting in performance losses for discarding this additional information.
This PR modifies LICM to accept a ``speculative'' parameter which allows LICM to be set to perform information-loss speculative hoists or not. Phase ordering is then modified to not perform the information-losing speculative hoists until after loop rotate is performed, preserving this additional information.
Reviewed By: lebedev.ri
Differential Revision: https://reviews.llvm.org/D119965
This extract a common isNotVisibleOnUnwind() helper into
AliasAnalysis, which handles allocas, byval arguments and noalias
calls. After D116998 this could also handle sret arguments. We
have similar logic in DSE and MemCpyOpt, which will be switched
to use this helper as well.
The noalias call case is a bit different from the others, because
it also requires that the object is not captured. The caller is
responsible for doing the appropriate check.
Differential Revision: https://reviews.llvm.org/D117000
When determining whether the memory is local to the function (and
we can thus introduce spurious writes without thread-safety issues),
check for a noalias call rather than the hardcoded list of memory
allocation functions. Noalias calls are the more general way to
determine allocation functions, as long as we're only interested
in the property that the returned value is distinct from any other
accessible memory.
Differential Revision: https://reviews.llvm.org/D116728
This reverts change 2c391a5/D87551. As noted in the llvm-dev thread "LICM as canonical form" sent earlier today, introducing this was a major design change made without sufficient cause.
A profile driven LICM is not an unreasonable design, it simply is not what we have. Switching to such a model requires a lot more work than just this patch, and broad aggeement that is the right direction for the optimizer as a whole.
Worth noting is that all the tests included in the reverted changed are probably handled if we allow running unconstrained LICM, and later run LoopSink. As such, we have no public examples which motivate a profit based hoisting approach.
When doing load/store promotion within LICM, if we
cannot prove that it is safe to sink the store we won't
hoist the load, even though we can prove the load could
be dereferenced and moved outside the loop. This patch
implements the load promotion by moving it in the loop
preheader by inserting proper PHI in the loop. The store
is kept as is in the loop. By doing this, we avoid doing
the load from a memory location in each iteration.
Please consider this small example:
loop {
var = *ptr;
if (var) break;
*ptr= var + 1;
}
After this patch, it will be:
var0 = *ptr;
loop {
var1 = phi (var0, var2);
if (var1) break;
var2 = var1 + 1;
*ptr = var2;
}
This addresses some problems from [0].
[0] https://bugs.llvm.org/show_bug.cgi?id=51193
Differential revision: https://reviews.llvm.org/D113289
When doing load/store promotion within LICM, if we
cannot prove that it is safe to sink the store we won't
hoist the load, even though we can prove the load could
be dereferenced and moved outside the loop. This patch
implements the load promotion by moving it in the loop
preheader by inserting proper PHI in the loop. The store
is kept as is in the loop. By doing this, we avoid doing
the load from a memory location in each iteration.
Please consider this small example:
loop {
var = *ptr;
if (var) break;
*ptr= var + 1;
}
After this patch, it will be:
var0 = *ptr;
loop {
var1 = phi (var0, var2);
if (var1) break;
var2 = var1 + 1;
*ptr = var2;
}
This addresses some problems from [0].
[0] https://bugs.llvm.org/show_bug.cgi?id=51193
Differential revision: https://reviews.llvm.org/D113289