As my goal is to remove at least _some_ functions from the static list
in MemoryBuiltins.cpp, these tests either need to run inferattrs or
statically declare these attributes to keep passing. A couple of tests
had alternate cases which are no longer meaningful, e.g.
`malloc-load-removal.ll`.
Differential Revision: https://reviews.llvm.org/D123087
For non-mem-intrinsic and non-lifetime `CallBase`s, the current
`isRemovable` function only checks if the `CallBase` 1. has no uses 2.
will return 3. does not throw:
80fb782336/llvm/lib/Transforms/Scalar/DeadStoreElimination.cpp (L1017)
But we should also exclude invokes even in case they don't throw,
because they are terminators and thus cannot be removed. While it
doesn't seem to make much sense for `invoke`s to have an `nounwind`
target, this kind of code can be generated and is also valid bitcode.
Reviewed By: nikic
Differential Revision: https://reviews.llvm.org/D128224
And thread DSE's ephemeral values to EarliestEscapeInfo.
This allows more precise analysis in DSEState::isReadClobber() via BatchAA.
Followup to D123162.
Reviewed By: nikic
Differential Revision: https://reviews.llvm.org/D123342
DSE assumes that this is the case when forming a calloc from a
malloc + memset pair.
For tests, either update the malloc signature or change the
data layout.
Change the DSE calloc handling to assume that it is
inaccessiblememonly, i.e. the defining access is liveOnEntry.
Differential Revision: https://reviews.llvm.org/D117543
Similar to memset, memset_pattern{4,8,16} all will return and do not
unwind. Use fallthrough to include all attributes also set for memset.
Reviewed By: nikic
Differential Revision: https://reviews.llvm.org/D114904
For these "visible on unwind/ret" checks we only care about the
fact that no other code has access to the pointer (unless it
escapes). A noalias call is sufficient for this, it does not
have to be a known allocation function.
This is basically the same change as D116728, but for DSE rather
than LICM.
The intended transform is not legal with an extern global, because
the actual global defined in a different TU might have larger
size. Make it non-extern to show that the desired transform already
works.
If the killing store overwrites the whole object, we know that the
preceding store is dead, regardless of the accessed offset or size.
This case was previously only handled if the size of the dead store
was also known.
This allows us to perform conventional DSE for calls that write to
an argument (but without known size).
Differential Revision: https://reviews.llvm.org/D116267
This fixes a typo in 81d69e1bda.
Of course we should only skip the particular store if it isn't
removable, not bail out of the whole loop. Add a test to cover
this case.
Remove the special casing for intrinsics in MemoryLocation::getForDest()
and handle them through the general attribute based code. On the DSE
side, this means that isRemovable() now needs to handle more than a
hardcoded list of intrinsics. We consider everything apart from
volatile memory intrinsics and lifetime markers to be removable.
This allows us to perform DSE on intrinsics that DSE has not been
specially taught about, using a matrix store as an example here.
There is an interesting test change for invariant.start, but I
believe that optimization is correct. It only looks a bit odd
because the code is immediate UB anyway.
Differential Revision: https://reviews.llvm.org/D116210
Fix handling of alloc-like instructions in isGuaranteedLoopInvariant(). It was not valid when the 'KillingDef' was outside of the loop, while the 'CurrentDef' was inside the loop. In that case, the 'KillingDef' only overwrites the definition from the last iteration of the loop, and not the ones of all iterations. Therefor it does not make the 'CurrentDef' to be dead, and must not remove it.
Fixing issue : https://github.com/llvm/llvm-project/issues/52774
Reviewed by: Florian Hahn
Differential revision: https://reviews.llvm.org/D115965
As reames mentioned on related reviews, we don't need the nocapture
requirement here. First of all, from an API perspective, this is
not something that MemoryLocation::getForDest() should be checking
in the first place, because it does not affect which memory this
particular call can access; it's an orthogonal concern that should
be handled by the caller if necessary.
However, for both of the motivating users in DSE and InstCombine,
we don't need the nocapture requirement, because the capture can
either be purely local to the call (a pointer identity check that
is irrelevant to us), be part of the return value (which we check
is unused), or be written in the dest location, which we have
determined to be dead.
This allows us to remove the special handling for libcalls as well.
Differential Revision: https://reviews.llvm.org/D116148
This is a reapply of a8a51fe5, which was reverted in 1ba99e due to a failing compiler-rt test. That test was a false positive because it was checking asan failures not accounting for the fact the call could be validly optimized out. I hopefully managed to stablize that test in 9b955f. (That's a speculative fix due to disk consumption needed to build compiler-rt tests locally being absurd.)
Original commit message follows..
The majority of this change is sinking logic from instcombine into MemoryLocation such that it can be generically reused. If we have a call with a single analyzable write to an argument, we can treat that as-if it were a store of unknown size.
Merging the code in this was unblocks DSE in the store to dead memory code paths. In theory, it should also enable classic DSE of such calls, but the code appears to not know how to use object sizes to refine unknown access bounds (yet).
In addition, this does make the isAllocRemovable path slightly stronger by reusing the libfunc and additional intrinsics bits which are already in getForDest.
Differential Revision: https://reviews.llvm.org/D115904
The majority of this change is sinking logic from instcombine into MemoryLocation such that it can be generically reused. If we have a call with a single analyzable write to an argument, we can treat that as-if it were a store of unknown size.
Merging the code in this was unblocks DSE in the store to dead memory code paths. In theory, it should also enable classic DSE of such calls, but the code appears to not know how to use object sizes to refine unknown access bounds (yet).
In addition, this does make the isAllocRemovable path slightly stronger by reusing the libfunc and additional intrinsics bits which are already in getForDest.
Differential Revision: https://reviews.llvm.org/D115904
This reverts commit ac60263ad1.
It looks like the test fails on certain non-Darwin system, even though
the triple is explicitly set to macos. Revert while I investigate.
memset_pattern{4,8,16} writes to the first argument. Use getForDest
to return the corresponding MemoryLocation.
Reviewed By: ab
Differential Revision: https://reviews.llvm.org/D114906
strcpy/strcat/strncat access memory starting from the passed in
pointers. Construct memory locations for their args using getAfter.
Discussed in D114872.
Reviewed By: reames
Differential Revision: https://reviews.llvm.org/D114969
This allows for better optimization of 'stores-of-existing-values' and
possibly helps passes further down the pipeline.
Reviewed By: asbirlea
Differential Revision: https://reviews.llvm.org/D113712
Using the optimized access enables additional optimizations in cases
where the defining access is a non-aliasing store.
Alternatively we could also walk upwards and skip non-aliasing defs
here, but my experiments so far showed that this will noticeably
increase compile-time for little extra gain compared to just using the
optimized access.
Improvements of dse.NumRedundantStores on MultiSource/CINT2006/CPF2006
on X86 with -O3:
test-suite...-typeset/consumer-typeset.test 1.00 76.00 7500.0%
test-suite.../Benchmarks/Bullet/bullet.test 3.00 12.00 300.0%
test-suite...006/453.povray/453.povray.test 3.00 6.00 100.0%
test-suite...telecomm-gsm/telecomm-gsm.test 1.00 2.00 100.0%
test-suite...ediabench/gsm/toast/toast.test 1.00 2.00 100.0%
test-suite...marks/7zip/7zip-benchmark.test 1.00 2.00 100.0%
test-suite...ications/JM/lencod/lencod.test 7.00 10.00 42.9%
test-suite...6/464.h264ref/464.h264ref.test 6.00 8.00 33.3%
test-suite...ications/JM/ldecod/ldecod.test 6.00 7.00 16.7%
test-suite...006/447.dealII/447.dealII.test 33.00 33.00 0.0%
test-suite...6/471.omnetpp/471.omnetpp.test NaN 1.00 nan%
test-suite...006/450.soplex/450.soplex.test NaN 2.00 nan%
test-suite.../CINT2006/403.gcc/403.gcc.test NaN 7.00 nan%
test-suite...lications/ClamAV/clamscan.test NaN 1.00 nan%
test-suite...CI_Purple/SMG2000/smg2000.test NaN 3.00 nan%
Follow-up to D111727.
Reviewed By: nikic
Differential Revision: https://reviews.llvm.org/D112315
This patch extends the existing out-of-bounds store tests with a case
with a bigger object and multiple inbounds stores, followed by an OOB
store. The OOB store is not used to remove the inbounds stores in this
case at the moment.
This patch adds support to remove stores that write the same value
as earlier memesets.
It uses isOverwrite to check that a memset completely overwrites a later
store. The candidate store must store the same bytewise value as the
byte stored by the memset.
Reviewed By: nikic
Differential Revision: https://reviews.llvm.org/D112321
That's https://reviews.llvm.org/D90328 follow-up.
This change eliminates writes to variables where the value that is being written is already stored in the variable.
This achieves the goal by looping through all memory definitions in the current state and getting defining access from each of them.
When there is defining access where the write instruction is identical to the original instruction it will remove this redundant write.
For example:
void f() {
x = 1;
if foo() {
x = 1;
g();
} else {
h();
}
}
void g();
void h();
The second x=1 will be eliminated since it is rewriting 1 to x. This pass will produce this:
void f() {
x = 1;
if foo() {
g();
} else {
h();
}
}
void g();
void h();
Differential Revision: https://reviews.llvm.org/D111727
This patch adds more complex test cases with redundant stores of an
existing memset, with other stores in between.
It also makes a few of the existing tests more robust.