If there is a freeze %x, we currently replace all other uses of %x
with freeze %x -- as long as they are dominated by the freeze
instruction. This patch extends this behavior to cases where we
did not originally dominate the use by moving the freeze
instruction directly after the definition of the frozen value.
The motivation can be seen in test @combine_and_after_freezing_uses:
Canonicalizing everything to freeze %x allows folds that are based
on value identity (i.e. same operand occurring in two places) to
trigger. This also covers the case from D125248.
Differential Revision: https://reviews.llvm.org/D125321
Currently, two GEPs will only be combined if the result element
type of one is the same as the source element type of the other.
However, this means we may miss folding opportunities where the
second GEP could be rewritten using a different element type. This
is especially relevant for opaque pointers, where constant GEPs
often use i8 element type.
Address this by converting GEP indices to offsets, adding them,
and then converting them back to indices. The first (inner) GEP
is allowed to have variable indices as well, in which case only
the constant suffix is converted into an offset.
This should address the regression reported in
https://reviews.llvm.org/D123300#3467615.
Differential Revision: https://reviews.llvm.org/D124459
We can always replace the undef elements in a vector constant
with regular constants to get rid of the freeze:
https://alive2.llvm.org/ce/z/nfRb4F
The select diffs show that we might do better by adjusting the
logic for a frozen select condition. We may also want to refine
the vector constant replacement to consider forming a splat.
Differential Revision: https://reviews.llvm.org/D123962
The description was ambiguous about the behavior
when boths select arms are constant or both arms
are not constant. I don't think there's any
evidence to support either way, but this matches
the code with a more specified description.
We can extend this to deal with vector constants
with undef/poison elements. Currently, those don't
get folded anywhere.
It actually implements support for seeing through loads, using alias analysis to
refine the result.
This is rather limited, but I didn't want to rely on more than available
analysis at that point (to be gentle with compilation time), and it does seem to
catch common scenario, as showcased by the included tests.
Differential Revision: https://reviews.llvm.org/D122431
By adding a parameter to function FoldOpIntoSelect, we can fold more Ops to Select.
For this example, we tend to fold the division instruction,
so we no longer care whether SelectInst is one use.
This patch slove TODO left in InstCombine/div.ll.
Reviewed By: RKSimon
Differential Revision: https://reviews.llvm.org/D122967
This patch tries to sink instructions when they are only used in a successor block.
This is a further enhancement patch based on Anna's commit:
D109700, which allows sinking an instruction having multiple uses in a single user.
In this patch, sink instructions with multiple users in a single successor block will be supported.
It could fix a known issue from rust:
https://github.com/rust-lang/rust/issues/51346#issuecomment-394443610
Reviewed By: nikic, reames
Differential Revision: https://reviews.llvm.org/D121585
Prior to this change LLVM would happily elide a call to any allocation
function and a call to any free function operating on the same unused
pointer. This can cause problems in some obscure cases, for example if
the body of operator::new can be inlined but the body of
operator::delete can't, as in this example from jyknight:
#include <stdlib.h>
#include <stdio.h>
int allocs = 0;
void *operator new(size_t n) {
allocs++;
void *mem = malloc(n);
if (!mem) abort();
return mem;
}
__attribute__((noinline)) void operator delete(void *mem) noexcept {
allocs--;
free(mem);
}
void deleteit(int*i) { delete i; }
int main() {
int*i = new int;
deleteit(i);
if (allocs != 0)
printf("MEMORY LEAK! allocs: %d\n", allocs);
}
This patch addresses the issue by introducing the concept of an
allocator function family and uses it to make sure that alloc/free
function pairs are only removed if they're in the same family.
Differential Revision: https://reviews.llvm.org/D117356
extractvalue (any_mul_with_overflow X, -1), 0 --> -X
There are similar other potential transforms that we could do as
noted by the last TODO in the test diffs.
Fixes#54053
Based on the output of include-what-you-use.
This is a big chunk of changes. It is very likely to break downstream code
unless they took a lot of care in avoiding hidden ehader dependencies, something
the LLVM codebase doesn't do that well :-/
I've tried to summarize the biggest change below:
- llvm/include/llvm-c/Core.h: no longer includes llvm-c/ErrorHandling.h
- llvm/IR/DIBuilder.h no longer includes llvm/IR/DebugInfo.h
- llvm/IR/IRBuilder.h no longer includes llvm/IR/IntrinsicInst.h
- llvm/IR/LLVMRemarkStreamer.h no longer includes llvm/Support/ToolOutputFile.h
- llvm/IR/LegacyPassManager.h no longer include llvm/Pass.h
- llvm/IR/Type.h no longer includes llvm/ADT/SmallPtrSet.h
- llvm/IR/PassManager.h no longer includes llvm/Pass.h nor llvm/Support/Debug.h
And the usual count of preprocessed lines:
$ clang++ -E -Iinclude -I../llvm/include ../llvm/lib/IR/*.cpp -std=c++14 -fno-rtti -fno-exceptions | wc -l
before: 6400831
after: 6189948
200k lines less to process is no that bad ;-)
Discourse thread on the topic: https://llvm.discourse.group/t/include-what-you-use-include-cleanup
Differential Revision: https://reviews.llvm.org/D118652
This transform is fundamentally incompatible with opaque pointers.
Usually we would not hit it anyway because the bitcast is folded
away earlier, but due to worklist order it might survive until
here, so make sure we bail out explicitly.
Instead use either Type::getPointerElementType() or
Type::getNonOpaquePointerElementType().
This is part of D117885, in preparation for deprecating the API.
This is an alternate version of D115914 that handles/tests all binary opcodes.
I suspect that we don't see these patterns too often because -simplifycfg
would convert the minimal cases into selects rather than leave them in phi form
(note: instcombine has logic holes for combining the select patterns too though,
so that's another potential patch).
We only create a new binop in a predecessor that unconditionally branches to
the final block.
https://alive2.llvm.org/ce/z/C57M2Fhttps://alive2.llvm.org/ce/z/WHwAoU (not safe to speculate an sdiv for example)
https://alive2.llvm.org/ce/z/rdVUvW (but it is ok on this path)
Differential Revision: https://reviews.llvm.org/D117110
Checking for specific function terminating opcodes
means we don't handle other non-hardcoded ones :)
This should probably be generalized to something
similar to the `IsBlockFollowedByDeoptOrUnreachable()`.
Reviewed By: spatel
Differential Revision: https://reviews.llvm.org/D117810
The tests with constant folding that produces poison
could potentially remove the select entirely:
https://alive2.llvm.org/ce/z/e-WUqF
...but this patch just removes the FMF-only limitation on
propagation.
This doesn't require callers to put the pointer operand and the indices
in a container like a vector when calling the function. This is not
really an issue with the existing callers. But when using it from
IRBuilder the inputs are available as separate pointer value and indices
ArrayRef.
Reviewed By: lebedev.ri
Differential Revision: https://reviews.llvm.org/D117038
Not all allocation functions are removable if unused. An example of a non-removable allocation would be a direct call to the replaceable global allocation function in C++. An example of a removable one - at least according to historical practice - would be malloc.
If we have a call whose only side effect is a write to a location which is known to be dead, we can sink said call to the users of the call's result value. This is analogous to the recent changes to delete said calls if unused, but framed as a sinking transform instead.
Differential Revision: https://reviews.llvm.org/D116200
In their current form, these folds are fundamentally incompatible
with opaque pointers. We should add a separate set of folds for
the canonicalization of the GEP source type. For now, skip these
folds.
This change may not be entirely NFC, because a number of early
returns will now only early return from this particular fold,
rather than the whole visitGetElementPtr() implementation. This
is also the reason why I'm doing this change, as I don't think
this was intended.
This is a reapply of a8a51fe5, which was reverted in 1ba99e due to a failing compiler-rt test. That test was a false positive because it was checking asan failures not accounting for the fact the call could be validly optimized out. I hopefully managed to stablize that test in 9b955f. (That's a speculative fix due to disk consumption needed to build compiler-rt tests locally being absurd.)
Original commit message follows..
The majority of this change is sinking logic from instcombine into MemoryLocation such that it can be generically reused. If we have a call with a single analyzable write to an argument, we can treat that as-if it were a store of unknown size.
Merging the code in this was unblocks DSE in the store to dead memory code paths. In theory, it should also enable classic DSE of such calls, but the code appears to not know how to use object sizes to refine unknown access bounds (yet).
In addition, this does make the isAllocRemovable path slightly stronger by reusing the libfunc and additional intrinsics bits which are already in getForDest.
Differential Revision: https://reviews.llvm.org/D115904
The majority of this change is sinking logic from instcombine into MemoryLocation such that it can be generically reused. If we have a call with a single analyzable write to an argument, we can treat that as-if it were a store of unknown size.
Merging the code in this was unblocks DSE in the store to dead memory code paths. In theory, it should also enable classic DSE of such calls, but the code appears to not know how to use object sizes to refine unknown access bounds (yet).
In addition, this does make the isAllocRemovable path slightly stronger by reusing the libfunc and additional intrinsics bits which are already in getForDest.
Differential Revision: https://reviews.llvm.org/D115904
This is a slight generalization of D115829. I noticed this while restructuring code for a follow up patch to perform the same optimizations in DSE.
If we have a call whose only visible effect is writing to an alloca, and we're removing the alloca anyways, we don't care if the call also reads from the same alloca. That read will be unobservable and thus doesn't block removal of the call.
Worth noting is that this observation generalizes for non-argument reads. It just happens that case reduces to a readonly call, and is already handled separately.
Differential Revision: https://reviews.llvm.org/D115898
isAllocSiteRemovable tracks whether all uses of an alloca are both non-capturing, and non-reading. If so, we can remove said alloca because nothing can depend on its content or address.
This patch extends this reasoning to allow writes from calls where we can prove the call has no side effect other than writing to said allocation. This is a fairly natural fit for the existing code with one subtle detail - the call can write to multiple locations at once which stores can't.
As a follow up, we can likely sink the intrinsic handling into the generic code by allowing readnone arguments as well. I deliberately left that out to minimize conceptual churn.
Differential Revision: https://reviews.llvm.org/D115829
This is a generalization/extension of the existing and/or
folds noted with TODO comments. Those have a one-use
constraint that is not necessary.
Potential follow-ups are noted by the TODO comments in
the new function. We can also call this function from
other binop visit* functions, but we need to add tests
first.
This solves:
https://llvm.org/PR52543https://alive2.llvm.org/ce/z/NWuCR5
Add a variant of getEquivalentICmp() that produces an optional
offset. This allows us to create an equivalent icmp for all ranges.
Use this in the with.overflow folding code, which was doing this
adjustment separately -- this clarifies that the fold will indeed
always apply.
If the parameter had been annotated as nonnull because of the null
check, we want to remove the attribute, since it may no longer apply and
could result in miscompiles if left. Similarly, we also want to remove
undef-implying attributes, since they may not apply anymore either.
Fixes PR52110.
Reviewed By: nikic
Differential Revision: https://reviews.llvm.org/D111515
This extends the foldOpIntoPhi code used when visiting a freeze user of a phi to allow any non-undef/poison operand as opposed to only non-undef/poison constants. This lets us hoist a freeze in the increment of an IV into the preheader in many cases.
Differential Revision: https://reviews.llvm.org/D111744
If we have an instruction which produces poison only when flags are specified on the instruction, then we know that freezing the operands and dropping flags is equivalent to freezing the result. If we know those flags don't result in any undefined behavior being executed, then there's no point in preserving the flags as we gain no knowledge by having them.
This patch extends the existing propagation logic which sinks freeze to single potential non-poison operands to allow dropping of flags when we know the freeze is the sole use of the instruction with poison flags.
The main value is that we tend to sink freezes towards the phi in IV cycles where the incoming value to the phi is the freeze of an IV increment. This will in turn (in a future patch), let us fold the freeze through the phi into the loop preheader. Motivated by eliminating need for CanonicalizeFreezeInLoops for the clearly profitable cases from onephi.ll test case in the test directory.
Differential Revision: https://reviews.llvm.org/D111675
This patch continues unblocking optimizations that are blocked by pseudo probe instrumentation.
Not exactly like DbgIntrinsics, PseudoProbe intrinsic has other attributes (such as mayread, maywrite, mayhaveSideEffect) that can block optimizations. The issues fixed are:
- Flipped default param of getFirstNonPHIOrDbg API to skip pseudo probes
- Unblocked CSE by avoiding pseudo probe from clobbering memory SSA
- Unblocked induction variable simpliciation
- Allow empty loop deletion by treating probe intrinsic isDroppable
- Some refactoring.
Reviewed By: wenlei
Differential Revision: https://reviews.llvm.org/D110847
This splits out the logic from shouldChangeType() that
currently allows 8/16/32-bit transforms even if those
types are not listed as legal in the data layout.
This could be useful as a predicate for vector
insert/extract transforms.
Note that this leaves the subsequent checks in
shouldChangeType() unchanged. We may want to merge
the checks for i1 and/or "ToLegal" into "isDesirable",
but that may alter existing transforms.
Stop using APInt constructors and methods that were soft-deprecated in
D109483. This fixes all the uses I found in llvm, except for the APInt
unit tests which should still test the deprecated methods.
Differential Revision: https://reviews.llvm.org/D110807
This patch is for fixing potential shufflevector-related bugs like D93818.
As D93818, this patch change shufflevector's default placeholder to poison.
To reduce risk, it was divided into several patches, and this patch is for InstCombineCompares and InstructionCombining.
Reviewed By: spatel
Differential Revision: https://reviews.llvm.org/D110227
InstCombine's worklist can be re-used by other passes like
VectorCombine. Move it to llvm/Transform/Utils and rename it to
InstructionWorklist.
Reviewed By: lebedev.ri
Differential Revision: https://reviews.llvm.org/D110181