Previously the options category given to cl::HideUnrelatedOptions was
local to llvm-reduce.cpp and as a result only options declared in that
file were visible in the -help options listing. This was a bit
unfortunate since there were several useful options declared in other
files. This patch addresses that.
Differential Revision: https://reviews.llvm.org/D118682
When exporting textual IR during reduction the ShouldPreserveUseListOrder
parameter of the IR printer should be set to get predictable results.
Differential Revision: https://reviews.llvm.org/D118585
The cleanup was manual, but assisted by "include-what-you-use". It consists in
1. Removing unused forward declaration. No impact expected.
2. Removing unused headers in .cpp files. No impact expected.
3. Removing unused headers in .h files. This removes implicit dependencies and
is generally considered a good thing, but this may break downstream builds.
I've updated llvm, clang, lld, lldb and mlir deps, and included a list of the
modification in the second part of the commit.
4. Replacing header inclusion by forward declaration. This has the same impact
as 3.
Notable changes:
- llvm/Support/TargetParser.h no longer includes llvm/Support/AArch64TargetParser.h nor llvm/Support/ARMTargetParser.h
- llvm/Support/TypeSize.h no longer includes llvm/Support/WithColor.h
- llvm/Support/YAMLTraits.h no longer includes llvm/Support/Regex.h
- llvm/ADT/SmallVector.h no longer includes llvm/Support/MemAlloc.h nor llvm/Support/ErrorHandling.h
You may need to add some of these headers in your compilation units, if needs be.
As an hint to the impact of the cleanup, running
clang++ -E -Iinclude -I../llvm/include ../llvm/lib/Support/*.cpp -std=c++14 -fno-rtti -fno-exceptions | wc -l
before: 8000919 lines
after: 7917500 lines
Reduced dependencies also helps incremental rebuilds and is more ccache
friendly, something not shown by the above metric :-)
Discourse thread on the topic: https://llvm.discourse.group/t/include-what-you-use-include-cleanup/5831
This patch adds parallel processing of chunks. When reducing very large
inputs, e.g. functions with 500k basic blocks, processing chunks in
parallel can significantly speed up the reduction.
To allow modifying clones of the original module in parallel, each clone
needs their own LLVMContext object. To achieve this, each job parses the
input module with their own LLVMContext. In case a job successfully
reduced the input, it serializes the result module as bitcode into a
result array.
To ensure parallel reduction produces the same results as serial
reduction, only the first successfully reduced result is used, and
results of other successful jobs are dropped. Processing resumes after
the chunk that was successfully reduced.
The number of threads to use can be configured using the -j option.
It defaults to 1, which means serial processing.
Reviewed By: Meinersbur
Differential Revision: https://reviews.llvm.org/D113857
This patch moves the logic to clone and check a new chunk into a new
function, to allow re-use in a follow-up patch that implements parallel
reductions.
Reviewed By: dblaikie
Differential Revision: https://reviews.llvm.org/D113856
Textual LLVM IR files are much bigger and take longer to write to disk.
To avoid the extra cost incurred by serializing to text, this patch adds
an option to save temporary files as bitcode instead.
Reviewed By: aeubanks
Differential Revision: https://reviews.llvm.org/D113858
Having a separate counting method runs the risk of a mismatch between
the actual reduction method and the counting method.
Instead, create an Oracle that always returns true for shouldKeep(), run
the reduction, and count how many times shouldKeep() was called. The
module should not be modified if shouldKeep() always returns true.
Reviewed By: Meinersbur
Differential Revision: https://reviews.llvm.org/D113537
Metadata operands tend to require special conditions, especially on dbg
intrinsics. We also don't have a zero value for metadata.
Replacing callee operands is a little weird, since calling undef/null
doesn't make sense. It also causes tons of invalid reductions when
reducing calls to intrinsics since only arguments to intrinsics can be
of the metadata type.
Reviewed By: Meinersbur
Differential Revision: https://reviews.llvm.org/D113532
Add a new "operands-skip" pass whose goal is to remove instructions in the middle of dependency chains. For instance:
```
%baseptr = alloca i32
%arrayidx = getelementptr i32, i32* %baseptr, i32 %idxprom
store i32 42, i32* %arrayidx
```
might be reducible to
```
%baseptr = alloca i32
%arrayidx = getelementptr ... ; now dead, together with the computation of %idxprom
store i32 42, i32* %baseptr
```
Other passes would either replace `%baseptr` with undef (operands, instructions) or move it to become a function argument (operands-to-args), both of which might fail the interestingness check.
In principle the implementation allows operand replacement with any value or instruction in the function that passes the filter constraints (same type, dominance, "more reduced"), but is limited in this patch to values that are directly or indirectly used to compute the current operand value, motivated by the example above. Additionally, function arguments are added to the candidate set which helps reducing the number of relevant arguments mitigating a concern of too many arguments mentioned in https://reviews.llvm.org/D110274#3025013.
Possible future extensions:
* Instead of requiring the same type, bitcast/trunc/zext could be automatically inserted for some more flexibility.
* If undef is added to the candidate set, "operands-skip"is able to produce any reduction that "operands" can do. Additional candidates might be zero and one, where the "reductive power" classification can prefer one over the other. If undefined behaviour should not be introduced, undef can be removed from the candidate set.
Recommit after resolving conflict with D112651 and reusing
shouldReduceOperand from D113532.
Reviewed By: aeubanks
Differential Revision: https://reviews.llvm.org/D111818
Add a new "operands-skip" pass whose goal is to remove instructions in the middle of dependency chains. For instance:
```
%baseptr = alloca i32
%arrayidx = getelementptr i32, i32* %baseptr, i32 %idxprom
store i32 42, i32* %arrayidx
```
might be reducible to
```
%baseptr = alloca i32
%arrayidx = getelementptr ... ; now dead, together with the computation of %idxprom
store i32 42, i32* %baseptr
```
Other passes would either replace `%baseptr` with undef (operands, instructions) or move it to become a function argument (operands-to-args), both of which might fail the interestingness check.
In principle the implementation allows operand replacement with any value or instruction in the function that passes the filter constraints (same type, dominance, "more reduced"), but is limited in this patch to values that are directly or indirectly used to compute the current operand value, motivated by the example above. Additionally, function arguments are added to the candidate set which helps reducing the number of relevant arguments mitigating a concern of too many arguments mentioned in https://reviews.llvm.org/D110274#3025013.
Possible future extensions:
* Instead of requiring the same type, bitcast/trunc/zext could be automatically inserted for some more flexibility.
* If undef is added to the candidate set, "operands-skip"is able to produce any reduction that "operands" can do. Additional candidates might be zero and one, where the "reductive power" classification can prefer one over the other. If undefined behaviour should not be introduced, undef can be removed from the candidate set.
Reviewed By: aeubanks
Differential Revision: https://reviews.llvm.org/D111818
When reducing functions with very large basic blocks (~ almost 1 million
BBs), the majority of time is spent maintaining the order in the std::set
for the basic blocks to keep.
In those cases, DenseSet<> is much more efficient. Use it instead.
Previously, if the basic-blocks delta pass tried to remove a basic block
that was the last basic block in a function that did not have external
or weak linkage, the resulting IR would become invalid. Since removing
the last basic block in a function is effectively identical to removing
the function body itself, we check explicitly for this case and if we
detect it, we run the same logic as in ReduceFunctionBodies.cpp
Reviewed By: aeubanks
Differential Revision: https://reviews.llvm.org/D113486
Sometimes if llvm-reduce is interrupted in the middle of a delta pass on
a large file, it can take quite some time for the tool to start actually
doing new work if it is restarted again on the partially-reduced file. A
lot of time ends up being spent testing large chunks when these large
chunks are very unlikely to actually pass the interestingness test. In
cases like this, the tool will complete faster if the starting
granularity is reduced to a finer amount. Thus, we introduce a command
line flag that automatically divides the chunks into smaller subsets a
fixed, user-specified number of times prior to beginning the core loop.
Reviewed By: aeubanks
Differential Revision: https://reviews.llvm.org/D112651
(Second try. Need to link against CodeGen and MC libs.)
The llvm-reduce tool has been extended to operate on MIR (import, clone and
export). Current limitation is that only a single machine function is
supported. A single reducer pass that operates on machine instructions (while
on SSA-form) has been added. Additional MIR specific reducer passes can be
added later as needed.
Differential Revision: https://reviews.llvm.org/D110527
The llvm-reduce tool has been extended to operate on MIR (import, clone and
export). Current limitation is that only a single machine function is
supported. A single reducer pass that operates on machine instructions (while
on SSA-form) has been added. Additional MIR specific reducer passes can be
added later as needed.
Differential Revision: https://reviews.llvm.org/D110527
The extractBasicBlocksFromModule, extractInstrFromModule, and other
similar functions previously performed very poorly when the number of
such elements in the program to reduce was very high. Previously, we
were creating the set which caches elements to keep by looping through
all elements in the module and adding them to the set. However, since
std::set is an ordered set, this introduces a massive amount of
rebalancing if the order of elements in the program and the order of
their pointers in memory are not the same.
The solution is straightforward: first put all the elements to be kept
in a vector, then use the constructor for std::set which takes a pair of
iterators over a collection. This constructor is optimized to avoid
doing unnecessary work when initializing large sets.
Also in this change, we pass BBsToKeep set to functions
replaceBranchTerminator and removeUninterestingBBsFromSwitch as a const
reference rather than passing it by value. This ought to prevent the
need to copy the collection each time these functions are called, which
is expensive if the collection is large.
Reviewed By: aeubanks
Differential Revision: https://reviews.llvm.org/D112757
This was checked while counting but not actually when doing the reduction, resulting in crashes.
Reviewed By: Meinersbur
Differential Revision: https://reviews.llvm.org/D112766
Having non-undef constants in a final llvm-reduce output is nicer than
having undefs.
This splits the existing reduce-operands pass into three, one which does
the same as the current pass of reducing to undef, and two more to
reduce to the constant 1 and the constant 0. Do not reduce to undef if
the operand is a ConstantData, and do not reduce 0s to 1s.
Reducing GEP operands very frequently causes invalid IR (since types may
not match up if we index differently into a struct), so don't touch GEPs.
Reviewed By: Meinersbur
Differential Revision: https://reviews.llvm.org/D111765
Instead of setting operands to undef as the "operands" pass does,
convert the operands to a function argument. This avoids having to
introduce undef values into the IR which have some unpredictability
during optimizations.
For instance,
define void @func() {
entry:
%val = add i32 32, 21
store i32 %val, i32* null
ret void
}
is reduced to
define void @func(i32 %val) {
entry:
%val1 = add i32 32, 21
store i32 %val, i32* null
ret void
}
(note that the instruction %val is renamed to %val1 when printing
the IR to avoid ambiguity; ideally %val1 would be removed by dce or the
instruction reduction pass)
Any call to @func is replaced with a call to the function with the
new signature and filled with undef. This is not ideal for IPA passes,
but those out-of-scope for now.
Reviewed By: aeubanks
Differential Revision: https://reviews.llvm.org/D111503
Use Module& wherever possible.
Since every reduction immediately turns Chunks into an Oracle, directly pass Oracle instead.
Reviewed By: hans
Differential Revision: https://reviews.llvm.org/D111122
We expose the fact that we rely on unsigned wrapping to iterate through
all indexes. This can be confusing. Rather, keeping it as an
implementation detail through an iterator is less confusing and is less
code.
Reviewed By: rnk
Differential Revision: https://reviews.llvm.org/D110885
When replacing function calls, skip call instructions where the old
function is not the called function, but e.g. the old function is passed
as an argument.
This fixes a crash due to trying to construct invalid IR for the test
case.
Reviewed By: aeubanks
Differential Revision: https://reviews.llvm.org/D109759
The ReduceMetadata pass before this patch removed metadata on a per-MDNode (or NamedMDNode) basis. Either all references to an MDNode are kept, or all of them are removed. However, MDNodes are uniqued, meaning that references to MDNodes with the same data become references to the same MDNodes. As a consequence, e.g. tbaa references to the same type will all have the same MDNode reference and hence make it impossible to reduce only keeping metadata on those memory access for which they are interesting.
Moreover, MDNodes can also be referenced by some intrinsics or other MDNodes. These references were not considered for removal leading to the possibility that MDNodes are not actually removed even if selected to be removed by the oracle.
This patch changes ReduceMetadata to reduces based on removable metadata references instead. MDNodes without references implicitly dropped anyway. References by intrinsic calls should be removed by ReduceOperands or ReduceInstructions. References in other MDNodes cannot be removed as it would violate the immutability of MDNodes.
Additionally, ReduceMetadata pass before this patch used `setMetadata(I, NULL)` to remove references, where `I` is the index in the array returned by `getAllMetadata`. However, `setMetadata` expects a MDKind (such as `MD_tbaa`) as first argument. `getAllMetadata` does not return those in consecutive order (otherwise it would not need to be a `std::pair` with `first` representing the MDKind).
Reviewed By: aeubanks, swamulism
Differential Revision: https://reviews.llvm.org/D110534
This removes the data layout, target triple, source filename, and module
identifier when possible.
Reviewed By: swamulism
Differential Revision: https://reviews.llvm.org/D108568
This patch allows iterating typed enum via the ADT/Sequence utility.
It also changes the original design to better separate concerns:
- `StrongInt` only deals with safe `intmax_t` operations,
- `SafeIntIterator` presents the iterator and reverse iterator
interface but only deals with safe `StrongInt` internally.
- `iota_range` only deals with `SafeIntIterator` internally.
This design ensures that operations are always valid. In particular,
"Out of bounds" assertions fire when:
- the `value_type` is not representable as an `intmax_t`
- iterator operations make internal computation underflow/overflow
- the internal representation cannot be converted back to `value_type`
Differential Revision: https://reviews.llvm.org/D106279
The argument reduction pass shouldn't remove arguments of
intrinsics, because the resulting module is ill-formed, and so
inherently uninteresting.
Reviewed By: aeubanks
Differential Revision: https://reviews.llvm.org/D103129
The parseInputFile function returns an empty unique_ptr to signal an
error, like when the input file doesn't exist, or is malformed. In this
case, the tool should exit immediately rather than segfault by
dereferencing the unique_ptr later.
Reviewed By: aeubanks
Differential Revision: https://reviews.llvm.org/D102891
This introduces a flag that aborts if we ever reduce to IR that fails
the verifier.
Reviewed By: swamulism, arichardson
Differential Revision: https://reviews.llvm.org/D101279
Much like with ReduceFunctionBodies delta pass,
we need to remove comdat and set linkage to external,
else verifier will complain, and our deltas are invalid.
The limitation of the current pass that it skips initializer-less GV's
seems arbitrary, in all the reduced cases i (personally) looked at,
the globals weren't needed, yet they were kept.
So let's do two things:
1. allow reducing initializer-less globals
2. before reducing globals, reduce their initializers, much like we do function bodies
ee6e25e439 changed
the delta pass to skip intrinsics, which means we may end up being
left with declarations of intrinsics, that aren't otherwise referenced
in the module. This is obviously unwanted, do drop them.
No longer rely on an external tool to build the llvm component layout.
Instead, leverage the existing `add_llvm_componentlibrary` cmake function and
introduce `add_llvm_component_group` to accurately describe component behavior.
These function store extra properties in the created targets. These properties
are processed once all components are defined to resolve library dependencies
and produce the header expected by llvm-config.
Differential Revision: https://reviews.llvm.org/D90848
This patch adds a reduction of 'special' globals that lead to further
reductions (e.g. alias or regular globals reduction) being less efficient
because there are special constraints on values referenced in those
special globals. For example, values in @llvm.used and
@llvm.compiler.used need to be named, so replacing all uses of an
alias/global with undef or a different unnamed constant results in
invalid IR.
More details:
https://llvm.org/docs/LangRef.html#intrinsic-global-variables
Reviewed By: lebedev.ri
Differential Revision: https://reviews.llvm.org/D90302
This patch adds a new reduction pass that tries to remove aliases.
It runs early, as most of those likely can be removed up-front in
practice.
This substantially improves llvm-reduce for IR generated by the swift
compiler, which can generate a lot of aliases which lead to lots of
invalid reductions.
Reviewed By: lebedev.ri
Differential Revision: https://reviews.llvm.org/D90260
These don't really have function bodies to try to eliminate. This also
has a good chance of just producing invalid IR since intrinsics can
have special operand constraints (e.g. metadata arguments aren't valid
for an arbitrary call). This was wasting quite a bit of time producing
and failing on invalid IR when replacing dbg.values with undefs.
Currently replaceBranchTerminator/removeUninterestingBBsFromSwitch
always creates `ret void` instructions if no successor is in the chunk.
This results in invalid IR for functions with non-void return types,
which makes those reductions unfeasible. Instead, create `ret ty undef`
for functions with non-void return types.
Reviewed By: lebedev.ri
Differential Revision: https://reviews.llvm.org/D86849
althought the interstingness test should usually fail when the module is invalid
this changes reduces the frequency at which llvm-reduce generate invalid IR.
Reviewed By: lebedev.ri
Differential Revision: https://reviews.llvm.org/D86404
Some reduction passes may create invalid IR. I am not aware of any use
case where we would like to proceed reducing invalid IR. Various utils
used here, including CloneModule, assume the module to clone is valid
and crash otherwise.
Ideally, no reduction pass would create invalid IR, but some currently
do. ReduceInstructions can be fixed relatively easily (D86210), but
others are harder. For example, ReduceBasicBlocks may remove result in
invalid PHI nodes.
For now, skip the chunks. If we get to the point where all reduction
passes result in valid IR, we may want to turn this into an assertion.
Reviewed By: lebedev.ri
Differential Revision: https://reviews.llvm.org/D86212
Removing terminators will result in invalid IR, making further
reductions pointless. I do not think there is any valid use case where
we actually want to create invalid IR as part of a reduction.
Reviewed By: lebedev.ri
Differential Revision: https://reviews.llvm.org/D86210
This helps with both debugging llvm-reduce and sometimes getting usefull result even if llvm-reduce crashes
Reviewed By: lebedev.ri
Differential Revision: https://reviews.llvm.org/D85996
It is not enough to replace all uses of users of the function with undef,
the users, we only drop instruction users, so they may stick around.
Let's try different approach - first drop bodies for all the functions
we will drop, which should take care of blockaddress issue the previous
rewrite was dealing with; then, after dropping *all* such bodies,
replace remaining uses with undef (thus all the uses are either
outside of functions, or are in kept functions)
and then finally drop functions.
This seems to work, and passes the *existing* test coverage,
but it is possible that a new issue will be discovered later :)
A new (previously crashing) test added.
Much like with function reduction, there may be remaining unhandled uses
of function, in particular in blockaddress. And in constants we can't
RAUW it with undef, because undef is not a function.
Instead, let's try to pretent that in the remaining cases, the new
signature didn't change, by bitcasting it.
A new (previously crashing) test case added.
There may be other users of a function other than CallInsts,
but what's more important, we can't actually replace function pointer
with undef, because for constants, that would not preserve the type
and RAUW would assert.
In particular, that affects blockaddress, however it proves to be
prohibitively complex to come up with a good test involving blockaddress:
we'd need to both ensure that the function body survives until
this pass, and is not interesting in this pass.
We can happily turn function definitions into declarations,
thus obscuring their argument from being elided by this pass.
I don't believe there is a good reason to just ignore declarations.
likely even proper llvm intrinsics ones,
at worst the input becomes uninteresting.
The other question here is that all these transforms are all-or-nothing.
In some cases, should we be treating each use separately?
The main blocker here seemed to be that llvm::CloneFunctionInto()
does `&OldFunc->front()`, which inserts a nullptr into a densemap,
which is not happy about it and asserts.
replaceFunctionCalls() is very non-exhaustive, it only handles
CallInst's. Which means, by the time we drop old function,
there may still be uses of it lurking around.
Let's instead whack-a-mole them by all by replacing with undef.
I'm not sure this is the best handling, especially for calls, but IMO
poorly reduced input is much better than crashing reduction tool.
A (previously-crashing!) test added.
Fixes https://bugs.llvm.org/show_bug.cgi?id=46819
Terminator may have returned value, so we need to replace uses,
and in general handle invoke as a branch inst.
I'm not sure this is the best handling, but IMO poorly reduced
input is much better than crashing reduction tool.
A (previously-crashing!) test added.
Fixes https://bugs.llvm.org/show_bug.cgi?id=46818
ReduceFunctions could do it, but it also replaces *all* calls with undef,
so if any of undef replacements makes reduction uninteresting,
it won't work.
ReduceBasicBlocks also could do it, but well, it may take many guesses
for all the blocks of a function to happen to be out-of-chunk,
which is not a very efficient way to go about it.
So let's just do this first.
Summary:
If there was a single target to begin with, because a single target
can only occupy a single chunk, we couldn't increase granularity.
and would immediately give up.
Likewise, if we had multiple targets, if by the end we'd end up with
a single target, we wouldn't finish reducing it, it would always
end up being "interesting"
Reviewers: dblaikie, nickdesaulniers, diegotf
Reviewed By: dblaikie
Subscribers: llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D84318
Newly-added test previously crashed.
While it is up for debate whether or not instruction reduction
should be indiscriminate in instruction dropping (there you can
just ensure that the test case is still -verify'ies), here
if we drop terminator, CloneFunctionInto() will immediately crash.
So let's not do that :)
The function extractArgumentsFromModule() was passing a one-based index to,
but replaceFunctionCalls() was expecting a zero-based argument index. This
resulted in assertion errors when reducing function call arguments with
different types. Additionally, the
Reviewed By: lebedev.ri
Differential Revision: https://reviews.llvm.org/D84099
Summary:
This handles all three places where attributes could currently be - `GlobalVariable`, `Function` and `CallBase`.
For last two, it correctly handles all three possible attribute locations (return value, arguments and function itself)
There was a previous attempt at it D73853,
which was committed in rGfc62b36a000681c01e993242b583c5ec4ab48a3c,
but then reverted all the way back in rGb12176d2aafa0ccb2585aa218fc3b454ba84f2a9
due to some (osx?) test failures.
Reviewers: nickdesaulniers, dblaikie, diegotf, george.burgess.iv, jdoerfert, Tyker, arsenm
Reviewed By: nickdesaulniers
Subscribers: wdng, MaskRay, arsenm, llvm-commits, mgorny
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D83351
Summary:
I think, this results in much more understandable/readable flow.
At least the original logic was perhaps the most hard thing for me to grasp when taking an initial look on the delta passes.
Reviewers: nickdesaulniers, dblaikie, diegotf, george.burgess.iv
Reviewed By: nickdesaulniers
Subscribers: llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D83287
Summary:
This would have been marginally useful to me during/for rG7ea46aee3670981827c04df89b2c3a1cbdc7561b.
With ongoing migration to representing assumes via operand bundles on the assume, this will be gradually more useful.
Reviewers: nickdesaulniers, diegotf, dblaikie, george.burgess.iv, jdoerfert, Tyker
Reviewed By: nickdesaulniers
Subscribers: hiraditya, mgorny, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D83177
As it can be seen in newly-added (previously-crashing) test-case,
there can be a situation where multiple GV's are used in instr,
and we would schedule the same instruction to be deleted several times,
crashing when trying to delete it the second time.
We could either store WeakVH (done here), or use something set-like.
I think using WeakVH is prevalent in these cases elsewhere.
As it can be seen in newly-added (previously-crashing) test-case,
there can be a situation where multiple arguments are used in instr,
and we would schedule the same instruction to be deleted several times,
crashing when trying to delete it the second time.
We could either store WeakVH (done here), or use something set-like.
I think using WeakVH is prevalent in these cases elsewhere.
Summary:
The output from llvm-reduce still has significantly more attributes than
bugpoint does. Teach llvm-reduce to remove attributes.
Reviewers: diegotf, dblaikie, george.burgess.iv
Subscribers: mgorny, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D73853
This is how it should've been and brings it more in line with
std::string_view. There should be no functional change here.
This is mostly mechanical from a custom clang-tidy check, with a lot of
manual fixups. It uncovers a lot of minor inefficiencies.
This doesn't actually modify StringRef yet, I'll do that in a follow-up.