This should cut down on the visual noise when reducing. Still keep output when we run a pass or when we successfully reduce.
Notably, this also suppresses redirecting the test output to stdout/stderr.
Reviewed By: regehr
Differential Revision: https://reviews.llvm.org/D131922
There are two different senses in which a block can be "address-taken".
There can be a BlockAddress involved, which means we need to map the
IR-level value to some specific block of machine code. Or there can be
constructs inside a function which involve using the address of a basic
block to implement certain kinds of control flow.
Mixing these together causes a problem: if target-specific passes are
marking random blocks "address-taken", if we have a BlockAddress, we
can't actually tell which MachineBasicBlock corresponds to the
BlockAddress.
So split this into two separate bits: one for BlockAddress, and one for
the machine-specific bits.
Discovered while trying to sort out related stuff on D102817.
Differential Revision: https://reviews.llvm.org/D124697
so that we can reduce away incidental parts of the CFG in cases where
the full simplifyCFG pass makes the test case uninteresting
Differential Revision: https://reviews.llvm.org/D131920
out of chunk ones. the non-default second argument to
removePredecessor() is necessary to avoid creating invalid IR on
examples like the one in the provided test case
Differential Revision: https://reviews.llvm.org/D131843
This was done by adding --abort-on-invalid-reduction to remove-function-bodies-used-in-globals.ll and fixing the fallout.
Aliases must have a GlobalValue or ConstantExpr aliasee and the aliasee must be a definition if it's a GlobalValue.
Don't RAUW functions with null if there's an alias pointing to it, and similarly don't delete the body of a function.
Don't delete the entire body of a function when reducing blocks, preserve at least one block.
Also make debugging these sorts of things easier by dumping the module when --abort-on-invalid-reduction triggers.
Reviewed By: regehr
Differential Revision: https://reviews.llvm.org/D131505
Try to insert an implicit_def to replace the instruction's value,
replacing the original instruction's def with a dead register. If all
defs are delete the instruction entirely.
This is pretty similar to the instruction reduction, but leaves the
new defs in the same place as the original instruction. This could
possibly replace it. I'm not sure if we should directly delete the
instructions here, or leave dead ones behind.
This could also further work to replace physical register defs.
I have a register allocator failure that only reproduces with IPRA
enabled, and requires the specific regmask if I want to only run the
one relevant pass. The printed custom regmask is enormous and I would
like to reduce it.
This reduces each individual bit in the mask, but it would probably be
better to start at register units and clear all aliasing fields at a
time. This would require stricter verification that all aliasing bits
are set in regmasks (although I would prefer to switch regmasks to use
register units in the first place).
Integer vectors were previously ignored when reducing operands. When
6b8bd0f72 introduced support for reducing floating-point
scalars/vectors, the vector case was written to only handle
floating-point values. It would crash when creating an invalid
ConstantFP from the integer element type.
Instead of reinstating the old integer vector behaviour, we might as
well reduce integer vectors to all-one splats.
A couple of existing tests has also been renamed from "remove" to
"reduce" to better reflect the deltas they test.
Reviewed By: arsenm
Differential Revision: https://reviews.llvm.org/D129629
Fixes this error:
/var/lib/buildbot/sanitizer-buildbot6/sanitizer-x86_64-linux-android/build/
llvm-project/llvm/tools/llvm-reduce/TestRunner.cpp:20:7:
error: field 'TM' will be initialized after field 'ToolName'
[-Werror,-Wreorder-ctor]
TM(std::move(TM)), ToolName(ToolName) {
^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~
Program(std::move(Program)) TM(std::move(TM))
1 error generated.
https://lab.llvm.org/buildbot/\#/builders/77/builds/19154
Adds support for reading and writing LTO bitcode files.
- Emit a summary if the original bitcode file had a summary
- Use split LTO units if the original bitcode file used them.
Reviewed By: arsenm
Differential Revision: https://reviews.llvm.org/D127168
The intention is that these should never have undef operands. It turns
out the restriction the verifier enforces is too lax. The verifier
enforces that registers without a register class cannot be undef, but
it's valid to use a register with a register class and type. The
verifier needs to change to be based on the opcode.
MIR support is totally unusable for AMDGPU without this, since the set
of reserved registers is set from fields here.
Add a clone method to MachineFunctionInfo. This is a subtle variant of
the copy constructor that is required if there are any MIR constructs
that use pointers. Specifically, at minimum fields that reference
MachineBasicBlocks or the MachineFunction need to be adjusted to the
values in the new function.
Use the query that doesn't assert if TracksLiveness isn't set, which
needs to always be available. We also need to start printing liveins
regardless of TracksLiveness.
I'm a bit confused by what's actually stored for the allocation
hints. The MIR parser only handles the "simple" case where there's a
single hint. I don't really understand the assertion in
clearSimpleHint, or under what circumstances there are multiple hint
registers.
This reverts commit 3988bd1398.
Did not build on this bot:
https://lab.llvm.org/buildbot#builders/215/builds/6372
/usr/include/c++/9/bits/predefined_ops.h:177:11: error: no match for call to
‘(llvm::less_first) (std::pair<long unsigned int, llvm::bolt::BinaryBasicBlock*>&, const std::pair<long unsigned int, std::nullptr_t>&)’
177 | { return bool(_M_comp(*__it, __val)); }
One could reuse this functor instead of rolling out your own version.
There were a couple other cases where the code was similar, but not
quite the same, such as it might have an assertion in the lambda or other
constructs. Thus, I've not touched any of those, as it might change the
behavior in some way.
As per https://discourse.llvm.org/t/submitting-simple-nfc-patches/62640/3?u=steakhal
Chris Lattner
> LLVM intentionally has a “yes, you can apply common sense judgement to
> things” policy when it comes to code review. If you are doing mechanical
> patches (e.g. adopting less_first) that apply to the entire monorepo,
> then you don’t need everyone in the monorepo to sign off on it. Having
> some +1 validation from someone is useful, but you don’t need everyone
> whose code you touch to weigh in.
Differential Revision: https://reviews.llvm.org/D126068
When the single branch target of a block has been removed try updating
it to target a block that is kept (by scanning forward in the sequence)
instead of replacing the branch with a return instruction. Doing so
reduces the risk of breaking loop structures meaning that when the loop
is 'interesting' these reductions should have more blocks eliminated.
Differential Revision: https://reviews.llvm.org/D125766
This had the surprising behavior of using whatever instruction
happened to be first in the block as an anchor point to stick random
implicit defs on. Use a real implicit_def instead.
Many MIR reductions benefit from or require increasing the instruction
count. For example, unlike in the IR, you may need to insert a new
instruction to represent an undef. The current instruction reduction
pass works around this by sticking implicit defs on whatever
instruction happens to be first in the entry block block.
Other strategies I've applied manually include breaking instructions
with multiple defs into separate instructions, or breaking large
register defs into multiple subregister defs.
Make up a simple scoring system based on what I generally try to get
rid of first when manually reducing. Counts implicit defs as free
since reduction passes will be introducing them, although they
probably should count for something. It also might make more sense to
have a comparison the two functions, rather than having to compute a
contextless number. This isn't particularly well tested since overall
the MIR support isn't in a place where it is useful on the kinds of
testcases I want to throw at it.
There were two problems with directly copying the MMOs from the old
function. The MMOs are owned by the function's Allocator, so need to
be reallocated anyways (surprisingly I didn't notice breakage on
this). Second, the PseudoSourceValues are also allocated per function
and need to be reallocated.
The current testcase I'm trying to reduce only reproduces with IPRA
enabled and requires handling multiple functions.
The only real difference vs. the IR is the extra indirect to look for
the underlying MachineFunction, so treat the ReduceWorkItem as the
module instead of the function.
The ugliest piece of this is really the ugliness of
MachineModuleInfo. It not only tracks actual module state, but has a
number of transient fields used for isel and/or the asm printer. These
shouldn't do any harm for the use here, though they should be
separated out.
Just clone all the virtual registers instead of looking for def
operands. This preserves the register values used, simplifying the
rest of the code. This avoids needing to expose the register map to
target code.
Previously the specific values used for fixed frame indexes was in
reverse order in the cloned function from the original, and a map was
used to adjust all frame indexes to the potentially new values. Insert
the fixed objects in reverse to avoid this. This simplifies other
code, since now we don't need to track down all frame indexes. This
will allow targets that store frame indexes in MachineFunctionInfo to
simply copy the values.
Note this isn't directly observable in the test since the resulting
MIR print/parse can shuffle the IDs around (in particular the final
serialization implicitly strips out dead objects).
Removing these is extremely unhelpful and just adds extra hassle. This
is really finding out whether your test script uses -mtriple or
not. You can't meaningfully delete these fields, and the resulting
module defaults to the host.
getSuccProbability was private for some reason, saying to go through
MachineBranchProbabilityInfo. There doesn't seem to be much point to
that, as that wrapper directly calls this.
Like other areas, some of these fields aren't handled by the MIR
printer/parser so aren't tested.
This didn't work at all before, and would assert on any frame
index. Also copy the other fields, which I believe should cover
everything. There are a few that are untested since MIR serialization
is apparently still missing them (isStatepointSpillSlot,
ObjectSSPLayout, and ObjectSExt/ObjectZExt).
Previously the options category given to cl::HideUnrelatedOptions was
local to llvm-reduce.cpp and as a result only options declared in that
file were visible in the -help options listing. This was a bit
unfortunate since there were several useful options declared in other
files. This patch addresses that.
Differential Revision: https://reviews.llvm.org/D118682
When exporting textual IR during reduction the ShouldPreserveUseListOrder
parameter of the IR printer should be set to get predictable results.
Differential Revision: https://reviews.llvm.org/D118585
The cleanup was manual, but assisted by "include-what-you-use". It consists in
1. Removing unused forward declaration. No impact expected.
2. Removing unused headers in .cpp files. No impact expected.
3. Removing unused headers in .h files. This removes implicit dependencies and
is generally considered a good thing, but this may break downstream builds.
I've updated llvm, clang, lld, lldb and mlir deps, and included a list of the
modification in the second part of the commit.
4. Replacing header inclusion by forward declaration. This has the same impact
as 3.
Notable changes:
- llvm/Support/TargetParser.h no longer includes llvm/Support/AArch64TargetParser.h nor llvm/Support/ARMTargetParser.h
- llvm/Support/TypeSize.h no longer includes llvm/Support/WithColor.h
- llvm/Support/YAMLTraits.h no longer includes llvm/Support/Regex.h
- llvm/ADT/SmallVector.h no longer includes llvm/Support/MemAlloc.h nor llvm/Support/ErrorHandling.h
You may need to add some of these headers in your compilation units, if needs be.
As an hint to the impact of the cleanup, running
clang++ -E -Iinclude -I../llvm/include ../llvm/lib/Support/*.cpp -std=c++14 -fno-rtti -fno-exceptions | wc -l
before: 8000919 lines
after: 7917500 lines
Reduced dependencies also helps incremental rebuilds and is more ccache
friendly, something not shown by the above metric :-)
Discourse thread on the topic: https://llvm.discourse.group/t/include-what-you-use-include-cleanup/5831
This patch adds parallel processing of chunks. When reducing very large
inputs, e.g. functions with 500k basic blocks, processing chunks in
parallel can significantly speed up the reduction.
To allow modifying clones of the original module in parallel, each clone
needs their own LLVMContext object. To achieve this, each job parses the
input module with their own LLVMContext. In case a job successfully
reduced the input, it serializes the result module as bitcode into a
result array.
To ensure parallel reduction produces the same results as serial
reduction, only the first successfully reduced result is used, and
results of other successful jobs are dropped. Processing resumes after
the chunk that was successfully reduced.
The number of threads to use can be configured using the -j option.
It defaults to 1, which means serial processing.
Reviewed By: Meinersbur
Differential Revision: https://reviews.llvm.org/D113857
This patch moves the logic to clone and check a new chunk into a new
function, to allow re-use in a follow-up patch that implements parallel
reductions.
Reviewed By: dblaikie
Differential Revision: https://reviews.llvm.org/D113856
Textual LLVM IR files are much bigger and take longer to write to disk.
To avoid the extra cost incurred by serializing to text, this patch adds
an option to save temporary files as bitcode instead.
Reviewed By: aeubanks
Differential Revision: https://reviews.llvm.org/D113858
Having a separate counting method runs the risk of a mismatch between
the actual reduction method and the counting method.
Instead, create an Oracle that always returns true for shouldKeep(), run
the reduction, and count how many times shouldKeep() was called. The
module should not be modified if shouldKeep() always returns true.
Reviewed By: Meinersbur
Differential Revision: https://reviews.llvm.org/D113537
Metadata operands tend to require special conditions, especially on dbg
intrinsics. We also don't have a zero value for metadata.
Replacing callee operands is a little weird, since calling undef/null
doesn't make sense. It also causes tons of invalid reductions when
reducing calls to intrinsics since only arguments to intrinsics can be
of the metadata type.
Reviewed By: Meinersbur
Differential Revision: https://reviews.llvm.org/D113532
Add a new "operands-skip" pass whose goal is to remove instructions in the middle of dependency chains. For instance:
```
%baseptr = alloca i32
%arrayidx = getelementptr i32, i32* %baseptr, i32 %idxprom
store i32 42, i32* %arrayidx
```
might be reducible to
```
%baseptr = alloca i32
%arrayidx = getelementptr ... ; now dead, together with the computation of %idxprom
store i32 42, i32* %baseptr
```
Other passes would either replace `%baseptr` with undef (operands, instructions) or move it to become a function argument (operands-to-args), both of which might fail the interestingness check.
In principle the implementation allows operand replacement with any value or instruction in the function that passes the filter constraints (same type, dominance, "more reduced"), but is limited in this patch to values that are directly or indirectly used to compute the current operand value, motivated by the example above. Additionally, function arguments are added to the candidate set which helps reducing the number of relevant arguments mitigating a concern of too many arguments mentioned in https://reviews.llvm.org/D110274#3025013.
Possible future extensions:
* Instead of requiring the same type, bitcast/trunc/zext could be automatically inserted for some more flexibility.
* If undef is added to the candidate set, "operands-skip"is able to produce any reduction that "operands" can do. Additional candidates might be zero and one, where the "reductive power" classification can prefer one over the other. If undefined behaviour should not be introduced, undef can be removed from the candidate set.
Recommit after resolving conflict with D112651 and reusing
shouldReduceOperand from D113532.
Reviewed By: aeubanks
Differential Revision: https://reviews.llvm.org/D111818
Add a new "operands-skip" pass whose goal is to remove instructions in the middle of dependency chains. For instance:
```
%baseptr = alloca i32
%arrayidx = getelementptr i32, i32* %baseptr, i32 %idxprom
store i32 42, i32* %arrayidx
```
might be reducible to
```
%baseptr = alloca i32
%arrayidx = getelementptr ... ; now dead, together with the computation of %idxprom
store i32 42, i32* %baseptr
```
Other passes would either replace `%baseptr` with undef (operands, instructions) or move it to become a function argument (operands-to-args), both of which might fail the interestingness check.
In principle the implementation allows operand replacement with any value or instruction in the function that passes the filter constraints (same type, dominance, "more reduced"), but is limited in this patch to values that are directly or indirectly used to compute the current operand value, motivated by the example above. Additionally, function arguments are added to the candidate set which helps reducing the number of relevant arguments mitigating a concern of too many arguments mentioned in https://reviews.llvm.org/D110274#3025013.
Possible future extensions:
* Instead of requiring the same type, bitcast/trunc/zext could be automatically inserted for some more flexibility.
* If undef is added to the candidate set, "operands-skip"is able to produce any reduction that "operands" can do. Additional candidates might be zero and one, where the "reductive power" classification can prefer one over the other. If undefined behaviour should not be introduced, undef can be removed from the candidate set.
Reviewed By: aeubanks
Differential Revision: https://reviews.llvm.org/D111818
When reducing functions with very large basic blocks (~ almost 1 million
BBs), the majority of time is spent maintaining the order in the std::set
for the basic blocks to keep.
In those cases, DenseSet<> is much more efficient. Use it instead.
Previously, if the basic-blocks delta pass tried to remove a basic block
that was the last basic block in a function that did not have external
or weak linkage, the resulting IR would become invalid. Since removing
the last basic block in a function is effectively identical to removing
the function body itself, we check explicitly for this case and if we
detect it, we run the same logic as in ReduceFunctionBodies.cpp
Reviewed By: aeubanks
Differential Revision: https://reviews.llvm.org/D113486
Sometimes if llvm-reduce is interrupted in the middle of a delta pass on
a large file, it can take quite some time for the tool to start actually
doing new work if it is restarted again on the partially-reduced file. A
lot of time ends up being spent testing large chunks when these large
chunks are very unlikely to actually pass the interestingness test. In
cases like this, the tool will complete faster if the starting
granularity is reduced to a finer amount. Thus, we introduce a command
line flag that automatically divides the chunks into smaller subsets a
fixed, user-specified number of times prior to beginning the core loop.
Reviewed By: aeubanks
Differential Revision: https://reviews.llvm.org/D112651
(Second try. Need to link against CodeGen and MC libs.)
The llvm-reduce tool has been extended to operate on MIR (import, clone and
export). Current limitation is that only a single machine function is
supported. A single reducer pass that operates on machine instructions (while
on SSA-form) has been added. Additional MIR specific reducer passes can be
added later as needed.
Differential Revision: https://reviews.llvm.org/D110527
The llvm-reduce tool has been extended to operate on MIR (import, clone and
export). Current limitation is that only a single machine function is
supported. A single reducer pass that operates on machine instructions (while
on SSA-form) has been added. Additional MIR specific reducer passes can be
added later as needed.
Differential Revision: https://reviews.llvm.org/D110527
The extractBasicBlocksFromModule, extractInstrFromModule, and other
similar functions previously performed very poorly when the number of
such elements in the program to reduce was very high. Previously, we
were creating the set which caches elements to keep by looping through
all elements in the module and adding them to the set. However, since
std::set is an ordered set, this introduces a massive amount of
rebalancing if the order of elements in the program and the order of
their pointers in memory are not the same.
The solution is straightforward: first put all the elements to be kept
in a vector, then use the constructor for std::set which takes a pair of
iterators over a collection. This constructor is optimized to avoid
doing unnecessary work when initializing large sets.
Also in this change, we pass BBsToKeep set to functions
replaceBranchTerminator and removeUninterestingBBsFromSwitch as a const
reference rather than passing it by value. This ought to prevent the
need to copy the collection each time these functions are called, which
is expensive if the collection is large.
Reviewed By: aeubanks
Differential Revision: https://reviews.llvm.org/D112757