Try to insert an implicit_def to replace the instruction's value,
replacing the original instruction's def with a dead register. If all
defs are delete the instruction entirely.
This is pretty similar to the instruction reduction, but leaves the
new defs in the same place as the original instruction. This could
possibly replace it. I'm not sure if we should directly delete the
instructions here, or leave dead ones behind.
This could also further work to replace physical register defs.
I have a register allocator failure that only reproduces with IPRA
enabled, and requires the specific regmask if I want to only run the
one relevant pass. The printed custom regmask is enormous and I would
like to reduce it.
This reduces each individual bit in the mask, but it would probably be
better to start at register units and clear all aliasing fields at a
time. This would require stricter verification that all aliasing bits
are set in regmasks (although I would prefer to switch regmasks to use
register units in the first place).
Integer vectors were previously ignored when reducing operands. When
6b8bd0f72 introduced support for reducing floating-point
scalars/vectors, the vector case was written to only handle
floating-point values. It would crash when creating an invalid
ConstantFP from the integer element type.
Instead of reinstating the old integer vector behaviour, we might as
well reduce integer vectors to all-one splats.
A couple of existing tests has also been renamed from "remove" to
"reduce" to better reflect the deltas they test.
Reviewed By: arsenm
Differential Revision: https://reviews.llvm.org/D129629
Adds support for reading and writing LTO bitcode files.
- Emit a summary if the original bitcode file had a summary
- Use split LTO units if the original bitcode file used them.
Reviewed By: arsenm
Differential Revision: https://reviews.llvm.org/D127168
The intention is that these should never have undef operands. It turns
out the restriction the verifier enforces is too lax. The verifier
enforces that registers without a register class cannot be undef, but
it's valid to use a register with a register class and type. The
verifier needs to change to be based on the opcode.
I'm a bit confused by what's actually stored for the allocation
hints. The MIR parser only handles the "simple" case where there's a
single hint. I don't really understand the assertion in
clearSimpleHint, or under what circumstances there are multiple hint
registers.
This reverts commit 3988bd1398.
Did not build on this bot:
https://lab.llvm.org/buildbot#builders/215/builds/6372
/usr/include/c++/9/bits/predefined_ops.h:177:11: error: no match for call to
‘(llvm::less_first) (std::pair<long unsigned int, llvm::bolt::BinaryBasicBlock*>&, const std::pair<long unsigned int, std::nullptr_t>&)’
177 | { return bool(_M_comp(*__it, __val)); }
One could reuse this functor instead of rolling out your own version.
There were a couple other cases where the code was similar, but not
quite the same, such as it might have an assertion in the lambda or other
constructs. Thus, I've not touched any of those, as it might change the
behavior in some way.
As per https://discourse.llvm.org/t/submitting-simple-nfc-patches/62640/3?u=steakhal
Chris Lattner
> LLVM intentionally has a “yes, you can apply common sense judgement to
> things” policy when it comes to code review. If you are doing mechanical
> patches (e.g. adopting less_first) that apply to the entire monorepo,
> then you don’t need everyone in the monorepo to sign off on it. Having
> some +1 validation from someone is useful, but you don’t need everyone
> whose code you touch to weigh in.
Differential Revision: https://reviews.llvm.org/D126068
When the single branch target of a block has been removed try updating
it to target a block that is kept (by scanning forward in the sequence)
instead of replacing the branch with a return instruction. Doing so
reduces the risk of breaking loop structures meaning that when the loop
is 'interesting' these reductions should have more blocks eliminated.
Differential Revision: https://reviews.llvm.org/D125766
This had the surprising behavior of using whatever instruction
happened to be first in the block as an anchor point to stick random
implicit defs on. Use a real implicit_def instead.
The current testcase I'm trying to reduce only reproduces with IPRA
enabled and requires handling multiple functions.
The only real difference vs. the IR is the extra indirect to look for
the underlying MachineFunction, so treat the ReduceWorkItem as the
module instead of the function.
The ugliest piece of this is really the ugliness of
MachineModuleInfo. It not only tracks actual module state, but has a
number of transient fields used for isel and/or the asm printer. These
shouldn't do any harm for the use here, though they should be
separated out.
Removing these is extremely unhelpful and just adds extra hassle. This
is really finding out whether your test script uses -mtriple or
not. You can't meaningfully delete these fields, and the resulting
module defaults to the host.
Previously the options category given to cl::HideUnrelatedOptions was
local to llvm-reduce.cpp and as a result only options declared in that
file were visible in the -help options listing. This was a bit
unfortunate since there were several useful options declared in other
files. This patch addresses that.
Differential Revision: https://reviews.llvm.org/D118682
This patch adds parallel processing of chunks. When reducing very large
inputs, e.g. functions with 500k basic blocks, processing chunks in
parallel can significantly speed up the reduction.
To allow modifying clones of the original module in parallel, each clone
needs their own LLVMContext object. To achieve this, each job parses the
input module with their own LLVMContext. In case a job successfully
reduced the input, it serializes the result module as bitcode into a
result array.
To ensure parallel reduction produces the same results as serial
reduction, only the first successfully reduced result is used, and
results of other successful jobs are dropped. Processing resumes after
the chunk that was successfully reduced.
The number of threads to use can be configured using the -j option.
It defaults to 1, which means serial processing.
Reviewed By: Meinersbur
Differential Revision: https://reviews.llvm.org/D113857
This patch moves the logic to clone and check a new chunk into a new
function, to allow re-use in a follow-up patch that implements parallel
reductions.
Reviewed By: dblaikie
Differential Revision: https://reviews.llvm.org/D113856
Textual LLVM IR files are much bigger and take longer to write to disk.
To avoid the extra cost incurred by serializing to text, this patch adds
an option to save temporary files as bitcode instead.
Reviewed By: aeubanks
Differential Revision: https://reviews.llvm.org/D113858
Having a separate counting method runs the risk of a mismatch between
the actual reduction method and the counting method.
Instead, create an Oracle that always returns true for shouldKeep(), run
the reduction, and count how many times shouldKeep() was called. The
module should not be modified if shouldKeep() always returns true.
Reviewed By: Meinersbur
Differential Revision: https://reviews.llvm.org/D113537
Metadata operands tend to require special conditions, especially on dbg
intrinsics. We also don't have a zero value for metadata.
Replacing callee operands is a little weird, since calling undef/null
doesn't make sense. It also causes tons of invalid reductions when
reducing calls to intrinsics since only arguments to intrinsics can be
of the metadata type.
Reviewed By: Meinersbur
Differential Revision: https://reviews.llvm.org/D113532
Add a new "operands-skip" pass whose goal is to remove instructions in the middle of dependency chains. For instance:
```
%baseptr = alloca i32
%arrayidx = getelementptr i32, i32* %baseptr, i32 %idxprom
store i32 42, i32* %arrayidx
```
might be reducible to
```
%baseptr = alloca i32
%arrayidx = getelementptr ... ; now dead, together with the computation of %idxprom
store i32 42, i32* %baseptr
```
Other passes would either replace `%baseptr` with undef (operands, instructions) or move it to become a function argument (operands-to-args), both of which might fail the interestingness check.
In principle the implementation allows operand replacement with any value or instruction in the function that passes the filter constraints (same type, dominance, "more reduced"), but is limited in this patch to values that are directly or indirectly used to compute the current operand value, motivated by the example above. Additionally, function arguments are added to the candidate set which helps reducing the number of relevant arguments mitigating a concern of too many arguments mentioned in https://reviews.llvm.org/D110274#3025013.
Possible future extensions:
* Instead of requiring the same type, bitcast/trunc/zext could be automatically inserted for some more flexibility.
* If undef is added to the candidate set, "operands-skip"is able to produce any reduction that "operands" can do. Additional candidates might be zero and one, where the "reductive power" classification can prefer one over the other. If undefined behaviour should not be introduced, undef can be removed from the candidate set.
Recommit after resolving conflict with D112651 and reusing
shouldReduceOperand from D113532.
Reviewed By: aeubanks
Differential Revision: https://reviews.llvm.org/D111818