This patch refactors BAT to be testable as a library, so we
can have open-source tests on it. This further fixes an issue with
basic blocks that lack a valid input offset, making BAT omit those
when writing translation tables.
Test Plan: new testcases added, new testing tool added (llvm-bat-dump)
Differential Revision: https://reviews.llvm.org/D129382
AllowStripped has not been used in BOLT.
This option is replaced by actively detecting stripped binary.
Test Plan:
Reviewed By: Amir
Differential Revision: https://reviews.llvm.org/D130036
Determine stripped status of a binary based on .symtab
Test Plan:
```
ninja check-bolt
```
Reviewed By: Amir
Differential Revision: https://reviews.llvm.org/D130034
`LastSymbol` handling in `discoverFileObjects` assumes a non-zero number of
symbols in an object file. It's not the case for broken_dynsym.test added in
D130073, and potentially other stripped binaries.
Reviewed By: maksfb
Differential Revision: https://reviews.llvm.org/D130544
Strip tools cause a few symbols in .dynsym to have bad section index.
This update safely keeps such broken symbols intact.
Test Plan:
```
ninja check-bolt
```
Reviewed By: Amir
Differential Revision: https://reviews.llvm.org/D130073
We previously support split jump table, where some jump table entries
target different fragments of same function. In this fix, we provide
support for another type of intra-indirect transfer: landing pad.
When C++ exception handling is used, compiler emits .gcc_except_table
that describes the location of catch block (landing pad) for specific
range that potentially invokes a throw(). Normally landing pads reside
in the function, but with -fsplit-machine-functions, landing pads can
be moved to another fragment. The intuition is, landing pads are rarely
executed, so compiler can move them to .cold section.
This update will mark all fragments that have landing pad to another
fragment as non-simple, and later propagate non-simple to all related
fragments.
This update also includes one manual test case: split-landing-pad.s
Reviewed By: Amir
Differential Revision: https://reviews.llvm.org/D128561
There are two assumptions regarding jump table:
(a) It is accessed by only one fragment, say, Parent
(b) All entries target instructions in Parent
For (a), BOLT stores jump table entries as relative offset to Parent.
For (b), BOLT treats jump table entries target somewhere out of Parent
as INVALID_OFFSET, including fragment of same split function.
In this update, we extend (a) and (b) to include fragment of same split
functinon. For (a), we store jump table entries in absolute offset
instead. In addition, jump table will store all fragments that access
it. A fragment uses this information to only create label for jump table
entries that target to that fragment.
For (b), using absolute offset allows jump table entries to target
fragments of same split function, i.e., extend support for split jump
table. This can be done using relocation (fragment start/size) and
fragment detection heuristics (e.g., using symbol name pattern for
non-stripped binaries).
For jump table targets that can only be reached by one fragment, we
mark them as local label; otherwise, they would be the secondary
function entry to the target fragment.
Test Plan
```
ninja check-bolt
```
Reviewed By: Amir
Differential Revision: https://reviews.llvm.org/D128474
The gold linker veneers are written between functions without symbols,
so we to handle it specially in BOLT.
Vladislav Khmelevsky,
Advanced Software Technology Lab, Huawei
Differential Revision: https://reviews.llvm.org/D129260
This reverts commit 425dda76e9.
This commit is currently causing BOLT to crash in one of our
binaries and needs a bit more checking to make sure it is safe
to land.
The gold linker veneers are written between functions without symbols,
so we to handle it specially in BOLT.
Vladislav Khmelevsky,
Advanced Software Technology Lab, Huawei
Differential Revision: https://reviews.llvm.org/D128082
Add functionality to allow splitting code with C++ exceptions in shared
libraries and PIEs. To overcome a limitation in exception ranges format,
for functions with fragments spanning multiple sections, add trampoline
landing pads in the same section as the corresponding throwing range.
Reviewed By: Amir
Differential Revision: https://reviews.llvm.org/D127936
Allow cold fragment to get new address.
Our previous assumption is that a fragment (.cold) is only reached
through the main fragment of same function. In addition, .cold fragment
must be reached through either (a) direct transfer, or (b) split jump
table. For (a), we perform a simple fix-up. For (b), we currently mark
all relevant fragments as non-simple. Therefore, there is no need to
get new address for .cold fragment.
This is not always the case, as function entry can be rarely executed,
and is placed in .text.cold segment. Essentially we cannot tell which
the source-level function entry is based on hot and cold segments,
so we must treat each fragment a function on its own. Therfore, we
remove the assertion that a function entry cannot be cold fragment.
Test Plan:
```
ninja check-bolt
```
Reviewed By: Amir
Differential Revision: https://reviews.llvm.org/D128111
Resolve a crash related to split functions
Due to split function optimization, a function can be divided to two
fragments, and both fragments can access same jump table. This
violates the assumption that a jump table can only have one parent
function, which causes a crash during instrumentation.
We want to support the case: different functions cannot access same
jump tables, but different fragments of same function can!
As all fragments are from same function, we point JT::Parent to one
specific fragment. Right now it is the first disassembled fragment, but
we can point it to the function's main fragment later.
Functions are disassembled sequentially. Previously, at the end of
processing a function, JT::OffsetEntries is cleared, so other fragment
can no longer reuse JT::OffsetEntries. To extend the support for split
function, we only clear JT::OffsetEntries after all functions are
disassembled.
Let say A.hot and A.cold access JT of three targets {X, Y, Z}, where
X and Y are in A.hot, and Z is in A.cold. Suppose that A.hot is
disassembled first, JT::OffsetEntries = {X',Y',INVALID_OFFSET}. When
A.cold is disassembled, it cannot reuse JT::OffsetEntries above due to
different fragment start. A simple solution:
A.hot = {X',Y',INVALID_OFFSET}
A.cold = {INVALID_OFFSET, INVALID_OFFSET, INVALID_OFFSET}
We update the assertion to allow different fragments of same function
to get the same JumpTable object.
Potential improvements:
A.hot = {X',Y',INVALID_OFFSET}
A.cold = {INVALID_OFFSET, INVALID_OFFSET, Z'}
The main issue is A.hot and A.cold have separate CFGs, thus jump table
targets are still constrained within fragment bounds.
Future improvements:
A.hot = {X, Y, Z}
A.cold = {X, Y, Z}
Reviewed By: Amir
Differential Revision: https://reviews.llvm.org/D127924
Supress failed to analyze relocations warning for R_AARCH64_LD_PREL_LO19
relocation. This relocation is mostly used to get value stored in CI and
we don't process it since we are caluclating target address using the
instruction value in evaluateMemOperandTarget().
Differential Revision: https://reviews.llvm.org/D127413
Mark fragments related to split jump table as non-simple.
A function could be splitted into hot and cold fragments. A split jump table is
challenging for correctly reconstructing control flow graphs, so it was marked
as ignored. This update marks those fragments as non-simple, allowing them
to be printed and partial control flow graph construction.
Test Plan:
```
llvm-lit -a tools/bolt/test/X86/split-func-icf.s
```
This test has two functions (main, main2), each has a jump table target to the
same cold portion main2.cold.1(*2). We try to print out only this cold portion.
If it is ignored, it cannot be printed. If it is non-simple, it can be printed. We
verify that it can be printed.
Reviewed By: Amir
Differential Revision: https://reviews.llvm.org/D127464
Use color coding to distinguish nodes:
- Entry nodes have bold border
- Scalar (non-loopy) code is milk white
- Outer loops are light yellow
- Innermost loops are light blue
`-print-loops` needs to be enabled to provide BinaryLoopInfo.
Examples:
{F23170673}
{F23170680}
Reviewed By: rafauler
Differential Revision: https://reviews.llvm.org/D126248
Fix BOLT's constant island mapping when a constant island marked by $d
spans multiple functions. Currently, because BOLT only marks the
constant island in the first function where $d is located, if the next
function contains data at its start, BOLT will miss the data and try
to disassemble it. This patch adds code to explicitly go through all
symbols between $d and $x markers and mark their respective offsets as
data, which stops BOLT from trying to disassemble data. It also adds
MarkerType enum and refactors related functions.
Reviewed By: yota9, rafauler
Differential Revision: https://reviews.llvm.org/D126177
Addresses the warnings emitted by Apple Clang 13.1.6 (Xcode 13.3.1).
Tip @tschuett issue #55404.
Reviewed By: rafauler
Differential Revision: https://reviews.llvm.org/D125733
Added implementation to support DWARF5 in monolithic mode.
Next step DWARF5 split dwarf support.
Reviewed By: maksfb
Differential Revision: https://reviews.llvm.org/D121876
BOLT expects PC-relative relocations in data sections to reference code
and the relocated data to form a jump table. However, there are cases
where PC-relative addressing is used for data-to-data references
(e.g. clang-15 can generate such code). BOLT should recognize and ignore
such relocations. Otherwise, they will be considered relocations not
claimed by any jump table and cause a failure in the strict mode.
Reviewed By: yota9, Amir
Differential Revision: https://reviews.llvm.org/D123650
The bfd linker adds the symbol versioning string to the symbol name in symtab.
Skip the versioning part in order to find the registered PLT function.
Vladislav Khmelevsky,
Advanced Software Technology Lab, Huawei
Differential Revision: https://reviews.llvm.org/D122039
Read static relocs on the same address, as dynamic in order to update
constant island data address properly.
Differential Revision: https://reviews.llvm.org/D122100
The BinaryEmitter uses opts::AlignText value to align the hot text
section. Also check that the opts::AlignText is at least
equal opts::AlignFunctions for the same reason, as described in D121392.
Vladislav Khmelevsky,
Advanced Software Technology Lab, Huawei
Differential Revision: https://reviews.llvm.org/D121728
The aarch64 uses the trampolines located in .iplt section, which
contains plt-like trampolines on the value stored in .got. In this case
we don't have JUMP_SLOT relocation, but we have a symbol that belongs to
ifunc trampoline, so use it and set set plt symbol for such functions.
Vladislav Khmelevsky,
Advanced Software Technology Lab, Huawei
Differential Revision: https://reviews.llvm.org/D120850
Address fuzzer crash on malformed input:
```
BOLT-ERROR: cannot get section contents for .dynsym: The end of the file was unexpectedly encountered.
```
Reviewed By: rafauler
Differential Revision: https://reviews.llvm.org/D121068
This patch enables PLT analysis for aarch64. It is used by the static
relocations in order to provide final symbol address of PLT entry for some
instructions like ADRP.
Vladislav Khmelevsky,
Advanced Software Technology Lab, Huawei
Differential Revision: https://reviews.llvm.org/D118088
PC-relative memory operand could reference a different object from
the one located at the target address, e.g. when a negative offset
is used. Check relocations for the real referenced object.
Reviewed By: rafauler
Differential Revision: https://reviews.llvm.org/D120379
Further improve error handling in BOLT by reporting `RewriteInstance` errors in
a library and fuzzer-friendly way instead of exiting.
Follow-up to D119658
Reviewed By: rafauler
Differential Revision: https://reviews.llvm.org/D120224
Refactor createBinaryContext and RewriteInstance/MachORewriteInstance
constructors to report an error in a library and fuzzer-friendly way instead of
returning a nullptr or exiting.
Reviewed By: rafauler
Differential Revision: https://reviews.llvm.org/D119658
This patch changes patchELFAllocatableRelaSections from going through
old relocations sections and update the relocation offsets to emitting
the relocations stored in binary sections. This is needed in case we
would like to remove and add dynamic relocations during BOLT work and it
is used by golang support pass. Note: Currently we emit relocations in
the old sections, so the total number of them should be equal or less
of old number.
Testing: No special tests are neeeded, since this patch does not fix
anything or add new functionality (it only prepares to add). Every
PIC-compiled test binary will use this code and thus become a test.
But just in case the aarch64 dynamic relocations tests were added.
Vladislav Khmelevsky,
Advanced Software Technology Lab, Huawei
Reviewed By: maksfb
Differential Revision: https://reviews.llvm.org/D117612
As usual with that header cleanup series, some implicit dependencies now need to
be explicit:
llvm/DebugInfo/DWARF/DWARFContext.h no longer includes:
- "llvm/DebugInfo/DWARF/DWARFAcceleratorTable.h"
- "llvm/DebugInfo/DWARF/DWARFCompileUnit.h"
- "llvm/DebugInfo/DWARF/DWARFDebugAbbrev.h"
- "llvm/DebugInfo/DWARF/DWARFDebugAranges.h"
- "llvm/DebugInfo/DWARF/DWARFDebugFrame.h"
- "llvm/DebugInfo/DWARF/DWARFDebugLoc.h"
- "llvm/DebugInfo/DWARF/DWARFDebugMacro.h"
- "llvm/DebugInfo/DWARF/DWARFGdbIndex.h"
- "llvm/DebugInfo/DWARF/DWARFSection.h"
- "llvm/DebugInfo/DWARF/DWARFTypeUnit.h"
- "llvm/DebugInfo/DWARF/DWARFUnitIndex.h"
Plus llvm/Support/Errc.h not included by a bunch of llvm/DebugInfo/DWARF/DWARF*.h files
Preprocessed lines to build llvm on my setup:
after: 1065629059
before: 1066621848
Which is a great diff!
Discourse thread: https://discourse.llvm.org/t/include-what-you-use-include-cleanup
Differential Revision: https://reviews.llvm.org/D119723
This patch adds unit testing support for BOLT. In order to do this we will need at least do this changes on the code level:
* Make createMCPlusBuilder accessible externally
* Remove positional InputFilename argument to bolt utlity sources
And prepare the cmake and lit for the new tests.
Vladislav Khmelevsky,
Advanced Software Technology Lab, Huawei
Reviewed By: maksfb, Amir
Differential Revision: https://reviews.llvm.org/D118271
Summary:
Move the annotation to avoid dynamic memory allocations.
Improves the CPU time of instrumenting a large binary by 1% (+-0.8%, p-value 0.01)
Test Plan: NFC
Reviewers: maksfb
FBD30091656