Commit Graph

112 Commits

Author SHA1 Message Date
Amir Ayupov 556efdba85 [BOLT][NFC] Extend debug logging in analyzeJumpTable
Reviewed By: rafauler

Differential Revision: https://reviews.llvm.org/D131918
2022-08-15 20:34:40 -07:00
Kazu Hirata 2febc32c9c Use llvm::erase_if (NFC) 2022-08-13 12:55:48 -07:00
Fangrui Song 53113515cd [BOLT] Use Optional::emplace to avoid move assignment. NFC 2022-08-12 12:51:50 -07:00
Thorsten Schütt 0c9258612b [bolt] silence unused variables warnings 2022-08-06 20:52:45 +02:00
Rafael Auler 19eb908e61 [BOLT] Remove always true if statement
Got a warning from GCC when building this.

Reviewed By: Amir

Differential Revision: https://reviews.llvm.org/D131092
2022-08-03 13:11:33 -07:00
David Blaikie 7651522b78 Fold assert-used variable into assert
Fixes #56724
2022-08-01 21:57:11 +00:00
Amir Ayupov 468d4f6d18 Revert "[BOLT] Ignore functions accessing false positive jump tables"
This diff uncovers an ASAN leak in getOrCreateJumpTable:
```
Indirect leak of 264 byte(s) in 1 object(s) allocated from:
    #1 0x4f6e48c in llvm::bolt::BinaryContext::getOrCreateJumpTable ...
```
The removal of an assertion needs to be accompanied by proper deallocation of
a `JumpTable` object for which `analyzeJumpTable` was unsuccessful.

This reverts commit 52cd00cabf.
2022-07-30 10:39:46 -07:00
Kazu Hirata b498a8991e [bolt] Remove redundaunt control-flow statements (NFC)
Identified with readability-redundant-control-flow.
2022-07-30 10:35:49 -07:00
Huan Nguyen 52cd00cabf [BOLT] Ignore functions accessing false positive jump tables
Disassembly and branch target analysis are not decoupled, so any
analysis that depends on disassembly may not operate properly.

In specific, analyzeJumpTable uses instruction bounds check property.
A jump table was analyzed twice: (a) during disassembly, and (b) after
disassembly, so there are potentially some mismatched results.

In this update, functions that access JTs which fail the second check
will be marked as ignored.

Test Plan:
```
ninja check-bolt
```

Reviewed By: Amir

Differential Revision: https://reviews.llvm.org/D130431
2022-07-28 23:22:17 -07:00
Fabian Parzefall 8477bc6761 [BOLT] Add function layout class
This patch adds a dedicated class to keep track of each function's
layout. It also lays the groundwork for splitting functions into
multiple fragments (as opposed to a strict hot/cold split).

Reviewed By: maksfb

Differential Revision: https://reviews.llvm.org/D129518
2022-07-16 17:23:24 -07:00
Huan Nguyen ae563c9146 [BOLT] Support split landing pad
We previously support split jump table, where some jump table entries
target different fragments of same function. In this fix, we provide
support for another type of intra-indirect transfer: landing pad.

When C++ exception handling is used, compiler emits .gcc_except_table
that describes the location of catch block (landing pad) for specific
range that potentially invokes a throw(). Normally landing pads reside
in the function, but with -fsplit-machine-functions, landing pads can
be moved to another fragment. The intuition is, landing pads are rarely
executed, so compiler can move them to .cold section.

This update will mark all fragments that have landing pad to another
fragment as non-simple, and later propagate non-simple to all related
fragments.

This update also includes one manual test case: split-landing-pad.s

Reviewed By: Amir

Differential Revision: https://reviews.llvm.org/D128561
2022-07-14 18:10:22 -07:00
Fabian Parzefall d55dfeaf32 [BOLT] Replace uses of layout with basic block list
As we are moving towards support for multiple fragments, loops that
iterate over all basic blocks of a function, but do not depend on the
order of basic blocks in the final layout, should iterate over binary
functions directly, rather than the layout.

Eventually, all loops using the layout list should either iterate over
the function, or be aware of multiple layouts. This patch replaces
references to binary function's block layout with the binary function
itself where only little code changes are necessary.

Reviewed By: maksfb

Differential Revision: https://reviews.llvm.org/D129585
2022-07-14 13:07:05 -07:00
Huan Nguyen 05523dc32d [BOLT] Support multiple parents for split jump table
There are two assumptions regarding jump table:
(a) It is accessed by only one fragment, say, Parent
(b) All entries target instructions in Parent

For (a), BOLT stores jump table entries as relative offset to Parent.
For (b), BOLT treats jump table entries target somewhere out of Parent
as INVALID_OFFSET, including fragment of same split function.

In this update, we extend (a) and (b) to include fragment of same split
functinon. For (a), we store jump table entries in absolute offset
instead. In addition, jump table will store all fragments that access
it. A fragment uses this information to only create label for jump table
entries that target to that fragment.

For (b), using absolute offset allows jump table entries to target
fragments of same split function, i.e., extend support for split jump
table. This can be done using relocation (fragment start/size) and
fragment detection heuristics (e.g., using symbol name pattern for
non-stripped binaries).

For jump table targets that can only be reached by one fragment, we
mark them as local label; otherwise, they would be the secondary
function entry to the target fragment.

Test Plan
```
ninja check-bolt
```

Reviewed By: Amir

Differential Revision: https://reviews.llvm.org/D128474
2022-07-13 23:37:31 -07:00
Vladislav Khmelevsky 35efe1d806 [BOLT][AArch64] Handle gold linker veneers
The gold linker veneers are written between functions without symbols,
so we to handle it specially in BOLT.

Vladislav Khmelevsky,
Advanced Software Technology Lab, Huawei

Differential Revision: https://reviews.llvm.org/D129260
2022-07-13 14:47:22 +03:00
Denis Revunov 7564167885 [BOLT][AArch64] Use all supported CPU features on AArch64
Since we now have +all feature for AArch64 disassembler, we can use it
in BOLT and allow it to disassemble all ARM instructions supported by LLVM.

Reviewed by: rafauler

Differential Revision: https://reviews.llvm.org/D129139
2022-07-12 03:56:04 -04:00
Rafael Auler 42465efd17 [BOLT] Increase coverage of shrink wrapping [1/5]
Change how function score is calculated and provide more
detailed statistics when reporting back frame optimizer and shrink
wrapping results. In this new statistics, we provide dynamic coverage
numbers. The main metric for shrink wrapping is the number of executed
stores that were saved because of shrink wrapping (push instructions
that were either entirely moved away from the hot block or converted
to a stack adjustment instruction). There is still a number of reduced
load instructions (pop) that we are not counting at the moment. Also
update alloc combiner to report dynamic numbers, as well as frame
optimizer.

For debugging purposes, we also include a list of top 10 functions
optimized by shrink wrapping. These changes are aimed at better
understanding the impact of shrink wrapping in a given binary.

We also remove an assertion in dataflow analysis to do not choke on
empty functions (which makes no sense).

Reviewed By: Amir

Differential Revision: https://reviews.llvm.org/D126111
2022-07-11 17:29:22 -07:00
spupyrev 228970f612 Revert "Rebase: [Facebook] Revert "[BOLT] Update dynamic relocations from section relocations""
This reverts commit 76029cc53e.
2022-07-11 09:50:47 -07:00
spupyrev eecd41aa09 Revert "Rebase: [Facebook] [MC] Introduce NeverAlign fragment type"
This reverts commit 6d0528636a.
2022-07-11 09:50:47 -07:00
Maksim Panchenko 76029cc53e Rebase: [Facebook] Revert "[BOLT] Update dynamic relocations from section relocations"
Summary:
This reverts commit 729d29e167.

Needed as a workaround for T112872562.

Manual rebase conflict history:
https://phabricator.intern.facebook.com/D35230076
https://phabricator.intern.facebook.com/D35681740

Test Plan: sandcastle

Reviewers: #llvm-bolt

Subscribers: spupyrev

Differential Revision: https://phabricator.intern.facebook.com/D37098481
2022-07-11 09:31:52 -07:00
Rafael Auler 6d0528636a Rebase: [Facebook] [MC] Introduce NeverAlign fragment type
Summary:
Introduce NeverAlign fragment type.

The intended usage of this fragment is to insert it before a pair of
macro-op fusion eligible instructions. NeverAlign fragment ensures that
the next fragment (first instruction in the pair) does not end at a
given alignment boundary by emitting a minimal size nop if necessary.

In effect, it ensures that a pair of macro-fusible instructions is not
split by a given alignment boundary, which is a precondition for
macro-op fusion in modern Intel Cores (64B = cache line size, see Intel
Architecture Optimization Reference Manual, 2.3.2.1 Legacy Decode
Pipeline: Macro-Fusion).

This patch introduces functionality used by BOLT when emitting code with
MacroFusion alignment already in place.

The use case is different from BoundaryAlign and instruction bundling:
- BoundaryAlign can be extended to perform the desired alignment for the
first instruction in the macro-op fusion pair (D101817). However, this
approach has higher overhead due to reliance on relaxation as
BoundaryAlign requires in the general case - see
https://reviews.llvm.org/D97982#2710638.
- Instruction bundling: the intent of NeverAlign fragment is to prevent
the first instruction in a pair ending at a given alignment boundary, by
inserting at most one minimum size nop. It's OK if either instruction
crosses the cache line. Padding both instructions using bundles to not
cross the alignment boundary would result in excessive padding. There's
no straightforward way to request instruction bundling to avoid a given
end alignment for the first instruction in the bundle.

LLVM: https://reviews.llvm.org/D97982

Manual rebase conflict history:
https://phabricator.intern.facebook.com/D30142613

Test Plan: sandcastle

Reviewers: #llvm-bolt

Subscribers: phabricatorlinter

Differential Revision: https://phabricator.intern.facebook.com/D31361547
2022-07-11 09:31:52 -07:00
Alexander Yermolovich e159abdb04 [BOLT][DWARF] Support mix mode DWARF
Added support for mixing monolithic DWARF5 with legacy DWARF, and monolithic legacy and DWARF5 split dwarf.

Reviewed By: maksfb

Differential Revision: https://reviews.llvm.org/D128232
2022-06-30 16:53:15 -07:00
Amir Ayupov 798e92c6c4 [BOLT] Respect shouldPrint in dump-dot-all
Don't dump dot CFG graph for functions that should not be printed.

Reviewed By: rafauler, maksfb

Differential Revision: https://reviews.llvm.org/D128699
2022-06-29 17:01:17 -07:00
Rafael Auler fc2d96c334 Revert "[BOLT][AArch64] Handle gold linker veneers"
This reverts commit 425dda76e9.

This commit is currently causing BOLT to crash in one of our
binaries and needs a bit more checking to make sure it is safe
to land.
2022-06-28 19:23:28 -07:00
Vladislav Khmelevsky 425dda76e9 [BOLT][AArch64] Handle gold linker veneers
The gold linker veneers are written between functions without symbols,
so we to handle it specially in BOLT.

Vladislav Khmelevsky,
Advanced Software Technology Lab, Huawei

Differential Revision: https://reviews.llvm.org/D128082
2022-06-28 16:14:05 +03:00
Amir Ayupov 0d477f63b0 [BOLT][NFC] Add aliases for ICP flags
- `indirect-call-promotion` -> `icp`
- `indirect-call-promotion-mispredict-threshold` -> `icp-mp-threshold`
- `indirect-call-promotion-use-mispredicts` -> `icp-use-mp`
- `indirect-call-promotion-topn` -> `icp-topn`
- `indirect-call-promotion-calls-topn` -> `icp-calls-topn`
- `indirect-call-promotion-jump-tables-topn` -> `icp-jt-topn`
- `icp-jump-table-targets` -> `icp-jt-targets`

This also fixes an inconsistency in ICP flag names that some start with
`indirect-call-promotion` while others start with `icp`.

Reviewed By: rafauler

Differential Revision: https://reviews.llvm.org/D128375
2022-06-27 10:29:26 -07:00
Amir Ayupov c4302e4fc2 [BOLT][NFC] Use llvm::less_first
Follow the case of https://reviews.llvm.org/D126068 and simplify call sites
with `llvm::less_first`.

Reviewed By: rafauler

Differential Revision: https://reviews.llvm.org/D128242
2022-06-27 10:27:17 -07:00
Amir Ayupov d2c8769936 [BOLT][NFC] Use range-based STL wrappers
Replace `std::` algorithms taking begin/end iterators with `llvm::` counterparts
accepting ranges.

Reviewed By: rafauler

Differential Revision: https://reviews.llvm.org/D128154
2022-06-23 22:16:27 -07:00
Maksim Panchenko f263a66ba0 [BOLT] Split functions with exceptions in shared objects and PIEs
Add functionality to allow splitting code with C++ exceptions in shared
libraries and PIEs. To overcome a limitation in exception ranges format,
for functions with fragments spanning multiple sections, add trampoline
landing pads in the same section as the corresponding throwing range.

Reviewed By: Amir

Differential Revision: https://reviews.llvm.org/D127936
2022-06-19 16:48:48 -07:00
Huan Nguyen 28b1dcb122 [BOLT] Allow function fragments to point to one jump table
Resolve a crash related to split functions

Due to split function optimization, a function can be divided to two

fragments, and both fragments can access same jump table. This
violates 
the assumption that a jump table can only have one parent
function, 
which causes a crash during instrumentation.

We want to support the case: different functions cannot access same
jump tables, but different fragments of same function can!

As all fragments are from same function, we point JT::Parent to one
specific fragment. Right now it is the first disassembled fragment, but
we can point it to the function's main fragment later.

Functions are disassembled sequentially. Previously, at the end of
processing a function, JT::OffsetEntries is cleared, so other fragment
can no longer reuse JT::OffsetEntries. To extend the support for split
function, we only clear JT::OffsetEntries after all functions are
disassembled.

Let say A.hot and A.cold access JT of three targets {X, Y, Z}, where
X and Y are in A.hot, and Z is in A.cold. Suppose that A.hot is
disassembled first, JT::OffsetEntries = {X',Y',INVALID_OFFSET}. When
A.cold is disassembled, it cannot reuse JT::OffsetEntries above due to
different fragment start. A simple solution:
A.hot  = {X',Y',INVALID_OFFSET}
A.cold = {INVALID_OFFSET, INVALID_OFFSET, INVALID_OFFSET}

We update the assertion to allow different fragments of same function
to get the same JumpTable object.

Potential improvements:
A.hot  = {X',Y',INVALID_OFFSET}
A.cold = {INVALID_OFFSET, INVALID_OFFSET, Z'}
The main issue is A.hot and A.cold have separate CFGs, thus jump table
targets are still constrained within fragment bounds.

Future improvements:
A.hot  = {X, Y, Z}
A.cold = {X, Y, Z}

Reviewed By: Amir

Differential Revision: https://reviews.llvm.org/D127924
2022-06-17 16:22:30 -07:00
Maksim Panchenko 8228c70358 [BOLT][NFCI] Refactor interface for adding basic blocks
Reviewed By: Amir

Differential Revision: https://reviews.llvm.org/D127935
2022-06-16 11:51:57 -07:00
Amir Ayupov 02d510b416 [BOLT][NFC] Pass Function to BC.printInstructions in BinaryBasicBlock::dump
BC::printInstruction(s) has many uses of Function ptr if it's available:
# printing CFI instructions (unconditional)
# printing debug line information (-print-debug-info)
# printing instruction relocations (-print-relocations)

Enable these uses by passing Function ptr from the primary printing entry point:
BinaryBasicBlock::dump.

Reviewed By: maksfb

Differential Revision: https://reviews.llvm.org/D126916
2022-06-13 14:26:51 -07:00
Vladislav Khmelevsky 6e26ffa064 [BOLT][AARCH64] Skip R_AARCH64_LD_PREL_LO19 relocation
Supress failed to analyze relocations warning for R_AARCH64_LD_PREL_LO19
relocation. This relocation is mostly used to get value stored in CI and
we don't process it since we are caluclating target address using the
instruction value in evaluateMemOperandTarget().

Differential Revision: https://reviews.llvm.org/D127413
2022-06-13 15:40:06 +03:00
Amir Ayupov 7dee646b28 [BOLT][NFC] Move printDebugInfo out of BC::printInstruction
Simplify `BinaryContext::printInstruction`.

Reviewed By: ayermolo

Differential Revision: https://reviews.llvm.org/D127561
2022-06-11 11:58:36 -07:00
Fangrui Song adf4142f76 [MC] De-capitalize SwitchSection. NFC
Add SwitchSection to return switchSection. The API will be removed soon.
2022-06-10 22:50:55 -07:00
Huan Nguyen 82095bd5ed [BOLT] Mark fragments related to split jump table as non-simple
Mark fragments related to split jump table as non-simple.

A function could be splitted into hot and cold fragments. A split jump table is
challenging for correctly reconstructing control flow graphs, so it was marked
as ignored. This update marks those fragments as non-simple, allowing them
to be printed and partial control flow graph construction.

Test Plan:
```
llvm-lit -a tools/bolt/test/X86/split-func-icf.s
```
This test has two functions (main, main2), each has a jump table target to the
same cold portion main2.cold.1(*2). We try to print out only this cold portion.
If it is ignored, it cannot be printed. If it is non-simple, it can be printed. We
verify that it can be printed.

Reviewed By: Amir

Differential Revision: https://reviews.llvm.org/D127464
2022-06-10 15:49:32 -07:00
Denis Revunov 0b7e8baf83 [BOLT][AArch64] Handle data at the beginning of a function when disassembling and building CFG.
This patch adds getFirstInstructionOffset method for BinaryFunction
which is used to properly handle cases where data is at zero offset in
a function. The main change is that we add basic block at first
instruction offset when disassembling, which prevents assertion
failures in buildCFG.

Reviewed By: yota9, rafauler

Differential Revision: https://reviews.llvm.org/D127111
2022-06-09 15:26:32 -07:00
Maksim Panchenko 1817642684 [BOLT] Add support for GOTPCRELX relocations
The linker can convert instructions with GOTPCRELX relocations into a
form that uses an absolute addressing with an immediate. BOLT needs to
recognize such conversions and symbolize the immediates.

Reviewed By: rafauler

Differential Revision: https://reviews.llvm.org/D126747
2022-06-09 13:37:04 -07:00
Alexander Yermolovich 1c6dc43de9 [BOLT]DWARF] Eagerly write out loclists
Taking advantage of us being able to re-write .debug_info to reduce memory
footprint loclists. Writing out loc-list as they are added, similar to how
we handle ranges.

Collected on clang-14
trunk
4:41.20 real,   389.50 user,    59.50 sys,      0 amem, 38412532 mmem
4:30.08 real,   376.10 user,    63.75 sys,      0 amem, 38477844 mmem
4:25.58 real,   373.76 user,    54.71 sys,      0 amem, 38439660 mmem
diff
4:34.66 real,   392.83 user,    57.73 sys,      0 amem, 38382560 mmem
4:35.96 real,   377.70 user,    58.62 sys,      0 amem, 38255840 mmem
4:27.61 real,    390.18 user,    57.02 sys,      0 amem, 38223224 mmem

Reviewed By: maksfb

Differential Revision: https://reviews.llvm.org/D126999
2022-06-08 16:52:59 -07:00
Fangrui Song b92436efcb [bolt] Remove unneeded cl::ZeroOrMore for cl::opt options 2022-06-05 13:29:49 -07:00
Maksim Panchenko 986e5dedf2 [BOLT][NFC] Fix braces in BinaryEmitter
Reviewed By: rafauler

Differential Revision: https://reviews.llvm.org/D126844
2022-06-02 12:45:25 -07:00
Amir Ayupov 6333e5dde9 [BOLT][NFC] Use colors in CFG dumps
Use color coding to distinguish nodes:
- Entry nodes have bold border
- Scalar (non-loopy) code is milk white
- Outer loops are light yellow
- Innermost loops are light blue

`-print-loops` needs to be enabled to provide BinaryLoopInfo.
Examples:
{F23170673}
{F23170680}

Reviewed By: rafauler

Differential Revision: https://reviews.llvm.org/D126248
2022-06-02 00:27:12 -07:00
Amir Ayupov cc23c64ff1 [BOLT][NFC] Print block instructions in dumpGraph as part of node label
Reuse the option `-dot-tooltip-code` to put block instructions into the label.
This way, the instructions are displayed by default when used with dot viewer.

When the .dot file is used with dot2html, instructions are hidden by default,
and are shown by clicking on a node.

{F23169510}

Reviewed By: rafauler

Differential Revision: https://reviews.llvm.org/D126237
2022-06-01 23:41:41 -07:00
Maksim Panchenko 0426100ff4 [BOLT][NFC] Remove unused variable
Reviewed By: Amir

Differential Revision: https://reviews.llvm.org/D126808
2022-06-01 13:43:10 -07:00
Maksim Panchenko e290133c76 [BOLT] Add new class for symbolizing X86 instructions
Summary:
While disassembling instructions, we need to replace certain immediate
operands with symbols. This symbolizing process relies on reading
relocations against instructions. However, some X86 instructions can
have multiple immediate operands and up to two relocations against
them. Thus, correctly matching a relocation to an operand is not
always possible without knowing the operand offset within the
instruction.

Luckily, LLVM provides an interface for passing the required info from
the disassembler via a virtual MCSymbolizer class. Creating a
target-specific version allows a precise matching of relocations to
operands.

This diff adds X86MCSymbolizer class that performs X86-specific
symbolizing (currently limited to non-branch instructions).

Reviewers: yota9, Amir, ayermolo, rafauler, zr33

Differential Revision: https://reviews.llvm.org/D120928
2022-05-31 17:48:19 -07:00
Denis Revunov 8579db96e8 [BOLT] [AArch64] Handle constant islands spanning multiple functions
Fix BOLT's constant island mapping when a constant island marked by $d
spans multiple functions. Currently, because BOLT only marks the
constant island in the first function where $d is located, if the next
function contains data at its start, BOLT will miss the data and try
to disassemble it. This patch adds code to explicitly go through all
symbols between $d and $x markers and mark their respective offsets as
data, which stops BOLT from trying to disassemble data. It also adds
MarkerType enum and refactors related functions.

Reviewed By: yota9, rafauler

Differential Revision: https://reviews.llvm.org/D126177
2022-05-31 13:51:35 -07:00
Amir Ayupov f7581a3969 [BOLT][NFC] Use ListSeparator in BinaryFunction print methods
Reviewed By: rafauler

Differential Revision: https://reviews.llvm.org/D126243
2022-05-24 18:29:24 -07:00
Amir Ayupov 69f87b6c29 [BOLT][NFC] Customize endline character for printInstruction(s)
This would be used in `BF::dumpGraph` to dump left-justified text.

Reviewed By: rafauler

Differential Revision: https://reviews.llvm.org/D126232
2022-05-24 18:26:12 -07:00
Amir Ayupov 5d8247d4c7 [BOLT][NFC] Use for_each to simplify printLoopInfo
Reviewed By: rafauler

Differential Revision: https://reviews.llvm.org/D126242
2022-05-24 18:05:43 -07:00
Amir Ayupov c907d6e0e9 [BOLT][NFC] Suppress unused variable warnings
Addresses the warnings emitted by Apple Clang 13.1.6 (Xcode 13.3.1).
Tip @tschuett issue #55404.

Reviewed By: rafauler

Differential Revision: https://reviews.llvm.org/D125733
2022-05-17 14:30:23 -07:00
Amir Ayupov a7b69dbdd1 [BOLT][NFC] Move BinaryDominatorTree out of BinaryLoop header
Split up the BinaryLoop header and move BinaryDominatorTree into its own header,
preparing it for a standalone use.

Reviewed By: rafauler

Differential Revision: https://reviews.llvm.org/D125664
2022-05-17 14:20:11 -07:00