Commit Graph

153 Commits

Author SHA1 Message Date
Maksim Panchenko bcc4c90954 [BOLT] Fix instruction encoding validation
Always use non-symbolizing disassembler for instruction encoding
validation as symbols will be treated as undefined/zeros be the encoder
and causing byte sequence mismatches.

Reviewed By: Amir

Differential Revision: https://reviews.llvm.org/D136118
2022-10-18 13:50:00 -07:00
Maksim Panchenko 4d3a0cade2 [BOLT] Section-handling refactoring/overhaul
Simplify the logic of handling sections in BOLT. This change brings more
direct and predictable mapping of BinarySection instances to sections in
the input and output files.

* Only sections from the input binary will have a non-null SectionRef.
  When a new section is created as a copy of the input section,
  its SectionRef is reset to null.

* RewriteInstance::getOutputSectionName() is removed as the section name
  in the output file is now defined by BinarySection::getOutputName().

* Querying BinaryContext for sections by name uses their original name.
  E.g., getUniqueSectionByName(".rodata") will return the original
  section even if the new .rodata section was created.

* Input file sections (with relocations applied) are emitted via MC with
  ".bolt.org" prefix. However, their name in the output binary is
  unchanged unless a new section with the same name is created.

* New sections are emitted internally with ".bolt.new" prefix if there's
  a name conflict with an input file section. Their original name is
  preserved in the output file.

* Section header string table is properly populated with section names
  that are actually used. Previously we used to include discarded
  section names as well.

* Fix the problem when dynamic relocations were propagated to a new
  section with a name that matched a section in the input binary.
  E.g., the new .rodata with jump tables had dynamic relocations from
  the original .rodata.

Reviewed By: rafauler

Differential Revision: https://reviews.llvm.org/D135494
2022-10-13 23:10:39 -07:00
Rafael Auler 4f158995b9 [BOLT] Add pass to fix ambiguous memory references
This adds a round of checks to memory references, looking for
incorrect references to jump table objects. Fix them by replacing the
jump table reference with another object reference + offset.

This solves bugs related to regular data references in code
accidentally being bound to a jump table, and this reference being
updated to a new (incorrect) location because we moved this jump
table.

Fixes #55004

Reviewed By: #bolt, maksfb

Differential Revision: https://reviews.llvm.org/D134098
2022-10-12 18:39:50 -07:00
Rafael Auler 8d1fc45dc3 [BOLT][NFC] Refactor creation of symbol+addend references
Put code that creates references to symbol+addend behind MCPlusBuilder.
Will use this later in validate memory references pass.

Reviewed By: #bolt, maksfb, yota9

Differential Revision: https://reviews.llvm.org/D134097
2022-10-12 18:39:26 -07:00
Maksim Panchenko 5fca9c5763 [BOLT] Change order of new sections
While the order of new sections in the output binary was deterministic
in the past (i.e. there was no run-to-run variation), it wasn't always
rational as we used size to define the precedence of allocatable
sections within "code" or "data" groups (probably unintentionally).
Fix that by defining stricter section-ordering rules.

Other than the order of sections, this should be NFC.

Reviewed By: rafauler

Differential Revision: https://reviews.llvm.org/D135235
2022-10-07 11:20:42 -07:00
Maksim Panchenko c683e281cd [BOLT] Properly set _end symbol
To properly set the "_end" symbol, we need to track the last allocatable
address. Simply emitting "_end" at the end of some section is not
sufficient since the order of section allocation is unknown during the
emission step.

Reviewed By: rafauler

Differential Revision: https://reviews.llvm.org/D135121
2022-10-07 11:19:14 -07:00
serge-sans-paille 61cff9079c [BOLT] Support building bolt when LLVM_LINK_LLVM_DYLIB is ON
This does *not* link with libLLVM, but with static archives instead. Not
super-great, but at least the build works, which is probably better than
failing.

Related to #57551

Differential Revision: https://reviews.llvm.org/D134434
2022-09-23 07:59:30 +02:00
Kazu Hirata c9696322bd [BOLT] Use x.empty() instead of llvm::empty(x) (NFC)
I'm planning to deprecate and eventually remove llvm::empty.

Note that no use of llvm::empty requires the ability of llvm::empty to
determine the emptiness from begin/end only.
2022-09-18 11:01:56 -07:00
Maksim Panchenko 1d5393526c [BOLT] Change base class of ExecutableFileMemoryManager
When we derive EFMM from SectionMemoryManager, it brings into EFMM extra
functionality, such as the registry of exception handling sections,
page permission management, etc. Such functionality is of no use to
llvm-bolt and can even be detrimental (see
https://github.com/llvm/llvm-project/issues/56726).

Change the base class of ExecutableFileMemoryManager to MemoryManager,
avoid registering EH sections, and skip memory finalization.

Fixes #56726

Reviewed By: yota9

Differential Revision: https://reviews.llvm.org/D133994
2022-09-16 13:39:12 -07:00
Maksim Panchenko 9742c25b98 [BOLT] Fix empty function emission in non-relocation mode
In non-relocation mode, every function is emitted in its own section. If
a function is empty, RuntimeDyld will still allocate 1-byte section
for the function and initialize it with zero. As a result, we will
overwrite the first byte of the original function contents with zero.
Such scenario can happen when the input function had only NOP
instructions which BOLT removes by default. Even though such functions
likely cause undefined behavior, it's better to preserve their contents.

Reviewed By: yota9

Differential Revision: https://reviews.llvm.org/D133978
2022-09-16 13:38:32 -07:00
Amir Ayupov e002523b65 [BOLT] Verify externally referenced blocks against jump table targets
For functions with references to internal offsets from data, verify externally
referenced blocks against the set of jump table targets. Mark the function
as non-simple if there are any unclaimed data to code references.

Reviewed By: #bolt, maksfb

Differential Revision: https://reviews.llvm.org/D132495
2022-09-16 11:44:33 -07:00
revunov.denis@huawei.com 553c238952 [BOLT] Preserve original LSDA type encoding
In non-pie binaries BOLT unconditionally converted type encoding
from indirect to absptr, which broke std exceptions since pointers
to their typeinfo were only assigned at runtime in .data section.
In this patch we preserve original encoding so that indirect
remains indirect and can be resolved at runtime, and absolute remains absolute.

Reviewed By: rafauler, maksfb

Differential Revision: https://reviews.llvm.org/D132484
2022-09-14 16:33:47 +00:00
Fabian Parzefall 3ac46f377a [BOLT] Emit LSDA call sites for all fragments
For exception handling, LSDA call sites have to be emitted for each
fragment individually. With this patch, call sites and respective LSDA
symbols are generated and associated with each fragment of their
function, such that they can be used by the emitter.

Reviewed By: maksfb

Differential Revision: https://reviews.llvm.org/D132052
2022-09-08 17:10:29 -07:00
Kazu Hirata a0c7ca8ad4 [BOLT] Use range-based for loops (NFC)
LLVM Coding Standards discourage for_each unless callable objects
already exist.
2022-09-03 11:17:32 -07:00
Amir Ayupov f119a2483d [BOLT][NFC] Use llvm::any_of
Replace the imperative pattern of the following kind
```
bool IsTrue = false;
for (Element : Range) {
  if (Condition(Element)) {
    IsTrue = true;
    break;
  }
}
```
with functional style `llvm::any_of`:
```
bool IsTrue = llvm::any_of(Range, [&](Element) {
  return Condition(Element);
});
```

Reviewed By: rafauler

Differential Revision: https://reviews.llvm.org/D132276
2022-08-27 21:36:15 -07:00
Denis Revunov 6040415ef9 [BOLT][AArch64] Handle references to the middle of Constant Islands
Fix BinaryContext::handleAddressRef to properly detect references to
other function's Constant islands.

Revieved By: rafauler, yota9

Differential Revision: https://reviews.llvm.org/D132376
2022-08-25 04:32:35 -04:00
Fabian Parzefall 9b6e7861ae [BOLT] Track fragment info for all split fragments
To generate all symbols correctly, it is necessary to record the address
of each fragment. This patch moves the address info for the main and
cold fragments from BinaryFunction to FunctionFragment, where this data
is recorded for all fragments.

Reviewed By: rafauler

Differential Revision: https://reviews.llvm.org/D132051
2022-08-24 18:07:09 -07:00
Fabian Parzefall 07f63b0ac5 [BOLT] Allocate FunctionFragment on heap
This changes `FunctionFragment` from being used as a temporary proxy
object to access basic block ranges to a heap-allocated object that can
store fragment-specific information.

Reviewed By: rafauler

Differential Revision: https://reviews.llvm.org/D132050
2022-08-24 18:06:08 -07:00
Fabian Parzefall d5c03def24 [BOLT] Towards FunctionLayout const-correctness
A const-qualified reference to function layout allows accessing
non-const qualified basic blocks on a const-qualified function. This
patch adds or removes const-qualifiers where necessary to indicate where
basic blocks are used in a non-const manner.

Reviewed By: rafauler

Differential Revision: https://reviews.llvm.org/D132049
2022-08-24 16:32:33 -07:00
Fabian Parzefall f24c299e7d Revert "[BOLT] Towards FunctionLayout const-correctness"
This reverts commit 587d265342.
2022-08-24 10:51:38 -07:00
Fabian Parzefall 5065134aa0 Revert "[BOLT] Allocate FunctionFragment on heap"
This reverts commit 101344af1a.
2022-08-24 10:51:36 -07:00
Fabian Parzefall 6304e38281 Revert "[BOLT] Track fragment info for all split fragments"
This reverts commit 7e254818e4.
2022-08-24 10:51:19 -07:00
Fabian Parzefall 7e254818e4 [BOLT] Track fragment info for all split fragments
To generate all symbols correctly, it is necessary to record the address
of each fragment. This patch moves the address info for the main and
cold fragments from BinaryFunction to FunctionFragment, where this data
is recorded for all fragments.

Reviewed By: rafauler

Differential Revision: https://reviews.llvm.org/D132051
2022-08-24 10:17:17 -07:00
Fabian Parzefall 101344af1a [BOLT] Allocate FunctionFragment on heap
This changes `FunctionFragment` from being used as a temporary proxy
object to access basic block ranges to a heap-allocated object that can
store fragment-specific information.

Reviewed By: rafauler

Differential Revision: https://reviews.llvm.org/D132050
2022-08-24 10:17:17 -07:00
Fabian Parzefall 587d265342 [BOLT] Towards FunctionLayout const-correctness
A const-qualified reference to function layout allows accessing
non-const qualified basic blocks on a const-qualified function. This
patch adds or removes const-qualifiers where necessary to indicate where
basic blocks are used in a non-const manner.

Reviewed By: rafauler

Differential Revision: https://reviews.llvm.org/D132049
2022-08-24 10:17:17 -07:00
Amir Ayupov 37cbbea674 [BOLT][NFC] Move out handleAArch64IndirectCall
Move the large lambda out of BinaryFunction::disassemble, reducing its size from
255 to 233 LoC.

Reviewed By: rafauler

Differential Revision: https://reviews.llvm.org/D132104
2022-08-23 17:37:01 -07:00
Amir Ayupov c844850bdf [BOLT][NFC] Move out handleIndirectBranch
Move the large lambda out of BinaryFunction::disassemble, reducing its size from
295 to 255 LoC.

Differential Revision: https://reviews.llvm.org/D132101
2022-08-23 17:36:51 -07:00
Amir Ayupov ec1fbf229e [BOLT][NFC] Move out handleExternalReference
Move the large lambda out of BinaryFunction::disassemble, reducing its size from
338 to 295 LoC.

Reviewed By: rafauler

Differential Revision: https://reviews.llvm.org/D132100
2022-08-23 17:36:41 -07:00
Amir Ayupov 6cd475f8ca [BOLT][NFC] Move out handlePCRelOperand
Move the large lambda out of BinaryFunction::disassemble, reducing its size from
377 to 338 LoC.

Reviewed By: rafauler

Differential Revision: https://reviews.llvm.org/D132099
2022-08-23 17:36:29 -07:00
Kazu Hirata 258531b7ac Remove redundant initialization of Optional (NFC) 2022-08-20 21:18:28 -07:00
Fabian Parzefall 0f74d191d1 [BOLT] Generate sections for multiple fragments
This patch adds support to generate any number of sections that are
assigned to fragments of functions that are split more than two-way.
With this, a function's *nth* split fragment goes into section
`.text.cold.n`.

This also changes `FunctionLayout::erase` to make sure, that there are
no empty fragments at the end of the function. This sometimes happens
when blocks are erased from the function. To avoid creating symbols
pointing to these fragments, they need to be removed.

Reviewed By: rafauler

Differential Revision: https://reviews.llvm.org/D130521
2022-08-18 21:55:06 -07:00
Fabian Parzefall a191ea7d59 [BOLT] Make exception handling fragment aware
This adds basic fragment awareness in the exception handling passes and
generates the necessary symbols for fragments.

Reviewed By: rafauler

Differential Revision: https://reviews.llvm.org/D130520
2022-08-18 21:55:06 -07:00
Fabian Parzefall 275e075cbe [BOLT] Support passing fragments to code emission
This changes code emission such that it can emit specific function
fragments instead of scanning all basic blocks of a function and just
emitting those that are hot or cold.

To implement this, `FunctionLayout` explicitly distinguishes the "main"
fragment (i.e. the one that contains the entry block and is associated
with the original symbol) from "split" fragments. Additionally,
`BinaryFunction` receives support for multiple cold symbols - one for
each split fragment.

Reviewed By: rafauler

Differential Revision: https://reviews.llvm.org/D130052
2022-08-18 21:55:06 -07:00
Amir Ayupov 129dfc8a9a Revert "[BOLT][NFC] Simplify addRelocation"
This reverts commit 29f2301322.

This change breaks one of the internal tests.
2022-08-18 17:26:26 -07:00
Denis Revunov d0e29e87cd [BOLT][AArch64] Ignore functions with islandsInfo during VeneerEliminarion and ICF
Differential Revision: https://reviews.llvm.org/D131881

Reviewed By: yota9
2022-08-18 11:08:47 -04:00
Amir Ayupov 055f9f6d08 [BOLT][NFC] Simplify debug logging in case of JT heuristic failure
Move logging into LLVM_DEBUG scope.
Remove redundant printing of jump table parents:

Old logging:
```
failed to analyze jump table in function _ZN12_GLOBAL__N_116InitHeaderSearch23Ad
dDefaultCIncludePathsERKN4llvm6TripleERKN5clang19HeaderSearchOptionsE/1(*2)
PIC Jump table JUMP_TABLE/_ZN12_GLOBAL__N_116InitHeaderSearch23AddDefaultCInclud
ePathsERKN4llvm6TripleERKN5clang19HeaderSearchOptionsE/1.1 for function _ZN12_GL
OBAL__N_116InitHeaderSearch23AddDefaultCIncludePathsERKN4llvm6TripleERKN5clang19
HeaderSearchOptionsE/1(*2) at 0x65996e0 with a total count of 0:
  0x9dc

next jump table at 0x659a810 belongs to function _ZN5clang5Lexer40LexDependencyD
irectiveTokenWhileSkippingERNS_5TokenE
PIC Jump table JUMP_TABLE/_ZN5clang5Lexer40LexDependencyDirectiveTokenWhileSkipp
ingERNS_5TokenE.0 for function _ZN5clang5Lexer40LexDependencyDirectiveTokenWhile
SkippingERNS_5TokenE at 0x659a810 with a total count of 0:

jump table heuristic failure
```

New logging:
```
failed to analyze PIC Jump table JUMP_TABLE/_ZN12_GLOBAL__N_116InitHeaderSearch2
3AddDefaultCIncludePathsERKN4llvm6TripleERKN5clang19HeaderSearchOptionsE/1.1 for
function _ZN12_GLOBAL__N_116InitHeaderSearch23AddDefaultCIncludePathsERKN4llvm6T
ripleERKN5clang19HeaderSearchOptionsE/1(*2) at 0x65996e0 with a total count of 0:
  absolute offset: 0x52ac58c

next PIC Jump table JUMP_TABLE/_ZN5clang5Lexer40LexDependencyDirectiveTokenWhile
SkippingERNS_5TokenE.0 for function _ZN5clang5Lexer40LexDependencyDirectiveToken
WhileSkippingERNS_5TokenE at 0x659a810 with a total count of 0:

jump table heuristic failure
```

Reviewed By: rafauler

Differential Revision: https://reviews.llvm.org/D131243
2022-08-17 17:35:16 -07:00
Amir Ayupov cdef841fe7 [BOLT][NFC] Simplify scanExternalRefs
Reviewed By: rafauler

Differential Revision: https://reviews.llvm.org/D132013
2022-08-17 17:33:59 -07:00
Fabian Parzefall fd159c2316 [BOLT] Fix ignored LP at fragment start
If the first block of a fragment is also a landing pad, the landing pad
is not used if an exception is thrown. This is because the landing pad
is at the same start address that the corresponding LSDA describes. In
that case, the offset in the call site records to refer to that landing
pad is zero, and a zero offset is interpreted by the personality
function as "no handler" and ignored.

Reviewed By: Amir

Differential Revision: https://reviews.llvm.org/D132053
2022-08-17 16:34:44 -07:00
Amir Ayupov 29f2301322 [BOLT][NFC] Simplify addRelocation
Move the implementation out of the header file.
Simplify the method.
Add debug logging.

Reviewed By: rafauler

Differential Revision: https://reviews.llvm.org/D131811
2022-08-17 16:07:28 -07:00
Fabian Parzefall aed75748de [BOLT] Remove old layout from function layout
To track whether a function's new layout is different from its old
layout when updating it, the old layout would be kept around in memory
indefinitely (if the new layout is different). This was used only for
debugging/logging purposes. This patch forces the caller of function
layout's update method to copy the old layout into a temporary if they
need it by removing the old layout fields.

Reviewed By: rafauler

Differential Revision: https://reviews.llvm.org/D131413
2022-08-17 15:06:17 -07:00
Fabian Parzefall 0f8412c19c [BOLT] Add main fragment to function layout
Functions that do not contain any code still have to be emitted. This
occurs on AArch64 where functions can consist only of a constant island.
To support fragment semantics in code emission, this commits adds a
guaranteed main fragment to function layout. This fragment might be
empty, but allows us omit checks whether the function is empty in most
places.

Reviewed By: rafauler

Differential Revision: https://reviews.llvm.org/D130051
2022-08-17 14:51:31 -07:00
Amir Ayupov 556efdba85 [BOLT][NFC] Extend debug logging in analyzeJumpTable
Reviewed By: rafauler

Differential Revision: https://reviews.llvm.org/D131918
2022-08-15 20:34:40 -07:00
Kazu Hirata 2febc32c9c Use llvm::erase_if (NFC) 2022-08-13 12:55:48 -07:00
Fangrui Song 53113515cd [BOLT] Use Optional::emplace to avoid move assignment. NFC 2022-08-12 12:51:50 -07:00
Thorsten Schütt 0c9258612b [bolt] silence unused variables warnings 2022-08-06 20:52:45 +02:00
Rafael Auler 19eb908e61 [BOLT] Remove always true if statement
Got a warning from GCC when building this.

Reviewed By: Amir

Differential Revision: https://reviews.llvm.org/D131092
2022-08-03 13:11:33 -07:00
David Blaikie 7651522b78 Fold assert-used variable into assert
Fixes #56724
2022-08-01 21:57:11 +00:00
Amir Ayupov 468d4f6d18 Revert "[BOLT] Ignore functions accessing false positive jump tables"
This diff uncovers an ASAN leak in getOrCreateJumpTable:
```
Indirect leak of 264 byte(s) in 1 object(s) allocated from:
    #1 0x4f6e48c in llvm::bolt::BinaryContext::getOrCreateJumpTable ...
```
The removal of an assertion needs to be accompanied by proper deallocation of
a `JumpTable` object for which `analyzeJumpTable` was unsuccessful.

This reverts commit 52cd00cabf.
2022-07-30 10:39:46 -07:00
Kazu Hirata b498a8991e [bolt] Remove redundaunt control-flow statements (NFC)
Identified with readability-redundant-control-flow.
2022-07-30 10:35:49 -07:00
Huan Nguyen 52cd00cabf [BOLT] Ignore functions accessing false positive jump tables
Disassembly and branch target analysis are not decoupled, so any
analysis that depends on disassembly may not operate properly.

In specific, analyzeJumpTable uses instruction bounds check property.
A jump table was analyzed twice: (a) during disassembly, and (b) after
disassembly, so there are potentially some mismatched results.

In this update, functions that access JTs which fail the second check
will be marked as ignored.

Test Plan:
```
ninja check-bolt
```

Reviewed By: Amir

Differential Revision: https://reviews.llvm.org/D130431
2022-07-28 23:22:17 -07:00