Commit Graph

256 Commits

Author SHA1 Message Date
Amir Ayupov 8cb7a873ab [BOLT][NFC] Add MCPlus::primeOperands iterator_range
Reviewed By: yota9

Differential Revision: https://reviews.llvm.org/D125397
2022-05-11 09:34:51 -07:00
Amir Ayupov c2d40f1dfb [BOLT] Add icp-inline option
Add an option to only peel ICP targets that can be subsequently inlined.
Yet there's no guarantee that they will be inlined.

The mode is independent from the heuristic used to choose ICP targets: by exec
count, mispredictions, or memory profile.

Reviewed By: rafauler

Differential Revision: https://reviews.llvm.org/D124900
2022-05-11 03:21:24 -07:00
Alexander Yermolovich 3abb68a626 [BOLT][DWARF] Fix assert for split dwarf.
Fixing a small bug where it would assert if CU does not modify .debug_addr section.

Differential Revision: https://reviews.llvm.org/D125181
2022-05-08 19:18:17 -07:00
Alexander Yermolovich ba1ac98c62 [BOLT][DWARF] Add version 5 split dwarf support
Added support for DWARF5 Split Dwarf.

Reviewed By: maksfb

Differential Revision: https://reviews.llvm.org/D122988
2022-05-05 14:59:05 -07:00
Rahman Lavaee 733dc3e50b [BOLT] Report per-section hotness in bolt-heatmap.
This patch adds a new feature to bolt heatmap to print the hotness of each section in terms of the percentage of samples within that section.

Sample output generated for the clang binary:

Section Name, Begin Address, End Address, Percentage Hotness
.text, 0x1a7b9b0, 0x20a2cc0, 1.4709
.init, 0x20a2cc0, 0x20a2ce1, 0.0001
.fini, 0x20a2ce4, 0x20a2cf2, 0.0000
.text.unlikely, 0x20a2d00, 0x431990c, 0.3061
.text.hot, 0x4319910, 0x4bc6927, 97.2197
.text.startup, 0x4bc6930, 0x4c10c89, 0.0058
.plt, 0x4c10c90, 0x4c12010, 0.9974

Reviewed By: rafauler

Differential Revision: https://reviews.llvm.org/D124412
2022-05-05 11:37:46 -07:00
Amir Ayupov f8d2d8b587 [BOLT][NFC] Move getInliningInfo out of Inliner class
`getInliningInfo` is useful in other passes that need to check inlining
eligibility for some function. Move the declaration and InliningInfo definition
out of Inliner class. Prepare for subsequent use in ICP.

Reviewed By: rafauler

Differential Revision: https://reviews.llvm.org/D124899
2022-05-04 14:08:06 -07:00
Amir Ayupov 2ad1c7540e [BOLT][NFC] Minor cleanup in ICP getCallTargets and canPromoteCallsite
Minor refactoring. NFC.

Reviewed By: rafauler

Differential Revision: https://reviews.llvm.org/D124898
2022-05-04 14:06:53 -07:00
Amir Ayupov 68c7299f16 [BOLT][NFC] Fix MCPlusBuilder::getAliases caching behavior
Caching behavior of `getAliases` causes a failure in unit tests where two
MCPlusBuilder objects are created corresponding to AArch64 and X86:
the alias cache is created for AArch64 but then used for X86.

https://lab.llvm.org/staging/#/builders/211/builds/126

The issue only affects unit tests as we only construct one MCPlusBuilder
for ELF binary.

Resolve the issue by moving alias bitvectors to MCPlusBuilder object.

Reviewed By: yota9

Differential Revision: https://reviews.llvm.org/D124942
2022-05-04 12:53:26 -07:00
Amir Ayupov 60957a5a08 [BOLT] Fix ICPJumpTablesTopN option use
Fix non-sensical `opts::ICPJumpTablesTopN != 0 ? opts::ICPTopN : opts::ICPTopN`.
Refactor/simplify another similar assignment.

Reviewed By: rafauler

Differential Revision: https://reviews.llvm.org/D124880
2022-05-03 19:34:10 -07:00
Amir Ayupov c3d5372093 [BOLT][NFC] Make ICP options naming uniform
Rename `opts::IndirectCallPromotion*` to `opts::ICP*`, making option naming
uniform and easier to follow.

Reviewed By: rafauler

Differential Revision: https://reviews.llvm.org/D124879
2022-05-03 19:32:45 -07:00
Amir Ayupov d0b1c98c96 [BOLT][NFC] ICP: simplify findTargetsIndex
Unnest lambda and use `llvm::is_contained`.

Reviewed By: rafauler

Differential Revision: https://reviews.llvm.org/D124877
2022-05-03 19:31:20 -07:00
Amir Ayupov ec02227bf7 [BOLT][NFC] Refactor ICP::findCallTargetSymbols
Reduce nesting making it easier to read.

Reviewed By: rafauler

Differential Revision: https://reviews.llvm.org/D124876
2022-05-03 19:29:22 -07:00
Paul Kirth 625e0e611b [BOLT] [NFC] Remove unused variable
This patch fixes a warning from -Wunused-but-set-variable
MismatchedBranches are counted, but are never reported.
Since evaluateProfileData() should already identify and report
these cases, we can safely remove the unused variable.

Reviewed By: rafauler

Differential Revision: https://reviews.llvm.org/D124588
2022-05-03 15:15:56 +00:00
Amir Ayupov 64421e191b [BOLT][NFC] Reduce Target/{AArch64,X86} dependencies
We don't actually depend on entire X86/AArch64 components that pull in CodeGen,
SelectionDAG etc., just the Desc part with opcode and other definitions.

Note that it doesn't decouple BOLT from these components - we still pull in X86
and AArch64 from top-level llvm-bolt dependencies as we use assembler and
disassembler. It's difficult to reduce these as this requires non-trivial
changes to X86/AArch64 components themselves (e.g. moving out AsmPrinter).

Reviewed By: rafauler

Differential Revision: https://reviews.llvm.org/D124206
2022-04-29 20:37:53 -07:00
Paul Kirth a0b8ab1ba3 [BOLT][NFC] Fix warning for unqualified call to std::move
Fixes warning from RetpolineInsertion.cpp:171:44:
warning: unqualified call to std::move [-Wunqualified-std-cast-call]

Reviewed By: maksfb

Differential Revision: https://reviews.llvm.org/D124482
2022-04-26 23:18:20 +00:00
Rahman Lavaee e59e580116 [BOLT] Refactor DataAggregator::printLBRHeatMap.
This also fixes some logs that were impacted by D123067.

Reviewed By: Amir

Differential Revision: https://reviews.llvm.org/D124281
2022-04-25 11:39:44 -07:00
Alexander Yermolovich 014cd37f51 [BOLT][DWARF] Implement monolithic DWARF5
Added implementation to support DWARF5 in monolithic mode.
Next step DWARF5 split dwarf support.

Reviewed By: maksfb

Differential Revision: https://reviews.llvm.org/D121876
2022-04-21 16:02:23 -07:00
Alexey Moksyakov 48e894a536 [BOLT] Add R_AARCH64_PREL16/32/64 relocations support
Reviewed By: yota9, rafauler

Differential Revision: https://reviews.llvm.org/D122294
2022-04-21 13:52:47 +03:00
Vladislav Khmelevsky 63686af1e1 [BOLT] Fix build with GCC 7.3.0
The gcc 7.3.0 version raises "could not covert" error without std::move
used explicitly.

Differential Revision: https://reviews.llvm.org/D124009
2022-04-21 13:47:58 +03:00
Maksim Panchenko 76981fbcf6 [BOLT] Add fuzzy function name matching for LLVM LTO
LLVM with LTO can generate function names in the form
func.llvm.<number>, where <number> could vary based on the compilation
environment. As a result, if a profiled binary originated from a
different build than a corresponding binary used for BOLT optimization,
then profiles for such LTO functions will be ignored.

To fix the problem, use "fuzzy" matching with "func.llvm.*" form.

Reviewed By: yota9, Amir

Differential Revision: https://reviews.llvm.org/D124117
2022-04-20 17:00:21 -07:00
Alexander Yermolovich 7d6716786f [BOLT][DWARF] Handle Error returned by visitLocationList
Looks like implementation in llvm changed, and now we need to process error
being returned.

Reviewed By: maksfb

Differential Revision: https://reviews.llvm.org/D124133
2022-04-20 16:40:46 -07:00
Amir Ayupov 4f277f28ab [BOLT] Check if LLVM_REVISION is defined
Handle the case where LLVM_REVISION is undefined (due to LLVM_APPEND_VC_REV=OFF
or otherwise) by setting "<unknown>" value as before D123549.

Reviewed By: yota9

Differential Revision: https://reviews.llvm.org/D123852
2022-04-15 06:33:14 -07:00
Amir Ayupov 2a9386726b [BOLT][NFC] Use LLVM_REVISION instead of BOLT_VERSION_STRING
Remove duplicate version string identification

Reviewed By: rafauler

Differential Revision: https://reviews.llvm.org/D123549
2022-04-14 19:16:35 -07:00
Maksim Panchenko 77b75ca53f [BOLT][perf2bolt] Fix base address calculation for shared objects
When processing profile data for shared object or PIE, perf2bolt needs
to calculate base address of the binary based on the map info reported
by the perf tool. When the mapping data provided is for the second
(or any other than the first) segment and the segment's file offset
does not match its memory offset, perf2bolt uses wrong assumption
about the binary base address.

Add a function to calculate binary base address using the reported
memory mapping and use the returned base for further address
adjustments.

Reviewed By: yota9

Differential Revision: https://reviews.llvm.org/D123755
2022-04-14 10:29:53 -07:00
Vladislav Khmelevsky 2f98c5febc [BOLT] Update skipRelocation for aarch64
The ld might relax ADRP+ADD or ADRP+LDR sequences to the ADR+NOP, add
the new case to the skipRelocation for aarch64.

Vladislav Khmelevsky,
Advanced Software Technology Lab, Huawei

Differential Revision: https://reviews.llvm.org/D123334
2022-04-13 22:54:06 +03:00
Maksim Panchenko 36cb736665 [BOLT] Ignore PC-relative relocations from data to data
BOLT expects PC-relative relocations in data sections to reference code
and the relocated data to form a jump table. However, there are cases
where PC-relative addressing is used for data-to-data references
(e.g. clang-15 can generate such code). BOLT should recognize and ignore
such relocations. Otherwise, they will be considered relocations not
claimed by any jump table and cause a failure in the strict mode.

Reviewed By: yota9, Amir

Differential Revision: https://reviews.llvm.org/D123650
2022-04-13 11:13:51 -07:00
Amir Ayupov bad3798113 [BOLT] Fix data race in shortenInstructions
Address ThreadSanitizer warning

Reviewed By: maksfb

Differential Revision: https://reviews.llvm.org/D121338
2022-04-13 11:10:36 -07:00
Rahman Lavaee 0c13d97e2b Allow building heatmaps from basic sampled events with `-nl`.
I find that this is useful for finding event hotspots.

Reviewed By: rafauler

Differential Revision: https://reviews.llvm.org/D123067
2022-04-11 15:04:44 -07:00
Amir Ayupov 9b02dc631d [BOLT] Check MCContext errors
Abort on emission errors to prevent a malformed binary being written.
Example:
```
<unknown>:0: error: Undefined temporary symbol .Ltmp26310
<unknown>:0: error: Undefined temporary symbol .Ltmp26311
<unknown>:0: error: Undefined temporary symbol .Ltmp26312
<unknown>:0: error: Undefined temporary symbol .Ltmp26313
<unknown>:0: error: Undefined temporary symbol .Ltmp26314
<unknown>:0: error: Undefined temporary symbol .Ltmp26315
BOLT-ERROR: Emission failed.
```

Reviewed By: yota9

Differential Revision: https://reviews.llvm.org/D123263
2022-04-08 21:08:39 -07:00
Argyrios Kyrtzidis 330268ba34 [Support/Hash functions] Change the `final()` and `result()` of the hashing functions to return an array of bytes
Returning `std::array<uint8_t, N>` is better ergonomics for the hashing functions usage, instead of a `StringRef`:

* When returning `StringRef`, client code is "jumping through hoops" to do string manipulations instead of dealing with fixed array of bytes directly, which is more natural
* Returning `std::array<uint8_t, N>` avoids the need for the hasher classes to keep a field just for the purpose of wrapping it and returning it as a `StringRef`

As part of this patch also:

* Introduce `TruncatedBLAKE3` which is useful for using BLAKE3 as the hasher type for `HashBuilder` with non-default hash sizes.
* Make `MD5Result` inherit from `std::array<uint8_t, 16>` which improves & simplifies its API.

Differential Revision: https://reviews.llvm.org/D123100
2022-04-05 21:38:06 -07:00
Amir Ayupov f99398fe0e [BOLT][NFC] Move isADD64rr and isADDri out of MCPlusBuilder class
Reviewed By: rafauler

Differential Revision: https://reviews.llvm.org/D123077
2022-04-05 14:32:07 -07:00
Vladislav Khmelevsky 2e51a32219 [BOLT] Check for !isTailCall in isUnconditionalBranch
Add !isTailCall in isUnconditionalBranch check in order to sync the x86
and aarch64 and fix the fixDoubleJumps pass on aarch64.

Vladislav Khmelevsky,
Advanced Software Technology Lab, Huawei

Differential Revision: https://reviews.llvm.org/D122929
2022-04-05 23:39:34 +03:00
Vladislav Khmelevsky 4956e0e197 [BOLT] Fix plt relocations symbol match
The bfd linker adds the symbol versioning string to the symbol name in symtab.
Skip the versioning part in order to find the registered PLT function.

Vladislav Khmelevsky,
Advanced Software Technology Lab, Huawei

Differential Revision: https://reviews.llvm.org/D122039
2022-04-05 15:57:26 +03:00
Amir Ayupov 686406a006 [BOLT][NFC] Use X86 mnemonic checks
Remove switches in X86MCPlusBuilder.cpp, use mnemonic checks instead

Reviewed By: rafauler

Differential Revision: https://reviews.llvm.org/D122853
2022-04-04 14:05:46 -07:00
Vladislav Khmelevsky 3b1314f4de [BOLT] AArch64: Read all static relocations
Read static relocs on the same address, as dynamic in order to update
constant island data address properly.

Differential Revision: https://reviews.llvm.org/D122100
2022-04-03 19:03:35 +03:00
Vladislav Khmelevsky 4c14519ecb [BOLT] LongJmp: Check for shouldEmit
Check that the function will be emitted in the final binary. Preserving
old function address is needed in case it is PLT trampiline, that is
currently not moved by the BOLT.

Differential Revision: https://reviews.llvm.org/D122098
2022-03-31 22:33:09 +03:00
Vladislav Khmelevsky fed958c6cc [BOLT] AArch64: Emit text objects
BOLT treats aarch64 objects located in text as empty functions with
contant islands. Emit them with at least 8-byte alignment to the new
text section.

Vladislav Khmelevsky,
Advanced Software Technology Lab, Huawei

Differential Revision: https://reviews.llvm.org/D122097
2022-03-31 22:28:50 +03:00
Amir Ayupov c31af7cfe3 [MC][BOLT] Add setter for AllowAtInName
Use the setter in BOLT to allow printing names with variant kind in the name
(e.g. "func@PLT").
Fixes BOLT buildbot tests that broke after D122516:
https://lab.llvm.org/buildbot/#/builders/215/builds/3595

Reviewed By: maksfb

Differential Revision: https://reviews.llvm.org/D122694
2022-03-30 13:04:28 -07:00
Vladislav Khmelevsky af9bdcfc46 [BOLT] Align constant islands to 8 bytes
AArch64 requires CI to be aligned to 8 bytes due to access instructions
restrictions. E.g. the ldr with imm, where imm must be aligned to 8 bytes.

Differential Revision: https://reviews.llvm.org/D122065
2022-03-27 22:30:42 +03:00
spupyrev 4609f60ebc [BOLT] Avoid pointless loop rotation
It seems the earlier implementation does not follow the description
in LoopRotationPass.h: It rotates loops even if they are already laid out
correctly. The diff adjusts the behaviour.

Given that the impact of LoopInversionPass is minor, this change won't
yield significant perf differences. Tested on clang-10: there seems to be a
0.1%-0.3% cpu win and a small reduction of branch misses.

**Before:**
BOLT-INFO: 120 Functions were reordered by LoopInversionPass

**After:**
BOLT-INFO: 79 Functions were reordered by LoopInversionPass

Reviewed By: yota9

Differential Revision: https://reviews.llvm.org/D121921
2022-03-22 12:42:42 -07:00
Vladislav Khmelevsky 5be5d0f56e [BOLT] LongJmp speedup refactoring
Run tentativeLayoutRelocMode twice only if UseOldText option was passed.
Refactor BF loop to break on condtition met.

Differential Revision: https://reviews.llvm.org/D121825
2022-03-18 16:16:47 +03:00
Amir Ayupov 42e8e00189 [BOLT][NFC] Use X86 mnemonic tables
Remove tables from X86MCPlusBuilder, make use of llvm::X86 mnemonic tables.

Reviewed By: rafauler

Differential Revision: https://reviews.llvm.org/D121573
2022-03-18 01:52:11 -07:00
Amir Ayupov dc1cf838a5 [BOLT] Strip redundant AdSize override prefix
Since LLVM MC now preserves redundant AdSize override prefix (0x67), remove it
in BOLT explicitly (-x86-strip-redundant-adsize, on by default).

Test Plan:
`bin/llvm-lit -a bolt/test/X86/addr32.s`

Reviewed By: rafauler

Differential Revision: https://reviews.llvm.org/D120975
2022-03-16 09:38:17 -07:00
Amir Ayupov 698127df51 [BOLT][NFC] Move isMOVSX64rm32 out of MCPlusBuilder
Reviewed By: rafauler

Differential Revision: https://reviews.llvm.org/D121669
2022-03-16 08:18:56 -07:00
Vladislav Khmelevsky 62a289d85c [BOLT] LongJmp: Fix hot text section alignment
The BinaryEmitter uses opts::AlignText value to align the hot text
section. Also check that the opts::AlignText is at least
equal opts::AlignFunctions for the same reason, as described in D121392.

Vladislav Khmelevsky,
Advanced Software Technology Lab, Huawei

Differential Revision: https://reviews.llvm.org/D121728
2022-03-16 15:57:46 +03:00
Maksim Panchenko 57f03db195 [BOLT][NFC] Remove unused function
Reviewed By: yota9

Differential Revision: https://reviews.llvm.org/D121729
2022-03-15 12:39:14 -07:00
Vladislav Khmelevsky 8ab69baad5 [BOLT] Set cold sections alignment explicitly
The cold text section alignment is set using the maximum alignment value
passed to the emitCodeAlignment. In order to calculate tentetive layout
right we will set the minimum alignment of such sections to the maximum
possible function alignment explicitly.

Differential Revision: https://reviews.llvm.org/D121392
2022-03-15 22:12:17 +03:00
Amir Ayupov 5790441c45 [BOLT][NFC] Use getShortOpcodeArith in X86MCPlusBuilder
Unify `llvm::X86::getRelaxedOpcodeArith` and `getShortArithOpcode` in
X86MCPlusBuilder.cpp.

Addresses https://lists.llvm.org/pipermail/llvm-dev/2022-January/154526.html

Reviewed By: rafauler

Differential Revision: https://reviews.llvm.org/D121404
2022-03-12 09:07:28 -08:00
Elvina Yakubova db65429db5 [BOLT] Divide RegularPageSize for X86 and AArch64 cases
For AArch64 in some cases/some distributions ld uses 64K alignment of LOAD segments by default.

Reviewed By: yota9, maksfb

Differential Revision: https://reviews.llvm.org/D119267
2022-03-10 23:09:50 +03:00
Vladislav Khmelevsky 04b87cf0e7 [BOLT] LongJmp: Use per-function alignment values
The per-function alignment values must be used in order to create
tentative layout.

Differential Revision: https://reviews.llvm.org/D121298
2022-03-10 19:48:48 +03:00
Amir Ayupov d1638cb0b5 [BOLT][NFC] Fix print-cfg data race
Addresses ThreadSanitizer warning

Reviewed By: rafauler

Differential Revision: https://reviews.llvm.org/D121337
2022-03-09 20:28:06 -08:00
Amir Ayupov d16bbc5340 [BOLT][NFC] Check errors from Obj.dynamicEntries
Addresses fuzzer crash

Reviewed By: rafauler

Differential Revision: https://reviews.llvm.org/D121336
2022-03-09 20:24:44 -08:00
Vladislav Khmelevsky 8bdbcfe7d8 [BOLT] Handle ifuncs trampolines for aarch64
The aarch64 uses the trampolines located in .iplt section, which
contains plt-like trampolines on the value stored in .got. In this case
we don't have JUMP_SLOT relocation, but we have a symbol that belongs to
ifunc trampoline, so use it and set set plt symbol for such functions.

Vladislav Khmelevsky,
Advanced Software Technology Lab, Huawei

Differential Revision: https://reviews.llvm.org/D120850
2022-03-09 01:38:19 +03:00
Amir Ayupov 1e016c3bd5 [BOLT][NFC] Handle "dynamic section sizes should match"
Address fuzzer crash on malformed input

Reviewed By: rafauler

Differential Revision: https://reviews.llvm.org/D121070
2022-03-08 13:03:05 -08:00
Amir Ayupov 687e4af1c0 [BOLT] CMOVConversion pass
Convert simple hammocks into cmov based on misprediction rate.

Test Plan:
- Assembly test: `cmov-conversion.s`
- Testing on a binary:
  # Bootstrap clang with `-x86-cmov-converter-force-all` and `-Wl,--emit-relocs`
  (Release build)
  # Collect perf.data:

    - `clang++ <opts> bolt/lib/Core/BinaryFunction.cpp -E > bf.cpp`
    - `perf record -e cycles:u -j any,u -- clang-15 bf.cpp -O2 -std=c++14 -c -o bf.o`
  # Optimize clang-15 with and w/o -cmov-conversion:
    - `llvm-bolt clang-15 -p perf.data -o clang-15.bolt`
    - `llvm-bolt clang-15 -p perf.data -cmov-conversion -o clang-15.bolt.cmovconv`
  # Run perf experiment:
    - test: `clang-15.bolt.cmovconv`,
    - control: `clang-15.bolt`,
    - workload (clang options): `bf.cpp -O2 -std=c++14 -c -o bf.o`
Results:
```
  task-clock [delta: -360.21 ± 356.75, delta(%): -1.7760 ± 1.7589, p-value: 0.047951, balance: -6]
  instructions  [delta: 44061118 ± 13246382, delta(%): 0.0690 ± 0.0207, p-value: 0.000001, balance: 50]
  icache-misses [delta: -5534468 ± 2779620, delta(%): -0.4331 ± 0.2175, p-value: 0.028014, balance: -28]
  branch-misses [delta: -1624270 ± 1113244, delta(%): -0.3456 ± 0.2368, p-value: 0.030300, balance: -22]
```

Reviewed By: rafauler

Differential Revision: https://reviews.llvm.org/D120177
2022-03-08 10:44:31 -08:00
Amir Ayupov ced5472e09 [BOLT][NFC] Check section contents before registering it
Address fuzzer crash on malformed input:
```
BOLT-ERROR: cannot get section contents for .dynsym: The end of the file was unexpectedly encountered.
```

Reviewed By: rafauler

Differential Revision: https://reviews.llvm.org/D121068
2022-03-08 09:13:01 -08:00
Maksim Panchenko fada230920 [BOLT][NFC] Return MCRegister::NoRegister from MCPlusBuilder::getNoRegister()
Reviewed By: yota9

Differential Revision: https://reviews.llvm.org/D120863
2022-03-03 13:25:13 -08:00
Vladislav Khmelevsky 00b6efc830 [BOLT] Enable PLT analysis for aarch64
This patch enables PLT analysis for aarch64. It is used by the static
relocations in order to provide final symbol address of PLT entry for some
instructions like ADRP.

Vladislav Khmelevsky,
Advanced Software Technology Lab, Huawei

Differential Revision: https://reviews.llvm.org/D118088
2022-03-02 22:14:48 +03:00
Amir Ayupov 08dcbed92f [BOLT] Fix X86MCPlusBuilder::replaceRegWithImm
Reassigning the operand didn't update the operand type which resulted in an
assertion (`Assertion `isReg() && "This is not a register operand!"' failed.`)
Reset the instruction instead.

Test Plan:
```
ninja check-bolt
...
PASS: BOLT-Unit :: Core/./CoreTests/X86/MCPlusBuilderTester.ReplaceRegWithImm/0 (90 of 136)
```

Reviewed By: rafauler

Differential Revision: https://reviews.llvm.org/D120263
2022-02-28 19:24:46 -08:00
Alexander Yermolovich a44fe31977 [BOLT][DWARF] Fix how DW_AT_high_pc [DW_FORM_udata] is handled
We were not handling correctly conversion from DW_AT_high_pc into DW_AT_ranges,
when size of DW_AT_high_pc is not 4/8 bytes.

Reviewed By: maksfb

Differential Revision: https://reviews.llvm.org/D120528
2022-02-25 10:32:05 -08:00
Maksim Panchenko 4101aa130a [BOLT] Support PC-relative relocations with addends
PC-relative memory operand could reference a different object from
the one located at the target address, e.g. when a negative offset
is used. Check relocations for the real referenced object.

Reviewed By: rafauler

Differential Revision: https://reviews.llvm.org/D120379
2022-02-23 22:54:42 -08:00
Amir Ayupov af6e66f44c [BOLT][NFC] Report errors from RewriteInstance `discoverStorage` and `run`
Further improve error handling in BOLT by reporting `RewriteInstance` errors in
a library and fuzzer-friendly way instead of exiting.

Follow-up to D119658

Reviewed By: rafauler

Differential Revision: https://reviews.llvm.org/D120224
2022-02-23 20:42:39 -08:00
Alexander Yermolovich 210bb04e23 [BOLT][DWARF] Remove patchLowHigh unused function.
Cleanup after removing caching mechanims for ranges/abbrevs.

Reviewed By: rafauler, yota9

Differential Revision: https://reviews.llvm.org/D120174
2022-02-22 13:28:15 -08:00
Amir Ayupov d44f99c748 [BOLT] Added fuzzer target (llvm-bolt-fuzzer)
This adds a target that would consume random binary as an
input ELF file.
TBD: add structured input support (ELF).

Build:
```
cmake /path/to/llvm-project/llvm -GNinja \
-DLLVM_TARGETS_TO_BUILD="X86;AArch64" \
-DCMAKE_BUILD_TYPE=Release \
-DLLVM_ENABLE_ASSERTIONS=1 \
-DCMAKE_C_COMPILER=<sanitizer-capable clang> \
-DCMAKE_CXX_COMPILER=<sanitizer-capable clang++> \
-DLLVM_ENABLE_PROJECTS="bolt"  \
-DLLVM_USE_SANITIZER=Address \
-DLLVM_USE_SANITIZE_COVERAGE=On
ninja llvm-bolt-fuzzer
```

Test Plan: ninja llvm-bolt-fuzzer

Reviewed By: maksfb

Differential Revision: https://reviews.llvm.org/D120016
2022-02-20 17:24:16 -08:00
Amir Ayupov 36ada32727 [BOLT][NFC] Fix data race in ShrinkWrapping stats
Fix data race reported by ThreadSanitizer in clang.test:
```
ThreadSanitizer: data race /data/llvm-project/bolt/lib/Passes/ShrinkWrapping.cpp:1359:28
in llvm::bolt::ShrinkWrapping::moveSaveRestores()
```

The issue is with incrementing global counters from multiple threads.

Reviewed By: yota9

Differential Revision: https://reviews.llvm.org/D120218
2022-02-20 17:21:58 -08:00
Amir Ayupov 32d2473a5d [BOLT][NFC] Report errors from createBinaryContext and RewriteInstance ctor
Refactor createBinaryContext and RewriteInstance/MachORewriteInstance
constructors to report an error in a library and fuzzer-friendly way instead of
returning a nullptr or exiting.

Reviewed By: rafauler

Differential Revision: https://reviews.llvm.org/D119658
2022-02-17 00:50:52 -08:00
Vladislav Khmelevsky 729d29e167 [BOLT] Update dynamic relocations from section relocations
This patch changes patchELFAllocatableRelaSections from going through
old relocations sections and update the relocation offsets to emitting
the relocations stored in binary sections. This is needed in case we
would like to remove and add dynamic relocations during BOLT work and it
is used by golang support pass. Note: Currently we emit relocations in
the old sections, so the total number of them should be equal or less
of old number.

Testing: No special tests are neeeded, since this patch does not fix
anything or add new functionality (it only prepares to add). Every
PIC-compiled test binary will use this code and thus become a test.
But just in case the aarch64 dynamic relocations tests were added.

Vladislav Khmelevsky,
Advanced Software Technology Lab, Huawei

Reviewed By: maksfb

Differential Revision: https://reviews.llvm.org/D117612
2022-02-16 18:40:54 +03:00
Shao-Ce SUN 2aed07e96c [NFC][MC] remove unused argument `MCRegisterInfo` in `MCCodeEmitter`
Reviewed By: skan

Differential Revision: https://reviews.llvm.org/D119846
2022-02-16 13:10:09 +08:00
Shao-Ce SUN 9cc49c1951 Revert "[NFC][MC] remove unused argument `MCRegisterInfo` in `MCCodeEmitter`"
This reverts commit fe25c06cc5.
2022-02-16 11:57:49 +08:00
Shao-Ce SUN fe25c06cc5 [NFC][MC] remove unused argument `MCRegisterInfo` in `MCCodeEmitter`
For ten years, it seems that `MCRegisterInfo` is not used by any target.

Reviewed By: skan

Differential Revision: https://reviews.llvm.org/D119846
2022-02-16 11:47:17 +08:00
Alexander Yermolovich bd1ebe9d04 [BOLT][DWARF] Add ability to insert new entries in to DIE
Added ability to append new entries to DIE. This is useful to standadize DWARF4
Split Dwarf, and simplify implementation of DWARF5.
Multiple DIEs can share an abbrev. So currently limitation is that only unique
Attributes can be added.

Reviewed By: maksfb

Differential Revision: https://reviews.llvm.org/D119577
2022-02-15 18:07:19 -08:00
serge-sans-paille 290e482342 Cleanup LLVMDWARFDebugInfo
As usual with that header cleanup series, some implicit dependencies now need to
be explicit:

llvm/DebugInfo/DWARF/DWARFContext.h no longer includes:
- "llvm/DebugInfo/DWARF/DWARFAcceleratorTable.h"
- "llvm/DebugInfo/DWARF/DWARFCompileUnit.h"
- "llvm/DebugInfo/DWARF/DWARFDebugAbbrev.h"
- "llvm/DebugInfo/DWARF/DWARFDebugAranges.h"
- "llvm/DebugInfo/DWARF/DWARFDebugFrame.h"
- "llvm/DebugInfo/DWARF/DWARFDebugLoc.h"
- "llvm/DebugInfo/DWARF/DWARFDebugMacro.h"
- "llvm/DebugInfo/DWARF/DWARFGdbIndex.h"
- "llvm/DebugInfo/DWARF/DWARFSection.h"
- "llvm/DebugInfo/DWARF/DWARFTypeUnit.h"
- "llvm/DebugInfo/DWARF/DWARFUnitIndex.h"

Plus llvm/Support/Errc.h not included by a bunch of llvm/DebugInfo/DWARF/DWARF*.h files

Preprocessed lines to build llvm on my setup:
after: 1065629059
before: 1066621848

Which is a great diff!

Discourse thread: https://discourse.llvm.org/t/include-what-you-use-include-cleanup
Differential Revision: https://reviews.llvm.org/D119723
2022-02-15 09:16:03 +01:00
Maksim Panchenko 5a343994c3 [BOLT] Make order of jump table successors deterministic
When a jump table is recovered in postProcessIndirectBranches(),
successors for the containing basic block are added in random order.
Make the order deterministic.

Reviewed By: yota9

Differential Revision: https://reviews.llvm.org/D119672
2022-02-14 10:37:20 -08:00
Maksim Panchenko 641e92d46b [BOLT] Skip warning message if no functions were ignored
Reviewed By: yota9, Amir

Differential Revision: https://reviews.llvm.org/D119673
2022-02-14 10:31:43 -08:00
serge-sans-paille 57f7c7d90e Add missing MC includes in bolt/
Changes needed after ef736a1c39 that removes some implicit
dependencies from MrCV headers.
2022-02-09 08:28:34 -05:00
Alexander Yermolovich 0d9921daad [BOLT][DWARF] Remove caching of ranges/abbrevs
Removing caching of ranges/abbrevs to simplify the code.
Before we were doing it to get around a gdb limitation.
FBD34015613

Reviewed By: Amir, maksfb

Differential Revision: https://reviews.llvm.org/D119276
2022-02-08 16:37:40 -08:00
Vladislav Khmelevsky 19fb5a210d [BOLT] Add aarch64 support for peephole passes
Enable peephole optimizations for aarch64.
Also small code refactoring - add PeepholeOpts under Peepholes class.

Vladislav Khmelevsky,
Advanced Software Technology Lab, Huawei

Differential Revision: https://reviews.llvm.org/D118732
2022-02-08 03:04:40 +03:00
Vladislav Khmelevsky 5c2ae5f454 [BOLT] Refactor heatmap to be standalone tool
Separate heatmap from bolt and build it as standalone tool.

Reviewed By: maksfb

Differential Revision: https://reviews.llvm.org/D118946
2022-02-07 22:00:44 +03:00
Amir Ayupov 194b164eb5 [BOLT][NFC] Fix compiler warnings
Summary:
- variable 'TotalSize' set but not used
- variable 'TotalCallsTopN' set but not used
- use of bitwise '|' with boolean operands

Reviewed By: maksfb

FBD33911129
2022-02-04 15:57:33 -08:00
Amir Ayupov 167b623a6a [BOLT][NFC] Use isInt<> instead of range checks
Summary: Reuse LLVM isInt check

Reviewed By: maksfb

FBD33945182
2022-02-02 20:32:05 -08:00
Alexander Yermolovich 9f3f9d19c7 [BOLT][DWARF] Handle shared abbrev section
We can have a scenario where multiple CUs share an abbrev table.
We modify or don't modify one CU, which leads to other CUs having invalid abbrev section.
Example that caused it.
All of CUs shared the same abbrev table. First CU just had compile_unit and sub_program.
It was not modified. Next CU had DW_TAG_lexical_block with
DW_AT_low_pc/DW_AT_high_pc converted to DW_AT_low_pc/DW_AT_ranges.
We used unmodified abbrev section for first and subsequent CUs.
So when parsing subsequent CUs debug info was corrupted.

In this patch we will now duplicate all sections that are modified and are different.
This also means that if .debug_types is present and it shares Abbrev table, and
they usually are, we now can have two Abbrev tables. One for CU that was modified,
and unmodified one for TU.

Reviewed By: maksfb

Differential Revision: https://reviews.llvm.org/D118517
2022-01-31 11:10:23 -08:00
Vladislav Khmelevsky e900f0584e [BOLT] Fix AARCH64 registers aliasing
The aarch64 platform has special registers like X0_X1_X2_X3_X4_X5_X6_X7.
Using the downwards propagation this register will become a super
register for all X0..X7 and its super registers which is not right. This
patch replaces the downwards propagation with caching all the aliases using MCRegAliasIterator.

Vladislav Khmelevsky,
Advanced Software Technology Lab, Huawei

Reviewed By: maksfb

Differential Revision: https://reviews.llvm.org/D117394
2022-01-28 01:24:35 +03:00
Alexander Yermolovich 612f0f4568 [BOLT][DWARF] Fix gdb index section
Since we now re-write .debug_info the DWARF CU Offsets can change.
Just like for .debug_aranges the GDB Index will need to be updated.

Reviewed By: Amir, maksfb

Differential Revision: https://reviews.llvm.org/D118273
2022-01-27 12:07:58 -08:00
Vladislav Khmelevsky 20e9d4caf0 [BOLT] Prepare BOLT for unit-testing
This patch adds unit testing support for BOLT. In order to do this we will need at least do this changes on the code level:
* Make createMCPlusBuilder accessible externally
* Remove positional InputFilename argument to bolt utlity sources
And prepare the cmake and lit for the new tests.

Vladislav Khmelevsky,
Advanced Software Technology Lab, Huawei

Reviewed By: maksfb, Amir

Differential Revision: https://reviews.llvm.org/D118271
2022-01-27 00:22:13 +03:00
Amir Ayupov f8c7fb499b [BOLT][NFC] Reduce includes with include-what-you-use
Summary: Removed redundant includes with IWYU

Test Plan: ninja bolt

Reviewers: maksfb

FBD32043568
2022-01-21 12:05:47 -08:00
Amir Ayupov 5a654b0113 [BOLT] Make ICP target selection (more) deterministic
Summary: Break ties by selecting targets with lower addresses.

Reviewers: maksfb

FBD33677001
2022-01-21 12:03:43 -08:00
Amir Ayupov f18fcdabda [BOLT][NFC] Expand auto types pt.2
Summary: Expand autos where it may lead to differences in the BOLT binary.

Test Plan: NFC

Reviewers: maksfb

Reviewed By: maks

FBD27673231
2022-01-21 12:02:57 -08:00
Vladislav Khmelevsky bb8e7ebaad [BOLT] Remove unreachable uncond branch after return
This patch fixes the removal of unreachable uncondtional branch located
after return instruction.

Vladislav Khmelevsky,
Advanced Software Technology Lab, Huawei

Reviewed By: Amir

Differential Revision: https://reviews.llvm.org/D117677
2022-01-19 22:06:26 +03:00
Amir Ayupov a9cd49d50e [BOLT][NFC] Move Offset annotation to Group 1
Summary:
Move the annotation to avoid dynamic memory allocations.
Improves the CPU time of instrumenting a large binary by 1% (+-0.8%, p-value 0.01)

Test Plan: NFC

Reviewers: maksfb

FBD30091656
2022-01-18 13:24:50 -08:00
Vladislav Khmelevsky ad4e26833f updateDWARFObjectAddressRanges: nullify low pc
In case the case the DW_AT_ranges tag already exists for the object the
low pc values won't be updated and will be incorrect in
after-bolt binaries.

Vladislav Khmelevsky,
Advanced Software Technology Lab, Huawei

Differential Revision: https://reviews.llvm.org/D117216
2022-01-18 22:37:29 +03:00
Alexander Yermolovich ea6c8b013e [BOLT][DWARF] Reduce overhead for sized dealloc
This is a follow up to Fix size mismatch error with jemalloc.
4243b6582c
Although that fix works it increased memory footprint.
With this patch we go back to original memory footprint.

Reviewed By: maksfb

Differential Revision: https://reviews.llvm.org/D117341
2022-01-14 17:32:48 -08:00
Amir Ayupov 29fe14c78d [BOLT][NFC] Remove redundant dependent template type
Summary:
Reduce code size by removing redundant dependent template type
from RewriteInstance methods.

Code size savings (via bloaty on llvm-bolt Debug build):
```
symbol,vmsize,filesize -> vmsize,filesize (delta vmsize,filesize)
updateELFSymbolTable         57096,59600 -> 56656,59048 (440,552)
updateELFSymbolTable::lambda 35957,55277 -> 35949,54485   (8,792)
getOutputSections            20592,21440 -> 20372,21156 (220,284)
getOutputSections::lambda      1792,5300 ->   1792,5372   (0,-72)

total delta (668,1556)
```

Reviewed By: maksfb

FBD33589393
2022-01-14 15:47:15 -08:00
Amir Ayupov c34adaa3ca [BOLT][CMAKE] Use IN_LIST check
Summary:
Address @smeenai feedback https://reviews.llvm.org/D117061#inline-1122106:
>CMake has if(IN_LIST) now, which you can use instead of the string(FIND)

IN_LIST is available since CMake 3.3 released in 2015.

Reviewed By: smeenai

FBD33590959
2022-01-14 15:47:14 -08:00
Vladislav Khmelevsky fb3b86fedc [BOLT][DWARF] Fix high pc patching
The DW_FORM_addr form of highPC address is written in absolute addres,
the data form is written in offset-from-low pc format.

Due to the large test binary the test is prepared separately in
https://github.com/rafaelauler/bolt-tests/pull/8

Vladislav Khmelevsky,
Advanced Software Technology Lab, Huawei

Reviewed By: ayermolo

Differential Revision: https://reviews.llvm.org/D117217
2022-01-15 01:05:16 +03:00
Amir Ayupov d914486a9a [BOLT][NFC] Refactor reset-release to move assignment
Summary:
Follow the clang-tidy suggestion to replace reset-release with move assignment.

Move assignment's effect for unique_ptr:
> Effects: Transfers ownership from `u` to `*this` as if by calling `reset(u.release())`
followed by an assignment from `std::forward<D>(u.get_deleter())`.
2022-01-13 22:47:15 -08:00
Amir Ayupov 18bc405a09 [BOLT][NFC] Remove uses of `std::vector<bool>`
Summary:
LLVM Programmer’s Manual strongly discourages the use of `std::vector<bool>`
and suggests `llvm::BitVector` as a possible replacement.
2022-01-13 22:46:34 -08:00
Amir Ayupov b1a107db56 [BOLT][NFC] Format braced initializer lists
Summary:
Use assignment (`=`) with braced initializer lists when constructing
aggregate temporaries in expressions.

https://llvm.org/docs/CodingStandards.html#braced-initializer-lists

(cherry picked from FBD33515669)
2022-01-10 12:45:55 -08:00
Maksim Panchenko 8aab58ba65 [BOLT][NFC] Refactor AArch64MCPlusBuilder
Summary: Selectively apply clang-format to the code in AArch64MCPlusBuilder.cpp.

(cherry picked from FBD33495653)
2022-01-08 18:17:31 -08:00
Maksim Panchenko 82278a8f29 [BOLT][NFC] Refactor X86MCPlusBuilder
Summary:
Selectively apply clang-format and other minor refactoring to the code
in X86MCPlusBuilder.cpp

(cherry picked from FBD33495550)
2022-01-08 17:48:33 -08:00
Amir Ayupov 799cbbb797 [BOLT][NFC] Reuse X86BaseInfo interfaces for macrofusion checks
Summary:
Remove X86MCPlusBuilder code that duplicates checks in X86BaseInfo.
Remove isINC and isDEC as redundant.

The new code of `X86MCPlusBuilder::isMacroOpFusionPair` is functionally
equivalent to `X86AsmBackend::isMacroFused`. However, as the method is
declared/defined in X86AsmBackend.cpp and not exported in a header file,
there's no way to use it in BOLT without changes in LLVM code.

(cherry picked from FBD33440373)
2022-01-05 15:58:01 -08:00
Amir Ayupov 1d3c150748 [BOLT] Remove ineligible macro-fusion patterns
Summary:
Remove patterns ineligible for macro-fusion:
- First instruction has a memory destination

This is a temporary commit to align BOLT with LLVM MC interfaces.
(cherry picked from FBD33479340)
2022-01-07 09:40:04 -08:00
Maksim Panchenko 330c8e42ab [BOLT][NFC] Refactor command line options in BinaryPassManager
Summary:
Reformat code and put options in lexicographical order.

Comparing to clang-format output, manual formatting looks cleaner to me.

(cherry picked from FBD33481692)
2022-01-07 11:36:22 -08:00
Alexander Yermolovich e579f5c6e7 [BOLT][DWARF] Fix race conditions for debug fission in non-deterministic mode
Summary: Adding mutexes to avoid runtime race conditions.

(cherry picked from FBD33439854)
2022-01-05 15:27:21 -08:00
Maksim Panchenko bc9032c7fa [BOLT][NFC] Use uniform DEBUG_TYPE for MCPlus builders
(cherry picked from FBD33435121)
2022-01-05 12:02:54 -08:00
Maksim Panchenko df288e8487 [BOLT][NFC] Refactor if statements in RewriteInstance
(cherry picked from FBD33341796)
2021-12-28 13:46:45 -08:00
Alexander Yermolovich 6b89327deb [BOLT][DWARF] Handling more data formats for DW_AT_high_pc
Summary:
Adding support for DW_FORM_data_2, DW_FORM_data_1, DW_FORM_udata.
With new .debug_info code only need to modify the check.

(cherry picked from FBD33302731)
2021-12-23 14:49:14 -08:00
Alexander Yermolovich 9bf7a73787 [BOLT][DWARF] Change convertToRanges to not use indirect
Summary:
Now that we are re-writing .debug_info we are not longer restricted to have same size patches.
Simplifying logic to use direct forms.

(cherry picked from FBD32971159)
2021-12-07 17:35:12 -08:00
Amir Ayupov 6bb26fcb20 [BOLT] removeAllSuccessors: handle multiple edges between basic blocks
Summary:
If `addUnknownControlFlow` in `BinaryFunction::postProcessIndirectBranches`
is invoked with a basic block that has multiple edges to the same successor,
it leads to an assertion in `BinaryBasicBlock::removePredecessor`.

For basic blocks with multiple edges to the same successor, the default
behavior of removePredecessor is to remove all occurrences of the
predecessor block in its predecessor list (Multiple=true).

Example:
```A -> B (two edges)

A->removeAllSuccessors()
  for each successor of block A: // B twice
  // this removes both occurrences of A in B's predecessors list
  B->removePredecessor(A);
  // this invocation triggers an assert as A is no longer in B's
  // predecessor list
  B->removePredecessor(A);
```
This issue is not fixed by NormalizeCFG as `removeAllSuccessor` is called
earlier (from `buildCFG` -> `postProcessIndirectBranches`).

Solve this issue by collecting the successors into a set (`SmallPtrSet`) first,
before invoking `SuccessorBB->removePredecessor(this)`.

GitHub issue: https://github.com/facebookincubator/BOLT/issues/187

(cherry picked from FBD30796979)
2021-09-07 16:58:19 -07:00
Alexander Yermolovich 1c2f4bbe99 [BOLT] Rewrite of .debug_info section
Summary:
Changed the behavior of how we handle .debug_info section.
Instead of patching it will now rewrite it.
With this approach we are no longer constrained to having new values
 of the same size.
It handles re-writing by treating .debug_info as raw data.
It copies chunks of data between patches, with new data written in
 between.

(cherry picked from FBD32519952)
2021-11-15 17:19:24 -08:00
Amir Ayupov 883bf0e83d [BOLT][NFC] Fix braces usage in the rest of the codebase
Summary:
Refactor remaining bolt sources to follow the braces rule for if/else/loop from
[LLVM Coding Standards](https://llvm.org/docs/CodingStandards.html).

(cherry picked from FBD33345885)
2021-12-28 18:43:53 -08:00
Amir Ayupov def464aaae [BOLT][NFC] Fix braces usage in Profile
Summary:
Refactor bolt/*/Profile to follow the braces rule for if/else/loop from
[LLVM Coding Standards](https://llvm.org/docs/CodingStandards.html).

(cherry picked from FBD33345741)
2021-12-28 18:29:54 -08:00
Amir Ayupov 89ceb77997 [BOLT][NFC] Fix braces usage in Target
Summary:
Refactor bolt/lib/Target to follow the braces rule for if/else/loop from
[LLVM Coding Standards](https://llvm.org/docs/CodingStandards.html).

(cherry picked from FBD33345353)
2021-12-28 17:52:08 -08:00
Amir Ayupov 3b01fbebeb [BOLT] Fix debug logging in IndirectCallPromotion
Summary:
Access elements of a value pair in HotTargetMap debug logging/loop over
HotTargetMap key-value.

(cherry picked from FBD33344656)
2021-12-28 16:37:53 -08:00
Amir Ayupov f92ab6af35 [BOLT][NFC] Fix braces usage in Passes
Summary:
Refactor bolt/*/Passes to follow the braces rule for if/else/loop from
[LLVM Coding Standards](https://llvm.org/docs/CodingStandards.html).

(cherry picked from FBD33344642)
2021-12-28 16:36:17 -08:00
Maksim Panchenko ee0e9ccb52 [BOLTRewrite][NFC] Fix braces usages
Summary:
Refactor bolt/*/Rewrite to follow the braces rule for if/else/loop from
LLVM Coding Standards.

(cherry picked from FBD33305364)
2021-12-23 12:38:33 -08:00
Vladislav Khmelevsky 2d84e344d9 [PR][BOLT] Check for end iterator in LongJmp stub lookup
Summary:
The lower_bound might return the end iterator, the ignoring of which will
cause memory corruption.

Vladislav Khmelevsky,
Advanced Software Technology Lab, Huawei

(cherry picked from FBD33307803)
2021-11-28 02:56:30 +03:00
Maksim Panchenko e1eeef5b90 [BOLT][RFC] Use new LLVM license for ADRRelaxationPass
Summary: Fixes facebookincubator/BOLT#271

(cherry picked from FBD33299273)
2021-12-23 10:49:37 -08:00
Rafael Auler b392ec696b Re-enable Windows build and fix issues
Summary:
Fix missing string header file inclusion and link_fdata find
problem in lit tests. Change root-level tests to require
linux. Re-enable Windows in our root CMakeLists.txt.

(cherry picked from FBD33296290)
2021-12-23 05:59:35 -08:00
Rafael Auler 3652483c8e [BOLTCore] [NFC] Fix braces usages according to LLVM
Summary:
Fix according to Coding Standards doc, section Don't Use
Braces on Simple Single-Statement Bodies of if/else/loop Statements.
This set of changes applies to lib Core only.

(cherry picked from FBD33240028)
2021-12-20 11:07:46 -08:00
Maksim Panchenko 2f09f445b2 [BOLT][NFC] Fix file-description comments
Summary: Fix comments at the start of source files.

(cherry picked from FBD33274597)
2021-12-21 10:21:41 -08:00
Maksim Panchenko 226c973280 [BOLT][NFC] Remove another unused function
Summary: Remove DataReader::getBranchRange().

(cherry picked from FBD32810933)
2021-12-02 13:41:59 -08:00
Maksim Panchenko ccb99dd126 [BOLT] Fix profile and tests for nop-removal pass
Summary:
Since nops are now removed in a separate pass, the profile is consumed
on a CFG with nops. If previously a profile was generated without nops,
the offsets in the profile could be different if branches included nops
either as a source or a destination.

This diff adjust offsets to make the profile reading backwards
compatible.

(cherry picked from FBD33231254)
2021-12-18 17:05:00 -08:00
Vladislav Khmelevsky 08f56926c2 [BOLT] Move disassemble optimizations to optimization passes
Summary:
The patch moves the shortenInstructions and nop remove to separate binary
passes. As a result when llvm-bolt optimizations stage will begin the
instructions of the binary functions will be absolutely the same as it
was in the binary. This is needed for the golang support by llvm-bolt.
Some of the tests must be changed, since bb alignment nops might create
unreachable BBs in original functions.

Vladislav Khmelevsky,
Advanced Software Technology Lab, Huawei

(cherry picked from FBD32896517)
2021-12-18 17:03:35 -08:00
Rafael Auler 46e93fb427 Fix frameopt crash when processing POPF
Summary: POPF instruction was triggering an assertion in our analysis.

(cherry picked from FBD33141809)
2021-12-15 13:29:46 -08:00
Elvina Yakubova 4a4045f740 [PR] Fix update-debug-sections for AArch64
Summary:
This patch adds AArch64 relocations handling in case updating of
debug sections is enabled

Elvina Yakubova,
Advanced Software Technology Lab, Huawei

(cherry picked from FBD33077609)
2021-12-08 16:53:38 +03:00
Maksim Panchenko 40c2e0fafe [BOLT][NFC] Reformat with clang-format
Summary: Selectively apply clang-format to BOLT code base.

(cherry picked from FBD33119052)
2021-12-14 16:52:51 -08:00
Amir Ayupov 6aa735ceaf [BOLT] Split functions: support fragments with multiple parents
Summary:
Gracefully handle binaries with split functions where two fragments are folded
into one, resulting in a fragment with two parent functions.

This behavior is expected in GCC8+ with -O2 optimization level, where both
function splitting and ICF are enabled by default.

On the BOLT side, the changes are:
- BinaryFunction: allow multiple parent fragments:
  - `ParentFragment` --> `ParentFragments`,
  - `setParentFragment` --> `addParentFragment`.
- BinaryContext:
  - `populateJumpTables`: mark fragments to be skipped later,
  - `registerFragment`: add a name heuristic check, return false if it failed,
  - `processInterproceduralReferences`: check if `registerFragment`
succeeded, otherwise issue a warning,
  - `skipMarkedFragments`: move out fragment traversal and skipping from
  `populateJumpTables` into a separate function.

This change fixes an issue where unrelated functions might be registered
as fragments:

```
BOLT-WARNING: interprocedural reference between unrelated fragments:
bad_gs/1(*2) and amd_decode_mce.cold.27/1(*2)
```

(Linux kernel binary)

(cherry picked from FBD32786688)
2021-12-01 21:14:56 -08:00
Maksim Panchenko 69706eafab [BOLT] Refactor BinaryBasicBlock to use ADT
Summary:
Refactor members of BinaryBasicBlock. Replace some std containers with
ADT equivalents. The size of BinaryBasicBlock on x86-64 Linux is reduced
from 232 bytes to 192 bytes.

(cherry picked from FBD33081850)
2021-12-09 11:53:12 -08:00
Maksim Panchenko ebe51c4d23 [BOLT] Use more ADT data structures for BinaryFunction
Summary:
Switched members of BinaryFunction to ADT where it was possible and
made sense. As a result, the size of BinaryFunction on x86-64 Linux
reduced from 1624 bytes to 1448.

(cherry picked from FBD32981555)
2021-12-08 22:59:09 -08:00
Maksim Panchenko a73b1b7289 [BOLT][NFC] Clear HFSort copyright/license
Summary:
Remove the copyright/license message for the code originated from
Facebook.

(cherry picked from FBD32998404)
2021-12-09 12:24:16 -08:00
Alexander Yermolovich 1417f607bd [BOLT][DWARF] Fix for abbrev check in DWP case
Summary:
For DWP case the AbbreviationsOffset is the offset of the abbrev
 contribution in the DWP file, so can be none zero.

(cherry picked from FBD32961240)
2021-12-08 12:04:45 -08:00
Maksim Panchenko b73c87bc4f [BOLT][DWARF] Force allocation of debug_line in RuntimeDyld
Summary:
Currently, RuntimeDyld will not allocate a section without relocations
even if such a section is marked allocatable and defines symbols.

When we emit .debug_line for compile units with unchanged code, we
output original (input) data, without relocations. If all units are
emitted in this way, we will have no relocations in the emitted
.debug_line. RuntimeDyld will not allocate the section and as a result
we will write an empty .debug_line section.

To workaround the issue, always emit a relocation of RELOC_NONE type
when emitting raw contents to debug_line.

(cherry picked from FBD32909869)
2021-12-06 23:32:40 -08:00
Maksim Panchenko cbf530bf41 [BOLT] Add pass to normalize CFG
Summary:
Some optimizations may remove all instructions in a basic block.

The pass will cleanup the CFG afterwards by removing empty basic
blocks and merging duplicate CFG edges.

The normalized CFG is printed under '-print-normalized' option.

(cherry picked from FBD32774360)
2021-12-01 13:57:50 -08:00
Amir Ayupov b69d487a62 [BOLT][NFC] Remove unused MCPlusBuilder::isEnter
Summary: Remove unused code identified via coverage report.

(cherry picked from FBD32818608)
2021-12-02 17:23:58 -08:00
Amir Ayupov 8e632eae56 [BOLT][NFC] Remove unused MCPlusBuilder::createIndirectCall method
Summary: Remove unused code identified via coverage report.

(cherry picked from FBD32818329)
2021-12-02 17:19:33 -08:00
Amir Ayupov 02145d20ab [BOLT] Tail duplication: disable const/copy propagation by default as a workaround
Summary:
Disable const/copy propagation as a bug workaround.
Also add the debug logging in aggressive duplication.

(cherry picked from FBD32774744)
2021-12-01 14:05:05 -08:00
Maksim Panchenko 4f91538f57 [BOLT][NFC] Remove misleading debug message
Summary:
The debug message for the last fall-through block was printed under the
reverse condition, i.e. when the block was not a fall-through. Remove
the debug message. If we'll need such information, we can add a pass
with more analysis, i.e. checking the last instruction, if the block is
reachable, etc.

(cherry picked from FBD32670816)
2021-11-25 13:14:16 -08:00
Amir Ayupov eb9f4eb6ab [BOLT][NFC] Better diagnostics for unsupported relocation types
Summary: Print the relocation name instead of just the number.

(cherry picked from FBD32704832)
2021-11-29 13:20:03 -08:00
Amir Ayupov 76cd07f9e4 [BOLT] Tail Duplication: fix jump table check
Summary: The intent is clearly to check the current basic block.

(cherry picked from FBD32658103)
2021-11-24 15:39:24 -08:00
Amir Ayupov 7261655d2c [BOLT] Tail Duplication: skip unreachable blocks
Summary:
TailDuplication::isInCacheLine makes the assumption that the block
has a valid layout index, which is not the case for unreachable blocks.
Add a check for a valid layout index.

(cherry picked from FBD32659755)
2021-11-24 16:13:42 -08:00
Rafael Auler a23726bb33 [BOLT] Fix crash when trying to resolve external symbols for runtime libs
Summary:
As pointed out by Vladislav in issue 217, if our RTDyld-based
linker fails to locate a symbol, it will crash with segfault. Fix that.

(cherry picked from FBD32481543)
2021-11-16 16:47:02 -08:00
Amir Ayupov d474dbdfcb [BOLT][NFC] Use function names passed in -funcs-no-regex as-is
Summary:
Currently there are two issues rendering the use of bughunter/BOLT on a binary
with a large number of functions (100k) impossible:
1) `selectFunctionsToProcess` has O(binary_fn * force_fn) run-time, which is up
to quadratic with the number of functions in the binary.
2) It unnecessarily treats supplied function names as regexes.

This diff proposes the following changes to address the issue:
1. Add two options that treat function names as is, not as regexes, matching
bughunter usage model: `-funcs-no-regex`/`-funcs-file-no-regex`.
These options are complementary to `-funcs`/`-funcs-file` and `-skip-funcs`/
`-skip-funcs-file`. `funcs` takes precedence over `funcs-no-regex`.
2. Use string set to speed up function eligibility checking with
`-funcs-file-no-regex` to O(binary_fn * log force_fn).

(cherry picked from FBD28917225)
2021-06-04 18:49:29 -07:00
Vladislav Khmelevsky a944a487ae [PR] Fix ShrinkWrapping pop order
Summary:
The push and pop instructions might have wrong reorder due to this
error. Thanks rafaelauler for the provided test case.

Vladislav Khmelevsky,
Advanced Software Technology Lab, Huawei

(cherry picked from FBD32478348)
2021-11-14 02:23:20 +03:00
Alexander Yermolovich 68b0003ee3 [BOLT][DWARF] Fix for Unsupported Debug section: debug_line.dwo warning
Summary: Probably copy and paste mistake or something.

(cherry picked from FBD32625751)
2021-11-23 11:52:25 -08:00
Rafael Auler 2ccea6eac3 Fix shared build
Summary:
We were not tracking -DBUILD_SHARED_LIBS=ON and we introduced
some commits that break that by not specifying the correct dependencies.
Fix that.

(cherry picked from FBD32453377)
2021-11-15 20:01:48 -08:00
Rafael Auler ae585be11c [BOLT] Fix Windows build
Summary:
Make BOLT build in VisualStudio compiler and run without
crashing on a simple test. Other tests are not running.

(cherry picked from FBD32378736)
2021-11-11 18:14:53 -08:00
Amir Ayupov a82502d4a8 [BOLT][NFC] AsmDump: disable printing of empty profile data
Summary:
Moved the FDATA printing under the condition of non-empty profile data.
The change reduces the assembly dumps.

(cherry picked from FBD32262675)
2021-11-08 14:14:35 -08:00
Amir Ayupov 9ab0662211 [BOLT] TailDuplication: skip non-simple functions
Summary:
Replace erroneous check for function eligibility from `Function.isIgnored()`
to `shouldOptimize(Function)`. This prevents non-simple functions from being
processed.

(cherry picked from FBD32301958)
2021-11-09 17:13:00 -08:00
Maksim Panchenko 933df2a460 [BOLT][NFC] Remove references to internal tasks
(cherry picked from FBD32272387)
2021-11-08 19:54:05 -08:00
Maksim Panchenko 45f94abcd9 [BOLT][DWARF] Fix rare problem while rewriting debug_abbrev after LTO
Summary:
With LTO, it's possible for multiple DWARF compile units to share the
same abbreviation section set, i.e. to have the same abbrev_offset.
When units sharing the same abbrev set are located next to each other
and neither of them is being processed (i.e. contain processed
functions), it can trigger a bug in BOLT. When this happened,
the abbrev set is considered empty. Additionally, different units
may patch abbrev section differently.

The fix is to not rely on the next unit offset when detecting
abbreviation set boundaries and to delay writing abbrev section
until all units are processed.

(cherry picked from FBD31985046)
2021-10-27 20:28:17 -07:00