![]() This patch adds `CLANG_BOLT_INSTRUMENT` option that applies BOLT instrumentation to Clang, performs a bootstrap build with the resulting Clang, merges resulting fdata files into a single profile file, and uses it to perform BOLT optimization on the original Clang binary. The projects and targets used for bootstrap/profile collection are configurable via `CLANG_BOLT_INSTRUMENT_PROJECTS` and `CLANG_BOLT_INSTRUMENT_TARGETS`. The defaults are "llvm" and "count" respectively, which results in a profile with ~5.3B dynamically executed instructions. The intended use of the functionality is through BOLT CMake cache file, similar to PGO 2-stage build: ``` cmake <llvm-project>/llvm -C <llvm-project>/clang/cmake/caches/BOLT.cmake ninja clang++-bolt # pulls clang-bolt ``` Stats with a recent checkout (clang-16), pre-built BOLT and Clang, 72vCPU/224G | CMake configure with host Clang + BOLT.cmake | 1m6.592s | Instrumenting Clang with BOLT | 2m50.508s | CMake configure `llvm` with instrumented Clang | 5m46.364s (~5x slowdown) | CMake build `not` with instrumented Clang |0m6.456s | Merging fdata files | 0m9.439s | Optimizing Clang with BOLT | 0m39.201s Building Clang: ```cmake ../llvm-project/llvm -DCMAKE_C_COMPILER=... -DCMAKE_CXX_COMPILER=... -DCMAKE_BUILD_TYPE=Release -DLLVM_ENABLE_PROJECTS=clang -DLLVM_TARGETS_TO_BUILD=Native -GNinja``` | | Release | BOLT-optimized | cmake | 0m24.016s | 0m22.333s | ninja clang | 5m55.692s | 4m35.122s I know it's not rigorous, but shows a ballpark figure. Reviewed By: phosek Differential Revision: https://reviews.llvm.org/D132975 |
||
---|---|---|
.. | ||
cxx | ||
CMakeLists.txt | ||
README.txt | ||
lit.cfg | ||
lit.site.cfg.in | ||
order-files.lit.cfg | ||
order-files.lit.site.cfg.in | ||
perf-helper.py |
README.txt
========================== Performance Training Data ========================== This directory contains simple source files for use as training data for generating PGO data and linker order files for clang.