164 lines
		
	
	
		
			7.0 KiB
		
	
	
	
		
			ReStructuredText
		
	
	
	
			
		
		
	
	
			164 lines
		
	
	
		
			7.0 KiB
		
	
	
	
		
			ReStructuredText
		
	
	
	
=============================================================
 | 
						|
How To Build Clang and LLVM with Profile-Guided Optimizations
 | 
						|
=============================================================
 | 
						|
 | 
						|
Introduction
 | 
						|
============
 | 
						|
 | 
						|
PGO (Profile-Guided Optimization) allows your compiler to better optimize code
 | 
						|
for how it actually runs. Users report that applying this to Clang and LLVM can
 | 
						|
decrease overall compile time by 20%.
 | 
						|
 | 
						|
This guide walks you through how to build Clang with PGO, though it also applies
 | 
						|
to other subprojects, such as LLD.
 | 
						|
 | 
						|
 | 
						|
Using the script
 | 
						|
================
 | 
						|
 | 
						|
We have a script at ``utils/collect_and_build_with_pgo.py``. This script is
 | 
						|
tested on a few Linux flavors, and requires a checkout of LLVM, Clang, and
 | 
						|
compiler-rt. Despite the name, it performs four clean builds of Clang, so it
 | 
						|
can take a while to run to completion. Please see the script's ``--help`` for
 | 
						|
more information on how to run it, and the different options available to you.
 | 
						|
If you want to get the most out of PGO for a particular use-case (e.g. compiling
 | 
						|
a specific large piece of software), please do read the section below on
 | 
						|
'benchmark' selection.
 | 
						|
 | 
						|
Please note that this script is only tested on a few Linux distros. Patches to
 | 
						|
add support for other platforms, as always, are highly appreciated. :)
 | 
						|
 | 
						|
This script also supports a ``--dry-run`` option, which causes it to print
 | 
						|
important commands instead of running them.
 | 
						|
 | 
						|
 | 
						|
Selecting 'benchmarks'
 | 
						|
======================
 | 
						|
 | 
						|
PGO does best when the profiles gathered represent how the user plans to use the
 | 
						|
compiler. Notably, highly accurate profiles of llc building x86_64 code aren't
 | 
						|
incredibly helpful if you're going to be targeting ARM.
 | 
						|
 | 
						|
By default, the script above does two things to get solid coverage. It:
 | 
						|
 | 
						|
- runs all of Clang and LLVM's lit tests, and
 | 
						|
- uses the instrumented Clang to build Clang, LLVM, and all of the other
 | 
						|
  LLVM subprojects available to it.
 | 
						|
 | 
						|
Together, these should give you:
 | 
						|
 | 
						|
- solid coverage of building C++,
 | 
						|
- good coverage of building C,
 | 
						|
- great coverage of running optimizations,
 | 
						|
- great coverage of the backend for your host's architecture, and
 | 
						|
- some coverage of other architectures (if other arches are supported backends).
 | 
						|
 | 
						|
Altogether, this should cover a diverse set of uses for Clang and LLVM. If you
 | 
						|
have very specific needs (e.g. your compiler is meant to compile a large browser
 | 
						|
for four different platforms, or similar), you may want to do something else.
 | 
						|
This is configurable in the script itself.
 | 
						|
 | 
						|
 | 
						|
Building Clang with PGO
 | 
						|
=======================
 | 
						|
 | 
						|
If you prefer to not use the script, this briefly goes over how to build
 | 
						|
Clang/LLVM with PGO.
 | 
						|
 | 
						|
First, you should have at least LLVM, Clang, and compiler-rt checked out
 | 
						|
locally.
 | 
						|
 | 
						|
Next, at a high level, you're going to need to do the following:
 | 
						|
 | 
						|
1. Build a standard Release Clang and the relevant libclang_rt.profile library
 | 
						|
2. Build Clang using the Clang you built above, but with instrumentation
 | 
						|
3. Use the instrumented Clang to generate profiles, which consists of two steps:
 | 
						|
 | 
						|
  - Running the instrumented Clang/LLVM/lld/etc. on tasks that represent how
 | 
						|
    users will use said tools.
 | 
						|
  - Using a tool to convert the "raw" profiles generated above into a single,
 | 
						|
    final PGO profile.
 | 
						|
 | 
						|
4. Build a final release Clang (along with whatever other binaries you need)
 | 
						|
   using the profile collected from your benchmark
 | 
						|
 | 
						|
In more detailed steps:
 | 
						|
 | 
						|
1. Configure a Clang build as you normally would. It's highly recommended that
 | 
						|
   you use the Release configuration for this, since it will be used to build
 | 
						|
   another Clang. Because you need Clang and supporting libraries, you'll want
 | 
						|
   to build the ``all`` target (e.g. ``ninja all`` or ``make -j4 all``).
 | 
						|
 | 
						|
2. Configure a Clang build as above, but add the following CMake args:
 | 
						|
 | 
						|
   - ``-DLLVM_BUILD_INSTRUMENTED=IR`` -- This causes us to build everything
 | 
						|
     with instrumentation.
 | 
						|
   - ``-DLLVM_BUILD_RUNTIME=No`` -- A few projects have bad interactions when
 | 
						|
     built with profiling, and aren't necessary to build. This flag turns them
 | 
						|
     off.
 | 
						|
   - ``-DCMAKE_C_COMPILER=/path/to/stage1/clang`` - Use the Clang we built in
 | 
						|
     step 1.
 | 
						|
   - ``-DCMAKE_CXX_COMPILER=/path/to/stage1/clang++`` - Same as above.
 | 
						|
 | 
						|
 In this build directory, you simply need to build the ``clang`` target (and
 | 
						|
 whatever supporting tooling your benchmark requires).
 | 
						|
 | 
						|
3. As mentioned above, this has two steps: gathering profile data, and then
 | 
						|
   massaging it into a useful form:
 | 
						|
 | 
						|
   a. Build your benchmark using the Clang generated in step 2. The 'standard'
 | 
						|
      benchmark recommended is to run ``check-clang`` and ``check-llvm`` in your
 | 
						|
      instrumented Clang's build directory, and to do a full build of Clang/LLVM
 | 
						|
      using your instrumented Clang. So, create yet another build directory,
 | 
						|
      with the following CMake arguments:
 | 
						|
 | 
						|
      - ``-DCMAKE_C_COMPILER=/path/to/stage2/clang`` - Use the Clang we built in
 | 
						|
        step 2.
 | 
						|
      - ``-DCMAKE_CXX_COMPILER=/path/to/stage2/clang++`` - Same as above.
 | 
						|
 | 
						|
      If your users are fans of debug info, you may want to consider using
 | 
						|
      ``-DCMAKE_BUILD_TYPE=RelWithDebInfo`` instead of
 | 
						|
      ``-DCMAKE_BUILD_TYPE=Release``. This will grant better coverage of
 | 
						|
      debug info pieces of clang, but will take longer to complete and will
 | 
						|
      result in a much larger build directory.
 | 
						|
 | 
						|
      It's recommended to build the ``all`` target with your instrumented Clang,
 | 
						|
      since more coverage is often better.
 | 
						|
 | 
						|
  b. You should now have a few ``*.profraw`` files in
 | 
						|
     ``path/to/stage2/profiles/``. You need to merge these using
 | 
						|
     ``llvm-profdata`` (even if you only have one! The profile merge transforms
 | 
						|
     profraw into actual profile data, as well). This can be done with
 | 
						|
     ``/path/to/stage1/llvm-profdata merge
 | 
						|
     -output=/path/to/output/profdata.prof path/to/stage2/profiles/*.profraw``.
 | 
						|
 | 
						|
4. Now, build your final, PGO-optimized Clang. To do this, you'll want to pass
 | 
						|
   the following additional arguments to CMake.
 | 
						|
 | 
						|
   - ``-DLLVM_PROFDATA_FILE=/path/to/output/profdata.prof`` - Use the PGO
 | 
						|
     profile from the previous step.
 | 
						|
   - ``-DCMAKE_C_COMPILER=/path/to/stage1/clang`` - Use the Clang we built in
 | 
						|
     step 1.
 | 
						|
   - ``-DCMAKE_CXX_COMPILER=/path/to/stage1/clang++`` - Same as above.
 | 
						|
 | 
						|
   From here, you can build whatever targets you need.
 | 
						|
 | 
						|
   .. note::
 | 
						|
     You may see warnings about a mismatched profile in the build output. These
 | 
						|
     are generally harmless. To silence them, you can add
 | 
						|
     ``-DCMAKE_C_FLAGS='-Wno-backend-plugin'
 | 
						|
     -DCMAKE_CXX_FLAGS='-Wno-backend-plugin'`` to your CMake invocation.
 | 
						|
 | 
						|
 | 
						|
Congrats! You now have a Clang built with profile-guided optimizations, and you
 | 
						|
can delete all but the final build directory if you'd like.
 | 
						|
 | 
						|
If this worked well for you and you plan on doing it often, there's a slight
 | 
						|
optimization that can be made: LLVM and Clang have a tool called tblgen that's
 | 
						|
built and run during the build process. While it's potentially nice to build
 | 
						|
this for coverage as part of step 3, none of your other builds should benefit
 | 
						|
from building it. You can pass the CMake options
 | 
						|
``-DCLANG_TABLEGEN=/path/to/stage1/bin/clang-tblgen
 | 
						|
-DLLVM_TABLEGEN=/path/to/stage1/bin/llvm-tblgen`` to steps 2 and onward to avoid
 | 
						|
these useless rebuilds.
 |