440 lines
		
	
	
		
			15 KiB
		
	
	
	
		
			ReStructuredText
		
	
	
	
			
		
		
	
	
			440 lines
		
	
	
		
			15 KiB
		
	
	
	
		
			ReStructuredText
		
	
	
	
| ============
 | |
| CMake Primer
 | |
| ============
 | |
| 
 | |
| .. contents::
 | |
|    :local:
 | |
| 
 | |
| .. warning::
 | |
|    Disclaimer: This documentation is written by LLVM project contributors `not`
 | |
|    anyone affiliated with the CMake project. This document may contain
 | |
|    inaccurate terminology, phrasing, or technical details. It is provided with
 | |
|    the best intentions.
 | |
| 
 | |
| 
 | |
| Introduction
 | |
| ============
 | |
| 
 | |
| The LLVM project and many of the core projects built on LLVM build using CMake.
 | |
| This document aims to provide a brief overview of CMake for developers modifying
 | |
| LLVM projects or building their own projects on top of LLVM.
 | |
| 
 | |
| The official CMake language references is available in the cmake-language
 | |
| manpage and `cmake-language online documentation
 | |
| <https://cmake.org/cmake/help/v3.4/manual/cmake-language.7.html>`_.
 | |
| 
 | |
| 10,000 ft View
 | |
| ==============
 | |
| 
 | |
| CMake is a tool that reads script files in its own language that describe how a
 | |
| software project builds. As CMake evaluates the scripts it constructs an
 | |
| internal representation of the software project. Once the scripts have been
 | |
| fully processed, if there are no errors, CMake will generate build files to
 | |
| actually build the project. CMake supports generating build files for a variety
 | |
| of command line build tools as well as for popular IDEs.
 | |
| 
 | |
| When a user runs CMake it performs a variety of checks similar to how autoconf
 | |
| worked historically. During the checks and the evaluation of the build
 | |
| description scripts CMake caches values into the CMakeCache. This is useful
 | |
| because it allows the build system to skip long-running checks during
 | |
| incremental development. CMake caching also has some drawbacks, but that will be
 | |
| discussed later.
 | |
| 
 | |
| Scripting Overview
 | |
| ==================
 | |
| 
 | |
| CMake's scripting language has a very simple grammar. Every language construct
 | |
| is a command that matches the pattern _name_(_args_). Commands come in three
 | |
| primary types: language-defined (commands implemented in C++ in CMake), defined
 | |
| functions, and defined macros. The CMake distribution also contains a suite of
 | |
| CMake modules that contain definitions for useful functionality.
 | |
| 
 | |
| The example below is the full CMake build for building a C++ "Hello World"
 | |
| program. The example uses only CMake language-defined functions.
 | |
| 
 | |
| .. code-block:: cmake
 | |
| 
 | |
|    cmake_minimum_required(VERSION 3.15)
 | |
|    project(HelloWorld)
 | |
|    add_executable(HelloWorld HelloWorld.cpp)
 | |
| 
 | |
| The CMake language provides control flow constructs in the form of foreach loops
 | |
| and if blocks. To make the example above more complicated you could add an if
 | |
| block to define "APPLE" when targeting Apple platforms:
 | |
| 
 | |
| .. code-block:: cmake
 | |
| 
 | |
|    cmake_minimum_required(VERSION 3.15)
 | |
|    project(HelloWorld)
 | |
|    add_executable(HelloWorld HelloWorld.cpp)
 | |
|    if(APPLE)
 | |
|      target_compile_definitions(HelloWorld PUBLIC APPLE)
 | |
|    endif()
 | |
| 
 | |
| Variables, Types, and Scope
 | |
| ===========================
 | |
| 
 | |
| Dereferencing
 | |
| -------------
 | |
| 
 | |
| In CMake variables are "stringly" typed. All variables are represented as
 | |
| strings throughout evaluation. Wrapping a variable in ``${}`` dereferences it
 | |
| and results in a literal substitution of the name for the value. CMake refers to
 | |
| this as "variable evaluation" in their documentation. Dereferences are performed
 | |
| *before* the command being called receives the arguments. This means
 | |
| dereferencing a list results in multiple separate arguments being passed to the
 | |
| command.
 | |
| 
 | |
| Variable dereferences can be nested and be used to model complex data. For
 | |
| example:
 | |
| 
 | |
| .. code-block:: cmake
 | |
| 
 | |
|    set(var_name var1)
 | |
|    set(${var_name} foo) # same as "set(var1 foo)"
 | |
|    set(${${var_name}}_var bar) # same as "set(foo_var bar)"
 | |
| 
 | |
| Dereferencing an unset variable results in an empty expansion. It is a common
 | |
| pattern in CMake to conditionally set variables knowing that it will be used in
 | |
| code paths that the variable isn't set. There are examples of this throughout
 | |
| the LLVM CMake build system.
 | |
| 
 | |
| An example of variable empty expansion is:
 | |
| 
 | |
| .. code-block:: cmake
 | |
| 
 | |
|    if(APPLE)
 | |
|      set(extra_sources Apple.cpp)
 | |
|    endif()
 | |
|    add_executable(HelloWorld HelloWorld.cpp ${extra_sources})
 | |
| 
 | |
| In this example the ``extra_sources`` variable is only defined if you're
 | |
| targeting an Apple platform. For all other targets the ``extra_sources`` will be
 | |
| evaluated as empty before add_executable is given its arguments.
 | |
| 
 | |
| Lists
 | |
| -----
 | |
| 
 | |
| In CMake lists are semi-colon delimited strings, and it is strongly advised that
 | |
| you avoid using semi-colons in lists; it doesn't go smoothly. A few examples of
 | |
| defining lists:
 | |
| 
 | |
| .. code-block:: cmake
 | |
| 
 | |
|    # Creates a list with members a, b, c, and d
 | |
|    set(my_list a b c d)
 | |
|    set(my_list "a;b;c;d")
 | |
| 
 | |
|    # Creates a string "a b c d"
 | |
|    set(my_string "a b c d")
 | |
| 
 | |
| Lists of Lists
 | |
| --------------
 | |
| 
 | |
| One of the more complicated patterns in CMake is lists of lists. Because a list
 | |
| cannot contain an element with a semi-colon to construct a list of lists you
 | |
| make a list of variable names that refer to other lists. For example:
 | |
| 
 | |
| .. code-block:: cmake
 | |
| 
 | |
|    set(list_of_lists a b c)
 | |
|    set(a 1 2 3)
 | |
|    set(b 4 5 6)
 | |
|    set(c 7 8 9)
 | |
| 
 | |
| With this layout you can iterate through the list of lists printing each value
 | |
| with the following code:
 | |
| 
 | |
| .. code-block:: cmake
 | |
| 
 | |
|    foreach(list_name IN LISTS list_of_lists)
 | |
|      foreach(value IN LISTS ${list_name})
 | |
|        message(${value})
 | |
|      endforeach()
 | |
|    endforeach()
 | |
| 
 | |
| You'll notice that the inner foreach loop's list is doubly dereferenced. This is
 | |
| because the first dereference turns ``list_name`` into the name of the sub-list
 | |
| (a, b, or c in the example), then the second dereference is to get the value of
 | |
| the list.
 | |
| 
 | |
| This pattern is used throughout CMake, the most common example is the compiler
 | |
| flags options, which CMake refers to using the following variable expansions:
 | |
| CMAKE_${LANGUAGE}_FLAGS and CMAKE_${LANGUAGE}_FLAGS_${CMAKE_BUILD_TYPE}.
 | |
| 
 | |
| Other Types
 | |
| -----------
 | |
| 
 | |
| Variables that are cached or specified on the command line can have types
 | |
| associated with them. The variable's type is used by CMake's UI tool to display
 | |
| the right input field. A variable's type generally doesn't impact evaluation,
 | |
| however CMake does have special handling for some variables such as PATH.
 | |
| You can read more about the special handling in `CMake's set documentation
 | |
| <https://cmake.org/cmake/help/v3.5/command/set.html#set-cache-entry>`_.
 | |
| 
 | |
| Scope
 | |
| -----
 | |
| 
 | |
| CMake inherently has a directory-based scoping. Setting a variable in a
 | |
| CMakeLists file, will set the variable for that file, and all subdirectories.
 | |
| Variables set in a CMake module that is included in a CMakeLists file will be
 | |
| set in the scope they are included from, and all subdirectories.
 | |
| 
 | |
| When a variable that is already set is set again in a subdirectory it overrides
 | |
| the value in that scope and any deeper subdirectories.
 | |
| 
 | |
| The CMake set command provides two scope-related options. PARENT_SCOPE sets a
 | |
| variable into the parent scope, and not the current scope. The CACHE option sets
 | |
| the variable in the CMakeCache, which results in it being set in all scopes. The
 | |
| CACHE option will not set a variable that already exists in the CACHE unless the
 | |
| FORCE option is specified.
 | |
| 
 | |
| In addition to directory-based scope, CMake functions also have their own scope.
 | |
| This means variables set inside functions do not bleed into the parent scope.
 | |
| This is not true of macros, and it is for this reason LLVM prefers functions
 | |
| over macros whenever reasonable.
 | |
| 
 | |
| .. note::
 | |
|   Unlike C-based languages, CMake's loop and control flow blocks do not have
 | |
|   their own scopes.
 | |
| 
 | |
| Control Flow
 | |
| ============
 | |
| 
 | |
| CMake features the same basic control flow constructs you would expect in any
 | |
| scripting language, but there are a few quirks because, as with everything in
 | |
| CMake, control flow constructs are commands.
 | |
| 
 | |
| If, ElseIf, Else
 | |
| ----------------
 | |
| 
 | |
| .. note::
 | |
|   For the full documentation on the CMake if command go
 | |
|   `here <https://cmake.org/cmake/help/v3.4/command/if.html>`_. That resource is
 | |
|   far more complete.
 | |
| 
 | |
| In general CMake if blocks work the way you'd expect:
 | |
| 
 | |
| .. code-block:: cmake
 | |
| 
 | |
|   if(<condition>)
 | |
|     message("do stuff")
 | |
|   elseif(<condition>)
 | |
|     message("do other stuff")
 | |
|   else()
 | |
|     message("do other other stuff")
 | |
|   endif()
 | |
| 
 | |
| The single most important thing to know about CMake's if blocks coming from a C
 | |
| background is that they do not have their own scope. Variables set inside
 | |
| conditional blocks persist after the ``endif()``.
 | |
| 
 | |
| Loops
 | |
| -----
 | |
| 
 | |
| The most common form of the CMake ``foreach`` block is:
 | |
| 
 | |
| .. code-block:: cmake
 | |
| 
 | |
|   foreach(var ...)
 | |
|     message("do stuff")
 | |
|   endforeach()
 | |
| 
 | |
| The variable argument portion of the ``foreach`` block can contain dereferenced
 | |
| lists, values to iterate, or a mix of both:
 | |
| 
 | |
| .. code-block:: cmake
 | |
| 
 | |
|   foreach(var foo bar baz)
 | |
|     message(${var})
 | |
|   endforeach()
 | |
|   # prints:
 | |
|   #  foo
 | |
|   #  bar
 | |
|   #  baz
 | |
| 
 | |
|   set(my_list 1 2 3)
 | |
|   foreach(var ${my_list})
 | |
|     message(${var})
 | |
|   endforeach()
 | |
|   # prints:
 | |
|   #  1
 | |
|   #  2
 | |
|   #  3
 | |
| 
 | |
|   foreach(var ${my_list} out_of_bounds)
 | |
|     message(${var})
 | |
|   endforeach()
 | |
|   # prints:
 | |
|   #  1
 | |
|   #  2
 | |
|   #  3
 | |
|   #  out_of_bounds
 | |
| 
 | |
| There is also a more modern CMake foreach syntax. The code below is equivalent
 | |
| to the code above:
 | |
| 
 | |
| .. code-block:: cmake
 | |
| 
 | |
|   foreach(var IN ITEMS foo bar baz)
 | |
|     message(${var})
 | |
|   endforeach()
 | |
|   # prints:
 | |
|   #  foo
 | |
|   #  bar
 | |
|   #  baz
 | |
| 
 | |
|   set(my_list 1 2 3)
 | |
|   foreach(var IN LISTS my_list)
 | |
|     message(${var})
 | |
|   endforeach()
 | |
|   # prints:
 | |
|   #  1
 | |
|   #  2
 | |
|   #  3
 | |
| 
 | |
|   foreach(var IN LISTS my_list ITEMS out_of_bounds)
 | |
|     message(${var})
 | |
|   endforeach()
 | |
|   # prints:
 | |
|   #  1
 | |
|   #  2
 | |
|   #  3
 | |
|   #  out_of_bounds
 | |
| 
 | |
| Similar to the conditional statements, these generally behave how you would
 | |
| expect, and they do not have their own scope.
 | |
| 
 | |
| CMake also supports ``while`` loops, although they are not widely used in LLVM.
 | |
| 
 | |
| Modules, Functions and Macros
 | |
| =============================
 | |
| 
 | |
| Modules
 | |
| -------
 | |
| 
 | |
| Modules are CMake's vehicle for enabling code reuse. CMake modules are just
 | |
| CMake script files. They can contain code to execute on include as well as
 | |
| definitions for commands.
 | |
| 
 | |
| In CMake macros and functions are universally referred to as commands, and they
 | |
| are the primary method of defining code that can be called multiple times.
 | |
| 
 | |
| In LLVM we have several CMake modules that are included as part of our
 | |
| distribution for developers who don't build our project from source. Those
 | |
| modules are the fundamental pieces needed to build LLVM-based projects with
 | |
| CMake. We also rely on modules as a way of organizing the build system's
 | |
| functionality for maintainability and re-use within LLVM projects.
 | |
| 
 | |
| Argument Handling
 | |
| -----------------
 | |
| 
 | |
| When defining a CMake command handling arguments is very useful. The examples
 | |
| in this section will all use the CMake ``function`` block, but this all applies
 | |
| to the ``macro`` block as well.
 | |
| 
 | |
| CMake commands can have named arguments that are required at every call site. In
 | |
| addition, all commands will implicitly accept a variable number of extra
 | |
| arguments (In C parlance, all commands are varargs functions). When a command is
 | |
| invoked with extra arguments (beyond the named ones) CMake will store the full
 | |
| list of arguments (both named and unnamed) in a list named ``ARGV``, and the
 | |
| sublist of unnamed arguments in ``ARGN``. Below is a trivial example of
 | |
| providing a wrapper function for CMake's built in function ``add_dependencies``.
 | |
| 
 | |
| .. code-block:: cmake
 | |
| 
 | |
|    function(add_deps target)
 | |
|      add_dependencies(${target} ${ARGN})
 | |
|    endfunction()
 | |
| 
 | |
| This example defines a new macro named ``add_deps`` which takes a required first
 | |
| argument, and just calls another function passing through the first argument and
 | |
| all trailing arguments.
 | |
| 
 | |
| CMake provides a module ``CMakeParseArguments`` which provides an implementation
 | |
| of advanced argument parsing. We use this all over LLVM, and it is recommended
 | |
| for any function that has complex argument-based behaviors or optional
 | |
| arguments. CMake's official documentation for the module is in the
 | |
| ``cmake-modules`` manpage, and is also available at the
 | |
| `cmake-modules online documentation
 | |
| <https://cmake.org/cmake/help/v3.4/module/CMakeParseArguments.html>`_.
 | |
| 
 | |
| .. note::
 | |
|   As of CMake 3.5 the cmake_parse_arguments command has become a native command
 | |
|   and the CMakeParseArguments module is empty and only left around for
 | |
|   compatibility.
 | |
| 
 | |
| Functions Vs Macros
 | |
| -------------------
 | |
| 
 | |
| Functions and Macros look very similar in how they are used, but there is one
 | |
| fundamental difference between the two. Functions have their own scope, and
 | |
| macros don't. This means variables set in macros will bleed out into the calling
 | |
| scope. That makes macros suitable for defining very small bits of functionality
 | |
| only.
 | |
| 
 | |
| The other difference between CMake functions and macros is how arguments are
 | |
| passed. Arguments to macros are not set as variables, instead dereferences to
 | |
| the parameters are resolved across the macro before executing it. This can
 | |
| result in some unexpected behavior if using unreferenced variables. For example:
 | |
| 
 | |
| .. code-block:: cmake
 | |
| 
 | |
|    macro(print_list my_list)
 | |
|      foreach(var IN LISTS my_list)
 | |
|        message("${var}")
 | |
|      endforeach()
 | |
|    endmacro()
 | |
| 
 | |
|    set(my_list a b c d)
 | |
|    set(my_list_of_numbers 1 2 3 4)
 | |
|    print_list(my_list_of_numbers)
 | |
|    # prints:
 | |
|    # a
 | |
|    # b
 | |
|    # c
 | |
|    # d
 | |
| 
 | |
| Generally speaking this issue is uncommon because it requires using
 | |
| non-dereferenced variables with names that overlap in the parent scope, but it
 | |
| is important to be aware of because it can lead to subtle bugs.
 | |
| 
 | |
| LLVM Project Wrappers
 | |
| =====================
 | |
| 
 | |
| LLVM projects provide lots of wrappers around critical CMake built-in commands.
 | |
| We use these wrappers to provide consistent behaviors across LLVM components
 | |
| and to reduce code duplication.
 | |
| 
 | |
| We generally (but not always) follow the convention that commands prefaced with
 | |
| ``llvm_`` are intended to be used only as building blocks for other commands.
 | |
| Wrapper commands that are intended for direct use are generally named following
 | |
| with the project in the middle of the command name (i.e. ``add_llvm_executable``
 | |
| is the wrapper for ``add_executable``). The LLVM ``add_*`` wrapper functions are
 | |
| all defined in ``AddLLVM.cmake`` which is installed as part of the LLVM
 | |
| distribution. It can be included and used by any LLVM sub-project that requires
 | |
| LLVM.
 | |
| 
 | |
| .. note::
 | |
| 
 | |
|    Not all LLVM projects require LLVM for all use cases. For example compiler-rt
 | |
|    can be built without LLVM, and the compiler-rt sanitizer libraries are used
 | |
|    with GCC.
 | |
| 
 | |
| Useful Built-in Commands
 | |
| ========================
 | |
| 
 | |
| CMake has a bunch of useful built-in commands. This document isn't going to
 | |
| go into details about them because The CMake project has excellent
 | |
| documentation. To highlight a few useful functions see:
 | |
| 
 | |
| * `add_custom_command <https://cmake.org/cmake/help/v3.4/command/add_custom_command.html>`_
 | |
| * `add_custom_target <https://cmake.org/cmake/help/v3.4/command/add_custom_target.html>`_
 | |
| * `file <https://cmake.org/cmake/help/v3.4/command/file.html>`_
 | |
| * `list <https://cmake.org/cmake/help/v3.4/command/list.html>`_
 | |
| * `math <https://cmake.org/cmake/help/v3.4/command/math.html>`_
 | |
| * `string <https://cmake.org/cmake/help/v3.4/command/string.html>`_
 | |
| 
 | |
| The full documentation for CMake commands is in the ``cmake-commands`` manpage
 | |
| and available on `CMake's website <https://cmake.org/cmake/help/v3.4/manual/cmake-commands.7.html>`_
 |