366 lines
		
	
	
		
			16 KiB
		
	
	
	
		
			ReStructuredText
		
	
	
	
			
		
		
	
	
			366 lines
		
	
	
		
			16 KiB
		
	
	
	
		
			ReStructuredText
		
	
	
	
| ==========================
 | |
| UndefinedBehaviorSanitizer
 | |
| ==========================
 | |
| 
 | |
| .. contents::
 | |
|    :local:
 | |
| 
 | |
| Introduction
 | |
| ============
 | |
| 
 | |
| UndefinedBehaviorSanitizer (UBSan) is a fast undefined behavior detector.
 | |
| UBSan modifies the program at compile-time to catch various kinds of undefined
 | |
| behavior during program execution, for example:
 | |
| 
 | |
| * Using misaligned or null pointer
 | |
| * Signed integer overflow
 | |
| * Conversion to, from, or between floating-point types which would
 | |
|   overflow the destination
 | |
| 
 | |
| See the full list of available :ref:`checks <ubsan-checks>` below.
 | |
| 
 | |
| UBSan has an optional run-time library which provides better error reporting.
 | |
| The checks have small runtime cost and no impact on address space layout or ABI.
 | |
| 
 | |
| How to build
 | |
| ============
 | |
| 
 | |
| Build LLVM/Clang with `CMake <https://llvm.org/docs/CMake.html>`_.
 | |
| 
 | |
| Usage
 | |
| =====
 | |
| 
 | |
| Use ``clang++`` to compile and link your program with ``-fsanitize=undefined``
 | |
| flag. Make sure to use ``clang++`` (not ``ld``) as a linker, so that your
 | |
| executable is linked with proper UBSan runtime libraries. You can use ``clang``
 | |
| instead of ``clang++`` if you're compiling/linking C code.
 | |
| 
 | |
| .. code-block:: console
 | |
| 
 | |
|   % cat test.cc
 | |
|   int main(int argc, char **argv) {
 | |
|     int k = 0x7fffffff;
 | |
|     k += argc;
 | |
|     return 0;
 | |
|   }
 | |
|   % clang++ -fsanitize=undefined test.cc
 | |
|   % ./a.out
 | |
|   test.cc:3:5: runtime error: signed integer overflow: 2147483647 + 1 cannot be represented in type 'int'
 | |
| 
 | |
| You can enable only a subset of :ref:`checks <ubsan-checks>` offered by UBSan,
 | |
| and define the desired behavior for each kind of check:
 | |
| 
 | |
| * ``-fsanitize=...``: print a verbose error report and continue execution (default);
 | |
| * ``-fno-sanitize-recover=...``: print a verbose error report and exit the program;
 | |
| * ``-fsanitize-trap=...``: execute a trap instruction (doesn't require UBSan run-time support).
 | |
| 
 | |
| Note that the ``trap`` / ``recover`` options do not enable the corresponding
 | |
| sanitizer, and in general need to be accompanied by a suitable ``-fsanitize=``
 | |
| flag.
 | |
| 
 | |
| For example if you compile/link your program as:
 | |
| 
 | |
| .. code-block:: console
 | |
| 
 | |
|   % clang++ -fsanitize=signed-integer-overflow,null,alignment -fno-sanitize-recover=null -fsanitize-trap=alignment
 | |
| 
 | |
| the program will continue execution after signed integer overflows, exit after
 | |
| the first invalid use of a null pointer, and trap after the first use of misaligned
 | |
| pointer.
 | |
| 
 | |
| .. _ubsan-checks:
 | |
| 
 | |
| Available checks
 | |
| ================
 | |
| 
 | |
| Available checks are:
 | |
| 
 | |
|   -  ``-fsanitize=alignment``: Use of a misaligned pointer or creation
 | |
|      of a misaligned reference. Also sanitizes assume_aligned-like attributes.
 | |
|   -  ``-fsanitize=bool``: Load of a ``bool`` value which is neither
 | |
|      ``true`` nor ``false``.
 | |
|   -  ``-fsanitize=builtin``: Passing invalid values to compiler builtins.
 | |
|   -  ``-fsanitize=bounds``: Out of bounds array indexing, in cases
 | |
|      where the array bound can be statically determined. The check includes
 | |
|      ``-fsanitize=array-bounds`` and ``-fsanitize=local-bounds``. Note that
 | |
|      ``-fsanitize=local-bounds`` is not included in ``-fsanitize=undefined``.
 | |
|   -  ``-fsanitize=enum``: Load of a value of an enumerated type which
 | |
|      is not in the range of representable values for that enumerated
 | |
|      type.
 | |
|   -  ``-fsanitize=float-cast-overflow``: Conversion to, from, or
 | |
|      between floating-point types which would overflow the
 | |
|      destination. Because the range of representable values for all
 | |
|      floating-point types supported by Clang is [-inf, +inf], the only
 | |
|      cases detected are conversions from floating point to integer types.
 | |
|   -  ``-fsanitize=float-divide-by-zero``: Floating point division by
 | |
|      zero. This is undefined per the C and C++ standards, but is defined
 | |
|      by Clang (and by ISO/IEC/IEEE 60559 / IEEE 754) as producing either an
 | |
|      infinity or NaN value, so is not included in ``-fsanitize=undefined``.
 | |
|   -  ``-fsanitize=function``: Indirect call of a function through a
 | |
|      function pointer of the wrong type (Darwin/Linux, C++ and x86/x86_64
 | |
|      only).
 | |
|   -  ``-fsanitize=implicit-unsigned-integer-truncation``,
 | |
|      ``-fsanitize=implicit-signed-integer-truncation``: Implicit conversion from
 | |
|      integer of larger bit width to smaller bit width, if that results in data
 | |
|      loss. That is, if the demoted value, after casting back to the original
 | |
|      width, is not equal to the original value before the downcast.
 | |
|      The ``-fsanitize=implicit-unsigned-integer-truncation`` handles conversions
 | |
|      between two ``unsigned`` types, while
 | |
|      ``-fsanitize=implicit-signed-integer-truncation`` handles the rest of the
 | |
|      conversions - when either one, or both of the types are signed.
 | |
|      Issues caught by these sanitizers are not undefined behavior,
 | |
|      but are often unintentional.
 | |
|   -  ``-fsanitize=implicit-integer-sign-change``: Implicit conversion between
 | |
|      integer types, if that changes the sign of the value. That is, if the the
 | |
|      original value was negative and the new value is positive (or zero),
 | |
|      or the original value was positive, and the new value is negative.
 | |
|      Issues caught by this sanitizer are not undefined behavior,
 | |
|      but are often unintentional.
 | |
|   -  ``-fsanitize=integer-divide-by-zero``: Integer division by zero.
 | |
|   -  ``-fsanitize=nonnull-attribute``: Passing null pointer as a function
 | |
|      parameter which is declared to never be null.
 | |
|   -  ``-fsanitize=null``: Use of a null pointer or creation of a null
 | |
|      reference.
 | |
|   -  ``-fsanitize=nullability-arg``: Passing null as a function parameter
 | |
|      which is annotated with ``_Nonnull``.
 | |
|   -  ``-fsanitize=nullability-assign``: Assigning null to an lvalue which
 | |
|      is annotated with ``_Nonnull``.
 | |
|   -  ``-fsanitize=nullability-return``: Returning null from a function with
 | |
|      a return type annotated with ``_Nonnull``.
 | |
|   -  ``-fsanitize=objc-cast``: Invalid implicit cast of an ObjC object pointer
 | |
|      to an incompatible type. This is often unintentional, but is not undefined
 | |
|      behavior, therefore the check is not a part of the ``undefined`` group.
 | |
|      Currently only supported on Darwin.
 | |
|   -  ``-fsanitize=object-size``: An attempt to potentially use bytes which
 | |
|      the optimizer can determine are not part of the object being accessed.
 | |
|      This will also detect some types of undefined behavior that may not
 | |
|      directly access memory, but are provably incorrect given the size of
 | |
|      the objects involved, such as invalid downcasts and calling methods on
 | |
|      invalid pointers. These checks are made in terms of
 | |
|      ``__builtin_object_size``, and consequently may be able to detect more
 | |
|      problems at higher optimization levels.
 | |
|   -  ``-fsanitize=pointer-overflow``: Performing pointer arithmetic which
 | |
|      overflows, or where either the old or new pointer value is a null pointer
 | |
|      (or in C, when they both are).
 | |
|   -  ``-fsanitize=return``: In C++, reaching the end of a
 | |
|      value-returning function without returning a value.
 | |
|   -  ``-fsanitize=returns-nonnull-attribute``: Returning null pointer
 | |
|      from a function which is declared to never return null.
 | |
|   -  ``-fsanitize=shift``: Shift operators where the amount shifted is
 | |
|      greater or equal to the promoted bit-width of the left hand side
 | |
|      or less than zero, or where the left hand side is negative. For a
 | |
|      signed left shift, also checks for signed overflow in C, and for
 | |
|      unsigned overflow in C++. You can use ``-fsanitize=shift-base`` or
 | |
|      ``-fsanitize=shift-exponent`` to check only left-hand side or
 | |
|      right-hand side of shift operation, respectively.
 | |
|   -  ``-fsanitize=unsigned-shift-base``: check that an unsigned left-hand side of
 | |
|      a left shift operation doesn't overflow.
 | |
|   -  ``-fsanitize=signed-integer-overflow``: Signed integer overflow, where the
 | |
|      result of a signed integer computation cannot be represented in its type.
 | |
|      This includes all the checks covered by ``-ftrapv``, as well as checks for
 | |
|      signed division overflow (``INT_MIN/-1``), but not checks for
 | |
|      lossy implicit conversions performed before the computation
 | |
|      (see ``-fsanitize=implicit-conversion``). Both of these two issues are
 | |
|      handled by ``-fsanitize=implicit-conversion`` group of checks.
 | |
|   -  ``-fsanitize=unreachable``: If control flow reaches an unreachable
 | |
|      program point.
 | |
|   -  ``-fsanitize=unsigned-integer-overflow``: Unsigned integer overflow, where
 | |
|      the result of an unsigned integer computation cannot be represented in its
 | |
|      type. Unlike signed integer overflow, this is not undefined behavior, but
 | |
|      it is often unintentional. This sanitizer does not check for lossy implicit
 | |
|      conversions performed before such a computation
 | |
|      (see ``-fsanitize=implicit-conversion``).
 | |
|   -  ``-fsanitize=vla-bound``: A variable-length array whose bound
 | |
|      does not evaluate to a positive value.
 | |
|   -  ``-fsanitize=vptr``: Use of an object whose vptr indicates that it is of
 | |
|      the wrong dynamic type, or that its lifetime has not begun or has ended.
 | |
|      Incompatible with ``-fno-rtti``. Link must be performed by ``clang++``, not
 | |
|      ``clang``, to make sure C++-specific parts of the runtime library and C++
 | |
|      standard libraries are present.
 | |
| 
 | |
| You can also use the following check groups:
 | |
|   -  ``-fsanitize=undefined``: All of the checks listed above other than
 | |
|      ``float-divide-by-zero``, ``unsigned-integer-overflow``,
 | |
|      ``implicit-conversion``, ``local-bounds`` and the ``nullability-*`` group
 | |
|      of checks.
 | |
|   -  ``-fsanitize=undefined-trap``: Deprecated alias of
 | |
|      ``-fsanitize=undefined``.
 | |
|   -  ``-fsanitize=implicit-integer-truncation``: Catches lossy integral
 | |
|      conversions. Enables ``implicit-signed-integer-truncation`` and
 | |
|      ``implicit-unsigned-integer-truncation``.
 | |
|   -  ``-fsanitize=implicit-integer-arithmetic-value-change``: Catches implicit
 | |
|      conversions that change the arithmetic value of the integer. Enables
 | |
|      ``implicit-signed-integer-truncation`` and ``implicit-integer-sign-change``.
 | |
|   -  ``-fsanitize=implicit-conversion``: Checks for suspicious
 | |
|      behavior of implicit conversions. Enables
 | |
|      ``implicit-unsigned-integer-truncation``,
 | |
|      ``implicit-signed-integer-truncation``, and
 | |
|      ``implicit-integer-sign-change``.
 | |
|   -  ``-fsanitize=integer``: Checks for undefined or suspicious integer
 | |
|      behavior (e.g. unsigned integer overflow).
 | |
|      Enables ``signed-integer-overflow``, ``unsigned-integer-overflow``,
 | |
|      ``shift``, ``integer-divide-by-zero``,
 | |
|      ``implicit-unsigned-integer-truncation``,
 | |
|      ``implicit-signed-integer-truncation``, and
 | |
|      ``implicit-integer-sign-change``.
 | |
|   -  ``-fsanitize=nullability``: Enables ``nullability-arg``,
 | |
|      ``nullability-assign``, and ``nullability-return``. While violating
 | |
|      nullability does not have undefined behavior, it is often unintentional,
 | |
|      so UBSan offers to catch it.
 | |
| 
 | |
| Volatile
 | |
| --------
 | |
| 
 | |
| The ``null``, ``alignment``, ``object-size``, ``local-bounds``, and ``vptr`` checks do not apply
 | |
| to pointers to types with the ``volatile`` qualifier.
 | |
| 
 | |
| Minimal Runtime
 | |
| ===============
 | |
| 
 | |
| There is a minimal UBSan runtime available suitable for use in production
 | |
| environments. This runtime has a small attack surface. It only provides very
 | |
| basic issue logging and deduplication, and does not support
 | |
| ``-fsanitize=function`` and ``-fsanitize=vptr`` checking.
 | |
| 
 | |
| To use the minimal runtime, add ``-fsanitize-minimal-runtime`` to the clang
 | |
| command line options. For example, if you're used to compiling with
 | |
| ``-fsanitize=undefined``, you could enable the minimal runtime with
 | |
| ``-fsanitize=undefined -fsanitize-minimal-runtime``.
 | |
| 
 | |
| Stack traces and report symbolization
 | |
| =====================================
 | |
| If you want UBSan to print symbolized stack trace for each error report, you
 | |
| will need to:
 | |
| 
 | |
| #. Compile with ``-g`` and ``-fno-omit-frame-pointer`` to get proper debug
 | |
|    information in your binary.
 | |
| #. Run your program with environment variable
 | |
|    ``UBSAN_OPTIONS=print_stacktrace=1``.
 | |
| #. Make sure ``llvm-symbolizer`` binary is in ``PATH``.
 | |
| 
 | |
| Logging
 | |
| =======
 | |
| 
 | |
| The default log file for diagnostics is "stderr". To log diagnostics to another
 | |
| file, you can set ``UBSAN_OPTIONS=log_path=...``.
 | |
| 
 | |
| Silencing Unsigned Integer Overflow
 | |
| ===================================
 | |
| To silence reports from unsigned integer overflow, you can set
 | |
| ``UBSAN_OPTIONS=silence_unsigned_overflow=1``.  This feature, combined with
 | |
| ``-fsanitize-recover=unsigned-integer-overflow``, is particularly useful for
 | |
| providing fuzzing signal without blowing up logs.
 | |
| 
 | |
| Issue Suppression
 | |
| =================
 | |
| 
 | |
| UndefinedBehaviorSanitizer is not expected to produce false positives.
 | |
| If you see one, look again; most likely it is a true positive!
 | |
| 
 | |
| Disabling Instrumentation with ``__attribute__((no_sanitize("undefined")))``
 | |
| ----------------------------------------------------------------------------
 | |
| 
 | |
| You disable UBSan checks for particular functions with
 | |
| ``__attribute__((no_sanitize("undefined")))``. You can use all values of
 | |
| ``-fsanitize=`` flag in this attribute, e.g. if your function deliberately
 | |
| contains possible signed integer overflow, you can use
 | |
| ``__attribute__((no_sanitize("signed-integer-overflow")))``.
 | |
| 
 | |
| This attribute may not be
 | |
| supported by other compilers, so consider using it together with
 | |
| ``#if defined(__clang__)``.
 | |
| 
 | |
| Suppressing Errors in Recompiled Code (Ignorelist)
 | |
| --------------------------------------------------
 | |
| 
 | |
| UndefinedBehaviorSanitizer supports ``src`` and ``fun`` entity types in
 | |
| :doc:`SanitizerSpecialCaseList`, that can be used to suppress error reports
 | |
| in the specified source files or functions.
 | |
| 
 | |
| Runtime suppressions
 | |
| --------------------
 | |
| 
 | |
| Sometimes you can suppress UBSan error reports for specific files, functions,
 | |
| or libraries without recompiling the code. You need to pass a path to
 | |
| suppression file in a ``UBSAN_OPTIONS`` environment variable.
 | |
| 
 | |
| .. code-block:: bash
 | |
| 
 | |
|     UBSAN_OPTIONS=suppressions=MyUBSan.supp
 | |
| 
 | |
| You need to specify a :ref:`check <ubsan-checks>` you are suppressing and the
 | |
| bug location. For example:
 | |
| 
 | |
| .. code-block:: bash
 | |
| 
 | |
|   signed-integer-overflow:file-with-known-overflow.cpp
 | |
|   alignment:function_doing_unaligned_access
 | |
|   vptr:shared_object_with_vptr_failures.so
 | |
| 
 | |
| There are several limitations:
 | |
| 
 | |
| * Sometimes your binary must have enough debug info and/or symbol table, so
 | |
|   that the runtime could figure out source file or function name to match
 | |
|   against the suppression.
 | |
| * It is only possible to suppress recoverable checks. For the example above,
 | |
|   you can additionally pass
 | |
|   ``-fsanitize-recover=signed-integer-overflow,alignment,vptr``, although
 | |
|   most of UBSan checks are recoverable by default.
 | |
| * Check groups (like ``undefined``) can't be used in suppressions file, only
 | |
|   fine-grained checks are supported.
 | |
| 
 | |
| Supported Platforms
 | |
| ===================
 | |
| 
 | |
| UndefinedBehaviorSanitizer is supported on the following operating systems:
 | |
| 
 | |
| * Android
 | |
| * Linux
 | |
| * NetBSD
 | |
| * FreeBSD
 | |
| * OpenBSD
 | |
| * macOS
 | |
| * Windows
 | |
| 
 | |
| The runtime library is relatively portable and platform independent. If the OS
 | |
| you need is not listed above, UndefinedBehaviorSanitizer may already work for
 | |
| it, or could be made to work with a minor porting effort.
 | |
| 
 | |
| Current Status
 | |
| ==============
 | |
| 
 | |
| UndefinedBehaviorSanitizer is available on selected platforms starting from LLVM
 | |
| 3.3. The test suite is integrated into the CMake build and can be run with
 | |
| ``check-ubsan`` command.
 | |
| 
 | |
| Additional Configuration
 | |
| ========================
 | |
| 
 | |
| UndefinedBehaviorSanitizer adds static check data for each check unless it is
 | |
| in trap mode. This check data includes the full file name. The option
 | |
| ``-fsanitize-undefined-strip-path-components=N`` can be used to trim this
 | |
| information. If ``N`` is positive, file information emitted by
 | |
| UndefinedBehaviorSanitizer will drop the first ``N`` components from the file
 | |
| path. If ``N`` is negative, the last ``N`` components will be kept.
 | |
| 
 | |
| Example
 | |
| -------
 | |
| 
 | |
| For a file called ``/code/library/file.cpp``, here is what would be emitted:
 | |
| 
 | |
| * Default (No flag, or ``-fsanitize-undefined-strip-path-components=0``): ``/code/library/file.cpp``
 | |
| * ``-fsanitize-undefined-strip-path-components=1``: ``code/library/file.cpp``
 | |
| * ``-fsanitize-undefined-strip-path-components=2``: ``library/file.cpp``
 | |
| * ``-fsanitize-undefined-strip-path-components=-1``: ``file.cpp``
 | |
| * ``-fsanitize-undefined-strip-path-components=-2``: ``library/file.cpp``
 | |
| 
 | |
| More Information
 | |
| ================
 | |
| 
 | |
| * From LLVM project blog:
 | |
|   `What Every C Programmer Should Know About Undefined Behavior
 | |
|   <http://blog.llvm.org/2011/05/what-every-c-programmer-should-know.html>`_
 | |
| * From John Regehr's *Embedded in Academia* blog:
 | |
|   `A Guide to Undefined Behavior in C and C++
 | |
|   <https://blog.regehr.org/archives/213>`_
 |