forked from OSchip/llvm-project
				
			
		
			
				
	
	
		
			275 lines
		
	
	
		
			9.6 KiB
		
	
	
	
		
			ReStructuredText
		
	
	
	
			
		
		
	
	
			275 lines
		
	
	
		
			9.6 KiB
		
	
	
	
		
			ReStructuredText
		
	
	
	
================================
 | 
						|
Fuzzing LLVM libraries and tools
 | 
						|
================================
 | 
						|
 | 
						|
.. contents::
 | 
						|
   :local:
 | 
						|
   :depth: 2
 | 
						|
 | 
						|
Introduction
 | 
						|
============
 | 
						|
 | 
						|
The LLVM tree includes a number of fuzzers for various components. These are
 | 
						|
built on top of :doc:`LibFuzzer <LibFuzzer>`.
 | 
						|
 | 
						|
 | 
						|
Available Fuzzers
 | 
						|
=================
 | 
						|
 | 
						|
clang-fuzzer
 | 
						|
------------
 | 
						|
 | 
						|
A |generic fuzzer| that tries to compile textual input as C++ code. Some of the
 | 
						|
bugs this fuzzer has reported are `on bugzilla`__ and `on OSS Fuzz's
 | 
						|
tracker`__.
 | 
						|
 | 
						|
__ https://llvm.org/pr23057
 | 
						|
__ https://bugs.chromium.org/p/oss-fuzz/issues/list?q=proj-llvm+clang-fuzzer
 | 
						|
 | 
						|
clang-proto-fuzzer
 | 
						|
------------------
 | 
						|
 | 
						|
A |protobuf fuzzer| that compiles valid C++ programs generated from a protobuf
 | 
						|
class that describes a subset of the C++ language.
 | 
						|
 | 
						|
This fuzzer accepts clang command line options after `ignore_remaining_args=1`.
 | 
						|
For example, the following command will fuzz clang with a higher optimization
 | 
						|
level:
 | 
						|
 | 
						|
.. code-block:: shell
 | 
						|
 | 
						|
   % bin/clang-proto-fuzzer <corpus-dir> -ignore_remaining_args=1 -O3
 | 
						|
 | 
						|
clang-format-fuzzer
 | 
						|
-------------------
 | 
						|
 | 
						|
A |generic fuzzer| that runs clang-format_ on C++ text fragments. Some of the
 | 
						|
bugs this fuzzer has reported are `on bugzilla`__
 | 
						|
and `on OSS Fuzz's tracker`__.
 | 
						|
 | 
						|
.. _clang-format: https://clang.llvm.org/docs/ClangFormat.html
 | 
						|
__ https://llvm.org/pr23052
 | 
						|
__ https://bugs.chromium.org/p/oss-fuzz/issues/list?q=proj-llvm+clang-format-fuzzer
 | 
						|
 | 
						|
llvm-as-fuzzer
 | 
						|
--------------
 | 
						|
 | 
						|
A |generic fuzzer| that tries to parse text as :doc:`LLVM assembly <LangRef>`.
 | 
						|
Some of the bugs this fuzzer has reported are `on bugzilla`__.
 | 
						|
 | 
						|
__ https://llvm.org/pr24639
 | 
						|
 | 
						|
llvm-dwarfdump-fuzzer
 | 
						|
---------------------
 | 
						|
 | 
						|
A |generic fuzzer| that interprets inputs as object files and runs
 | 
						|
:doc:`llvm-dwarfdump <CommandGuide/llvm-dwarfdump>` on them. Some of the bugs
 | 
						|
this fuzzer has reported are `on OSS Fuzz's tracker`__
 | 
						|
 | 
						|
__ https://bugs.chromium.org/p/oss-fuzz/issues/list?q=proj-llvm+llvm-dwarfdump-fuzzer
 | 
						|
 | 
						|
llvm-demangle-fuzzer
 | 
						|
---------------------
 | 
						|
 | 
						|
A |generic fuzzer| for the Itanium demangler used in various LLVM tools. We've
 | 
						|
fuzzed __cxa_demangle to death, why not fuzz LLVM's implementation of the same
 | 
						|
function!
 | 
						|
 | 
						|
llvm-isel-fuzzer
 | 
						|
----------------
 | 
						|
 | 
						|
A |LLVM IR fuzzer| aimed at finding bugs in instruction selection.
 | 
						|
 | 
						|
This fuzzer accepts flags after `ignore_remaining_args=1`. The flags match
 | 
						|
those of :doc:`llc <CommandGuide/llc>` and the triple is required. For example,
 | 
						|
the following command would fuzz AArch64 with :doc:`GlobalISel`:
 | 
						|
 | 
						|
.. code-block:: shell
 | 
						|
 | 
						|
   % bin/llvm-isel-fuzzer <corpus-dir> -ignore_remaining_args=1 -mtriple aarch64 -global-isel -O0
 | 
						|
 | 
						|
Some flags can also be specified in the binary name itself in order to support
 | 
						|
OSS Fuzz, which has trouble with required arguments. To do this, you can copy
 | 
						|
or move ``llvm-isel-fuzzer`` to ``llvm-isel-fuzzer--x-y-z``, separating options
 | 
						|
from the binary name using "--". The valid options are architecture names
 | 
						|
(``aarch64``, ``x86_64``), optimization levels (``O0``, ``O2``), or specific
 | 
						|
keywords, like ``gisel`` for enabling global instruction selection. In this
 | 
						|
mode, the same example could be run like so:
 | 
						|
 | 
						|
.. code-block:: shell
 | 
						|
 | 
						|
   % bin/llvm-isel-fuzzer--aarch64-O0-gisel <corpus-dir>
 | 
						|
 | 
						|
llvm-opt-fuzzer
 | 
						|
---------------
 | 
						|
 | 
						|
A |LLVM IR fuzzer| aimed at finding bugs in optimization passes.
 | 
						|
 | 
						|
It receives optimzation pipeline and runs it for each fuzzer input.
 | 
						|
 | 
						|
Interface of this fuzzer almost directly mirrors ``llvm-isel-fuzzer``. Both
 | 
						|
``mtriple`` and ``passes`` arguments are required. Passes are specified in a
 | 
						|
format suitable for the new pass manager.
 | 
						|
 | 
						|
.. code-block:: shell
 | 
						|
 | 
						|
   % bin/llvm-opt-fuzzer <corpus-dir> -ignore_remaining_args=1 -mtriple x86_64 -passes instcombine
 | 
						|
 | 
						|
Similarly to the ``llvm-isel-fuzzer`` arguments in some predefined configurations
 | 
						|
might be embedded directly into the binary file name:
 | 
						|
 | 
						|
.. code-block:: shell
 | 
						|
 | 
						|
   % bin/llvm-opt-fuzzer--x86_64-instcombine <corpus-dir>
 | 
						|
 | 
						|
llvm-mc-assemble-fuzzer
 | 
						|
-----------------------
 | 
						|
 | 
						|
A |generic fuzzer| that fuzzes the MC layer's assemblers by treating inputs as
 | 
						|
target specific assembly.
 | 
						|
 | 
						|
Note that this fuzzer has an unusual command line interface which is not fully
 | 
						|
compatible with all of libFuzzer's features. Fuzzer arguments must be passed
 | 
						|
after ``--fuzzer-args``, and any ``llc`` flags must use two dashes. For
 | 
						|
example, to fuzz the AArch64 assembler you might use the following command:
 | 
						|
 | 
						|
.. code-block:: console
 | 
						|
 | 
						|
  llvm-mc-fuzzer --triple=aarch64-linux-gnu --fuzzer-args -max_len=4
 | 
						|
 | 
						|
This scheme will likely change in the future.
 | 
						|
 | 
						|
llvm-mc-disassemble-fuzzer
 | 
						|
--------------------------
 | 
						|
 | 
						|
A |generic fuzzer| that fuzzes the MC layer's disassemblers by treating inputs
 | 
						|
as assembled binary data.
 | 
						|
 | 
						|
Note that this fuzzer has an unusual command line interface which is not fully
 | 
						|
compatible with all of libFuzzer's features. See the notes above about
 | 
						|
``llvm-mc-assemble-fuzzer`` for details.
 | 
						|
 | 
						|
 | 
						|
.. |generic fuzzer| replace:: :ref:`generic fuzzer <fuzzing-llvm-generic>`
 | 
						|
.. |protobuf fuzzer|
 | 
						|
   replace:: :ref:`libprotobuf-mutator based fuzzer <fuzzing-llvm-protobuf>`
 | 
						|
.. |LLVM IR fuzzer|
 | 
						|
   replace:: :ref:`structured LLVM IR fuzzer <fuzzing-llvm-ir>`
 | 
						|
 | 
						|
 | 
						|
Mutators and Input Generators
 | 
						|
=============================
 | 
						|
 | 
						|
The inputs for a fuzz target are generated via random mutations of a
 | 
						|
:ref:`corpus <libfuzzer-corpus>`. There are a few options for the kinds of
 | 
						|
mutations that a fuzzer in LLVM might want.
 | 
						|
 | 
						|
.. _fuzzing-llvm-generic:
 | 
						|
 | 
						|
Generic Random Fuzzing
 | 
						|
----------------------
 | 
						|
 | 
						|
The most basic form of input mutation is to use the built in mutators of
 | 
						|
LibFuzzer. These simply treat the input corpus as a bag of bits and make random
 | 
						|
mutations. This type of fuzzer is good for stressing the surface layers of a
 | 
						|
program, and is good at testing things like lexers, parsers, or binary
 | 
						|
protocols.
 | 
						|
 | 
						|
Some of the in-tree fuzzers that use this type of mutator are `clang-fuzzer`_,
 | 
						|
`clang-format-fuzzer`_, `llvm-as-fuzzer`_, `llvm-dwarfdump-fuzzer`_,
 | 
						|
`llvm-mc-assemble-fuzzer`_, and `llvm-mc-disassemble-fuzzer`_.
 | 
						|
 | 
						|
.. _fuzzing-llvm-protobuf:
 | 
						|
 | 
						|
Structured Fuzzing using ``libprotobuf-mutator``
 | 
						|
------------------------------------------------
 | 
						|
 | 
						|
We can use libprotobuf-mutator_ in order to perform structured fuzzing and
 | 
						|
stress deeper layers of programs. This works by defining a protobuf class that
 | 
						|
translates arbitrary data into structurally interesting input. Specifically, we
 | 
						|
use this to work with a subset of the C++ language and perform mutations that
 | 
						|
produce valid C++ programs in order to exercise parts of clang that are more
 | 
						|
interesting than parser error handling.
 | 
						|
 | 
						|
To build this kind of fuzzer you need `protobuf`_ and its dependencies
 | 
						|
installed, and you need to specify some extra flags when configuring the build
 | 
						|
with :doc:`CMake <CMake>`. For example, `clang-proto-fuzzer`_ can be enabled by
 | 
						|
adding ``-DCLANG_ENABLE_PROTO_FUZZER=ON`` to the flags described in
 | 
						|
:ref:`building-fuzzers`.
 | 
						|
 | 
						|
The only in-tree fuzzer that uses ``libprotobuf-mutator`` today is
 | 
						|
`clang-proto-fuzzer`_.
 | 
						|
 | 
						|
.. _libprotobuf-mutator: https://github.com/google/libprotobuf-mutator
 | 
						|
.. _protobuf: https://github.com/google/protobuf
 | 
						|
 | 
						|
.. _fuzzing-llvm-ir:
 | 
						|
 | 
						|
Structured Fuzzing of LLVM IR
 | 
						|
-----------------------------
 | 
						|
 | 
						|
We also use a more direct form of structured fuzzing for fuzzers that take
 | 
						|
:doc:`LLVM IR <LangRef>` as input. This is achieved through the ``FuzzMutate``
 | 
						|
library, which was `discussed at EuroLLVM 2017`_.
 | 
						|
 | 
						|
The ``FuzzMutate`` library is used to structurally fuzz backends in
 | 
						|
`llvm-isel-fuzzer`_.
 | 
						|
 | 
						|
.. _discussed at EuroLLVM 2017: https://www.youtube.com/watch?v=UBbQ_s6hNgg
 | 
						|
 | 
						|
 | 
						|
Building and Running
 | 
						|
====================
 | 
						|
 | 
						|
.. _building-fuzzers:
 | 
						|
 | 
						|
Configuring LLVM to Build Fuzzers
 | 
						|
---------------------------------
 | 
						|
 | 
						|
Fuzzers will be built and linked to libFuzzer by default as long as you build
 | 
						|
LLVM with sanitizer coverage enabled. You would typically also enable at least
 | 
						|
one sanitizer to find bugs faster. The most common way to build the fuzzers is
 | 
						|
by adding the following two flags to your CMake invocation:
 | 
						|
``-DLLVM_USE_SANITIZER=Address -DLLVM_USE_SANITIZE_COVERAGE=On``.
 | 
						|
 | 
						|
.. note:: If you have ``compiler-rt`` checked out in an LLVM tree when building
 | 
						|
          with sanitizers, you'll want to specify ``-DLLVM_BUILD_RUNTIME=Off``
 | 
						|
          to avoid building the sanitizers themselves with sanitizers enabled.
 | 
						|
 | 
						|
Continuously Running and Finding Bugs
 | 
						|
-------------------------------------
 | 
						|
 | 
						|
There used to be a public buildbot running LLVM fuzzers continuously, and while
 | 
						|
this did find issues, it didn't have a very good way to report problems in an
 | 
						|
actionable way. Because of this, we're moving towards using `OSS Fuzz`_ more
 | 
						|
instead.
 | 
						|
 | 
						|
You can browse the `LLVM project issue list`_ for the bugs found by
 | 
						|
`LLVM on OSS Fuzz`_. These are also mailed to the `llvm-bugs mailing
 | 
						|
list`_.
 | 
						|
 | 
						|
.. _OSS Fuzz: https://github.com/google/oss-fuzz
 | 
						|
.. _LLVM project issue list:
 | 
						|
   https://bugs.chromium.org/p/oss-fuzz/issues/list?q=Proj-llvm
 | 
						|
.. _LLVM on OSS Fuzz:
 | 
						|
   https://github.com/google/oss-fuzz/blob/master/projects/llvm
 | 
						|
.. _llvm-bugs mailing list:
 | 
						|
   http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-bugs
 | 
						|
 | 
						|
 | 
						|
Utilities for Writing Fuzzers
 | 
						|
=============================
 | 
						|
 | 
						|
There are some utilities available for writing fuzzers in LLVM.
 | 
						|
 | 
						|
Some helpers for handling the command line interface are available in
 | 
						|
``include/llvm/FuzzMutate/FuzzerCLI.h``, including functions to parse command
 | 
						|
line options in a consistent way and to implement standalone main functions so
 | 
						|
your fuzzer can be built and tested when not built against libFuzzer.
 | 
						|
 | 
						|
There is also some handling of the CMake config for fuzzers, where you should
 | 
						|
use the ``add_llvm_fuzzer`` to set up fuzzer targets. This function works
 | 
						|
similarly to functions such as ``add_llvm_tool``, but they take care of linking
 | 
						|
to LibFuzzer when appropriate and can be passed the ``DUMMY_MAIN`` argument to
 | 
						|
enable standalone testing.
 |