See RFC for background:
http://lists.llvm.org/pipermail/llvm-dev/2020-June/142744.html
Note that the runtime changes will be sent separately (hopefully this
week, need to add some tests).
This patch includes the LLVM pass to instrument memory accesses with
either inline sequences to increment the access count in the shadow
location, or alternatively to call into the runtime. It also changes
calls to memset/memcpy/memmove to the equivalent runtime version.
The pass is modeled on the address sanitizer pass.
The clang changes add the driver option to invoke the new pass, and to
link with the upcoming heap profiling runtime libraries.
Currently there is no attempt to optimize the instrumentation, e.g. to
aggregate updates to the same memory allocation. That will be
implemented as follow on work.
Differential Revision: https://reviews.llvm.org/D85948
Summary: This patch disables implicit builtin knowledge about memcmp-like functions when compiling the program for fuzzing, i.e., when -fsanitize=fuzzer(-no-link) is given. This allows libFuzzer to always intercept memcmp-like functions as it effectively disables optimizing calls to such functions into different forms. This is done by adding a set of flags (-fno-builtin-memcmp and others) in the clang driver. Individual -fno-builtin-* flags previously used in several libFuzzer tests are now removed, as it is now done automatically in the clang driver.
The patch was once reverted in 8ef9e2bf35, as this patch was dependent on a reverted commit f78d9fceea. This reverted commit was recommitted in 831ae45e3d, so relanding this dependent patch too.
Reviewers: morehouse, hctim
Subscribers: cfe-commits, #sanitizers
Tags: #clang, #sanitizers
Differential Revision: https://reviews.llvm.org/D83987
Summary: libFuzzer intercepts certain library functions such as memcmp/strcmp by defining weak hooks. Weak hooks, however, are called only when other runtimes such as ASan is linked. This patch defines libFuzzer's own interceptors, which is linked into the libFuzzer executable when other runtimes are not linked, i.e., when -fsanitize=fuzzer is given, but not others.
The patch once landed but was reverted in 8ef9e2bf35 due to an assertion failure caused by calling an intercepted function, strncmp, while initializing the interceptors in fuzzerInit(). This issue is now fixed by calling libFuzzer's own implementation of library functions (i.e., internal_*) when the fuzzer has not been initialized yet, instead of recursively calling fuzzerInit() again.
Reviewers: kcc, morehouse, hctim
Subscribers: #sanitizers, krytarowski, mgorny, cfe-commits
Tags: #clang, #sanitizers
Differential Revision: https://reviews.llvm.org/D83494
This causes binaries linked with this runtime to crash on startup if
dlsym uses any of the intercepted functions. (For example, that happens
when using tcmalloc as the allocator: dlsym attempts to allocate memory
with malloc, and tcmalloc uses strncmp within its implementation.)
Also revert dependent commit "[libFuzzer] Disable implicit builtin knowledge about memcmp-like functions when -fsanitize=fuzzer-no-link is given."
This reverts commit f78d9fceea and 12d1124c49.
Summary: This patch disables implicit builtin knowledge about memcmp-like functions when compiling the program for fuzzing, i.e., when -fsanitize=fuzzer(-no-link) is given. This allows libFuzzer to always intercept memcmp-like functions as it effectively disables optimizing calls to such functions into different forms. This is done by adding a set of flags (-fno-builtin-memcmp and others) in the clang driver. Individual -fno-builtin-* flags previously used in several libFuzzer tests are now removed, as it is now done automatically in the clang driver.
Reviewers: morehouse, hctim
Subscribers: cfe-commits, #sanitizers
Tags: #clang, #sanitizers
Differential Revision: https://reviews.llvm.org/D83987
Summary: libFuzzer intercepts certain library functions such as memcmp/strcmp by defining weak hooks. Weak hooks, however, are called only when other runtimes such as ASan is linked. This patch defines libFuzzer's own interceptors, which is linked into the libFuzzer executable when other runtimes are not linked, i.e., when -fsanitize=fuzzer is given, but not others.
Reviewers: kcc, morehouse, hctim
Reviewed By: morehouse, hctim
Subscribers: krytarowski, mgorny, cfe-commits, #sanitizers
Tags: #clang, #sanitizers
Differential Revision: https://reviews.llvm.org/D83494
Check that the implicit cast from `id` used to construct the element
variable in an ObjC for-in statement is valid.
This check is included as part of a new `objc-cast` sanitizer, outside
of the main 'undefined' group, as (IIUC) the behavior it's checking for
is not technically UB.
The check can be extended to cover other kinds of invalid casts in ObjC.
Partially addresses: rdar://12903059, rdar://9542496
Differential Revision: https://reviews.llvm.org/D71491
The other backends don't know what this feature is and print a
message to stderr.
I recently tried to rework some target feature stuff in X86 and
this unknown feature tripped an assert I added.
Differential Revision: https://reviews.llvm.org/D83369
Keep deprecated -fsanitize-coverage-{white,black}list as aliases for compatibility for now.
Reviewed By: echristo
Differential Revision: https://reviews.llvm.org/D82244
Summary:
MemTag does not have any runtime at the moment, it's strictly code
instrumentation.
Reviewers: pcc
Subscribers: cryptoad, cfe-commits
Tags: #clang
Differential Revision: https://reviews.llvm.org/D79522
This is to avoid checking for the validity of a file that is not used.
This also contains a minor fix for the test, as the cfi sanitizer requires -flto and -fvisibility= arguments.
Differential Revision: https://reviews.llvm.org/D79043
Prior to this change, for a few compiler-rt libraries such as ubsan and
the profile library, Clang would embed "-defaultlib:path/to/rt-arch.lib"
into the .drective section of every object compiled with
-finstr-profile-generate or -fsanitize=ubsan as appropriate.
These paths assume that the link step will run from the same working
directory as the compile step. There is also evidence that sometimes the
paths become absolute, such as when clang is run from a different drive
letter from the current working directory. This is fragile, and I'd like
to get away from having paths embedded in the object if possible. Long
ago it was suggested that we use this for ASan, and apparently I felt
the same way back then:
https://reviews.llvm.org/D4428#56536
This is also consistent with how all other autolinking usage works for
PS4, Mac, and Windows: they all use basenames, not paths.
To keep things working for people using the standard GCC driver
workflow, the driver now adds the resource directory to the linker
library search path when it calls the linker. This is enough to make
check-ubsan pass, and seems like a generally good thing.
Users that invoke the linker directly (most clang-cl users) will have to
add clang's resource library directory to their linker search path in
their build system. I'm not sure where I can document this. Ideally I'd
also do it in the MSBuild files, but I can't figure out where they go.
I'd like to start with this for now.
Reviewed By: hans
Differential Revision: https://reviews.llvm.org/D65543
Summary:
This flag has been deprecated, with an on-by-default warning encouraging
users to explicitly specify whether they mean "all" or ubsan for 5 years
(released in Clang 3.7). Change it to mean what we wanted and
undeprecate it.
Also make the argument to -fsanitize-trap optional, and likewise default
it to 'all', and express the aliases for these flags in the .td file
rather than in code. (Plus documentation updates for the above.)
Reviewers: kcc
Subscribers: cfe-commits
Tags: #clang
Differential Revision: https://reviews.llvm.org/D77753
Summary:
This commit adds two command-line options to clang.
These options let the user decide which functions will receive SanitizerCoverage instrumentation.
This is most useful in the libFuzzer use case, where it enables targeted coverage-guided fuzzing.
Patch by Yannis Juglaret of DGA-MI, Rennes, France
libFuzzer tests its target against an evolving corpus, and relies on SanitizerCoverage instrumentation to collect the code coverage information that drives corpus evolution. Currently, libFuzzer collects such information for all functions of the target under test, and adds to the corpus every mutated sample that finds a new code coverage path in any function of the target. We propose instead to let the user specify which functions' code coverage information is relevant for building the upcoming fuzzing campaign's corpus. To this end, we add two new command line options for clang, enabling targeted coverage-guided fuzzing with libFuzzer. We see targeted coverage guided fuzzing as a simple way to leverage libFuzzer for big targets with thousands of functions or multiple dependencies. We publish this patch as work from DGA-MI of Rennes, France, with proper authorization from the hierarchy.
Targeted coverage-guided fuzzing can accelerate bug finding for two reasons. First, the compiler will avoid costly instrumentation for non-relevant functions, accelerating fuzzer execution for each call to any of these functions. Second, the built fuzzer will produce and use a more accurate corpus, because it will not keep the samples that find new coverage paths in non-relevant functions.
The two new command line options are `-fsanitize-coverage-whitelist` and `-fsanitize-coverage-blacklist`. They accept files in the same format as the existing `-fsanitize-blacklist` option <https://clang.llvm.org/docs/SanitizerSpecialCaseList.html#format>. The new options influence SanitizerCoverage so that it will only instrument a subset of the functions in the target. We explain these options in detail in `clang/docs/SanitizerCoverage.rst`.
Consider now the woff2 fuzzing example from the libFuzzer tutorial <https://github.com/google/fuzzer-test-suite/blob/master/tutorial/libFuzzerTutorial.md>. We are aware that we cannot conclude much from this example because mutating compressed data is generally a bad idea, but let us use it anyway as an illustration for its simplicity. Let us use an empty blacklist together with one of the three following whitelists:
```
# (a)
src:*
fun:*
# (b)
src:SRC/*
fun:*
# (c)
src:SRC/src/woff2_dec.cc
fun:*
```
Running the built fuzzers shows how many instrumentation points the compiler adds, the fuzzer will output //XXX PCs//. Whitelist (a) is the instrument-everything whitelist, it produces 11912 instrumentation points. Whitelist (b) focuses coverage to instrument woff2 source code only, ignoring the dependency code for brotli (de)compression; it produces 3984 instrumented instrumentation points. Whitelist (c) focuses coverage to only instrument functions in the main file that deals with WOFF2 to TTF conversion, resulting in 1056 instrumentation points.
For experimentation purposes, we ran each fuzzer approximately 100 times, single process, with the initial corpus provided in the tutorial. We let the fuzzer run until it either found the heap buffer overflow or went out of memory. On this simple example, whitelists (b) and (c) found the heap buffer overflow more reliably and 5x faster than whitelist (a). The average execution times when finding the heap buffer overflow were as follows: (a) 904 s, (b) 156 s, and (c) 176 s.
We explain these results by the fact that WOFF2 to TTF conversion calls the brotli decompression algorithm's functions, which are mostly irrelevant for finding bugs in WOFF2 font reconstruction but nevertheless instrumented and used by whitelist (a) to guide fuzzing. This results in longer execution time for these functions and a partially irrelevant corpus. Contrary to whitelist (a), whitelists (b) and (c) will execute brotli-related functions without instrumentation overhead, and ignore new code paths found in them. This results in faster bug finding for WOFF2 font reconstruction.
The results for whitelist (b) are similar to the ones for whitelist (c). Indeed, WOFF2 to TTF conversion calls functions that are mostly located in SRC/src/woff2_dec.cc. The 2892 extra instrumentation points allowed by whitelist (b) do not tamper with bug finding, even though they are mostly irrelevant, simply because most of these functions do not get called. We get a slightly faster average time for bug finding with whitelist (b), which might indicate that some of the extra instrumentation points are actually relevant, or might just be random noise.
Reviewers: kcc, morehouse, vitalybuka
Reviewed By: morehouse, vitalybuka
Subscribers: pratyai, vitalybuka, eternalsakura, xwlin222, dende, srhines, kubamracek, #sanitizers, lebedev.ri, hiraditya, cfe-commits, llvm-commits
Tags: #clang, #sanitizers, #llvm
Differential Revision: https://reviews.llvm.org/D63616
This is how it should've been and brings it more in line with
std::string_view. There should be no functional change here.
This is mostly mechanical from a custom clang-tidy check, with a lot of
manual fixups. It uncovers a lot of minor inefficiencies.
This doesn't actually modify StringRef yet, I'll do that in a follow-up.
This required some fixes to the generic code for two issues:
1. -fsanitize=safe-stack is default on x86_64-fuchsia and is *not* incompatible with -fsanitize=leak on Fuchisa
2. -fsanitize=leak and other static-only runtimes must not be omitted under -shared-libsan (which is the default on Fuchsia)
Patch By: mcgrathr
Differential Revision: https://reviews.llvm.org/D73397
With updates to various LLVM tools that use SpecialCastList.
It was tempting to use RealFileSystem as the default, but that makes it
too easy to accidentally forget passing VFS in clang code.
Summary:
This is a follow-up to 590f279c45, which
moved some of the callers to use VFS.
It turned out more code in Driver calls into real filesystem APIs and
also needs an update.
Reviewers: gribozavr2, kadircet
Reviewed By: kadircet
Subscribers: ormris, mgorny, hiraditya, llvm-commits, jkorous, cfe-commits
Tags: #clang, #llvm
Differential Revision: https://reviews.llvm.org/D70440
Previously these were reported from the driver which blocked clang-scan-deps from getting the full set of dependencies from cc1 commands.
Also the default sanitizer blacklist that is added in driver was never reported as a dependency. I introduced -fsanitize-system-blacklist cc1 option to keep track of which blacklists were user-specified and which were added by driver and clang -MD now also reports system blacklists as dependencies.
Differential Revision: https://reviews.llvm.org/D69290
The default behavior of Clang's indirect function call checker will replace
the address of each CFI-checked function in the output file's symbol table
with the address of a jump table entry which will pass CFI checks. We refer
to this as making the jump table `canonical`. This property allows code that
was not compiled with ``-fsanitize=cfi-icall`` to take a CFI-valid address
of a function, but it comes with a couple of caveats that are especially
relevant for users of cross-DSO CFI:
- There is a performance and code size overhead associated with each
exported function, because each such function must have an associated
jump table entry, which must be emitted even in the common case where the
function is never address-taken anywhere in the program, and must be used
even for direct calls between DSOs, in addition to the PLT overhead.
- There is no good way to take a CFI-valid address of a function written in
assembly or a language not supported by Clang. The reason is that the code
generator would need to insert a jump table in order to form a CFI-valid
address for assembly functions, but there is no way in general for the
code generator to determine the language of the function. This may be
possible with LTO in the intra-DSO case, but in the cross-DSO case the only
information available is the function declaration. One possible solution
is to add a C wrapper for each assembly function, but these wrappers can
present a significant maintenance burden for heavy users of assembly in
addition to adding runtime overhead.
For these reasons, we provide the option of making the jump table non-canonical
with the flag ``-fno-sanitize-cfi-canonical-jump-tables``. When the jump
table is made non-canonical, symbol table entries point directly to the
function body. Any instances of a function's address being taken in C will
be replaced with a jump table address.
This scheme does have its own caveats, however. It does end up breaking
function address equality more aggressively than the default behavior,
especially in cross-DSO mode which normally preserves function address
equality entirely.
Furthermore, it is occasionally necessary for code not compiled with
``-fsanitize=cfi-icall`` to take a function address that is valid
for CFI. For example, this is necessary when a function's address
is taken by assembly code and then called by CFI-checking C code. The
``__attribute__((cfi_jump_table_canonical))`` attribute may be used to make
the jump table entry of a specific function canonical so that the external
code will end up taking a address for the function that will pass CFI checks.
Fixes PR41972.
Differential Revision: https://reviews.llvm.org/D65629
llvm-svn: 368495
Globals are instrumented by adding a pointer tag to their symbol values
and emitting metadata into a special section that allows the runtime to tag
their memory when the library is loaded.
Due to order of initialization issues explained in more detail in the comments,
shadow initialization cannot happen during regular global initialization.
Instead, the location of the global section is marked using an ELF note,
and we require libc support for calling a function provided by the HWASAN
runtime when libraries are loaded and unloaded.
Based on ideas discussed with @evgeny777 in D56672.
Differential Revision: https://reviews.llvm.org/D65770
llvm-svn: 368102
These "sanitizers" are hardened ABIs that are wholly orthogonal
to the SanitizerCoverage instrumentation.
Differential Revision: https://reviews.llvm.org/D65715
llvm-svn: 367799
This change introduces a pair of -fsanitize-link-runtime and
-fno-sanitize-link-runtime flags which can be used to control linking of
sanitizer runtimes. This is useful in certain environments like kernels
where existing runtime libraries cannot be used.
Differential Revision: https://reviews.llvm.org/D65029
llvm-svn: 367794
i.e., recent 5745eccef54ddd3caca278d1d292a88b2281528b:
* Bump the function_type_mismatch handler version, as its signature has changed.
* The function_type_mismatch handler can return successfully now, so
SanitizerKind::Function must be AlwaysRecoverable (like for
SanitizerKind::Vptr).
* But the minimal runtime would still unconditionally treat a call to the
function_type_mismatch handler as failure, so disallow -fsanitize=function in
combination with -fsanitize-minimal-runtime (like it was already done for
-fsanitize=vptr).
* Add tests.
Differential Revision: https://reviews.llvm.org/D61479
llvm-svn: 366186
Add "memtag" sanitizer that detects and mitigates stack memory issues
using armv8.5 Memory Tagging Extension.
It is similar in principle to HWASan, which is a software implementation
of the same idea, but there are enough differencies to warrant a new
sanitizer type IMHO. It is also expected to have very different
performance properties.
The new sanitizer does not have a runtime library (it may grow one
later, along with a "debugging" mode). Similar to SafeStack and
StackProtector, the instrumentation pass (in a follow up change) will be
inserted in all cases, but will only affect functions marked with the
new sanitize_memtag attribute.
Reviewers: pcc, hctim, vitalybuka, ostannard
Subscribers: srhines, mehdi_amini, javed.absar, kristof.beyls, hiraditya, cryptoad, steven_wu, dexonsmith, cfe-commits, llvm-commits
Tags: #clang, #llvm
Differential Revision: https://reviews.llvm.org/D64169
llvm-svn: 366123
D63793 removed float-divide-by-zero from the "undefined" set but it
failed to add it to getSupportedSanitizers(), thus the sanitizer is
rejected by the driver:
clang-9: error: unsupported option '-fsanitize=float-divide-by-zero' for target 'x86_64-unknown-linux-gnu'
Also, add SanitizerMask::FloatDivideByZero to a few other masks to make -fsanitize-trap, -fsanitize-recover, -fsanitize-minimal-runtime and -fsanitize-coverage work.
Reviewed By: rsmith, vitalybuka
Differential Revision: https://reviews.llvm.org/D64317
llvm-svn: 365587
Disabled by default as this is still an experimental feature.
Reviewed By: thakis
Differential Revision: https://reviews.llvm.org/D59221
llvm-svn: 358285
It hasn't seen active development in years, and it hasn't reached a
state where it was useful.
Remove the code until someone is interested in working on it again.
Differential Revision: https://reviews.llvm.org/D59133
llvm-svn: 355862
enum SanitizerOrdinal has reached maximum capacity, this change extends the capacity to 128 sanitizer checks.
This can eventually allow us to add gcc 8's options "-fsanitize=pointer-substract" and "-fsanitize=pointer-compare".
This is a recommit of r354873 but with a fix for unqualified lookup error in lldb cmake build bot.
Fixes: https://llvm.org/PR39425
Differential Revision: https://reviews.llvm.org/D57914
llvm-svn: 355190
enum SanitizerOrdinal has reached maximum capacity, this change extends the capacity to 128 sanitizer checks.
This can eventually allow us to add gcc 8's options "-fsanitize=pointer-substract" and "-fsanitize=pointer-compare".
Fixes: https://llvm.org/PR39425
Differential Revision: https://reviews.llvm.org/D57914
llvm-svn: 354873
to reflect the new license.
We understand that people may be surprised that we're moving the header
entirely to discuss the new license. We checked this carefully with the
Foundation's lawyer and we believe this is the correct approach.
Essentially, all code in the project is now made available by the LLVM
project under our new license, so you will see that the license headers
include that license only. Some of our contributors have contributed
code under our old license, and accordingly, we have retained a copy of
our old license notice in the top-level files in each project and
repository.
llvm-svn: 351636
Summary:
Adds a new -f[no]split-lto-unit flag that is disabled by default to
control module splitting during ThinLTO. It is automatically enabled
for -fsanitize=cfi and -fwhole-program-vtables.
The new EnableSplitLTOUnit codegen flag is passed down to llvm
via a new module flag of the same name.
Depends on D53890.
Reviewers: pcc
Subscribers: ormris, mehdi_amini, inglorion, eraman, steven_wu, dexonsmith, cfe-commits, llvm-commits
Differential Revision: https://reviews.llvm.org/D53891
llvm-svn: 350949
The problem is similar to D55986 but for threads: a process with the
interceptor hwasan library loaded might have some threads started by
instrumented libraries and some by uninstrumented libraries, and we
need to be able to run instrumented code on the latter.
The solution is to perform per-thread initialization lazily. If a
function needs to access shadow memory or add itself to the per-thread
ring buffer its prologue checks to see whether the value in the
sanitizer TLS slot is null, and if so it calls __hwasan_thread_enter
and reloads from the TLS slot. The runtime does the same thing if it
needs to access this data structure.
This change means that the code generator needs to know whether we
are targeting the interceptor runtime, since we don't want to pay
the cost of lazy initialization when targeting a platform with native
hwasan support. A flag -fsanitize-hwaddress-abi={interceptor,platform}
has been introduced for selecting the runtime ABI to target. The
default ABI is set to interceptor since it's assumed that it will
be more common that users will be compiling application code than
platform code.
Because we can no longer assume that the TLS slot is initialized,
the pthread_create interceptor is no longer necessary, so it has
been removed.
Ideally, lazy initialization should only cost one instruction in the
hot path, but at present the call may cause us to spill arguments
to the stack, which means more instructions in the hot path (or
theoretically in the cold path if the spills are moved with shrink
wrapping). With an appropriately chosen calling convention for
the per-thread initialization function (TODO) the hot path should
always need just one instruction and the cold path should need two
instructions with no spilling required.
Differential Revision: https://reviews.llvm.org/D56038
llvm-svn: 350429