We have SleepForSeconds, SleepForMillis and internal_sleep.
Some are implemented in terms of libc functions, some -- in terms
of syscalls. Some are implemented in per OS files,
some -- in libc/nolibc files. That's unnecessary complex
and libc functions cause crashes in some contexts because
we intercept them. There is no single reason to have calls to libc
when we have syscalls (and we have them anyway).
Add internal_usleep that is implemented in terms of syscalls per OS.
Make SleepForSeconds/SleepForMillis/internal_sleep a wrapper
around internal_usleep that is implemented in sanitizer_common.cpp once.
Also remove return values for internal_sleep, it's not used anywhere.
Eventually it would be nice to remove SleepForSeconds/SleepForMillis/internal_sleep.
There is no point in having that many different names for the same thing.
Reviewed By: vitalybuka
Differential Revision: https://reviews.llvm.org/D105718
LibIgnore is checked in every interceptor.
Currently it has all logic in the single function
in the header, which makes it uninlinable.
Split it into fast path (no libraries ignored)
and slow path (have ignored libraries).
It makes the fast path inlinable (single load).
Reviewed By: vitalybuka
Differential Revision: https://reviews.llvm.org/D105719
This reverts commit 9a9bc76c0e.
That commit broke "ninja install" when building compiler-rt for mingw
targets, building standalone (pointing cmake at the compiler-rt
directory) with cmake 3.16.3 (the one shipped in ubuntu 20.04), with
errors like this:
-- Install configuration: "Release"
CMake Error at cmake_install.cmake:44 (file):
file cannot create directory: /include/sanitizer. Maybe need
administrative privileges.
Call Stack (most recent call first):
/home/martin/code/llvm-mingw/src/llvm-project/compiler-rt/build-i686-sanitizers/cmake_install.cmake:37 (include)
FAILED: include/CMakeFiles/install-compiler-rt-headers
cd /home/martin/code/llvm-mingw/src/llvm-project/compiler-rt/build-i686-sanitizers/include && /usr/bin/cmake -DCMAKE_INSTALL_COMPONENT="compiler-rt-headers" -P /home/martin/code/llvm-mingw/src/llvm-project/compiler-rt/build-i686-sanitizers/cmake_install.cmake
ninja: build stopped: subcommand failed.
This makes sure we have support for MTE instructions.
Later the check can be extended to support MTE on other compilers.
Reviewed By: pcc
Differential Revision: https://reviews.llvm.org/D105722
Instead of using `COMPILER_RT_INSTALL_PATH` through the CMake for
complier-rt, just use it to define variables for the subdirs which
themselves are used.
This preserves compatibility, but later on we might consider getting rid
of `COMPILER_RT_INSTALL_PATH` and just changing the defaults for the
subdir variables directly.
---
There was a seaming bug where the (non-Apple) per-target libdir was
`${target}` not `lib/${target}`. I suspect that has to do with the docs
on `COMPILER_RT_INSTALL_PATH` saying was the library dir when that's no
longer true, so I just went ahead and fixed it, allowing me to define
fewer and more sensible variables.
That last part should be the only behavior changes; everything else
should be a pure refactoring.
---
D99484 is the main thrust of the `GnuInstallDirs` work. Once this lands,
it should be feasible to follow both of these up with a simple patch for
compiler-rt analogous to the one for libcxx.
Reviewed By: phosek
Differential Revision: https://reviews.llvm.org/D101497
Based on comments in D91466, we can make the current implementation
linux-speciic. The fuchsia implementation will just be a call to
`__sanitizer_fill_shadow`.
Differential Revision: https://reviews.llvm.org/D105663
The new name is something less linux-y and more platform generic. Once we
finalize the tagged pointer ABI in zircon, we will do the appropriate check
for fuchsia to see that TBI is enabled.
Differential Revision: https://reviews.llvm.org/D105667
trapOnAddress is designed to SEGV on a specific address. Unfortunately,
with an IR change, __builtin_unreachable() ends up doing DCE on things
that have side effects, like the load that causes the trap.
Change to __builtin_trap() to avoid the optimisation.
Root cause is still an LLVM bug, and tracked in
https://bugs.llvm.org/show_bug.cgi?id=47480.
Reviewed By: eugenis
Differential Revision: https://reviews.llvm.org/D105654
This contains all the definitions required by hwasan for the fuchsia
implementation and can be landed independently from the remaining parts of D91466.
Differential Revision: https://reviews.llvm.org/D103936
This disables use of hwasan interceptors which we do not use on Fuchsia. This
explicitly sets the macro for defining the hwasan versions of new/delete.
Differential Revision: https://reviews.llvm.org/D103544
This patch splits up hwasan thread creation between `__sanitizer_before_thread_create_hook`,
`__sanitizer_thread_create_hook`, and `__sanitizer_thread_start_hook`.
The linux implementation creates the hwasan thread object inside the
new thread. On Fuchsia, we know the stack bounds before thread creation,
so we can initialize part of the thread object in `__sanitizer_before_thread_create_hook`,
then initialize the stack ring buffer in `__sanitizer_thread_start_hook`
once we enter the thread.
Differential Revision: https://reviews.llvm.org/D104085
We would find an address with matching tag, only to discover in
ShowCandidate that it's very far away from [stack].
Reviewed By: eugenis
Differential Revision: https://reviews.llvm.org/D105197
If the fault address is at the boundary of memory regions, this could
cause us to segfault otherwise.
Ran test with old compiler_rt to make sure it fails.
Reviewed By: eugenis
Differential Revision: https://reviews.llvm.org/D105032
Compiling compiler-rt with Clang modules and libc++ revealed that the
global `operator new` is being called without including `<new>`.
Differential Revision: https://reviews.llvm.org/D105401
This change introduces libMutagen/libclang_rt.mutagen.a as a subset of libFuzzer/libclang_rt.fuzzer.a. This library contains only the fuzzing strategies used by libFuzzer to produce new test inputs from provided inputs, dictionaries, and SanitizerCoverage feedback.
Most of this change is simply moving sections of code to one side or the other of the library boundary. The only meaningful new code is:
* The Mutagen.h interface and its implementation in Mutagen.cpp.
* The following methods in MutagenDispatcher.cpp:
* UseCmp
* UseMemmem
* SetCustomMutator
* SetCustomCrossOver
* LateInitialize (similar to the MutationDispatcher's original constructor)
* Mutate_AddWordFromTORC (uses callbacks instead of accessing TPC directly)
* StartMutationSequence
* MutationSequence
* DictionaryEntrySequence
* RecommendDictionary
* RecommendDictionaryEntry
* FuzzerMutate.cpp (which now justs sets callbacks and handles printing)
* MutagenUnittest.cpp (which adds tests of Mutagen.h)
A note on performance: This change was tested with a 100 passes of test/fuzzer/LargeTest.cpp with 1000 runs per pass, both with and without the change. The running time distribution was qualitatively similar both with and without the change, and the average difference was within 30 microseconds (2.240 ms/run vs 2.212 ms/run, respectively). Both times were much higher than observed with the fully optimized system clang (~0.38 ms/run), most likely due to the combination of CMake "dev mode" settings (e.g. CMAKE_BUILD_TYPE="Debug", LLVM_ENABLE_LTO=OFF, etc.). The difference between the two versions built similarly seems to be "in the noise" and suggests no meaningful performance degradation.
Reviewed By: morehouse
Differential Revision: https://reviews.llvm.org/D102447
We need the compiler generated variable to override the weak symbol of
the same name inside the profile runtime, but using LinkOnceODRLinkage
results in weak symbol being emitted which leads to an issue where the
linker might choose either of the weak symbols potentially disabling the
runtime counter relocation.
This change replaces the use of weak definition inside the runtime with
an external weak reference to address the issue. We also place the
compiler generated symbol inside a COMDAT group so dead definition can
be garbage collected by the linker.
Differential Revision: https://reviews.llvm.org/D105176
If we get here from reallocate, BlockEnd is tagged. Then we
will storeTag(UntaggedEnd) into the header of the next chunk.
Luckily header tag is 0 so unpatched code still works.
Reviewed By: pcc
Differential Revision: https://reviews.llvm.org/D105261
It's already covered by multiple tests, but to trigger
this path we need MTE+GWP which disabled.
Reviewed By: hctim, pcc
Differential Revision: https://reviews.llvm.org/D105232
This allows application code checks if origin tracking is on before
printing out traces.
-dfsan-track-origins can be 0,1,2.
The current code only distinguishes 1 and 2 in compile time, but not at runtime.
Made runtime distinguish 1 and 2 too.
Reviewed By: browneee
Differential Revision: https://reviews.llvm.org/D105128
Users can call HwasanThreadList::GetRingBufferSize rather than RingBufferSize
to prevent having to do the calculation in RingBufferSize. This will be useful
for Fuchsia where we plan to initialize the stack ring buffer separately from
the rest of thread initialization.
Differential Revision: https://reviews.llvm.org/D104823
A heap or global buffer that is far away from the faulting address is
unlikely to be the cause, especially if there is a potential
use-after-free as well, so we want to show it after the other
causes.
Reviewed By: eugenis
Differential Revision: https://reviews.llvm.org/D104781
Mmap interceptor is not atomic in the sense that it
exposes unmapped shadow for a brief period of time.
This breaks programs that mmap over another mmap
and access the region concurrently.
Don't unmap shadow in the mmap interceptor to fix this.
Just mapping new shadow on top should be enough to zero it.
Reviewed By: vitalybuka
Differential Revision: https://reviews.llvm.org/D104593
Similar to InitOptions in asan, we can use this optional struct for
initializing some members thread objects before they are created. On
linux, this is unused and can remain undefined. On fuchsia, this will
just be the stack bounds.
Differential Revision: https://reviews.llvm.org/D104553
Bionic <malloc.h> may provide the definitions of M_MEMTAG_TUNING_* constants.
Do not redefine them in that case.
Differential Revision: https://reviews.llvm.org/D104758
These have been broken by https://reviews.llvm.org/D104494.
However, `lib/fuzzer/dataflow/` is unused (?) so addressing this is not a priority.
Added TODOs to re-enable them in the future.
Reviewed By: stephan.yichao.zhao
Differential Revision: https://reviews.llvm.org/D104568
Once D104553 lands, CreateCurrentThread will be able to accept optional
parameters for initializing the hwasan thread object. On fuchsia, we can get
stack info in the platform-specific InitThreads and pass it through
CreateCurrentThread. On linux, this is a no-op.
Differential Revision: https://reviews.llvm.org/D104561
These other platforms are unsupported and untested.
They could be re-added later based on MSan code.
Reviewed By: gbalats, stephan.yichao.zhao
Differential Revision: https://reviews.llvm.org/D104481
Explain what the given stack trace means before showing it, rather than
only in the paragraph at the end.
Reviewed By: eugenis
Differential Revision: https://reviews.llvm.org/D104523
This allows for other implementations to define their own version of `Thread::Init`.
This will be the case for Fuchsia where much of the thread initialization can be
broken up between different thread hooks (`__sanitizer_before_thread_create_hook`,
`__sanitizer_thread_create_hook`, `__sanitizer_thread_start_hook`). Namely, setting
up the heap ring buffer and stack info and can be setup before thread creation.
The stack ring buffer can also be setup before thread creation, but storing it into
`__hwasan_tls` can only be done on the thread start hook since it's only then we
can access `__hwasan_tls` for that thread correctly.
Differential Revision: https://reviews.llvm.org/D104248
The default callback instrumentation in x86 LAM mode uses ASLR bits
to randomly choose a tag, and thus has a 1/64 chance of choosing a
stack tag of 0, causing stack tests to fail intermittently. By using
__hwasan_generate_tag to pick tags, we guarantee non-zero tags and
eliminate the test flakiness.
aarch64 doesn't seem to have this problem using thread-local addresses
to pick tags, so perhaps we can remove this workaround once we implement
a similar mechanism for LAM.
Reviewed By: eugenis
Differential Revision: https://reviews.llvm.org/D104470
These other platforms are unsupported and untested.
They could be re-added later based on MSan code.
Reviewed By: gbalats, stephan.yichao.zhao
Differential Revision: https://reviews.llvm.org/D104481
This is to fix build on Android. And we don't want to intercept more new/delete operators on Android.
Differential Revision: https://reviews.llvm.org/D104313
Before: ADDR is located -320 bytes to the right of 1072-byte region
After: ADDR is located 752 bytes inside 1072-byte region
Reviewed By: eugenis, walli99
Differential Revision: https://reviews.llvm.org/D104412
Short granule tags as poison cause a UaF to read the referenced
memory to retrieve the tag, and means we do not detect the UaF
if the last granule's tag is still around.
This only increases the change of not catching a UaF from
0.39 % (1 / 256) to 0.42 % (1 / (256 - 17)).
Reviewed By: eugenis
Differential Revision: https://reviews.llvm.org/D104304
The dsb after instruction cache invalidation only needs to be executed
if any instruction cache invalidation did happen.
Without this change, if the CTR_EL0.DIC bit indicates that instruction
cache invalidation is not needed, __clear_cache would execute two dsb
instructions in a row; with the second one being unnecessary.
Differential Revision: https://reviews.llvm.org/D104371
The `MockAllocator` used in `ScudoTSDTest` wasn't allocated
properly aligned, which resulted in the `TSDs` of the shared
registry not being aligned either. This lead to some failures
like: https://reviews.llvm.org/D103119#2822008
This changes how the `MockAllocator` is allocated, same as
Vitaly did in the combined tests, properly aligning it, which
results in the `TSDs` being aligned as well.
Add a `DCHECK` in the shared registry to check that it is.
Differential Revision: https://reviews.llvm.org/D104402
Similar to SHADOW_OFFSET on asan, we can use this for hwasan so platforms that
use a constant value for the start of shadow memory can just use the constant
rather than access a global.
Differential Revision: https://reviews.llvm.org/D104275
Apparently __builtin_abort() is not supported when targetting Windows.
This should fix the following builder errors:
clang_rt.builtins-x86_64.lib(int_util.c.obj) : error LNK2019: unresolved
external symbol __builtin_abort referenced in function __compilerrt_abort_impl
When compiled with -ffreestanding, we should not assume that headers
declaring functions such as abort() are available. While the compiler may
still emit calls to those functions [1], we should not require the headers
to build compiler-rt since that can result in a cyclic dependency graph:
The compiler-rt functions might be required to build libc.so, but the libc
headers such as stdlib.h might only be available once libc has been built.
[1] From https://gcc.gnu.org/onlinedocs/gcc/Standards.html:
GCC requires the freestanding environment provide memcpy, memmove,
memset and memcmp. Finally, if __builtin_trap is used, and the target
does not implement the trap pattern, then GCC emits a call to abort.
Reviewed By: phosek
Differential Revision: https://reviews.llvm.org/D103876
cstddef is needed for size_t definition.
(Multiple headers can provide size_t but none of them exists.)
Reviewed By: MaskRay
Differential Revision: https://reviews.llvm.org/D96213
This will simplify integration of this code into LLVM -- The
Simple-Packed-Serialization code can be copied near-verbatim, but
WrapperFunctionResult will require more adaptation.
cmake-3.16+ for AIX changes the default behavior of building a `SHARED` library which breaks AIX's build of libatomic, i.e., cmake-3.16+ builds `SHARED` as an archive of dynamic libraries. To fix it, we have to build `libatomic.so.1` as `MODULE` which keeps `libatomic.so.1` as an normal dynamic library.
Reviewed By: jsji
Differential Revision: https://reviews.llvm.org/D103786
This mostly follows LLVM's InstrProfReader.cpp error handling.
Previously, attempting to merge corrupted profile data would result in
crashes. See https://crbug.com/1216811#c4.
Reviewed By: rnk
Differential Revision: https://reviews.llvm.org/D104050
This fixes an issue introduced by https://reviews.llvm.org/D70662
Function-scope static initialization are guarded in C++, so we should probably
not use it because it introduces a dependency on __cxa_guard* symbols.
In the context of clang, libasan is linked statically, and it currently needs to
the odd situation where compiling C code with clang and asan requires -lstdc++
Differential Revision: https://reviews.llvm.org/D102475
trusty.cpp and trusty.h define Trusty implementations of map and other
platform-specific functions. In addition to adding Trusty configurations
in allocator_config.h and size_class_map.h, MapSizeIncrement and
PrimaryEnableRandomOffset are added as configurable options in
allocator_config.h.
Background on Trusty: https://source.android.com/security/trusty
Differential Revision: https://reviews.llvm.org/D103578
This removes the `__sanitizer_*` allocation function definitions from
`hwasan_interceptors.cpp` and moves them into their own file. This way
implementations that do not use interceptors at all can just ignore
(almost) everything in `hwasan_interceptors.cpp`.
Also remove some unused headers in `hwasan_interceptors.cpp` after the move.
Differential Revision: https://reviews.llvm.org/D103564
Complete support for fast8:
- amend shadow size and mapping in runtime
- remove fast16 mode and -dfsan-fast-16-labels flag
- remove legacy mode and make fast8 mode the default
- remove dfsan-fast-8-labels flag
- remove functions in dfsan interface only applicable to legacy
- remove legacy-related instrumentation code and tests
- update documentation.
Reviewed By: stephan.yichao.zhao, browneee
Differential Revision: https://reviews.llvm.org/D103745
dfsan does not use sanitizer allocator as others. In practice,
we let it use glibc's allocator since tcmalloc needs more work
to be working with dfsan well. With glibc, we observe large
memory leakage. This could relate to two things:
1) glibc allocator has limitation: for example, tcmalloc can reduce memory footprint 2x easily
2) glibc may call unmmap directly as an internal system call by using system call number. so DFSan has no way to release shadow spaces for those unmmap.
Using sanitizer allocator addresses the above issues
1) its memory management is close to tcmalloc
2) we can register callback when sanitizer allocator calls unmmap, so dfsan can release shadow spaces correctly.
Our experiment with internal server-based application proved that with the change, in a-few-day run, memory usage leakage is close to what tcmalloc does w/o dfsan.
This change mainly follows MSan's code.
1) define allocator callbacks at dfsan_allocator.h|cpp
2) mark allocator APIs to be discard
3) intercept allocator APIs
4) make dfsan_set_label consistent with MSan's SetShadow when setting 0 labels, define dfsan_release_meta_memory when unmap is called
5) add flags about whether zeroing memory after malloc/free. dfsan works at byte-level, so bit-level oparations can cause reading undefined shadow. See D96842. zeroing memory after malloc helps this. About zeroing after free, reading after free is definitely UB, but if user code does so, it is hard to debug an overtainting caused by this w/o running MSan. So we add the flag to help debugging.
This change will be split to small changes for review. Before that, a question is
"this code shares a lot of with MSan, for example, dfsan_allocator.* and dfsan_new_delete.*.
Does it make sense to unify the code at sanitizer_common? will that introduce some
maintenance issue?"
Reviewed By: morehouse
Differential Revision: https://reviews.llvm.org/D101204
This resolves an issue tripping a `DCHECK`, as I was checking for the
capacity and not the size. We don't need to 0-init the Vector as it's
done already, and make sure we only 0-out the string on clear if it's
not empty.
Differential Revision: https://reviews.llvm.org/D103716
prepareTaggedChunk uses Tag 0 for header.
Android already PR_MTE_TAG_MASK to 0xfffe,
but with the patch we will not need to deppend
on the system configuration.
Reviewed By: pcc
Differential Revision: https://reviews.llvm.org/D103134
Some platforms (eg: Trusty) are extremelly memory constrained, which
doesn't necessarily work well with some of Scudo's current assumptions.
`Vector` by default (and as such `String` and `ScopedString`) maps a
page, which is a bit of a waste. This CL changes `Vector` to use a
buffer local to the class first, then potentially map more memory if
needed (`ScopedString` currently are all stack based so it would be
stack data). We also want to allow a platform to prevent any dynamic
resizing, so I added a `CanGrow` templated parameter that for now is
always `true` but would be set to `false` on Trusty.
Differential Revision: https://reviews.llvm.org/D103641
This moves the implementations for HandleTagMismatch, __hwasan_tag_mismatch4,
and HwasanAtExit from hwasan_linux.cpp to hwasan.cpp and declares them in hwasan.h.
This way, calls to those functions can be shared with the fuchsia implementation
without duplicating code.
Differential Revision: https://reviews.llvm.org/D103562
WrapperFunctionResult is a C++ wrapper for __orc_rt_CWrapperFunctionResult
that automatically manages the underlying struct.
The Simple Packed Serialization (SPS) utilities support a simple serialization
scheme for wrapper function argument and result buffers:
Primitive typess (bool, char, int8_t, and uint8_t, int16_t, uint16_t, int32_t,
uint32_t, int64_t, uint64_t) are serialized in little-endian form.
SPSTuples are serialized by serializing each of the tuple members in order
without padding.
SPSSequences are serialized by serializing a sequence length (as a uint64_t)
followed by each of the elements of the sequence in order without padding.
Serialization/deserialization always involves a pair of SPS type tag (a tag
representing the serialized format to use, e.g. uint32_t, or
SPSTuple<bool, SPSString>) and a concrete type to be serialized from or
deserialized to (uint32_t, std::pair<bool, std::string>). Serialization for new
types can be implemented by specializing the SPSSerializationTraits type.
When toolchain can supports all of arm, armhf and armv6m architectures compiler-rt
libraries won't compile because architecture specific flags are appended to single
BUILTIN_CFLAGS variable.
Differential revision: https://reviews.llvm.org/D103363
OrcRTCWrapperFunctionResult is a C struct that can be used to return serialized
results from "wrapper functions" -- functions that deserialize an argument
buffer, call through to an actual implementation function, then serialize and
return the result of that function. Wrapper functions allow calls between ORC
and the ORC Runtime to be written using a single signature,
WrapperFunctionResult(const char *ArgData, size_t ArgSize), and without coupling
either side to a particular transport mechanism (in-memory, TCP, IPC, ... the
actual mechanism will be determined by the TargetProcessControl implementation).
OrcRTCWrapperFunctionResult is designed to allow small serialized buffers to
be returned by value, with larger serialized results stored on the heap. They
also provide an error state to report failures in serialization/deserialization.
This change introduces libMutagen/libclang_rt.mutagen.a as a subset of libFuzzer/libclang_rt.fuzzer.a. This library contains only the fuzzing strategies used by libFuzzer to produce new test inputs from provided inputs, dictionaries, and SanitizerCoverage feedback.
Most of this change is simply moving sections of code to one side or the other of the library boundary. The only meaningful new code is:
* The Mutagen.h interface and its implementation in Mutagen.cpp.
* The following methods in MutagenDispatcher.cpp:
* UseCmp
* UseMemmem
* SetCustomMutator
* SetCustomCrossOver
* LateInitialize (similar to the MutationDispatcher's original constructor)
* Mutate_AddWordFromTORC (uses callbacks instead of accessing TPC directly)
* StartMutationSequence
* MutationSequence
* DictionaryEntrySequence
* RecommendDictionary
* RecommendDictionaryEntry
* FuzzerMutate.cpp (which now justs sets callbacks and handles printing)
* MutagenUnittest.cpp (which adds tests of Mutagen.h)
A note on performance: This change was tested with a 100 passes of test/fuzzer/LargeTest.cpp with 1000 runs per pass, both with and without the change. The running time distribution was qualitatively similar both with and without the change, and the average difference was within 30 microseconds (2.240 ms/run vs 2.212 ms/run, respectively). Both times were much higher than observed with the fully optimized system clang (~0.38 ms/run), most likely due to the combination of CMake "dev mode" settings (e.g. CMAKE_BUILD_TYPE="Debug", LLVM_ENABLE_LTO=OFF, etc.). The difference between the two versions built similarly seems to be "in the noise" and suggests no meaningful performance degradation.
Reviewed By: morehouse
Differential Revision: https://reviews.llvm.org/D102447
This reverts commit 6911114d8c.
Broke the QEMU sanitizer bots due to a missing header dependency. This
actually needs to be fixed on the bot-side, but for now reverting this
patch until I can fix up the bot.
This patch moves -fsanitize=scudo to link the standalone scudo library,
rather than the original compiler-rt based library. This is one of the
major remaining roadblocks to deleting the compiler-rt based scudo,
which should not be used any more. The standalone Scudo is better in
pretty much every way and is much more suitable for production usage.
As well as patching the litmus tests for checking that the
scudo_standalone lib is linked instead of the scudo lib, this patch also
ports all the scudo lit tests to run under scudo standalone.
This patch also adds a feature to scudo standalone that was under test
in the original scudo - that arguments passed to an aligned operator new
were checked that the alignment was a power of two.
Some lit tests could not be migrated, due to the following issues:
1. Features that aren't supported in scudo standalone, like the rss
limit.
2. Different quarantine implementation where the test needs some more
thought.
3. Small bugs in scudo standalone that should probably be fixed, like
the Secondary allocator having a full page on the LHS of an allocation
that only contains the chunk header, so underflows by <= a page aren't
caught.
4. Slight differences in behaviour that's technically correct, like
'realloc(malloc(1), 0)' returns nullptr in standalone, but a real
pointer in old scudo.
5. Some tests that might be migratable, but not easily.
Tests that are obviously not applicable to scudo standalone (like
testing that no sanitizer symbols made it into the DSO) have been
deleted.
After this patch, the remaining work is:
1. Update the Scudo documentation. The flags have changed, etc.
2. Delete the old version of scudo.
3. Patch up the tests in lit-unmigrated, or fix Scudo standalone.
Reviewed By: cryptoad, vitalybuka
Differential Revision: https://reviews.llvm.org/D102543
Now that everything is forcibly linker initialized, it feels like a
good time to get rid of the `init`/`initLinkerInitialized` split.
This allows to get rid of various `memset` construct in `init` that
gcc complains about (this fixes a Fuchsia open issue).
I added various `DCHECK`s to ensure that we would get a zero-inited
object when entering `init`, which required ensuring that
`unmapTestOnly` leaves the object in a good state (tests are currently
the only location where an allocator can be "de-initialized").
Running the tests with `--gtest_repeat=` showed no issue.
Differential Revision: https://reviews.llvm.org/D103119
The generic approach can still be used by musl and FreeBSD. Note: on glibc
2.31, TLS_PRE_TCB_SIZE is 0x700, larger than ThreadDescriptorSize() by 16, but
this is benign: as long as the range includes pthread::{specific_1stblock,specific}
pthread_setspecific will not cause false positives.
Note: the state before afec953857 underestimated
the TLS size a lot (nearly ThreadDescriptorSize() = 1776).
That may explain why afec953857 actually made some
tests pass.
When building with Clang 11 on Windows, silence the following:
[432/5643] Building C object projects\compiler-rt\lib\profile\CMakeFiles\clang_rt.profile-x86_64.dir\GCDAProfiling.c.obj
F:\aganea\llvm-project\compiler-rt\lib\profile\GCDAProfiling.c(464,13): warning: comparison of integers of different signs: 'uint32_t' (aka 'unsigned int') and 'int' [-Wsign-compare]
if (val != (gcov_version >= 90 ? GCOV_TAG_OBJECT_SUMMARY
~~~ ^ ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
1 warning generated.
Cast of signed types to u64 breaks comparison.
Also remove double () around operands.
Reviewed By: cryptoad, hctim
Differential Revision: https://reviews.llvm.org/D103060
Make sure that if SCUDO_DEBUG=1 in tests
then we had the same in the scudo
library itself.
Reviewed By: cryptoad, hctim
Differential Revision: https://reviews.llvm.org/D103061
Said function had a few shortfalls:
- didn't set an abort message on Android
- was logged on several lines
- didn't provide extra information like the size requested if OOM'ing
This improves the function to address those points.
Differential Revision: https://reviews.llvm.org/D103034
When trying to track down a vaddr-poisoning bug, I found that that the
secondary cache isn't emptied on test teardown. We should probably do
that to make the tests hermetic. Otherwise, repeating the tests lots of
times using --gtest_repeat fails after the mmap vaddr space is
exhausted.
To repro:
$ ninja check-scudo_standalone # build
$ ./projects/compiler-rt/lib/scudo/standalone/tests/ScudoUnitTest-x86_64-Test \
--gtest_filter=ScudoSecondaryTest.*:-ScudoSecondaryTest.SecondaryCombinations \
--gtest_repeat=10000
Reviewed By: cryptoad
Differential Revision: https://reviews.llvm.org/D102874
Fix buildbot failure
https://lab.llvm.org/buildbot/#/builders/57/builds/6542/steps/6/logs/stdio
/llvm-project/llvm/utils/unittest/googletest/include/gtest/gtest.h:1629:28:
error: comparison of integers of different signs: 'const unsigned long'
and 'const int' [-Werror,-Wsign-compare]
GTEST_IMPL_CMP_HELPER_(GT, >);
~~~~~~~~~~~~~~~~~~~~~~~~~~^~
/llvm-project/llvm/utils/unittest/googletest/include/gtest/gtest.h:1609:12:
note: expanded from macro 'GTEST_IMPL_CMP_HELPER_'
if (val1 op val2) {\
~~~~ ^ ~~~~
/llvm-project/compiler-rt/lib/scudo/standalone/tests/common_test.cpp:30:3:
note: in instantiation of function template specialization
'testing::internal::CmpHelperGT<unsigned long, int>' requested here
EXPECT_GT(OnStart, 0);
^
Reviewed By: vitalybuka
Differential Revision: https://reviews.llvm.org/D103029
The Fuchsia allocator config was using the default size class map.
This CL gives Fuchsia its own size class map and changes a couple of
things in the default one:
- make `SizeDelta` configurable in `Config` for a fixed size class map
as it currently is for a table size class map;
- switch `SizeDelta` to 0 for the default config, it allows for size
classes that allow for power of 2s, and overall better wrt pages
filling;
- increase the max number of caches pointers to 14 in the default,
this makes the transfer batch 64/128 bytes on 32/64-bit platforms,
which is cache-line friendly (previous size was 48/96 bytes).
The Fuchsia size class map remains untouched for now, this doesn't
impact Android which uses the table size class map.
Differential Revision: https://reviews.llvm.org/D102783
This method is like StackTrace::Print but instead of printing to stderr
it copies its output to a user-provided buffer.
Part of https://reviews.llvm.org/D102451.
Reviewed By: vitalybuka, stephan.yichao.zhao
Differential Revision: https://reviews.llvm.org/D102815
Put allocate/deallocate next to memory
access inside EXPECT_DEATH block.
This way we reduce probability that memory is not mapped
by unrelated code.
It's still not absolutely guaranty that mmap does not
happen so we repeat it few times to be sure.
Reviewed By: cryptoad
Differential Revision: https://reviews.llvm.org/D102886
Looks like secondary pointers don't get unmapped on one of the arm32
bots. In the interests of landing some dependent patches, disable this
test on arm32 so that it can be tested in isolation later.
Reviewed By: cryptoad, vitalybuka
Split from differential patchset (1/2): https://reviews.llvm.org/D102648
The Linux kernel has removed the interface to cyclades from
the latest kernel headers[1] due to them being orphaned for the
past 13 years.
libsanitizer uses this header when compiling against glibc, but
glibcs itself doesn't seem to have any references to cyclades.
Further more it seems that the driver is broken in the kernel and
the firmware doesn't seem to be available anymore.
As such since this is breaking the build of libsanitizer (and so the
GCC bootstrap[2]) I propose to remove this.
[1] https://lkml.org/lkml/2021/3/2/153
[2] https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100379
Reviewed By: eugenis
Differential Revision: https://reviews.llvm.org/D102059
The Linux kernel has removed the interface to cyclades from
the latest kernel headers[1] due to them being orphaned for the
past 13 years.
libsanitizer uses this header when compiling against glibc, but
glibcs itself doesn't seem to have any references to cyclades.
Further more it seems that the driver is broken in the kernel and
the firmware doesn't seem to be available anymore.
As such since this is breaking the build of libsanitizer (and so the
GCC bootstrap[2]) I propose to remove this.
[1] https://lkml.org/lkml/2021/3/2/153
[2] https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100379
Reviewed By: eugenis
Differential Revision: https://reviews.llvm.org/D102059
These will be used for error propagation and handling in the ORC runtime.
The implementations of these types are cut-down versions of the error
support in llvm/Support/Error.h. Most advice on llvm::Error and llvm::Expected
(e.g. from the LLVM Programmer's manual) applies equally to __orc_rt::Error
and __orc_rt::Expected. The primary difference is the mechanism for testing
and handling error types: The ORC runtime uses a new 'error_cast' operation
to replace the handleErrors family of functions. See error_cast comments in
error.h.
If there are no counters, an mmap() of the counters section would fail
due to the size argument being too small (EINVAL).
rdar://78175925
Differential Revision: https://reviews.llvm.org/D102735
While developing a change to the allocator I ended up breaking
realloc on secondary allocations with increasing sizes. That didn't
cause any of the unit tests to fail, which indicated that we're
missing some test coverage here. Add a unit test for that case.
Differential Revision: https://reviews.llvm.org/D102716
This is a substitute for std::apply, which we can't use until we move to c++17.
apply_tuple will be used in upcoming the upcoming wrapper-function utils code.
Override __cxa_atexit and ignore callbacks.
This prevents crashes in a configuration when the symbolizer
is built into sanitizer runtime and consequently into the test process.
LLVM libraries have some global objects destroyed during exit,
so if the test process triggers any bugs after that, the symbolizer crashes.
An example stack trace of such crash:
For the standalone llvm-symbolizer this does not hurt,
we just don't destroy few global objects on exit.
Reviewed By: kda
Differential Revision: https://reviews.llvm.org/D102470
Since we have both aliasing mode and Intel LAM on x86_64, we need to
choose the mode at either run time or compile time. This patch
implements the plumbing to build both and choose between them at
compile time.
Reviewed By: vitalybuka, eugenis
Differential Revision: https://reviews.llvm.org/D102286
On AIX, we have to ship `libatomic.a` for compatibility. First, a new `clang_rt.atomic` is added. Second, use added cmake modules for AIX, we are able to build a compatible libatomic.a for AIX. The second step can't be perfectly implemented with cmake now since AIX's archive approach is kinda unique, i.e., archiving shared libraries into a static archive file.
Reviewed By: jsji
Differential Revision: https://reviews.llvm.org/D102155
`-fno-exceptions -fno-asynchronous-unwind-tables` compiled programs don't
produce .eh_frame on Linux and other ELF platforms, so the slow unwinder cannot
print stack traces. Just fall back to the fast unwinder: this allows
-fno-asynchronous-unwind-tables without requiring the sanitizer option
`fast_unwind_on_fatal=1`
Reviewed By: #sanitizers, vitalybuka
Differential Revision: https://reviews.llvm.org/D102046
This removes one of the last dependencies on old Scudo, and should allow
us to delete the old Scudo soon.
Reviewed By: vitalybuka, cryptoad
Differential Revision: https://reviews.llvm.org/D102349
With zero-sized allocations we don't actually end up storing the
address tag to the memory tag space, so store it in the first byte of
the chunk instead so that we can find it later in getInlineErrorInfo().
Differential Revision: https://reviews.llvm.org/D102442
It's more likely that we have a UAF than an OOB in blocks that are
more than 1 block away from the fault address, so the UAF should
appear first in the error report.
Differential Revision: https://reviews.llvm.org/D102379
On x32 size_t == unsigned int, not unsigned long int:
../../../../../src-master/libsanitizer/sanitizer_common/sanitizer_linux_libcdep.cpp: In function ??void __sanitizer::InitTlsSize()??:
../../../../../src-master/libsanitizer/sanitizer_common/sanitizer_linux_libcdep.cpp:209:55: error: invalid conversion from ??__sanitizer::uptr*?? {aka ??long unsigned int*??} to ??size_t*?? {aka ??unsigned int*??} [-fpermissive]
209 | ((void (*)(size_t *, size_t *))get_tls_static_info)(&g_tls_size, &tls_align);
| ^~~~~~~~~~~
| |
| __sanitizer::uptr* {aka long unsigned int*}
by using size_t on g_tls_size. This is to fix:
https://bugs.llvm.org/show_bug.cgi?id=50332
Reviewed By: MaskRay
Differential Revision: https://reviews.llvm.org/D102446
The bounds check that we previously had here was suitable for secondary
allocations but not for UAF on primary allocations, where it is likely
to result in false positives. Fix it by using a different bounds check
for UAF that requires the fault address to be in bounds.
Differential Revision: https://reviews.llvm.org/D102376
We have some significant amount of duplication around
CheckFailed functionality. Each sanitizer copy-pasted
a chunk of code. Some got random improvements like
dealing with recursive failures better. These improvements
could benefit all sanitizers, but they don't.
Deduplicate CheckFailed logic across sanitizers and let each
sanitizer only print the current stack trace.
I've tried to dedup stack printing as well,
but this got me into cmake hell. So let's keep this part
duplicated in each sanitizer for now.
Reviewed By: vitalybuka
Differential Revision: https://reviews.llvm.org/D102221
setlocale interceptor imitates a write into result,
which may be located in .rodata section.
This is the only interceptor that tries to do this and
I think the intention was to initialize the range for msan.
So do that instead. Writing into .rodata shouldn't happen
(without crashing later on the actual write) and this
traps on my local tsan experiments.
Reviewed By: vitalybuka
Differential Revision: https://reviews.llvm.org/D102161
Currently we have:
sanitizer_posix_libcdep.cpp:146:27: warning: cast between incompatible
function types from ‘__sighandler_t’ {aka ‘void (*)(int)’} to ‘sa_sigaction_t’
146 | sigact.sa_sigaction = (sa_sigaction_t)SIG_DFL;
We don't set SA_SIGINFO, so we need to assign to sa_handler.
And SIG_DFL is meant for sa_handler, so this gets rid of both
compiler warning, type cast and potential runtime misbehavior.
Reviewed By: vitalybuka
Differential Revision: https://reviews.llvm.org/D102162
Add unit test infrastructure for the ORC runtime, plus a cut-down
extensible_rtti system and extensible_rtti unit test.
Removes the placeholder.cpp source file.
Differential Revision: https://reviews.llvm.org/D102080
This patch does a few cleanup things:
1. The non-standalone scudo has a problem where GWP-ASan allocations
may not meet alignment requirements where Scudo was requested to have
alignment >= 16. Use the new GWP-ASan API to fix this.
2. The standalone variant loses some debugging information inside of
GWP-ASan because we ask GWP-ASan to allocate an aligned size in the
frontend. This means reports end up with 'UaF on a 16-byte allocation'
for a 1-byte allocation with 16-byte alignment. Also use the new API to
fix this.
3. Add post-alloc hooks for GWP-ASan intercepted allocations, and add
stats tracking for GWP-ASan allocations.
4. Add a small test that checks the alignment of the frontend
allocator, so that it can be used under GWP-ASan torture mode.
5. Add GWP-ASan torture mode as a testing configuration to catch these
regressions.
Depends on D94830, D95889.
Reviewed By: cryptoad
Differential Revision: https://reviews.llvm.org/D95884
GWP-ASan is the "production" variant as compiled by compiler-rt, and it's useful to be able to benchmark changes in GWP-ASan or Scudo's GWP-ASan hooks across versions. GWP-ASan is sampled, and sampled allocations are much slower, but given the amount of allocations that happen under test here - we actually get a reasonable representation of GWP-ASan's negligent performance impact between runs.
Reviewed By: cryptoad
Differential Revision: https://reviews.llvm.org/D101865
According to:
https://docs.python.org/3/library/subprocess.html#subprocess.Popen.poll
poll can return None if the process hasn't terminated.
I'm not quite sure how addr2line could end up closing the pipe without
terminating but we did see this happen on one of our bots:
```
<...>scripts/asan_symbolize.py",
line 211, in symbolize
logging.debug("addr2line exited early (broken pipe), returncode=%d"
% self.pipe.poll())
TypeError: %d format: a number is required, not NoneType
```
Handle None by printing a message that we couldn't get the return
code.
Reviewed By: delcypher
Differential Revision: https://reviews.llvm.org/D101891
Address sanitizer can detect stack exhaustion via its SEGV handler, which is
executed on a separate stack using the sigaltstack mechanism. When libFuzzer is
used with address sanitizer, it installs its own signal handlers which defer to
those put in place by the sanitizer before performing additional actions. In the
particular case of a stack overflow, the current setup fails because libFuzzer
doesn't preserve the flag for executing the signal handler on a separate stack:
when we run out of stack space, the operating system can't run the SEGV handler,
so address sanitizer never reports the issue. See the included test for an
example.
This commit fixes the issue by making libFuzzer preserve the SA_ONSTACK flag
when installing its signal handlers; the dedicated signal-handler stack set up
by the sanitizer runtime appears to be large enough to support the additional
frames from the fuzzer.
Reviewed By: morehouse
Differential Revision: https://reviews.llvm.org/D101824
Fixes compilation on Android which has a TSDSharedRegistry object in the config.
Reviewed By: cryptoad, vitalybuka
Differential Revision: https://reviews.llvm.org/D101951
Operator new must align allocations for types with large alignment.
Before c++17 behavior was implementation defined and both clang and gc++
before 11 ignored alignment. Miss-aligned objects mysteriously crashed
tests on Ubuntu 14.
Alternatives are compile with -std=c++17 or -faligned-new, but they were
discarded as less portable.
Reviewed By: hctim
Differential Revision: https://reviews.llvm.org/D101874
The problem was introduced in D100348.
It's really hard to trigger the bug in a stress test - the race is just too
narrow - but the new checks in Thread::Init should at least provide usable
diagnostic if the problem ever returns.
Differential Revision: https://reviews.llvm.org/D101881
This relates to https://reviews.llvm.org/D95835.
In DFSan origin tracking we use StackDepot to record
stack traces and origin traces (like MSan origin tracking).
For at least two reasons, we wanted to control StackDepot's memory cost
1) We may use DFSan origin tracking to monitor programs that run for
many days. This may eventually use too much memory for StackDepot.
2) DFSan supports flush shadow memory to reduce overhead. After flush,
all existing IDs in StackDepot are not valid because no one will
refer to them.
Currently, the position hint of an entry in the persistent auto
dictionary is fixed to 1. As a consequence, with a 50% chance, the entry
is applied right after the first byte of the input. As the position 1
does not appear to have any particular significance, this is likely a
bug that may have been caused by confusing the constructor parameter
with a success count.
This commit resolves the issue by preserving any existing position hint
or disabling the hint if the original entry didn't have one.
Reviewed By: morehouse
Differential Revision: https://reviews.llvm.org/D101686
Code patterns like this are common, `#` at the line beginning
(https://google.github.io/styleguide/cppguide.html#Preprocessor_Directives),
one space indentation for if/elif/else directives.
```
#if SANITIZER_LINUX
# if defined(__aarch64__)
# endif
#endif
```
However, currently clang-format wants to reformat the code to
```
#if SANITIZER_LINUX
#if defined(__aarch64__)
#endif
#endif
```
This significantly harms readability in my review. Use `IndentPPDirectives:
AfterHash` to defeat the diagnostic. clang-format will now suggest:
```
#if SANITIZER_LINUX
# if defined(__aarch64__)
# endif
#endif
```
Unfortunately there is no clang-format option using indent with 1 for
just preprocessor directives. However, this is still one step forward
from the current behavior.
Reviewed By: #sanitizers, vitalybuka
Differential Revision: https://reviews.llvm.org/D100238
The Scudo C unit tests are currently non-hermetic. In particular, adding
or removing a transfer batch is a global state of the allocator that
persists between tests. This can cause flakiness in
ScudoWrappersCTest.MallInfo, because the creation or teardown of a batch
causes mallinfo's uordblks or fordblks to move up or down by the size of
a transfer batch on malloc/free.
It's my opinion that uordblks and fordblks should track the statistics
related to the user's malloc() and free() usage, and not the state of
the internal allocator structures. Thus, excluding the transfer batches
from stat collection does the trick and makes these tests pass.
Repro instructions of the bug:
1. ninja ./projects/compiler-rt/lib/scudo/standalone/tests/ScudoCUnitTest-x86_64-Test
2. ./projects/compiler-rt/lib/scudo/standalone/tests/ScudoCUnitTest-x86_64-Test --gtest_filter=ScudoWrappersCTest.MallInfo
Reviewed By: cryptoad
Differential Revision: https://reviews.llvm.org/D101653
In the overwrite branch of MutationDispatcher::ApplyDictionaryEntry in
FuzzerMutate.cpp, the index Idx at which W.size() bytes are overwritten
with the word W is chosen uniformly at random in the interval
[0, Size - W.size()). This means that Idx + W.size() will always be
strictly less than Size, i.e., the last byte of the current unit will
never be overwritten.
This is fixed by adding 1 to the exclusive upper bound.
Addresses https://bugs.llvm.org/show_bug.cgi?id=49989.
Reviewed By: morehouse
Differential Revision: https://reviews.llvm.org/D101625
Currently we have a bit of a mess related to tids:
- sanitizers re-declare kInvalidTid multiple times
- some call it kUnknownTid
- implicit assumptions that main tid is 0
- asan/memprof claim their tids need to fit into 24 bits,
but this does not seem to be true anymore
- inconsistent use of u32/int to store tids
Introduce kInvalidTid/kMainTid in sanitizer_common
and use them consistently.
Reviewed By: vitalybuka
Differential Revision: https://reviews.llvm.org/D101428
Commit efd254b636 ("tsan: fix deadlock in pthread_atfork callbacks")
fixed another deadlock related to atfork handling.
But builders with DCHECKs enabled reported failures of
pthread_atfork_deadlock2.c and pthread_atfork_deadlock3.c tests
related to the fact that we hold runtime locks on interceptor exit:
https://lab.llvm.org/buildbot/#/builders/70/builds/6727
This issue is somewhat inherent to the current approach,
we indeed execute user code (atfork callbacks) with runtime lock held.
Refactor fork handling to not run user code (atfork callbacks)
with runtime locks held. This change does this by installing
own atfork callbacks during runtime initialization.
Atfork callbacks run in LIFO order, so the expectation is that
our callbacks run last, right before the actual fork.
This way we lock runtime mutexes around fork, but not around
user callbacks.
Extend tests to also install after fork callbacks just to cover
more scenarios. Some tests also started reporting real races
that we previously suppressed.
Also extend tests to cover fork syscall support.
Reviewed By: vitalybuka
Differential Revision: https://reviews.llvm.org/D101517
This is to help review refactor the allocator code.
So it is easy to see which are the real public interfaces.
Reviewed By: vitalybuka
Differential Revision: https://reviews.llvm.org/D101586
To see how to extract a shared allocator interface for D101204,
found some unused code. Tests passed. Are they safe to remove?
Reviewed By: vitalybuka
Differential Revision: https://reviews.llvm.org/D101559
We've got a user report about heap block allocator overflow.
Bump the L1 capacity of all dense slab allocators to maximum
and be careful to not page the whole L1 array in from .bss.
If OS uses huge pages, this still may cause a limited RSS increase
due to boundary huge pages, but avoiding that looks hard.
Reviewed By: vitalybuka
Differential Revision: https://reviews.llvm.org/D101161
While implementing support for the float128 routines on x86_64, I noticed
that __builtin_isinf() was returning true for 128-bit floating point
values that are not infinite when compiling with GCC and using the
compiler-rt implementation of the soft-float comparison functions.
After stepping through the assembly, I discovered that this was caused by
GCC assuming a sign-extended 64-bit -1 result, but our implementation
returns an enum (which then has zeroes in the upper bits) and therefore
causes the comparison with -1 to fail.
Fix this by using a CMP_RESULT typedef and add a static_assert that it
matches the GCC soft-float comparison return type when compiling with GCC
(GCC has a __libgcc_cmp_return__ mode that can be used for this purpose).
Also move the 3 copies of the same code to a shared .inc file.
Reviewed By: compnerd
Differential Revision: https://reviews.llvm.org/D98205
COMPILER_RT_TSAN_DEBUG_OUTPUT enables TSAN_COLLECT_STATS,
which changes layout of runtime structs (some structs contain
stats when the option is enabled).
It's not OK to build runtime with the define, but tests without it.
The error is detected by build_consistency_stats/nostats.
Fix this by defining TSAN_COLLECT_STATS for tests to match the runtime.
Reviewed By: vitalybuka
Differential Revision: https://reviews.llvm.org/D101386
Commit efd254b636 ("tsan: fix deadlock in pthread_atfork callbacks")
fixed another deadlock related to atfork handling.
But builders with DCHECKs enabled reported failures of
pthread_atfork_deadlock2.c and pthread_atfork_deadlock3.c tests
related to the fact that we hold runtime locks on interceptor exit:
https://lab.llvm.org/buildbot/#/builders/70/builds/6727
This issue is somewhat inherent to the current approach,
we indeed execute user code (atfork callbacks) with runtime lock held.
Refactor fork handling to not run user code (atfork callbacks)
with runtime locks held. This change does this by installing
own atfork callbacks during runtime initialization.
Atfork callbacks run in LIFO order, so the expectation is that
our callbacks run last, right before the actual fork.
This way we lock runtime mutexes around fork, but not around
user callbacks.
Extend tests to also install after fork callbacks just to cover
more scenarios. Some tests also started reporting real races
that we previously suppressed.
Reviewed By: vitalybuka
Differential Revision: https://reviews.llvm.org/D101385
We take report/thread_registry locks around fork.
This means we cannot report any bugs in atfork handlers.
We resolved this by enabling per-thread ignores around fork.
This resolved some of the cases, but not all.
The added test triggers a race report from a signal handler
called from atfork callback, we reset per-thread ignores
around signal handlers, so we tried to report it and deadlocked.
But there are more cases: a signal handler can be called
synchronously if it's sent to itself. Or any other report
types would cause deadlocks as well: mutex misuse,
signal handler spoiling errno, etc.
Disable all reports for the duration of fork with
thr->suppress_reports and don't re-enable them around
signal handlers.
Reviewed By: vitalybuka
Differential Revision: https://reviews.llvm.org/D101154
This is undefined if SANITIZER_SYMBOLIZER_MARKUP is 1, which is the case for
Fuchsia, and will result in a undefined symbol error. This function is needed
by hwasan for online symbolization, but is not needed for us since we do
offline symbolization.
Differential Revision: https://reviews.llvm.org/D99386
This reapplies 1e1d75b190, which was reverted in ce1a4d5323 due to build
failures.
The unconditional dependencies on clang and llvm-jitlink in
compiler-rt/test/orc/CMakeLists.txt have been removed -- they don't appear to
be necessary, and I suspect they're the cause of the build failures seen
earlier.
Some builders failed with a missing clang dependency. E.g.
CMake Error at /Users/buildslave/jenkins/workspace/clang-stage1-RA/clang-build \
/lib/cmake/llvm/AddLLVM.cmake:1786 (add_dependencies):
The dependency target "clang" of target "check-compiler-rt" does not exist.
Reverting while I investigate.
This reverts commit 1e1d75b190.
This patch does a few cleanup things:
1. The non-standalone scudo has a problem where GWP-ASan allocations
may not meet alignment requirements where Scudo was requested to have
alignment >= 16. Use the new GWP-ASan API to fix this.
2. The standalone variant loses some debugging information inside of
GWP-ASan because we ask GWP-ASan to allocate an aligned size in the
frontend. This means reports end up with 'UaF on a 16-byte allocation'
for a 1-byte allocation with 16-byte alignment. Also use the new API to
fix this.
3. Add post-alloc hooks for GWP-ASan intercepted allocations, and add
stats tracking for GWP-ASan allocations.
4. Add a small test that checks the alignment of the frontend
allocator, so that it can be used under GWP-ASan torture mode.
5. Add GWP-ASan torture mode as a testing configuration to catch these
regressions.
Depends on D94830, D95889.
Reviewed By: cryptoad
Differential Revision: https://reviews.llvm.org/D95884
From a cache perspective it's better to store the header immediately
after loading it. If we delay this operation until after we've
retagged it's more likely that our header will have been evicted from
the cache and we'll need to fetch it again in order to perform the
compare-exchange operation.
For similar reasons, store the deallocation stack before retagging
instead of afterwards.
Differential Revision: https://reviews.llvm.org/D101137
It looks like there's some old version of gcc that doesn't like this
static_assert (I couldn't reproduce the issue with gcc 8, 9 or 10).
Work around the issue by only checking the static_assert under clang,
which should provide sufficient coverage.
Should hopefully fix this buildbot:
https://lab.llvm.org/buildbot/#/builders/112/builds/5356
With AndroidSizeClassMap all of the LSBs are in the range 4-6 so we
only need 2 bits of information per size class. Furthermore we have
32 size classes, which conveniently lets us fit all of the information
into a 64-bit integer. Do so if possible so that we can avoid a table
lookup entirely.
Differential Revision: https://reviews.llvm.org/D101105
In the most common case we call computeOddEvenMaskForPointerMaybe()
from quarantineOrDeallocateChunk(), in which case we need to look up
the class size from the SizeClassMap in order to compute the LSB. Since
we need to do a lookup anyway, we may as well look up the LSB itself
and avoid computing it every time.
While here, switch to a slightly more efficient way of computing the
odd/even mask.
Differential Revision: https://reviews.llvm.org/D101018
The first version of origin tracking tracks only memory stores. Although
this is sufficient for understanding correct flows, it is hard to figure
out where an undefined value is read from. To find reading undefined values,
we still have to do a reverse binary search from the last store in the chain
with printing and logging at possible code paths. This is
quite inefficient.
Tracking memory load instructions can help this case. The main issues of
tracking loads are performance and code size overheads.
With tracking only stores, the code size overhead is 38%,
memory overhead is 1x, and cpu overhead is 3x. In practice #load is much
larger than #store, so both code size and cpu overhead increases. The
first blocker is code size overhead: link fails if we inline tracking
loads. The workaround is using external function calls to propagate
metadata. This is also the workaround ASan uses. The cpu overhead
is ~10x. This is a trade off between debuggability and performance,
and will be used only when debugging cases that tracking only stores
is not enough.
Reviewed By: gbalats
Differential Revision: https://reviews.llvm.org/D100967
Since we already have a tagged pointer available to us, we can just
extract the tag from it and avoid an LDG instruction.
Differential Revision: https://reviews.llvm.org/D101014
Now that we have a more efficient implementation of storeTags(),
we should start using it from resizeTaggedChunk(). With that, plus
a new storeTag() function, resizeTaggedChunk() can be made generic,
and so can prepareTaggedChunk(). Make it so.
Now that the functions are generic, move them to combined.h so that
memtag.h no longer needs to know about chunks.
Differential Revision: https://reviews.llvm.org/D100911
DC GZVA can operate on multiple granules at a time (corresponding to
the CPU's cache line size) so we can generally expect it to be faster
than STZG in a loop.
Differential Revision: https://reviews.llvm.org/D100910
An empty macro that expands to just `... else ;` can get
warnings from some compilers (e.g. GCC's -Wempty-body).
Reviewed By: cryptoad, vitalybuka
Differential Revision: https://reviews.llvm.org/D100693
If these sizes do not match, asan will not work as expected. Previously, we added compile-time checks for non-iOS platforms. We check at run time for iOS because we get the max VM size from the kernel at run time.
rdar://76477969
Reviewed By: delcypher
Differential Revision: https://reviews.llvm.org/D100784
sanitizer_linux_libcdep.cpp doesn't build for Linux sparc (with minimum support
but can build) after D98926. I wasn't aware because the file didn't mention
`__sparc__`.
While here, add the relevant support since it does not add complexity
(the D99566 approach). Adds an explicit `#error` for unsupported
non-Android Linux and FreeBSD architectures.
ThreadDescriptorSize is only used by lsan to scan thread-specific data keys in
the thread control block.
On TLS Variant II architectures (i386/x86_64/s390/sparc), our dl_iterate_phdr
based approach can cover the region from the first byte of the static TLS block
(static TLS surplus) to the thread pointer.
We just need to extend the range to include the first few members of struct
pthread. offsetof(struct pthread, specific_used) satisfies the requirement and
has not changed since 2007-05-10. We don't need to update ThreadDescriptorSize
for each glibc version.
Technically we could use the 524/1552 for x86_64 as well but there is potential
risk that large applications with thousands of shared object dependency may
dislike the time complexity increase if there are many threads, so I don't make
the simplification for now.
Differential Revision: https://reviews.llvm.org/D100892
If these sizes do not match, asan will not work as expected.
If possible, assert at compile time that the vm size is less than or equal to mmap range.
If a compile time assert is not possible, check at run time (for iOS)
rdar://76477969
Reviewed By: delcypher, yln
Differential Revision: https://reviews.llvm.org/D100239
We previously shrunk the mmap range size on ios, but those settings got inherited by apple silicon macs.
Don't shrink the vm range on apple silicon Mac since we have access to the full range.
Also don't shrink vm range for iOS simulators because they have the same range as the host OS, not the simulated OS.
rdar://75302812
Reviewed By: delcypher, kubamracek, yln
Differential Revision: https://reviews.llvm.org/D100234
As mentioned in https://gcc.gnu.org/PR100114 , glibc starting with the
https://sourceware.org/git/?p=glibc.git;a=commit;h=6c57d320484988e87e446e2e60ce42816bf51d53
change doesn't define SIGSTKSZ and MINSIGSTKSZ macros to constants, but to sysconf function call.
sanitizer_posix_libcdep.cpp has
static const uptr kAltStackSize = SIGSTKSZ * 4; // SIGSTKSZ is not enough.
which is generally fine, just means that when SIGSTKSZ is not a compile time constant will be initialized later.
The problem is that kAltStackSize is used in SetAlternateSignalStack which is called very early, from .preinit_array
initialization, i.e. far before file scope variables are constructed, which means it is not initialized and
mmapping 0 will fail:
==145==ERROR: AddressSanitizer failed to allocate 0x0 (0) bytes of SetAlternateSignalStack (error code: 22)
Here is one possible fix, another one could be to make kAltStackSize a preprocessor macro if _SG_SIGSTKSZ is defined
(but perhaps with having an automatic const variable initialized to it so that sysconf isn't at least called twice
during SetAlternateSignalStack.
Reviewed By: vitalybuka
Differential Revision: https://reviews.llvm.org/D100645
In order to integrate libFuzzer with a dynamic symbolic execution tool
Sydr we need to print loaded file paths.
Reviewed By: morehouse
Differential Revision: https://reviews.llvm.org/D100303
... so that FreeBSD specific GetTls/glibc specific pthread_self code can be
removed. This also helps FreeBSD arm64/powerpc64 which don't have GetTls
implementation yet.
GetTls is the range of
* thread control block and optional TLS_PRE_TCB_SIZE
* static TLS blocks plus static TLS surplus
On glibc, lsan requires the range to include
`pthread::{specific_1stblock,specific}` so that allocations only referenced by
`pthread_setspecific` can be scanned.
This patch uses `dl_iterate_phdr` to collect TLS blocks. Find the one
with `dlpi_tls_modid==1` as one of the initially loaded module, then find
consecutive ranges. The boundaries give us addr and size.
This allows us to drop the glibc internal `_dl_get_tls_static_info` and
`InitTlsSize`. However, huge glibc x86-64 binaries with numerous shared objects
may observe time complexity penalty, so exclude them for now. Use the simplified
method with non-Android Linux for now, but in theory this can be used with *BSD
and potentially other ELF OSes.
This removal of RISC-V `__builtin_thread_pointer` makes the code compilable with
more compiler versions (added in Clang in 2020-03, added in GCC in 2020-07).
This simplification enables D99566 for TLS Variant I architectures.
Note: as of musl 1.2.2 and FreeBSD 12.2, dlpi_tls_data returned by
dl_iterate_phdr is not desired: https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=254774
This can be worked around by using `__tls_get_addr({modid,0})` instead
of `dlpi_tls_data`. The workaround can be shared with the workaround for glibc<2.25.
This fixes some tests on Alpine Linux x86-64 (musl)
```
test/lsan/Linux/cleanup_in_tsd_destructor.c
test/lsan/Linux/fork.cpp
test/lsan/Linux/fork_threaded.cpp
test/lsan/Linux/use_tls_static.cpp
test/lsan/many_tls_keys_thread.cpp
test/msan/tls_reuse.cpp
```
and `test/lsan/TestCases/many_tls_keys_pthread.cpp` on glibc aarch64.
The number of sanitizer test failures does not change on FreeBSD/amd64 12.2.
Differential Revision: https://reviews.llvm.org/D98926
While attempting to roll the latest Scudo in Fuchsia, some issues
arose. While trying to debug them, it appeared that `DCHECK`s were
also never exercised in Fuchsia. This CL fixes the following
problems:
- the size of a block in the TransferBatch class must be a multiple
of the compact pointer scale. In some cases, it wasn't true, which
lead to obscure crashes. Now, we round up `sizeof(TransferBatch)`.
This only materialized in Fuchsia due to the specific parameters
of the `DefaultConfig`;
- 2 `DCHECK` statements in Fuchsia were incorrect;
- `map()` & co. require a size multiple of a page (as enforced in
Fuchsia `DCHECK`s), which wasn't the case for `PackedCounters`.
- In the Secondary, a parameter was marked as `UNUSED` while it is
actually used.
Differential Revision: https://reviews.llvm.org/D100524
Do not hold the free/live thread list lock longer than necessary.
This change speeds up the following benchmark 10x.
constexpr int kTopThreads = 50;
constexpr int kChildThreads = 20;
constexpr int kChildIterations = 8;
void Thread() {
for (int i = 0; i < kChildIterations; ++i) {
std::vector<std::thread> threads;
for (int i = 0; i < kChildThreads; ++i)
threads.emplace_back([](){});
for (auto& t : threads)
t.join();
}
}
int main() {
std::vector<std::thread> threads;
for (int i = 0; i < kTopThreads; ++i)
threads.emplace_back(Thread);
for (auto& t : threads)
t.join();
}
Differential Revision: https://reviews.llvm.org/D100348
The GNU assembler can't parse `.arch_extension ...` before a `;`.
So instead uniformly use raw string syntax with separate lines
instead of `;` separators in the assembly code.
Reviewed By: pcc
Differential Revision: https://reviews.llvm.org/D100413
This will allow us to make osx specific changes easier. Because apple silicon macs also run on aarch64, it was easy to confuse it with iOS.
rdar://75302812
Reviewed By: yln
Differential Revision: https://reviews.llvm.org/D100157
ASan declares these functions as strongly-defined, which results in
'duplicate symbol' errors when trying to replace them in user code when
linking the runtimes statically.
Reviewed By: eugenis
Differential Revision: https://reviews.llvm.org/D100220
D99763 fixed `SizeClassAllocatorLocalCache::drain` but with the
assumption that `BatchClassId` is 0 - which is currently true. I would
rather not make the assumption so that if we ever change the ID of
the batch class, the loop would still work. Since `BatchClassId` is
used more often in `local_cache.h`, introduce a constant so that we
don't have to specify `SizeClassMap::` every time.
Differential Revision: https://reviews.llvm.org/D100062
After a follow-up change (D98332) this header can be included the same time
as fenv.h when running the tests. To avoid enum members conflicting with
the macros/enums defined in the host fenv.h, prefix them with CRT_.
Reviewed By: peter.smith
Differential Revision: https://reviews.llvm.org/D98333
Fixes the ASan RISC-V memory mapping (originally introduced by D87580 and
D87581). This should be an improvement both in terms of first principles
soundness and observed test failures --- test failures would occur
non-deterministically depending on the ASLR random offset.
On RISC-V Linux (64-bit), `TASK_UNMAPPED_BASE` is currently defined as
`PAGE_ALIGN(TASK_SIZE / 3)`. The non-power-of-two divisor makes the result
be the not very round number 0x1555556000. That address had to be further
rounded to ensure page alignment after the shadow scale shifting is applied.
Still, that value explains why the mapping table may look less regular than
expected.
Further cleanups:
- Moved the mapping table comment, to ensure that the two Linux/AArch64
tables stayed together;
- Removed mention of Sv48. Neither the original mapping nor this one are
compatible with an actual Linux Sv48 address space (mainline Linux still
operates Sv48 in Sv39 mode). A future patch can improve this;
- Removed the additional comments, for consistency.
Differential Revision: https://reviews.llvm.org/D97646
This was reverted by f176803ef1 due to
Ubuntu 16.04 x86-64 glibc 2.23 problems.
This commit additionally calls `__tls_get_addr({modid,0})` to work around the
dlpi_tls_data==NULL issues for glibc<2.25
(https://sourceware.org/bugzilla/show_bug.cgi?id=19826)
GetTls is the range of
* thread control block and optional TLS_PRE_TCB_SIZE
* static TLS blocks plus static TLS surplus
On glibc, lsan requires the range to include
`pthread::{specific_1stblock,specific}` so that allocations only referenced by
`pthread_setspecific` can be scanned.
This patch uses `dl_iterate_phdr` to collect TLS blocks. Find the one
with `dlpi_tls_modid==1` as one of the initially loaded module, then find
consecutive ranges. The boundaries give us addr and size.
This allows us to drop the glibc internal `_dl_get_tls_static_info` and
`InitTlsSize` entirely. Use the simplified method with non-Android Linux for
now, but in theory this can be used with *BSD and potentially other ELF OSes.
This simplification enables D99566 for TLS Variant I architectures.
See https://reviews.llvm.org/D93972#2480556 for analysis on GetTls usage
across various sanitizers.
Differential Revision: https://reviews.llvm.org/D98926
The check was removed in D99786 as it seems that quarantine is
irrelevant for the just created allocator. However there is internal
issues with tagged memory access.
We should be able to fix iterateOverChunks for taggin later.
Existing implementations took up to 30 minutues to execute on my setup.
Now it's more convenient to debug a single test.
Reviewed By: cryptoad
Differential Revision: https://reviews.llvm.org/D99786
Linux-only for now. Some mac bits stubbed out, but not tested.
Good enough for the tiny_race.c example at
https://clang.llvm.org/docs/ThreadSanitizer.html :
$ out/gn/bin/clang -fsanitize=address -g -O1 tiny_race.c
$ while true; do ./a.out || echo $? ; done
While here, also make `-fsanitize=address` work for .c files.
Differential Revision: https://reviews.llvm.org/D99795
This change adds a SimpleFastHash64 variant of SimpleFastHash which allows call sites to specify a starting value and get a 64 bit hash in return. This allows a hash to be "resumed" with more data.
A later patch needs this to be able to hash a sequence of module-relative values one at a time, rather than just a region a memory.
Reviewed By: morehouse
Differential Revision: https://reviews.llvm.org/D94510
Trying to build the builtins code fails because `arm64_32_SOURCES` is
missing. Setting it to the same list used for `aarch64_SOURCES` solves
that problem and allow the builtins to compile for that architecture.
Additionally, arm64_32 is added as a possible architecture for watchos
platforms.
Reviewed By: compnerd
Differential Revision: https://reviews.llvm.org/D99690
On 64-bit systems with small VMAs (e.g. 39-bit) we can't use
SizeClassAllocator64 parameterized with size class maps containing a large
number of classes, as that will make the allocator region size too small
(< 2^32). Several tests were already disabled for Android because of this.
This patch provides the correct allocator configuration for RISC-V
(riscv64), generalizes the gating condition for tests that can't be enabled
for small VMA systems, and tweaks the tests that can be made compatible with
those systems to enable them.
I think the previous gating on Android should instead be AArch64+Android, so
the patch reflects that.
Differential Revision: https://reviews.llvm.org/D97234
The previous code may underestimate the static TLS surplus part, which may cause
false positives to LeakSanitizer if a dynamically loaded module uses the surplus
and there is an allocation only referenced by a thread's TLS.
GetTls is the range of
* thread control block and optional TLS_PRE_TCB_SIZE
* static TLS blocks plus static TLS surplus
On glibc, lsan requires the range to include
`pthread::{specific_1stblock,specific}` so that allocations only referenced by
`pthread_setspecific` can be scanned.
This patch uses `dl_iterate_phdr` to collect TLS ranges. Find the one
with `dlpi_tls_modid==1` as one of the initially loaded module, then find
consecutive ranges. The boundaries give us addr and size.
This allows us to drop the glibc internal `_dl_get_tls_static_info` and
`InitTlsSize` entirely. Use the simplified method with non-Android Linux for
now, but in theory this can be used with *BSD and potentially other ELF OSes.
In the future, we can move `ThreadDescriptorSize` code to lsan (and consider
intercepting `pthread_setspecific`) to avoid hacks in generic code.
See https://reviews.llvm.org/D93972#2480556 for analysis on GetTls usage
across various sanitizers.
Differential Revision: https://reviews.llvm.org/D98926
Userspace page aliasing allows us to use middle pointer bits for tags
without untagging them before syscalls or accesses. This should enable
easier experimentation with HWASan on x86_64 platforms.
Currently stack, global, and secondary heap tagging are unsupported.
Only primary heap allocations get tagged.
Note that aliasing mode will not work properly in the presence of
fork(), since heap memory will be shared between the parent and child
processes. This mode is non-ideal; we expect Intel LAM to enable full
HWASan support on x86_64 in the future.
Reviewed By: vitalybuka, eugenis
Differential Revision: https://reviews.llvm.org/D98875
Make TSan runtime initialization and finalization hooks work
even if these hooks are not built in the main executable. When these
hooks are defined in another library that is not directly linked against
the TSan runtime (e.g., Swift runtime) we cannot rely on the "strong-def
overriding weak-def" mechanics and have to look them up via `dlsym()`.
Let's also define hooks that are easier to use from C-only code:
```
extern "C" void __tsan_on_initialize();
extern "C" int __tsan_on_finalize(int failed);
```
For now, these will call through to the old hooks. Eventually, we want
to adopt the new hooks downstream and remove the old ones.
This is part of the effort to support Swift Tasks (async/await and
actors) in TSan.
rdar://74256720
Reviewed By: vitalybuka, delcypher
Differential Revision: https://reviews.llvm.org/D98810
Userspace page aliasing allows us to use middle pointer bits for tags
without untagging them before syscalls or accesses. This should enable
easier experimentation with HWASan on x86_64 platforms.
Currently stack, global, and secondary heap tagging are unsupported.
Only primary heap allocations get tagged.
Note that aliasing mode will not work properly in the presence of
fork(), since heap memory will be shared between the parent and child
processes. This mode is non-ideal; we expect Intel LAM to enable full
HWASan support on x86_64 in the future.
Reviewed By: vitalybuka, eugenis
Differential Revision: https://reviews.llvm.org/D98875
Supported ctime_r, fgets, getcwd, get_current_dir_name, gethostname,
getrlimit, getrusage, strcpy, time, inet_pton, localtime_r,
getpwuid_r, epoll_wait, poll, select, sched_getaffinity
Most of them work as calling their non-origin verision directly.
This is a part of https://reviews.llvm.org/D95835.
Reviewed By: morehouse
Differential Revision: https://reviews.llvm.org/D98966
Supported strrchr, strrstr, strto*, recvmmsg, recrmsg, nanosleep,
memchr, snprintf, socketpair, sprintf, getocketname, getsocketopt,
gettimeofday, getpeername.
strcpy was added because the test of sprintf need it. It will be
committed by D98966. Please ignore it when reviewing.
This is a part of https://reviews.llvm.org/D95835.
Reviewed By: gbalats
Differential Revision: https://reviews.llvm.org/D99109
The function works like MapDynamicShadow, except that it creates aliased
memory to the right of the shadow. The main use case is for HWASan
aliasing mode, which gets fast IsAlias() checks by exploiting the fact
that the upper bits of the shadow base and aliased memory match.
Reviewed By: vitalybuka, eugenis
Differential Revision: https://reviews.llvm.org/D98369
The main use case for this change is HWASan aliasing mode, which premaps
the alias space adjacent to the dynamic shadow. With this change, the
primary allocator can allocate from the alias space instead of a
separate region.
Reviewed By: vitalybuka, eugenis
Differential Revision: https://reviews.llvm.org/D98293
The main use case for this change is HWASan aliasing mode, which premaps
the alias space adjacent to the dynamic shadow. With this change, the
primary allocator can allocate from the alias space instead of a
separate region.
Reviewed By: vitalybuka, eugenis
Differential Revision: https://reviews.llvm.org/D98293
x86_64 aliasing mode will use fewer than 8 bits for tags, so refactor
existing code to remove hard-coded 0xff and 8 values.
Reviewed By: vitalybuka, eugenis
Differential Revision: https://reviews.llvm.org/D98072
-mbranch-protection protects the LR on the stack with PAC.
When the frames are walked the LR need to be cleared.
This inline assembly later will be replaced with a new builtin.
Test: build with -DCMAKE_C_FLAGS="-mbranch-protection=standard".
Reviewed By: kubamracek
Differential Revision: https://reviews.llvm.org/D98008