Correct a bug in the code that resets shadow memory introduced as part
of a previous change for the Go race detector (D128909). The bug was
that only the most recently added shadow segment was being reset, as
opposed to the entire extent of the segment created so far. This
fixes a bug identified in Google internal testing (b/240733951).
Differential Revision: https://reviews.llvm.org/D131256
Capture the computed shadow begin/end values at the point where the
shadow is first created and reuse those values on reset. Introduce new
windows-specific function "ZeroMmapFixedRegion" for zeroing out an
address space region previously returned by one of the MmapFixed*
routines; call this function (on windows) from DoResetImpl
tsan_rtl.cpp instead of MmapFixedSuperNoReserve.
See https://github.com/golang/go/issues/53539#issuecomment-1168778740
for context; intended to help with updating the syso for Go's
windows/amd64 race detector.
Differential Revision: https://reviews.llvm.org/D128909
Prevent the following pathological behavior:
Since memory access handling is not synchronized with DoReset,
a thread running concurrently with DoReset can leave a bogus shadow value
that will be later falsely detected as a race. For such false races
RestoreStack will return false and we will not report it.
However, consider that a thread leaves a whole lot of such bogus values
and these values are later read by a whole lot of threads.
This will cause massive amounts of ReportRace calls and lots of
serialization. In very pathological cases the resulting slowdown
can be >100x. This is very unlikely, but it was presumably observed
in practice: https://github.com/google/sanitizers/issues/1552
If this happens, previous access sid+epoch will be the same for all of
these false races b/c if the thread will try to increment epoch, it will
notice that DoReset has happened and will stop producing bogus shadow
values. So, last_spurious_race is used to remember the last sid+epoch
for which RestoreStack returned false. Then it is used to filter out
races with the same sid+epoch very early and quickly.
It is of course possible that multiple threads left multiple bogus shadow
values and all of them are read by lots of threads at the same time.
In such case last_spurious_race will only be able to deduplicate a few
races from one thread, then few from another and so on. An alternative
would be to hold an array of such sid+epoch, but we consider such scenario
as even less likely.
Note: this can lead to some rare false negatives as well:
1. When a legit access with the same sid+epoch participates in a race
as the "previous" memory access, it will be wrongly filtered out.
2. When RestoreStack returns false for a legit memory access because it
was already evicted from the thread trace, we will still remember it in
last_spurious_race. Then if there is another racing memory access from
the same thread that happened in the same epoch, but was stored in the
next thread trace part (which is still preserved in the thread trace),
we will also wrongly filter it out while RestoreStack would actually
succeed for that second memory access.
Reviewed By: melver
Differential Revision: https://reviews.llvm.org/D130269
We used to deduplicate based on the race address to prevent lots
of repeated reports about the same race.
But now we clear the shadow for the racy address in DoReportRace:
// This prevents trapping on this address in future.
for (uptr i = 0; i < kShadowCnt; i++)
StoreShadow(&shadow_mem[i], i == 0 ? Shadow::kRodata : Shadow::kEmpty);
It should have the same effect of not reporting duplicates
(and actually better because it's automatically reset when the memory is reallocated).
So drop the address deduplication code. Both simpler and faster.
Reviewed By: melver
Differential Revision: https://reviews.llvm.org/D130240
This is a NFC change to factor out GCD worker thread registration via
the pthread introspection hook.
In a follow-up change we also want to register GCD workers for ASan to
make sure threads are registered before we attempt to print reports on
them.
rdar://93276353
Differential Revision: https://reviews.llvm.org/D126351
If lots of threads do lots of malloc/free and they overflow
per-pthread DenseSlabAlloc cache, it causes lots of contention:
31.97% race.old race.old [.] __sanitizer::StaticSpinMutex::LockSlow
17.61% race.old race.old [.] __tsan_read4
10.77% race.old race.old [.] __tsan::SlotLock
Optimize DenseSlabAlloc to use a lock-free stack of batches of nodes.
This way we don't take any locks in steady state at all and do only
1 push/pop per Refill/Drain.
Effect on the added benchmark:
$ TIME="%e %U %S %M" time ./test.old 36 5 2000000
34.51 978.22 175.67 5833592
32.53 891.73 167.03 5790036
36.17 1005.54 201.24 5802828
36.94 1004.76 226.58 5803188
$ TIME="%e %U %S %M" time ./test.new 36 5 2000000
26.44 720.99 13.45 5750704
25.92 721.98 13.58 5767764
26.33 725.15 13.41 5777936
25.93 713.49 13.41 5791796
Reviewed By: melver
Differential Revision: https://reviews.llvm.org/D130002
We already link libunwind explicitly so avoid trying to link toolchain's
default libunwind which may be missing. This matches what we already do
for libcxx and libcxxabi.
Differential Revision: https://reviews.llvm.org/D129472
Callers of TraceSwitchPart expect that TraceAcquire will always succeed
after the call. It's possible that TryTraceFunc/TraceMutexLock in TraceSwitchPart
that restore the current stack/mutexset filled the trace part exactly up
to the TracePart::kAlignment gap and the next TraceAcquire won't succeed.
Skip the alignment gap after writing initial stack/mutexset to avoid that.
Reviewed By: melver
Differential Revision: https://reviews.llvm.org/D129777
This is a follow up to D118200 which applies a similar cleanup to
headers when using in-tree libc++ to avoid accidentally picking up
the system headers.
Differential Revision: https://reviews.llvm.org/D128035
This is a follow up to D118200 which applies a similar cleanup to
headers when using in-tree libc++ to avoid accidentally picking up
the system headers.
Differential Revision: https://reviews.llvm.org/D128035
While investigating another issue, I noticed that `MaybeReexec()` never
actually "re-executes via `execv()`" anymore. `DyldNeedsEnvVariable()`
only returned true on macOS 10.10 and below.
Usually, I try to avoid "unnecessary" cleanups (it's hard to be certain
that there truly is no fallout), but I decided to do this one because:
* I initially tricked myself into thinking that `MaybeReexec()` was
relevant to my original investigation (instead of being dead code).
* The deleted code itself is quite complicated.
* Over time a few other things were mushed into `MaybeReexec()`:
initializing `MonotonicNanoTime()`, verifying interceptors are
working, and stripping the `DYLD_INSERT_LIBRARIES` env var to avoid
problems when forking.
* This platform-specific thing leaked into `sanitizer_common.h`.
* The `ReexecDisabled()` config nob relies on the "strong overrides weak
pattern", which is now problematic and can be completely removed.
* `ReexecDisabled()` actually hid another issue with interceptors not
working in unit tests. I added an explicit `verify_interceptors`
(defaults to `true`) option instead.
Differential Revision: https://reviews.llvm.org/D129157
While investigating another issue, I noticed that `MaybeReexec()` never
actually "re-executes via `execv()`" anymore. `DyldNeedsEnvVariable()`
only returned true on macOS 10.10 and below.
Usually, I try to avoid "unnecessary" cleanups (it's hard to be certain
that there truly is no fallout), but I decided to do this one because:
* I initially tricked myself into thinking that `MaybeReexec()` was
relevant to my original investigation (instead of being dead code).
* The deleted code itself is quite complicated.
* Over time a few other things were mushed into `MaybeReexec()`:
initializing `MonotonicNanoTime()`, verifying interceptors are
working, and stripping the `DYLD_INSERT_LIBRARIES` env var to avoid
problems when forking.
* This platform-specific thing leaked into `sanitizer_common.h`.
* The `ReexecDisabled()` config nob relies on the "strong overrides weak
pattern", which is now problematic and can be completely removed.
* `ReexecDisabled()` actually hid another issue with interceptors not
working in unit tests. I added an explicit `verify_interceptors`
(defaults to `true`) option instead.
Differential Revision: https://reviews.llvm.org/D129157
Add a missing "#if !SANITIZER_GO" guard for a call to DumpProcessMap
in the Finalize hook (needed to build an updated Go race detector syso
image).
Reviewed By: dvyukov
Differential Revision: https://reviews.llvm.org/D128641
We no longer support the use of LLVM_ENABLE_PROJECTS for libcxx and
libcxxabi. We don't use paths to libcxx and libcxxabi in compiler-rt.
Differential Revision: https://reviews.llvm.org/D126905
We no longer support the use of LLVM_ENABLE_PROJECTS for libcxx and
libcxxabi. We don't use paths to libcxx and libcxxabi in compiler-rt.
Differential Revision: https://reviews.llvm.org/D126905
The stack pointer is stored in the second slot in the jump buffer on
AArch64. Use the correct slot value to read this rather than the
following register.
Reviewed by: melver
Differential Revision: https://reviews.llvm.org/D125762
This is a follow up to [Sanitizers][Darwin] Rename Apple macro SANITIZER_MAC -> SANITIZER_APPLE (D125816)
Performed a global search/replace as in title against LLVM sources
Differential Revision: https://reviews.llvm.org/D126263
Fixes:
tsan/tsan_shadow.h:93:32: warning: enumerated and non-enumerated type in conditional expression [-Wextra]
tsan/tsan_shadow.h:94:44: warning: enumerated and non-enumerated type in conditional expression [-Wextra]
Differential Revision: https://reviews.llvm.org/D124828
After cd0a5889d7, unittest would run in shard mode where many tests
share a single process. Need to clear some global state to make the test
results stable.
Reviewed By: thetruestblue, rsundahl
Differential Revision: https://reviews.llvm.org/D124591
An application can use the mere fact of epoll_wait returning an fd
as synchronization with the write on the fd that triggered the notification.
This pattern come up in an internal networking server (b/229276331).
If an fd is added to epoll, setup a link from the fd to the epoll fd
and use it for synchronization as well.
Reviewed By: melver
Differential Revision: https://reviews.llvm.org/D124518
from compiler-rt/lib/tsan
[NFC] As part of using inclusive language within the llvm project, this
patch rewords comments to remove sanity check and sanity test.
Reviewed By: dvyukov
Differential Revision: https://reviews.llvm.org/D124390
Currently, we only print how threads involved in data race are created from their parent threads.
Add a runtime flag 'print_full_thread_history' to print thread creation stacks for the threads involved in the data race and their ancestors up to the main thread.
Reviewed By: dvyukov
Differential Revision: https://reviews.llvm.org/D122131
For errno spoiling reports we only print the stack
where the signal handler is invoked. And the top
frame is the signal handler function, which is supposed
to give the info for debugging.
But in same cases the top frame can be some common thunk,
which does not give much info. E.g. for Go/cgo it's always
runtime.cgoSigtramp.
Print the signal number.
This is what we can easily gather and it may give at least
some hints regarding the issue.
Reviewed By: melver, vitalybuka
Differential Revision: https://reviews.llvm.org/D121979
The false positive fixed by commit f831d6fc80
("tsan: fix false positive during fd close") still happens episodically
on the added more stressful test which does just open/close.
I don't have a coherent explanation as to what exactly happens
but the fix fixes the false positive on this test as well.
The issue may be related to lost writes during asynchronous MADV_DONTNEED.
I've debugged similar unexplainable false positive related to freed and
reused memory and at the time the only possible explanation I found is that
an asynchronous MADV_DONTNEED may lead to lost writes. That's why commit
302ec7b9bc ("tsan: add memory_limit_mb flag") added StopTheWorld around
the memory flush, but unfortunately the commit does not capture these findings.
Reviewed By: melver
Differential Revision: https://reviews.llvm.org/D121363
FdClose is a subjet to the same atomicity problem as MemoryRangeFreed
(memory state is not "monotoic" wrt race detection).
So we need to lock the thread slot in FdClose the same way we do
in MemoryRangeFreed.
This fixes the modified stress.cpp test.
Reviewed By: vitalybuka, melver
Differential Revision: https://reviews.llvm.org/D121143
The upstream project ships CMake rules for building vanilla gtest/gmock which conflict with the names chosen by LLVM. Since LLVM's build rules here are quite specific to LLVM, prefixing them to avoid collision is the right thing (i.e. there does not appear to be a path to letting someone *replace* LLVM's googletest with one they bring, so co-existence should be the goal).
This allows LLVM to be included with testing enabled within projects that themselves have a dependency on an official gtest release.
Reviewed By: mehdi_amini
Differential Revision: https://reviews.llvm.org/D120789
There should be 1-bit unused field between tid field and is_atomic field of Shadow.
Reviewed By: dvyukov, vitalybuka
Differential Revision: https://reviews.llvm.org/D119417
There should be 1-bit unused field between tid field and is_atomic field of Shadow.
Reviewed By: dvyukov, vitalybuka
Differential Revision: https://reviews.llvm.org/D119417
This is a follow up to 4f3f4d6722
("sanitizer_common: fix __sanitizer_get_module_and_offset_for_pc signature mismatch")
which fixes a similar problem for msan build.
I am getting the following error compiling a unit test for code that
uses sanitizer_common headers and googletest transitively includes
sanitizer interface headers:
In file included from third_party/gwp_sanitizers/singlestep_test.cpp:3:
In file included from sanitizer_common/sanitizer_common.h:19:
sanitizer_interface_internal.h:41:5: error: typedef redefinition with different types
('struct __sanitizer_sandbox_arguments' vs 'struct __sanitizer_sandbox_arguments')
} __sanitizer_sandbox_arguments;
common_interface_defs.h:39:3: note: previous definition is here
} __sanitizer_sandbox_arguments;
Reviewed By: melver
Differential Revision: https://reviews.llvm.org/D119546
Similar to 60cc1d3218 for NetBSD, add aliases and interceptors for the
following pthread related functions:
- pthread_cond_init(3)
- pthread_cond_destroy(3)
- pthread_cond_signal(3)
- pthread_cond_broadcast(3)
- pthread_cond_wait(3)
- pthread_mutex_init(3)
- pthread_mutex_destroy(3)
- pthread_mutex_lock(3)
- pthread_mutex_trylock(3)
- pthread_mutex_unlock(3)
- pthread_rwlock_init(3)
- pthread_rwlock_destroy(3)
- pthread_rwlock_rdlock(3)
- pthread_rwlock_tryrdlock(3)
- pthread_rwlock_wrlock(3)
- pthread_rwlock_trywrlock(3)
- pthread_rwlock_unlock(3)
- pthread_once(3)
- pthread_sigmask(3)
In FreeBSD's libc, a number of internal aliases of the pthread functions
are invoked, typically with an additional prefixed underscore, e.g.
_pthread_cond_init() and so on.
ThreadSanitizer needs to intercept these aliases too, otherwise some
false positive reports about data races might be produced.
Reviewed By: dvyukov
Differential Revision: https://reviews.llvm.org/D119034
In glibc before 2.33, include/sys/stat.h defines fstat/fstat64 to
`__fxstat/__fxstat64` and provides `__fxstat/__fxstat64` in libc_nonshared.a.
The symbols are glibc specific and not needed on other systems.
Reviewed By: vitalybuka, #sanitizers
Differential Revision: https://reviews.llvm.org/D118423
When using the in-tree libc++, we should be using the full path to
ensure that we're using the right library and not accidentally pick up
the system library.
Differential Revision: https://reviews.llvm.org/D118200
`_vfork` moved from libsystem_kernel.dylib to libsystem_c.dylib as part
of the below changes. The iOS simulator does not actually have
libsystem_kernel.dylib of its own, it only has the host Mac's. The
umbrella-nature of Libsystem makes this movement transparent to
everyone; except the simulator! So when we "back deploy", i.e., use the
current version of TSan with an older simulator runtime then this symbol
is now missing, when we run on the latest OS (but an older simulator
runtime).
Note we use `SANITIZER_IOS` because usage of vfork is forbidden on iOS
and the API is completely unavailable on watchOS and tvOS, even if this
problem is specific to the iOS simulator.
Caused by:
rdar://74818691 (Shim vfork() to fork syscall on iOS)
rdar://76762076 (Shim vfork() to fork syscall on macOS)
Radar-Id: rdar://8634734