This commit changes the place where TSan runtime turns full path
to binary or shared library into its basename
(/usr/foo/mybinary -> mybinary). Instead of doing it as early as possible
(when we obtained the full path from the symbolizer), we now do it as
late as possible (right before printing the error report).
This seems like a right thing to do - stripping to basename is a detail
of report formatting implementation, and should belong there. Also, we
might need the full path at some point - for example, to match the
suppressions.
llvm-svn: 221225
Summary:
This change removes `__tsan::StackTrace` class. There are
now three alternatives:
# Lightweight `__sanitizer::StackTrace`, which doesn't own a buffer
of PCs. It is used in functions that need stack traces in read-only
mode, and helps to prevent unnecessary allocations/copies (e.g.
for StackTraces fetched from StackDepot).
# `__sanitizer::BufferedStackTrace`, which stores buffer of PCs in
a constant array. It is used in TraceHeader (non-Go version)
# `__tsan::VarSizeStackTrace`, which owns buffer of PCs, dynamically
allocated via TSan internal allocator.
Test Plan: compiler-rt test suite
Reviewers: dvyukov, kcc
Reviewed By: kcc
Subscribers: llvm-commits, kcc
Differential Revision: http://reviews.llvm.org/D6004
llvm-svn: 221194
introduce a BufferedStackTrace class, which owns this array.
Summary:
This change splits __sanitizer::StackTrace class into a lightweight
__sanitizer::StackTrace, which doesn't own array of PCs, and BufferedStackTrace,
which owns it. This would allow us to simplify the interface of StackDepot,
and eventually merge __sanitizer::StackTrace with __tsan::StackTrace.
Test Plan: regression test suite.
Reviewers: kcc, dvyukov
Reviewed By: dvyukov
Subscribers: llvm-commits
Differential Revision: http://reviews.llvm.org/D5985
llvm-svn: 220635
The current handling (manual execution of atexit callbacks)
is overly complex and leads to constant problems due to mutual ordering of callbacks.
Instead simply wrap callbacks into our wrapper to establish
the necessary synchronization.
Fixes issue https://code.google.com/p/thread-sanitizer/issues/detail?id=80
llvm-svn: 219675
On some tests we see that signals are not delivered
when a thread is blocked in epoll_wait. The hypothesis
is that the signal is delivered right before epoll_wait
call. The signal is queued as in_blocking_func is not set
yet, and then the thread just blocks in epoll_wait forever.
So double check pending signals *after* setting
in_blocking_func. This way we either queue a signal
and handle it in the beginning of a blocking func,
or process the signal synchronously if it's delivered
when in_blocking_func is set.
llvm-svn: 218070
I don't remember that crash on mmap in internal allocator
ever yielded anything useful, only crashes in rare wierd untested situations.
One of the reasons for crash was to catch if tsan starts allocating
clocks using mmap. Tsan does not allocate clocks using internal_alloc anymore.
Solve it once and for all by allowing mmaps.
llvm-svn: 217929
We may as well just use Symbolizer::GetOrInit() in all the cases.
Don't call Symbolizer::Get() early in tools initialization: these days
it doesn't do any important setup work, and we may as well create the
symbolizer the first time it's actually needed.
llvm-svn: 217558
There interceptors do not seem to be strictly necessary for tsan.
But we see cases where the interceptors consume 70% of execution time.
Memory blocks passed to fgetgrent_r are "written to" by tsan several times.
First, there is some recursion (getgrnam_r calls fgetgrent_r), and each
function "writes to" the buffer. Then, the same memory is "written to"
twice, first as buf and then as pwbufp (both of them refer to the same addresses).
llvm-svn: 216904
Vector clocks is the most actively allocated object in tsan runtime.
Current internal allocator is not scalable enough to handle allocation
of clocks in scalable way (too small caches). This changes transforms
clocks to 2-level array with 512-byte blocks. Since all blocks are of
the same size, it's possible to cache them more efficiently in per-thread caches.
llvm-svn: 214912
Suppression context might be used in multiple sanitizers working
simultaneously (e.g. LSan and UBSan) and not knowing about each other.
llvm-svn: 214831
Convert TSan and LSan to the new interface. More changes will follow:
1) "suppressions" should become a common runtime flag.
2) Code for parsing suppressions file should be moved to SuppressionContext::Init().
llvm-svn: 214334
Get rid of Symbolizer::Init(path_to_external) in favor of
thread-safe Symbolizer::GetOrInit(), and use the latter version
everywhere. Implicitly depend on the value of external_symbolizer_path
runtime flag instead of passing it around manually.
No functionality change.
llvm-svn: 214005
It is currently broken because it reads a wrong value from profile (heap instead of total).
Also make it faster by reading /proc/self/statm. Reading of /proc/self/smaps
can consume more than 50% of time on beefy apps if done every 100ms.
llvm-svn: 213942
The tsan's deadlock detector has been used in Chromium for a while;
it found a few real bugs and reported no false positives.
So, it's time to give it a bit more exposure.
llvm-svn: 212533
The bug happens in the following case:
Mutex is located at heap block beginning,
when we call MutexDestroy, s->next is set to 0,
so free can't find the MBlock related to the block.
llvm-svn: 212531
Introduce new public header <sanitizer/allocator_interface.h> and a set
of functions __sanitizer_get_ownership(), __sanitizer_malloc_hook() etc.
that will eventually replace their tool-specific equivalents
(__asan_get_ownership(), __msan_get_ownership() etc.). Tool-specific
functions are now deprecated and implemented as stubs redirecting
to __sanitizer_ versions (which are implemented differently in each tool).
Replace all uses of __xsan_ versions with __sanitizer_ versions in unit
and lit tests.
llvm-svn: 212469
The former used to crash with a null deref if it was given a not owned pointer,
while the latter returned 0. Now they both return 0. This is still not the best possible
behavior: it is better to print an error report with a stack trace, pointing
to the error in user code, as we do in ASan.
llvm-svn: 212112
The optimization is two-fold:
First, the algorithm now uses SSE instructions to
handle all 4 shadow slots at once. This makes processing
faster.
Second, if shadow contains the same access, we do not
store the event into trace. This increases effective
trace size, that is, tsan can remember up to 10x more
previous memory accesses.
Perofrmance impact:
Before:
[ OK ] DISABLED_BENCH.Mop8Read (2461 ms)
[ OK ] DISABLED_BENCH.Mop8Write (1836 ms)
After:
[ OK ] DISABLED_BENCH.Mop8Read (1204 ms)
[ OK ] DISABLED_BENCH.Mop8Write (976 ms)
But this measures only fast-path.
On large real applications the speedup is ~20%.
Trace size impact:
On app1:
Memory accesses : 1163265870
Including same : 791312905 (68%)
on app2:
Memory accesses : 166875345
Including same : 150449689 (90%)
90% of filtered events means that trace size is effectively 10x larger.
llvm-svn: 209897
The new storage (MetaMap) is based on direct shadow (instead of a hashmap + per-block lists).
This solves a number of problems:
- eliminates quadratic behaviour in SyncTab::GetAndLock (https://code.google.com/p/thread-sanitizer/issues/detail?id=26)
- eliminates contention in SyncTab
- eliminates contention in internal allocator during allocation of sync objects
- removes a bunch of ad-hoc code in java interface
- reduces java shadow from 2x to 1/2x
- allows to memorize heap block meta info for Java and Go
- allows to cleanup sync object meta info for Go
- which in turn enabled deadlock detector for Go
llvm-svn: 209810
The refactoring makes suppressions more flexible
and allow to suppress based on arbitrary number of stacks.
In particular it fixes:
https://code.google.com/p/thread-sanitizer/issues/detail?id=64
"Make it possible to suppress deadlock reports by any stack (not just first)"
llvm-svn: 209757
This way does not require a __sanitizer_cov_dump() call. That's
important on Android, where apps can be killed at arbitrary time.
We write raw PCs to disk instead of module offsets; we also write
memory layout to a separate file. This increases dump size by the
factor of 2 on 64-bit systems.
llvm-svn: 209653
Move fflush and fclose interceptors to sanitizer_common.
Use a metadata map to keep information about the external locations
that must be updated when the file is written to.
llvm-svn: 208676
If a multi-threaded program calls fork(), TSan ignores all memory accesses
in the child to prevent deadlocks in TSan runtime. This is OK, as child is
probably going to call exec() as soon as possible. However, a rare deadlocks
could be caused by ThreadIgnoreBegin() function itself.
ThreadIgnoreBegin() remembers the current stack trace and puts it into the
StackDepot to report a warning later if a thread exited with ignores enabled.
Using StackDepotPut in a child process is dangerous: it locks a mutex on
a slow path, which could be already locked in a parent process.
The fix is simple: just don't put current stack traces to StackDepot in
ThreadIgnoreBegin() and ThreadIgnoreSyncBegin() functions if we're
running after a multithreaded fork. We will not report any
"thread exited with ignores enabled" errors in this case anyway.
Submitting this without a testcase, as I believe the standalone reproducer
is pretty hard to construct.
llvm-svn: 205534
Make vector clock operations O(1) for several important classes of use cases.
See comments for details.
Below are stats from a large server app, 77% of all clock operations are handled as O(1).
Clock acquire : 25983645
empty clock : 6288080
fast from release-store : 14917504
contains my tid : 4515743
repeated (fast) : 2141428
full (slow) : 2636633
acquired something : 1426863
Clock release : 2544216
resize : 6241
fast1 : 197693
fast2 : 1016293
fast3 : 2007
full (slow) : 1797488
was acquired : 709227
clear tail : 1
last overflow : 0
Clock release store : 3446946
resize : 200516
fast : 469265
slow : 2977681
clear tail : 0
Clock acquire-release : 820028
llvm-svn: 204656
Extend ParseFlag to accept the |description| parameter, add dummy values for all existing flags.
As the flags are parsed their descriptions are stored in a global linked list.
The tool can later call __sanitizer::PrintFlagDescriptions() to dump all the flag names and their descriptions.
Add the 'help' flag and make ASan, TSan and MSan print the flags if 'help' is set to 1.
llvm-svn: 204339
Introduce DDetector interface between the tool and the DD itself.
It will help to experiment with other DD implementation,
as well as reuse DD in other tools.
llvm-svn: 202485
This reverts commit r201910.
While __func__ may be standard in C++11, it was only recently added to
MSVC in 2013 CTP, and LLVM supports MSVC 2012. __FUNCTION__ may not be
standard, but it's *very* portable.
llvm-svn: 201916
This is covered by existing ASan test.
This does not change anything for TSan by default (but provides a flag to
change the threshold size).
Based on a patch by florent.bruneau here:
https://code.google.com/p/address-sanitizer/issues/detail?id=256
llvm-svn: 201400
Because of the way Bionic sets up signal stack frames, libc unwinder is unable
to step through it, resulting in broken SEGV stack traces.
Luckily, libcorkscrew.so on Android implements an unwinder that can start with
a signal context, thus sidestepping the issue.
llvm-svn: 201151
We do not detect errno spoiling for SIGTERM,
because some SIGTERM handlers do spoil errno but reraise SIGTERM,
tsan reports false positive in such case.
It's difficult to properly detect this situation (reraise),
because in async signal processing case (when handler is called directly
from rtl_generic_sighandler) we have not yet received the reraised
signal; and it looks too fragile to intercept all ways to reraise a signal.
llvm-svn: 200742
Also rename internal_sigaction() into internal_sigaction_norestorer(), as this function doesn't fully
implement the sigaction() functionality on Linux.
This change is a part of refactoring intended to have common signal handling behavior in all tools.
llvm-svn: 200535
allow SIGABRT to spoil errno, because some real programs
reset SIGABRT handler in the handler, re-raise SIGABRT and return from the handler
llvm-svn: 200304
We left ignore_interceptors>0 when calling signal handlers
from blocking interceptors, this leads to missing synchronization in such signal handler.
llvm-svn: 200003
Currently correct programs can deadlock after fork, because atomic operations and async-signal-safe calls are not async-signal-safe under tsan.
With this change:
- if a single-threaded program forks, the child continues running with verification enabled (the tsan background thread is recreated as well)
- if a multi-threaded program forks, then the child runs with verification disabled (memory accesses, atomic operations and interceptors are disabled); it's expected that it will exec soon anyway
- if the child tries to create more threads after multi-threaded fork, the program aborts with error message
- die_after_fork flag is added that allows to continue running, but all bets are off
http://llvm-reviews.chandlerc.com/D2614
llvm-svn: 199993
Intercept and sanitize arguments passed to printf functions in ASan and TSan
(don't do this in MSan for now). The checks are controlled by runtime flag
(off by default for now).
Patch http://llvm-reviews.chandlerc.com/D2480 by Yuri Gribov!
llvm-svn: 199729
This is intended to address the following problem.
Episodically we see CHECK-failures when recursive interceptors call back into user code. Effectively we are not "in_rtl" at this point, but it's very complicated and fragile to properly maintain in_rtl property. Instead get rid of it. It was used mostly for sanity CHECKs, which basically never uncover real problems.
Instead introduce ignore_interceptors flag, which is used in very few narrow places to disable recursive interceptors (e.g. during runtime initialization).
llvm-svn: 197979
I still don't know what is causing our bootstrapped LTO buildbots to fail,
but llvm r194701 seems to be OK and I can't imagine that these changes could
cause the problem.
llvm-svn: 194790
Apple's bootstrapped LTO builds have been failing, and these changes (along
with llvm 194701) are the only things on the blamelist. I will either reapply
these changes or help debug the problem, depending on whether this fixes the
buildbots.
llvm-svn: 194779
Summary:
TSan and MSan need to know if interceptor was called by the
user code or by the symbolizer and use pre- and post-symbolization hooks
for that. Make Symbolizer class responsible for calling these hooks instead.
This would ensure the hooks are only called when necessary (during
in-process symbolization, they are not needed for out-of-process) and
save specific sanitizers from tracing all places in the code where symbolization
will be performed.
Reviewers: eugenis, dvyukov
Reviewed By: eugenis
CC: llvm-commits
Differential Revision: http://llvm-reviews.chandlerc.com/D2067
llvm-svn: 193807
This moves away from creating the symbolizer object and initializing the
external symbolizer as separate steps. Those steps now always take place
together.
Sanitizers with a legacy requirement to specify their own symbolizer path
should use InitSymbolizer to initialize the symbolizer with the desired
path, and GetSymbolizer to access the symbolizer. Sanitizers with no
such requirement (e.g. UBSan) can use GetOrInitSymbolizer with no need for
initialization.
The symbolizer interface has been made thread-safe (as far as I can
tell) by protecting its member functions with mutexes.
Finally, the symbolizer interface no longer relies on weak externals, the
introduction of which was probably a mistake on my part.
Differential Revision: http://llvm-reviews.chandlerc.com/D1985
llvm-svn: 193448
some tests test libc/filesystem error handling paths (e.g. close(INT_MAX)),
currently such tests fail
with this change they work as expected
llvm-svn: 193400
This allows to increase max shadow stack size to 64K,
and reliably catch shadow stack overflows instead of silently
corrupting memory.
llvm-svn: 192797
Currently data-race-test unittests fail with the following false positive:
WARNING: ThreadSanitizer: data race (pid=20365)
Write of size 8 at 0x7da000008050 by thread T54:
#0 close tsan_interceptors.cc:1483 (racecheck_unittest-linux-amd64-O0+0x0000000eb34a)
#1 NegativeTests_epoll::Worker2() unittest/posix_tests.cc:1148 (racecheck_unittest-linux-amd64-O0+0x0000000cc6b1)
#2 MyThread::ThreadBody(MyThread*) unittest/./thread_wrappers_pthread.h:367 (racecheck_unittest-linux-amd64-O0+0x000000097500)
Previous read of size 8 at 0x7da000008050 by thread T49:
#0 epoll_ctl tsan_interceptors.cc:1646 (racecheck_unittest-linux-amd64-O0+0x0000000e9fee)
#1 NegativeTests_epoll::Worker1() unittest/posix_tests.cc:1140 (racecheck_unittest-linux-amd64-O0+0x0000000cc5b5)
#2 MyThread::ThreadBody(MyThread*) unittest/./thread_wrappers_pthread.h:367 (racecheck_unittest-linux-amd64-O0+0x000000097500)
llvm-svn: 192448
The annotations are AnnotateIgnoreSyncBegin/End,
may be useful to ignore some infrastructure synchronization
that introduces lots of false negatives.
llvm-svn: 192355
The flag allows to bound maximum process memory consumption (best effort).
If RSS reaches memory_limit_mb, tsan flushes all shadow memory.
llvm-svn: 191913
LibIgnore allows to ignore all interceptors called from a particular set
of dynamic libraries. LibIgnore remembers all "called_from_lib" suppressions
from the provided SuppressionContext; finds code ranges for the libraries;
and checks whether the provided PC value belongs to the code ranges.
Also make malloc and friends interceptors use SCOPED_INTERCEPTOR_RAW instead of
SCOPED_TSAN_INTERCEPTOR, because if they are called from an ignored lib,
then must call our internal allocator instead of libc malloc.
llvm-svn: 191897
This change adds a Python script that is invoked for
the just-built sanitizer runtime to generate the list of exported symbols
passed to the linker. By default, it contains interceptors and sanitizer
interface functions, but can be extended with tool-specific lists.
llvm-svn: 189356
If there's a race between a memory access and a free() call in the client program,
it can be reported as a use-after-free (if the access occurs after the free()) or an ordinary race
(if free() occurs after the access).
We've decided to use a single "race:" prefix for both cases instead of introducing a "use-after-free:" one,
because in many cases this allows us to keep a single suppression for both the use-after-free and free-after-use.
This may be misleading if the use-after-free occurs in a non-racy way (e.g. in a single-threaded program).
But normally such bugs shall not be suppressed.
llvm-svn: 187885
This reverts commit r187788.
The test case is unreliable (as the test may be run in a situation in
which it has no affinity with cpu0). This can be recommitted with a more
reliable test - possibly using CPU_COUNT != 0 instead (I wasn't entirely
sure that a process was guaranteed to have at least one affinity, though
it seems reasonable, or I'd have made the change myself).
llvm-svn: 187841
It is required for chromium sandboxing code.
From the description it seems to be indeed synchronous -- called back on syscall with incorrect arguments,
but seems to be unused in practice. So this should be fine.
llvm-svn: 186579
full proper list of dynamic symbols crashes old gold (see bug 16468).
the culprit is 'memcpy' function, if it's added to syms file, gold crashes
llvm-svn: 185078