llvm-project/compiler-rt/lib
Dmitry Vyukov 9f2c6207d5 tsan: optimize sync clock memory consumption
This change implements 2 optimizations of sync clocks that reduce memory consumption:

Use previously unused first level block space to store clock elements.
Currently a clock for 100 threads consumes 3 512-byte blocks:

2 64-bit second level blocks to store clock elements
+1 32-bit first level block to store indices to second level blocks
Only 8 bytes of the first level block are actually used.
With this change such clock consumes only 2 blocks.

Share similar clocks differing only by a single clock entry for the current thread.
When a thread does several release operations on fresh sync objects without intervening
acquire operations in between (e.g. initialization of several fields in ctor),
the resulting clocks differ only by a single entry for the current thread.
This change reuses a single clock for such release operations. The current thread time
(which is different for different clocks) is stored in dirty entries.

We are experiencing issues with a large program that eats all 64M clock blocks
(32GB of non-flushable memory) and crashes with dense allocator overflow.
Max number of threads in the program is ~170 which is currently quite unfortunate
(consume 4 blocks per clock). Currently it crashes after consuming 60+ GB of memory.
The first optimization brings clock block consumption down to ~40M and
allows the program to work. The second optimization further reduces block consumption
to "modest" 16M blocks (~8GB of RAM) and reduces overall RAM consumption to ~30GB.

Measurements on another real world C++ RPC benchmark show RSS reduction
from 3.491G to 3.186G and a modest speedup of ~5%.

Go parallel client/server HTTP benchmark:
https://github.com/golang/benchmarks/blob/master/http/http.go
shows RSS reduction from 320MB to 240MB and a few percent speedup.

Reviewed in https://reviews.llvm.org/D35323

llvm-svn: 308018
2017-07-14 11:30:06 +00:00
..
BlocksRuntime
asan [asan] For iOS/AArch64, if the dynamic shadow doesn't fit, restrict the VM space 2017-07-12 23:29:21 +00:00
builtins [compiler-rt][X86] Match the detection of cpu's for __cpu_model to the latest version of gcc 2017-07-13 02:56:24 +00:00
cfi CFI: Add a blacklist entry for std::_Sp_counted_ptr_inplace::_Sp_counted_ptr_inplace(). 2017-05-05 18:46:14 +00:00
dfsan [sanitizer-coverage] remove stale code (old coverage); compiler-rt part 2017-05-31 18:26:32 +00:00
esan Refactor MemoryMappingLayout::Next to use a single struct instead of output parameters. NFC. 2017-07-11 18:54:00 +00:00
interception [WinASan] Fix hotpatching new Win 10 build 1703 x64 strnlen prologue 2017-06-16 20:44:00 +00:00
lsan Refactor MemoryMappingLayout::Next to use a single struct instead of output parameters. NFC. 2017-07-11 18:54:00 +00:00
msan [Sanitizers] Consolidate internal errno definitions. 2017-07-06 00:50:57 +00:00
profile [profile] Move __llvm_profile_filename into a separate object 2017-06-29 17:42:24 +00:00
safestack [compiler-rt] Do not introduce __sanitizer namespace globally 2016-09-15 21:02:18 +00:00
sanitizer_common Fix sanitizer build against latest glibc 2017-07-13 21:59:01 +00:00
scudo [scudo] Do not grab a cache for secondary allocation & per related changes 2017-07-13 21:01:19 +00:00
stats Revert "[sancov] moving sancov rt to sancov/ directory" 2017-01-12 01:37:35 +00:00
tsan tsan: optimize sync clock memory consumption 2017-07-14 11:30:06 +00:00
ubsan [ubsan] Teach the pointer overflow check that "p - <unsigned> <= p" (compiler-rt) 2017-07-13 20:55:41 +00:00
xray [XRay][compiler-rt][NFC] Add example always/never instrument files. 2017-06-28 04:44:36 +00:00
CMakeLists.txt Don't build tsan/dd when COMPILER_RT_HAS_TSAN is false 2017-06-27 21:10:46 +00:00