Commit Graph

15 Commits

Author SHA1 Message Date
Kostya Kortchinsky 2c56776a31 [scudo][standalone] Compact pointers for Caches/Batches
This CL introduces configuration options to allow pointers to be
compacted in the thread-specific caches and transfer batches. This
offers the possibility to have them use 32-bit of space instead of
64-bit for the 64-bit Primary, thus cutting the size of the caches
and batches by nearly half (and as such the memory used in size
class 0). The cost is an additional read from the region information
in the fast path.

This is not a new idea, as it's being used in the sanitizer_common
64-bit primary. The difference here is that it is configurable via
the allocator config, with the possibility of not compacting at all.

This CL enables compacting pointers in the Android and Fuchsia default
configurations.

Differential Revision: https://reviews.llvm.org/D96435
2021-02-25 12:14:38 -08:00
Kostya Kortchinsky c753a306fd [scudo][standalone] Various improvements wrt RSS
Summary:
This patch includes several changes to reduce the overall footprint
of the allocator:
- for realloc'd chunks: only keep the same chunk when lowering the size
  if the delta is within a page worth of bytes;
- when draining a cache: drain the beginning, not the end; we add pointers
  at the end, so that meant we were draining the most recently added
  pointers;
- change the release code to account for an freed up last page: when
  scanning the pages, we were looking for pages fully covered by blocks;
  in the event of the last page, if it's only partially covered, we
  wouldn't mark it as releasable - even what follows the last chunk is
  all 0s. So now mark the rest of the page as releasable, and adapt the
  test;
- add a missing `setReleaseToOsIntervalMs` to the cacheless secondary;
- adjust the Android classes based on more captures thanks to pcc@'s
  tool.

Reviewers: pcc, cferris, hctim, eugenis

Subscribers: #sanitizers, llvm-commits

Tags: #sanitizers, #llvm

Differential Revision: https://reviews.llvm.org/D75142
2020-02-26 12:25:43 -08:00
Peter Collingbourne 87303fd917 scudo: Fix various test failures, mostly on 32-bit.
Differential Revision: https://reviews.llvm.org/D74429
2020-02-11 12:18:35 -08:00
Peter Collingbourne 515e19ae7b Fix errors/warnings in scudo build. 2020-02-11 08:37:37 -08:00
Peter Collingbourne 041547eb4e scudo: Table driven size classes for Android allocator.
Add an optional table lookup after the existing logarithm computation
for MidSize < Size <= MaxSize during size -> class lookups. The lookup is
O(1) due to indexing a precomputed (via constexpr) table based on a size
table. Switch to this approach for the Android size class maps.

Other approaches considered:
- Binary search was found to have an unacceptable (~30%) performance cost.
- An approach using NEON instructions (see older version of D73824) was found
  to be slightly slower than this approach on newer SoCs but significantly
  slower on older ones.

By selecting the values in the size tables to minimize wastage (for example,
by passing the malloc_info output of a target program to the included
compute_size_class_config program), we can increase the density of allocations
at a small (~0.5% on bionic malloc_sql_trace as measured using an identity
table) performance cost.

Reduces RSS on specific Android processes as follows (KB):

                             Before  After
zygote (median of 50 runs)    26836  26792 (-0.2%)
zygote64 (median of 50 runs)  30384  30076 (-1.0%)
dex2oat (median of 3 runs)   375792 372952 (-0.8%)

I also measured the amount of whole-system idle dirty heap on Android by
rebooting the system and then running the following script repeatedly until
the results were stable:

for i in $(seq 1 50); do grep -A5 scudo: /proc/*/smaps | grep Pss: | cut -d: -f2 | awk '{s+=$1} END {print s}' ; sleep 1; done

I did this 3 times both before and after this change and the results were:

Before: 365650, 356795, 372663
After:  344521, 356328, 342589

These results are noisy so it is hard to make a definite conclusion, but
there does appear to be a significant effect.

On other platforms, increase the sizes of all size classes by a fixed offset
equal to the size of the allocation header. This has also been found to improve
density, since it is likely for allocation sizes to be a power of 2, which
would otherwise waste space by pushing the allocation into the next size class.

Differential Revision: https://reviews.llvm.org/D73824
2020-02-10 14:59:49 -08:00
Kostya Kortchinsky fe6e77f6fb [scudo][standalone] 32-bit improvement
Summary:
This tweaks some behaviors of the allocator wrt 32-bit, notably
tailoring the size-class map.

I had to remove a `printStats` from `__scudo_print_stats` since when
within Bionic they share the same slot so they can't coexist at the
same time. I have to find a solution for that later, but right now we
are not using the Svelte configuration.

Reviewers: rengolin

Subscribers: #sanitizers, llvm-commits

Tags: #sanitizers, #llvm

Differential Revision: https://reviews.llvm.org/D74178
2020-02-07 11:16:48 -08:00
Peter Collingbourne f7de7084f4 scudo: Simplify getClassIdBySize() logic. NFCI.
By subtracting 1 from Size at the beginning we can simplify the
subsequent calculations. This also saves 4 instructions on aarch64
and 9 instructions on x86_64, but seems to be perf neutral.

Differential Revision: https://reviews.llvm.org/D73936
2020-02-04 09:32:27 -08:00
Kostya Kortchinsky 64cb77b946 [scudo][standalone] Change default Android config
Summary:
This changes a couple of parameters in the default Android config to
address some performance and memory footprint issues (well to be closer
to the default Bionic allocator numbers).

Subscribers: #sanitizers, llvm-commits

Tags: #sanitizers, #llvm

Differential Revision: https://reviews.llvm.org/D73750
2020-01-31 14:46:35 -08:00
Peter Collingbourne 6fd6cfdf72 scudo: Replace a couple of macros with their expansions.
The macros INLINE and COMPILER_CHECK always expand to the same thing (inline
and static_assert respectively). Both expansions are standards compliant C++
and are used consistently in the rest of LLVM, so let's improve consistency
with the rest of LLVM by replacing them with the expansions.

Differential Revision: https://reviews.llvm.org/D70793
2019-11-27 10:12:27 -08:00
Kostya Kortchinsky f018246c20 [scudo][standalone] Enabled SCUDO_DEBUG for tests + fixes
Summary:
`SCUDO_DEBUG` was not enabled for unit tests, meaning the `DCHECK`s
were never tripped. While turning this on, I discovered that a few
of those not-exercised checks were actually wrong. This CL addresses
those incorrect checks.

Not that to work in tests `CHECK_IMPL` has to explicitely use the
`scudo` namespace. Also changes a C cast to a C++ cast.

Reviewers: hctim, pcc, cferris, eugenis, vitalybuka

Subscribers: mgorny, #sanitizers, llvm-commits

Tags: #sanitizers, #llvm

Differential Revision: https://reviews.llvm.org/D70276
2019-11-15 08:33:57 -08:00
Kostya Kortchinsky f7b1489ffc [scudo][standalone] Get statistics in a char buffer
Summary:
Following up on D68471, this CL introduces some `getStats` APIs to
gather statistics in char buffers (`ScopedString` really) instead of
printing them out right away. Ultimately `printStats` will just
output the buffer, but that allows us to potentially do some work
on the intermediate buffer, and can be used for a `mallocz` type
of functionality. This allows us to pretty much get rid of all the
`Printf` calls around, but I am keeping the function in for
debugging purposes.

This changes the existing tests to use the new APIs when required.

I will add new tests as suggested in D68471 in another CL.

Reviewers: morehouse, hctim, vitalybuka, eugenis, cferris

Reviewed By: morehouse

Subscribers: delcypher, #sanitizers, llvm-commits

Tags: #llvm, #sanitizers

Differential Revision: https://reviews.llvm.org/D68653

llvm-svn: 374173
2019-10-09 15:09:28 +00:00
Kostya Kortchinsky 161cca266a [scudo][standalone] Android related improvements
Summary:
This changes a few things to improve memory footprint and performances
on Android, and fixes a test compilation error:
- add `stdlib.h` to `wrappers_c_test.cc` to address
  https://bugs.llvm.org/show_bug.cgi?id=42810
- change Android size class maps, based on benchmarks, to improve
  performances and lower the Svelte memory footprint. Also change the
  32-bit region size for said configuration
- change the `reallocate` logic to reallocate in place for sizes larger
  than the original chunk size, when they still fit in the same block.
  This addresses patterns from `memory_replay` dumps like the following:
```
202: realloc 0xb48fd000 0xb4930650 12352
202: realloc 0xb48fd000 0xb48fd000 12420
202: realloc 0xb48fd000 0xb48fd000 12492
202: realloc 0xb48fd000 0xb48fd000 12564
202: realloc 0xb48fd000 0xb48fd000 12636
202: realloc 0xb48fd000 0xb48fd000 12708
202: realloc 0xb48fd000 0xb48fd000 12780
202: realloc 0xb48fd000 0xb48fd000 12852
202: realloc 0xb48fd000 0xb48fd000 12924
202: realloc 0xb48fd000 0xb48fd000 12996
202: realloc 0xb48fd000 0xb48fd000 13068
202: realloc 0xb48fd000 0xb48fd000 13140
202: realloc 0xb48fd000 0xb48fd000 13212
202: realloc 0xb48fd000 0xb48fd000 13284
202: realloc 0xb48fd000 0xb48fd000 13356
202: realloc 0xb48fd000 0xb48fd000 13428
202: realloc 0xb48fd000 0xb48fd000 13500
202: realloc 0xb48fd000 0xb48fd000 13572
202: realloc 0xb48fd000 0xb48fd000 13644
202: realloc 0xb48fd000 0xb48fd000 13716
202: realloc 0xb48fd000 0xb48fd000 13788
...
```
  In this situation we were deallocating the old chunk, and
  allocating a new one for every single one of those, but now we can
  keep the same chunk (we just updated the header), which saves some
  heap operations.

Reviewers: hctim, morehouse, vitalybuka, eugenis, cferris, rengolin

Reviewed By: morehouse

Subscribers: srhines, delcypher, #sanitizers, llvm-commits

Tags: #llvm, #sanitizers

Differential Revision: https://reviews.llvm.org/D67293

llvm-svn: 371628
2019-09-11 14:48:41 +00:00
Kostya Kortchinsky 419f1a4185 [scudo][standalone] Optimization pass
Summary:
This introduces a bunch of small optimizations with the purpose of
making the fastpath tighter:
- tag more conditions as `LIKELY`/`UNLIKELY`: as a rule of thumb we
  consider that every operation related to the secondary is unlikely
- attempt to reduce the number of potentially extraneous instructions
- reorganize the `Chunk` header to not straddle a word boundary and
  use more appropriate types

Note that some `LIKELY`/`UNLIKELY` impact might be less obvious as
they are in slow paths (for example in `secondary.cc`), but at this
point I am throwing a pretty wide net, and it's consistant and doesn't
hurt.

This was mosly done for the benfit of Android, but other platforms
benefit from it too. An aarch64 Android benchmark gives:
- before:
```
  BM_youtube/min_time:15.000/repeats:4/manual_time_mean              445244 us       659385 us            4
  BM_youtube/min_time:15.000/repeats:4/manual_time_median            445007 us       658970 us            4
  BM_youtube/min_time:15.000/repeats:4/manual_time_stddev               885 us         1332 us            4
```
- after:
```
  BM_youtube/min_time:15.000/repeats:4/manual_time_mean       415697 us       621925 us            4
  BM_youtube/min_time:15.000/repeats:4/manual_time_median     415913 us       622061 us            4
  BM_youtube/min_time:15.000/repeats:4/manual_time_stddev        990 us         1163 us            4
```

Additional since `-Werror=conversion` is enabled on some platforms we
are built on, enable it upstream to catch things early: a few sign
conversions had slept through and needed additional casting.

Reviewers: hctim, morehouse, eugenis, vitalybuka

Reviewed By: vitalybuka

Subscribers: srhines, mgorny, javed.absar, kristof.beyls, delcypher, #sanitizers, llvm-commits

Tags: #llvm, #sanitizers

Differential Revision: https://reviews.llvm.org/D64664

llvm-svn: 366918
2019-07-24 16:36:01 +00:00
Kostya Kortchinsky 21c31f5e7b [scudo][standalone] Add the memory reclaiming mechanism
Summary:
This CL implements the memory reclaiming function `releaseFreeMemoryToOS`
and its associated classes. Most of this code was originally written by
Aleksey for the Primary64 in sanitizer_common, and I made some changes to
be able to implement 32-bit reclaiming as well. The code has be restructured
a bit to accomodate for freelist of batches instead of the freearray used
in the current sanitizer_common code.

Reviewers: eugenis, vitalybuka, morehouse, hctim

Reviewed By: vitalybuka

Subscribers: srhines, mgorny, delcypher, #sanitizers, llvm-commits

Tags: #llvm, #sanitizers

Differential Revision: https://reviews.llvm.org/D61214

llvm-svn: 359567
2019-04-30 14:56:18 +00:00
Kostya Kortchinsky 3fad6a206f [scudo][standalone] Introduce the SizeClassMap
Summary:
As with the sanitizer_common allocator, the SCM allows for efficient
mapping between sizes and size-classes, table-free.

It doesn't depart significantly from the original, except that we
allow the use of size-class 0 for other purposes (as opposed to
chunks of size 0). The Primary will use it to hold TransferBatches.

Reviewers: vitalybuka, eugenis, hctim, morehouse

Reviewed By: vitalybuka

Subscribers: srhines, mgorny, delcypher, #sanitizers, llvm-commits

Tags: #llvm, #sanitizers

Differential Revision: https://reviews.llvm.org/D61088

llvm-svn: 359199
2019-04-25 15:49:34 +00:00