Commit Graph

22 Commits

Author SHA1 Message Date
Kostya Kortchinsky a45877eea8 [scudo] Get rid of initLinkerInitialized
Now that everything is forcibly linker initialized, it feels like a
good time to get rid of the `init`/`initLinkerInitialized` split.

This allows to get rid of various `memset` construct in `init` that
gcc complains about (this fixes a Fuchsia open issue).

I added various `DCHECK`s to ensure that we would get a zero-inited
object when entering `init`, which required ensuring that
`unmapTestOnly` leaves the object in a good state (tests are currently
the only location where an allocator can be "de-initialized").

Running the tests with `--gtest_repeat=` showed no issue.

Differential Revision: https://reviews.llvm.org/D103119
2021-05-26 09:53:40 -07:00
Christopher Ferris 6fac34251d [scudo] Add initialization for TSDRegistrySharedT
Fixes compilation on Android which has a TSDSharedRegistry object in the config.

Reviewed By: cryptoad, vitalybuka

Differential Revision: https://reviews.llvm.org/D101951
2021-05-05 19:00:54 -07:00
Peter Collingbourne 3f71ce8589 scudo: Support memory tagging in the secondary allocator.
This patch enhances the secondary allocator to be able to detect buffer
overflow, and (on hardware supporting memory tagging) use-after-free
and buffer underflow.

Use-after-free detection is implemented by setting memory page
protection to PROT_NONE on free. Because this must be done immediately
rather than after the memory has been quarantined, we no longer use the
combined allocator quarantine for secondary allocations. Instead, a
quarantine has been added to the secondary allocator cache.

Buffer overflow detection is implemented by aligning the allocation
to the right of the writable pages, so that any overflows will
spill into the guard page to the right of the allocation, which
will have PROT_NONE page protection. Because this would require the
secondary allocator to produce a header at the correct position,
the responsibility for ensuring chunk alignment has been moved to
the secondary allocator.

Buffer underflow detection has been implemented on hardware supporting
memory tagging by tagging the memory region between the start of the
mapping and the start of the allocation with a non-zero tag. Due to
the cost of pre-tagging secondary allocations and the memory bandwidth
cost of tagged accesses, the allocation itself uses a tag of 0 and
only the first four pages have memory tagging enabled.

This is a reland of commit 7a0da88943 which was reverted in commit
9678b07e42. This reland includes the following changes:

- Fix the calculation of BlockSize which led to incorrect statistics
  returned by mallinfo().
- Add -Wno-pedantic to silence GCC warning.
- Optionally add some slack at the end of secondary allocations to help
  work around buggy applications that read off the end of their
  allocation.

Differential Revision: https://reviews.llvm.org/D93731
2021-03-08 14:39:33 -08:00
Peter Collingbourne 9678b07e42 Revert 7a0da88943, "scudo: Support memory tagging in the secondary allocator."
We measured a 2.5 seconds (17.5%) regression in Android boot time
performance with this change.
2021-02-25 16:50:02 -08:00
Peter Collingbourne 7a0da88943 scudo: Support memory tagging in the secondary allocator.
This patch enhances the secondary allocator to be able to detect buffer
overflow, and (on hardware supporting memory tagging) use-after-free
and buffer underflow.

Use-after-free detection is implemented by setting memory page
protection to PROT_NONE on free. Because this must be done immediately
rather than after the memory has been quarantined, we no longer use the
combined allocator quarantine for secondary allocations. Instead, a
quarantine has been added to the secondary allocator cache.

Buffer overflow detection is implemented by aligning the allocation
to the right of the writable pages, so that any overflows will
spill into the guard page to the right of the allocation, which
will have PROT_NONE page protection. Because this would require the
secondary allocator to produce a header at the correct position,
the responsibility for ensuring chunk alignment has been moved to
the secondary allocator.

Buffer underflow detection has been implemented on hardware supporting
memory tagging by tagging the memory region between the start of the
mapping and the start of the allocation with a non-zero tag. Due to
the cost of pre-tagging secondary allocations and the memory bandwidth
cost of tagged accesses, the allocation itself uses a tag of 0 and
only the first four pages have memory tagging enabled.

Differential Revision: https://reviews.llvm.org/D93731
2021-02-22 14:35:39 -08:00
Kostya Kortchinsky 7b51961cd0 [scudo][standalone] Remove the pthread key from the shared TSD
https://reviews.llvm.org/D87420 removed the uses of the pthread key,
but the key itself was left in the shared TSD registry. It is created
on registry initialization, and destroyed on registry teardown.

There is really no use for it now, so we can just remove it.

Differential Revision: https://reviews.llvm.org/D88046
2020-09-22 08:25:27 -07:00
Peter Collingbourne 7bd75b6301 scudo: Add an API for disabling memory initialization per-thread.
Here "memory initialization" refers to zero- or pattern-init on
non-MTE hardware, or (where possible to avoid) memory tagging on MTE
hardware. With shared TSD the per-thread memory initialization state
is stored in bit 0 of the TLS slot, similar to PointerIntPair in LLVM.

Differential Revision: https://reviews.llvm.org/D87739
2020-09-18 12:04:27 -07:00
Peter Collingbourne 84c2c4977d scudo: Introduce a new mechanism to let Scudo access a platform-specific TLS slot
An upcoming change to Scudo will change how we use the TLS slot
in tsd_shared.h, which will be a little easier to deal with if
we can remove the code path that calls pthread_getspecific and
pthread_setspecific. The only known user of this code path is Fuchsia.

We can't eliminate this code path by making Fuchsia use ELF TLS
because although Fuchsia supports ELF TLS, it is not supported within
libc itself. To address this, Roland McGrath on the Fuchsia team has
proposed that Scudo will optionally call a platform-provided function
to access a TLS slot reserved for Scudo. Android also has a reserved
TLS slot, but the code that accesses the TLS slot lives in Scudo.

We can eliminate some complexity and duplicated code by having Android
use the same mechanism that was proposed for Fuchsia, which is what
this change does. A separate change to Android implements it.

Differential Revision: https://reviews.llvm.org/D87420
2020-09-10 19:14:28 -07:00
Kostya Kortchinsky 6f00f3b56e [scudo][standalone] mallopt runtime configuration options
Summary:
Partners have requested the ability to configure more parts of Scudo
at runtime, notably the Secondary cache options (maximum number of
blocks cached, maximum size) as well as the TSD registry options
(the maximum number of TSDs in use).

This CL adds a few more Scudo specific `mallopt` parameters that are
passed down to the various subcomponents of the Combined allocator.

- `M_CACHE_COUNT_MAX`: sets the maximum number of Secondary cached items
- `M_CACHE_SIZE_MAX`: sets the maximum size of a cacheable item in the Secondary
- `M_TSDS_COUNT_MAX`: sets the maximum number of TSDs that can be used (Shared Registry only)

Regarding the TSDs maximum count, this is a one way option, only
allowing to increase the count.

In order to allow for this, I rearranged the code to have some `setOption`
member function to the relevant classes, using the `scudo::Option` class
enum to determine what is to be set.

This also fixes an issue where a static variable (`Ready`) was used in
templated functions without being set back to `false` every time.

Reviewers: pcc, eugenis, hctim, cferris

Subscribers: jfb, llvm-commits, #sanitizers

Tags: #sanitizers

Differential Revision: https://reviews.llvm.org/D84667
2020-07-28 11:57:54 -07:00
Kostya Kortchinsky fc69967a4b [scudo][standalone] Shift some data from dynamic to static
Summary:
Most of our larger data is dynamically allocated (via `map`) but it
became an hindrance with regard to init time, for a cost to benefit
ratio that is not great. So change the `TSD`s, `RegionInfo`, `ByteMap`
to be static.

Additionally, for reclaiming, we used mapped & unmapped a buffer each
time, which is costly. It turns out that we can have a static buffer,
and that there isn't much contention on it.

One of the other things changed here, is that we hard set the number
of TSDs on Android to the maximum number, as there could be a
situation where cores are put to sleep and we could miss some.

Subscribers: mgorny, #sanitizers, llvm-commits

Tags: #sanitizers, #llvm

Differential Revision: https://reviews.llvm.org/D74696
2020-02-18 09:38:50 -08:00
Peter Collingbourne 515e19ae7b Fix errors/warnings in scudo build. 2020-02-11 08:37:37 -08:00
Kostya Kortchinsky 561fa84477 [scudo][standalone] Allow sched_getaffinity to fail
Summary:
In some configuration, `sched_getaffinity` can fail. Some reasons for
that being the lack of `CAP_SYS_NICE` capability or some syscall
filtering and so on.

This should not be fatal to the allocator, so in this situation, we
will fallback to the `MaxTSDCount` value specified in the allocator
configuration.

Reviewers: cferris, eugenis, hctim, morehouse, pcc

Subscribers: #sanitizers, llvm-commits

Tags: #sanitizers, #llvm

Differential Revision: https://reviews.llvm.org/D73055
2020-01-21 11:18:18 -08:00
Kostya Kortchinsky 9ef6faf496 [scudo][standalone] Fork support
Summary:
fork() wasn't well (or at all) supported in Scudo. This materialized
in deadlocks in children.

In order to properly support fork, we will lock the allocator pre-fork
and unlock it post-fork in parent and child. This is done via a
`pthread_atfork` call installing the necessary handlers.

A couple of things suck here: this function allocates - so this has to
be done post initialization as our init path is not reentrance, and it
doesn't allow for an extra pointer - so we can't pass the allocator we
are currently working with.

In order to work around this, I added a post-init template parameter
that gets executed once the allocator is initialized for the current
thread. Its job for the C wrappers is to install the atfork handlers.

I reorganized a bit the impacted area and added some tests, courtesy
of cferris@ that were deadlocking prior to this fix.

Subscribers: jfb, #sanitizers, llvm-commits

Tags: #sanitizers, #llvm

Differential Revision: https://reviews.llvm.org/D72470
2020-01-14 07:51:48 -08:00
Kostya Kortchinsky 77e906ac78 [scudo][standalone] Implement TSD registry disabling
Summary:
In order to implement `malloc_{enable|disable}` we were just disabling
(or really locking) the Primary and the Secondary. That meant that
allocations could still be serviced from the TSD as long as the cache
wouldn't have to be filled from the Primary.

This wasn't working out for Android tests, so this change implements
registry disabling (eg: locking) so that `getTSDAndLock` doesn't
return a TSD if the allocator is disabled. This also means that the
Primary doesn't have to be disabled in this situation.

For the Shared Registry, we loop through all the TSDs and lock them.
For the Exclusive Registry, we add a `Disabled` boolean to the Registry
that forces `getTSDAndLock` to use the Fallback TSD instead of the
thread local one. Disabling the Registry is then done by locking the
Fallback TSD and setting the boolean in question (I don't think this
needed an atomic variable but I might be wrong).

I clang-formatted the whole thing as usual hence the couple of extra
whiteline changes in this CL.

Reviewers: cferris, pcc, hctim, morehouse, eugenis

Subscribers: jfb, #sanitizers, llvm-commits

Tags: #sanitizers, #llvm

Differential Revision: https://reviews.llvm.org/D71719
2019-12-20 06:52:13 -08:00
Peter Collingbourne 29f0a65671 scudo: Add a basic malloc/free benchmark.
Differential Revision: https://reviews.llvm.org/D71104
2019-12-09 10:10:19 -08:00
Peter Collingbourne f30fe16d49 scudo: Call setCurrentTSD(nullptr) when bringing down the TSD registry in tests.
Otherwise, we will hit a use-after-free when testing multiple instances of
the same allocator on the same thread. This only recently became a problem
with D70552 which caused us to run both ScudoCombinedTest.BasicCombined and
ScudoCombinedTest.ReleaseToOS on the unit tests' main thread.

Differential Revision: https://reviews.llvm.org/D70760
2019-11-27 09:55:14 -08:00
Peter Collingbourne f751a79173 scudo: Only use the Android reserved TLS slot when building libc's copy of the allocator.
When we're not building libc's allocator, just use a regular TLS variable. This
lets the unit tests pass on Android devices whose libc uses Scudo. Otherwise
libc's copy of Scudo and the unit tests' copy will both try to use the same
TLS slot, in likely incompatible ways.

This requires using ELF TLS, so start passing -fno-emulated-tls when building
the library and the unit tests on Android.

Differential Revision: https://reviews.llvm.org/D70472
2019-11-20 11:30:58 -08:00
Kostya Kortchinsky 419f1a4185 [scudo][standalone] Optimization pass
Summary:
This introduces a bunch of small optimizations with the purpose of
making the fastpath tighter:
- tag more conditions as `LIKELY`/`UNLIKELY`: as a rule of thumb we
  consider that every operation related to the secondary is unlikely
- attempt to reduce the number of potentially extraneous instructions
- reorganize the `Chunk` header to not straddle a word boundary and
  use more appropriate types

Note that some `LIKELY`/`UNLIKELY` impact might be less obvious as
they are in slow paths (for example in `secondary.cc`), but at this
point I am throwing a pretty wide net, and it's consistant and doesn't
hurt.

This was mosly done for the benfit of Android, but other platforms
benefit from it too. An aarch64 Android benchmark gives:
- before:
```
  BM_youtube/min_time:15.000/repeats:4/manual_time_mean              445244 us       659385 us            4
  BM_youtube/min_time:15.000/repeats:4/manual_time_median            445007 us       658970 us            4
  BM_youtube/min_time:15.000/repeats:4/manual_time_stddev               885 us         1332 us            4
```
- after:
```
  BM_youtube/min_time:15.000/repeats:4/manual_time_mean       415697 us       621925 us            4
  BM_youtube/min_time:15.000/repeats:4/manual_time_median     415913 us       622061 us            4
  BM_youtube/min_time:15.000/repeats:4/manual_time_stddev        990 us         1163 us            4
```

Additional since `-Werror=conversion` is enabled on some platforms we
are built on, enable it upstream to catch things early: a few sign
conversions had slept through and needed additional casting.

Reviewers: hctim, morehouse, eugenis, vitalybuka

Reviewed By: vitalybuka

Subscribers: srhines, mgorny, javed.absar, kristof.beyls, delcypher, #sanitizers, llvm-commits

Tags: #llvm, #sanitizers

Differential Revision: https://reviews.llvm.org/D64664

llvm-svn: 366918
2019-07-24 16:36:01 +00:00
Kostya Kortchinsky aeb3826228 [scudo][standalone] Merge Spin & Blocking mutex into a Hybrid one
Summary:
We ran into a problem on Fuchsia where yielding threads would never
be deboosted, ultimately resulting in several threads spinning on the
same TSD, and no possibility for another thread to be scheduled,
dead-locking the process.

While this was fixed in Zircon, this lead to discussions about if
spinning without a break condition was a good decision, and settled on
a new hybrid model that would spin for a while then block.

Currently we are using a number of iterations for spinning that is
mostly arbitrary (based on sanitizer_common values), but this can
be tuned in the future.

Since we are touching `common.h`, we also use this change as a vehicle
for an Android optimization (the page size is fixed in Bionic, so use
a fixed value too).

Reviewers: morehouse, hctim, eugenis, dvyukov, vitalybuka

Reviewed By: hctim

Subscribers: srhines, delcypher, jfb, #sanitizers, llvm-commits

Tags: #llvm, #sanitizers

Differential Revision: https://reviews.llvm.org/D64358

llvm-svn: 365790
2019-07-11 15:32:26 +00:00
Kostya Kortchinsky 3ad32a037e [scudo] Correct a behavior on the shared TSD registry
Summary:
There is an error in the shared TSD registry logic when looking for a
TSD in the slow path. There is an unlikely event when a TSD's precedence
was 0 after attempting a `tryLock` which indicated that it was grabbed
by another thread in between. We dealt with that case by continuing to
the next iteration, but that meant that the `Index` was not increased
and we ended up trying to lock the same TSD.
This would manifest in heavy contention, and in the end we would still
lock a TSD, but that was a wasted iteration.
So, do not `continue`, just skip the TSD as a potential candidate.

This is in both the standalone & non-standalone versions.

Reviewers: morehouse, eugenis, vitalybuka, hctim

Reviewed By: morehouse

Subscribers: delcypher, #sanitizers, llvm-commits

Tags: #llvm, #sanitizers

Differential Revision: https://reviews.llvm.org/D63783

llvm-svn: 364345
2019-06-25 19:58:11 +00:00
Kostya Kortchinsky 624a24e156 [scudo][standalone] Unmap memory in tests
Summary:
The more tests are added, the more we are limited by the size of the
address space on 32-bit. Implement `unmapTestOnly` all around (like it
is in sanitzer_common) to be able to free up some memory.
This is not intended to be a proper "destructor" for an allocator, but
allows us to not fail due to having no memory left.

Reviewers: morehouse, vitalybuka, eugenis, hctim

Reviewed By: morehouse

Subscribers: delcypher, jfb, #sanitizers, llvm-commits

Tags: #llvm, #sanitizers

Differential Revision: https://reviews.llvm.org/D63146

llvm-svn: 363095
2019-06-11 19:50:12 +00:00
Kostya Kortchinsky 52bfd673d1 [scudo][standalone] Introduce the thread specific data structures
Summary:
This CL adds the structures dealing with thread specific data for the
allocator. This includes the thread specific data structure itself and
two registries for said structures: an exclusive one, where each thread
will have its own TSD struct, and a shared one, where a pool of TSD
structs will be shared by all threads, with dynamic reassignment at
runtime based on contention.

This departs from the current Scudo implementation: we intend to make
the Registry a template parameter of the allocator (as opposed to a
single global entity), allowing various allocators to coexist with
different TSD registry models. As a result, TSD registry and Allocator
are tightly coupled.

This also corrects a couple of things in other files that I noticed
while adding this.

Reviewers: eugenis, vitalybuka, morehouse, hctim

Reviewed By: morehouse

Subscribers: srhines, mgorny, delcypher, jfb, #sanitizers, llvm-commits

Tags: #llvm, #sanitizers

Differential Revision: https://reviews.llvm.org/D62258

llvm-svn: 362962
2019-06-10 16:50:52 +00:00