In LAM model X86_64 will use bits 57-62 (of 0-63) as HWASAN tag.
So here we make sure the tag shift position and tag mask is correct for x86-64.
Differential Revision: https://reviews.llvm.org/D102472
Userspace page aliasing allows us to use middle pointer bits for tags
without untagging them before syscalls or accesses. This should enable
easier experimentation with HWASan on x86_64 platforms.
Currently stack, global, and secondary heap tagging are unsupported.
Only primary heap allocations get tagged.
Note that aliasing mode will not work properly in the presence of
fork(), since heap memory will be shared between the parent and child
processes. This mode is non-ideal; we expect Intel LAM to enable full
HWASan support on x86_64 in the future.
Reviewed By: vitalybuka, eugenis
Differential Revision: https://reviews.llvm.org/D98875
Userspace page aliasing allows us to use middle pointer bits for tags
without untagging them before syscalls or accesses. This should enable
easier experimentation with HWASan on x86_64 platforms.
Currently stack, global, and secondary heap tagging are unsupported.
Only primary heap allocations get tagged.
Note that aliasing mode will not work properly in the presence of
fork(), since heap memory will be shared between the parent and child
processes. This mode is non-ideal; we expect Intel LAM to enable full
HWASan support on x86_64 in the future.
Reviewed By: vitalybuka, eugenis
Differential Revision: https://reviews.llvm.org/D98875
Subsequent patches will implement page-aliasing mode for x86_64, which
will initially only work for the primary heap allocator. We force
callback instrumentation to simplify the initial aliasing
implementation.
Reviewed By: vitalybuka, eugenis
Differential Revision: https://reviews.llvm.org/D98069
Fixed section of code that iterated through a SmallDenseMap and added
instructions in each iteration, causing non-deterministic code; replaced
SmallDenseMap with MapVector to prevent non-determinism.
This reverts commit 01ac6d1587.
This caused non-deterministic compiler output; see comment on the
code review.
> This patch updates the various IR passes to correctly handle dbg.values with a
> DIArgList location. This patch does not actually allow DIArgLists to be produced
> by salvageDebugInfo, and it does not affect any pass after codegen-prepare.
> Other than that, it should cover every IR pass.
>
> Most of the changes simply extend code that operated on a single debug value to
> operate on the list of debug values in the style of any_of, all_of, for_each,
> etc. Instances of setOperand(0, ...) have been replaced with with
> replaceVariableLocationOp, which takes the value that is being replaced as an
> additional argument. In places where this value isn't readily available, we have
> to track the old value through to the point where it gets replaced.
>
> Differential Revision: https://reviews.llvm.org/D88232
This reverts commit df69c69427.
This patch updates the various IR passes to correctly handle dbg.values with a
DIArgList location. This patch does not actually allow DIArgLists to be produced
by salvageDebugInfo, and it does not affect any pass after codegen-prepare.
Other than that, it should cover every IR pass.
Most of the changes simply extend code that operated on a single debug value to
operate on the list of debug values in the style of any_of, all_of, for_each,
etc. Instances of setOperand(0, ...) have been replaced with with
replaceVariableLocationOp, which takes the value that is being replaced as an
additional argument. In places where this value isn't readily available, we have
to track the old value through to the point where it gets replaced.
Differential Revision: https://reviews.llvm.org/D88232
This patch updates DbgVariableIntrinsics to support use of a DIArgList for the
location operand, resulting in a significant change to its interface. This patch
does not update all IR passes to support multiple location operands in a
dbg.value; the only change is to update the DbgVariableIntrinsic interface and
its uses. All code outside of the intrinsic classes assumes that an intrinsic
will always have exactly one location operand; they will still support
DIArgLists, but only if they contain exactly one Value.
Among other changes, the setOperand and setArgOperand functions in
DbgVariableIntrinsic have been made private. This is to prevent code from
setting the operands of these intrinsics directly, which could easily result in
incorrect/invalid operands being set. This does not prevent these functions from
being called on a debug intrinsic at all, as they can still be called on any
CallInst pointer; it is assumed that any code directly setting the operands on a
generic call instruction is doing so safely. The intention for making these
functions private is to prevent DIArgLists from being overwritten by code that's
naively trying to replace one of the Values it points to, and also to fail fast
if a DbgVariableIntrinsic is updated to use a DIArgList without a valid
corresponding DIExpression.
This is not nice, but it's the best transient solution possible,
and is better than just duplicating the whole function.
The problem is, this function is widely used,
and it is not at all obvious that all the users
could be painlessly switched to operate on DomTreeUpdater,
and somehow i don't feel like porting all those users first.
This function is one of last three that not operate on DomTreeUpdater.
The function was introduced on Jan 23, 2019 in commit
73078ecd38.
Its definition was removed on Oct 27, 2020 in commit
0930763b4b, leaving the declaration
unused.
Add support for match-all tags and GOT-free runtime calls, which
are both required for the kernel to be able to support outlined
checks. This requires extending the access info to let the backend
know when to enable these features. To make the code easier to maintain
introduce an enum with the bit field positions for the access info.
Allow outlined checks to be enabled with -mllvm
-hwasan-inline-all-checks=0. Kernels that contain runtime support for
outlined checks may pass this flag. Kernels lacking runtime support
will continue to link because they do not pass the flag. Old versions
of LLVM will ignore the flag and continue to use inline checks.
With a separate kernel patch [1] I measured the code size of defconfig
+ tag-based KASAN, as well as boot time (i.e. time to init launch)
on a DragonBoard 845c with an Android arm64 GKI kernel. The results
are below:
code size boot time
before 92824064 6.18s
after 38822400 6.65s
[1] https://linux-review.googlesource.com/id/I1a30036c70ab3c3ee78d75ed9b87ef7cdc3fdb76
Depends on D90425
Differential Revision: https://reviews.llvm.org/D90426
This is a workaround for poor heuristics in the backend where we can
end up materializing the constant multiple times. This is particularly
bad when using outlined checks because we materialize it for every call
(because the backend considers it trivial to materialize).
As a result the field containing the shadow base value will always
be set so simplify the code taking that into account.
Differential Revision: https://reviews.llvm.org/D90425
This lets us support the scenario where a binary is linked from a mix
of object files with both instrumented and non-instrumented globals.
This is likely to occur on Android where the decision of whether to use
instrumented globals is based on the API level, which is user-facing.
Previously, in this scenario, it was possible for the comdat from
one of the object files with non-instrumented globals to be selected,
and since this comdat did not contain the note it would mean that the
note would be missing in the linked binary and the globals' shadow
memory would be left uninitialized, leading to a tag mismatch failure
at runtime when accessing one of the instrumented globals.
It is harmless to include the note when targeting a runtime that does
not support instrumenting globals because it will just be ignored.
Differential Revision: https://reviews.llvm.org/D85871
Along the lines of D77454 and D79968. Unlike loads and stores, the
default alignment is getPrefTypeAlign, to match the existing handling in
various places, including SelectionDAG and InstCombine.
Differential Revision: https://reviews.llvm.org/D80044
Summary: This was preventing MemorySanitizerLegacyPass from appearing in --print-after-all.
Reviewers: vitalybuka
Subscribers: hiraditya, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D79661
Summary:
Refactor getInterestingMemoryOperands() so that information about the
pointer operand is returned through an array of structures instead of
passing each piece of information separately by-value.
This is in preparation for returning information about multiple pointer
operands from a single instruction.
A side effect is that, instead of repeatedly generating the same
information through isInterestingMemoryAccess(), it is now simply collected
once and then passed around; that's probably more efficient.
HWAddressSanitizer has a bunch of copypasted code from AddressSanitizer,
so these changes have to be duplicated.
This is patch 3/4 of a patch series:
https://reviews.llvm.org/D77616 [PATCH 1/4] [AddressSanitizer] Refactor ClDebug{Min,Max} handling
https://reviews.llvm.org/D77617 [PATCH 2/4] [AddressSanitizer] Split out memory intrinsic handling
https://reviews.llvm.org/D77618 [PATCH 3/4] [AddressSanitizer] Refactor: Permit >1 interesting operands per instruction
https://reviews.llvm.org/D77619 [PATCH 4/4] [AddressSanitizer] Instrument byval call arguments
[glider: renamed llvm::InterestingMemoryOperand::Type to OpType to fix
GCC compilation]
Reviewers: kcc, glider
Reviewed By: glider
Subscribers: hiraditya, jfb, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D77618
Summary:
In both AddressSanitizer and HWAddressSanitizer, we first collect
instructions whose operands should be instrumented and memory intrinsics,
then instrument them. Both during collection and when inserting
instrumentation, they are handled separately.
Collect them separately and instrument them separately. This is a bit
more straightforward, and prepares for collecting operands instead of
instructions in a future patch.
This is patch 2/4 of a patch series:
https://reviews.llvm.org/D77616 [PATCH 1/4] [AddressSanitizer] Refactor ClDebug{Min,Max} handling
https://reviews.llvm.org/D77617 [PATCH 2/4] [AddressSanitizer] Split out memory intrinsic handling
https://reviews.llvm.org/D77618 [PATCH 3/4] [AddressSanitizer] Refactor: Permit >1 interesting operands per instruction
https://reviews.llvm.org/D77619 [PATCH 4/4] [AddressSanitizer] Instrument byval call arguments
Reviewers: kcc, glider
Reviewed By: glider
Subscribers: hiraditya, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D77617
I couldn't make arc land the changes properly, for some reason they all got
squashed. Reverting them now to land cleanly.
Summary: This reverts commit cfb5f89b62.
Reviewers: kcc, thejh
Subscribers:
Summary:
A following commit will split the loop over ToInstrument into two.
To avoid having to duplicate the condition for suppressing instrumentation
sites based on ClDebug{Min,Max}, refactor it out into a new function.
While we're at it, we can also avoid the indirection through
NumInstrumented for setting FunctionModified.
This is patch 1/4 of a patch series:
https://reviews.llvm.org/D77616 [PATCH 1/4] [AddressSanitizer] Refactor ClDebug{Min,Max} handling
https://reviews.llvm.org/D77617 [PATCH 2/4] [AddressSanitizer] Split out memory intrinsic handling
https://reviews.llvm.org/D77618 [PATCH 3/4] [AddressSanitizer] Refactor: Permit >1 interesting operands per instruction
https://reviews.llvm.org/D77619 [PATCH 4/4] [AddressSanitizer] Instrument byval call arguments
Reviewers: kcc, glider
Reviewed By: glider
Subscribers: jfb, hiraditya, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D77616
Summary:
This is a follow up on https://reviews.llvm.org/D71473#inline-647262.
There's a caveat here that `Align(1)` relies on the compiler understanding of `Log2_64` implementation to produce good code. One could use `Align()` as a replacement but I believe it is less clear that the alignment is one in that case.
Reviewers: xbolva00, courbet, bollu
Subscribers: arsenm, dylanmckay, sdardis, nemanjai, jvesely, nhaehnle, hiraditya, kbarton, jrtc27, atanasyan, jsji, Jim, kerbowa, cfe-commits, llvm-commits
Tags: #clang, #llvm
Differential Revision: https://reviews.llvm.org/D73099
As of D70146 lld GCs comdats as a group and no longer considers notes in
comdats to be GC roots, so we need to move the note to a comdat with a GC root
section (.init_array) in order to prevent lld from discarding the note.
Differential Revision: https://reviews.llvm.org/D72936
Summary:
Support alloca-referencing dbg.value in hwasan instrumentation.
Update AsmPrinter to emit DW_AT_LLVM_tag_offset when location is in
loclist format.
Reviewers: pcc
Subscribers: srhines, aprantl, hiraditya, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D70753
This file lists every pass in LLVM, and is included by Pass.h, which is
very popular. Every time we add, remove, or rename a pass in LLVM, it
caused lots of recompilation.
I found this fact by looking at this table, which is sorted by the
number of times a file was changed over the last 100,000 git commits
multiplied by the number of object files that depend on it in the
current checkout:
recompiles touches affected_files header
342380 95 3604 llvm/include/llvm/ADT/STLExtras.h
314730 234 1345 llvm/include/llvm/InitializePasses.h
307036 118 2602 llvm/include/llvm/ADT/APInt.h
213049 59 3611 llvm/include/llvm/Support/MathExtras.h
170422 47 3626 llvm/include/llvm/Support/Compiler.h
162225 45 3605 llvm/include/llvm/ADT/Optional.h
158319 63 2513 llvm/include/llvm/ADT/Triple.h
140322 39 3598 llvm/include/llvm/ADT/StringRef.h
137647 59 2333 llvm/include/llvm/Support/Error.h
131619 73 1803 llvm/include/llvm/Support/FileSystem.h
Before this change, touching InitializePasses.h would cause 1345 files
to recompile. After this change, touching it only causes 550 compiles in
an incremental rebuild.
Reviewers: bkramer, asbirlea, bollu, jdoerfert
Differential Revision: https://reviews.llvm.org/D70211
This was an experiment made possible by a non-standard feature of the
Android dynamic loader.
It required introducing a flag to tell the compiler which ABI was being
targeted.
This flag is no longer needed, since the generated code now works for
both ABI's.
We leave that flag untouched for backwards compatibility. This also
means that if we need to distinguish between targeted ABI's again
we can do that without disturbing any existing workflows.
We leave a comment in the source code and mention in the help text to
explain this for any confused person reading the code in the future.
Patch by Matthew Malcomson
Differential Revision: https://reviews.llvm.org/D69574
We can't use short granules with stack instrumentation when targeting older
API levels because the rest of the system won't understand the short granule
tags stored in shadow memory.
Moreover, we need to be able to let old binaries (which won't understand
short granule tags) run on a new system that supports short granule
tags. Such binaries will call the __hwasan_tag_mismatch function when their
outlined checks fail. We can compensate for the binary's lack of support
for short granules by implementing the short granule part of the check in
the __hwasan_tag_mismatch function. Unfortunately we can't do anything about
inline checks, but I don't believe that we can generate these by default on
aarch64, nor did we do so when the ABI was fixed.
A new function, __hwasan_tag_mismatch_v2, is introduced that lets code
targeting the new runtime avoid redoing the short granule check. Because tag
mismatches are rare this isn't important from a performance perspective; the
main benefit is that it introduces a symbol dependency that prevents binaries
targeting the new runtime from running on older (i.e. incompatible) runtimes.
Differential Revision: https://reviews.llvm.org/D68059
llvm-svn: 373035
Try harder to emulate "old runtime" in the test.
To get the old behavior with the new runtime library, we need both
disable personality function wrapping and enable landing pad
instrumentation.
llvm-svn: 369977
One problem with untagging memory in landing pads is that it only works
correctly if the function that catches the exception is instrumented.
If the function is uninstrumented, we have no opportunity to untag the
memory.
To address this, replace landing pad instrumentation with personality function
wrapping. Each function with an instrumented stack has its personality function
replaced with a wrapper provided by the runtime. Functions that did not have
a personality function to begin with also get wrappers if they may be unwound
past. As the unwinder calls personality functions during stack unwinding,
the original personality function is called and the function's stack frame is
untagged by the wrapper if the personality function instructs the unwinder
to keep unwinding. If unwinding stops at a landing pad, the function is
still responsible for untagging its stack frame if it resumes unwinding.
The old landing pad mechanism is preserved for compatibility with old runtimes.
Differential Revision: https://reviews.llvm.org/D66377
llvm-svn: 369721
Now that we've moved to C++14, we no longer need the llvm::make_unique
implementation from STLExtras.h. This patch is a mechanical replacement
of (hopefully) all the llvm::make_unique instances across the monorepo.
llvm-svn: 369013
Globals are instrumented by adding a pointer tag to their symbol values
and emitting metadata into a special section that allows the runtime to tag
their memory when the library is loaded.
Due to order of initialization issues explained in more detail in the comments,
shadow initialization cannot happen during regular global initialization.
Instead, the location of the global section is marked using an ELF note,
and we require libc support for calling a function provided by the HWASAN
runtime when libraries are loaded and unloaded.
Based on ideas discussed with @evgeny777 in D56672.
Differential Revision: https://reviews.llvm.org/D65770
llvm-svn: 368102
Summary:
While there is always a `Value::replaceAllUsesWith()`,
sometimes the replacement needs to be conditional.
I have only cleaned a few cases where `replaceUsesWithIf()`
could be used, to both add test coverage,
and show that it is actually useful.
Reviewers: jdoerfert, spatel, RKSimon, craig.topper
Reviewed By: jdoerfert
Subscribers: dschuff, sbc100, jgravelle-google, hiraditya, aheejin, george.burgess.iv, asbirlea, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D65528
llvm-svn: 367548
This will let us instrument globals during initialization. This required
making the new PM pass a module pass, which should still provide access to
analyses via the ModuleAnalysisManager.
Differential Revision: https://reviews.llvm.org/D64843
llvm-svn: 366379
A short granule is a granule of size between 1 and `TG-1` bytes. The size
of a short granule is stored at the location in shadow memory where the
granule's tag is normally stored, while the granule's actual tag is stored
in the last byte of the granule. This means that in order to verify that a
pointer tag matches a memory tag, HWASAN must check for two possibilities:
* the pointer tag is equal to the memory tag in shadow memory, or
* the shadow memory tag is actually a short granule size, the value being loaded
is in bounds of the granule and the pointer tag is equal to the last byte of
the granule.
Pointer tags between 1 to `TG-1` are possible and are as likely as any other
tag. This means that these tags in memory have two interpretations: the full
tag interpretation (where the pointer tag is between 1 and `TG-1` and the
last byte of the granule is ordinary data) and the short tag interpretation
(where the pointer tag is stored in the granule).
When HWASAN detects an error near a memory tag between 1 and `TG-1`, it
will show both the memory tag and the last byte of the granule. Currently,
it is up to the user to disambiguate the two possibilities.
Because this functionality obsoletes the right aligned heap feature of
the HWASAN memory allocator (and because we can no longer easily test
it), the feature is removed.
Also update the documentation to cover both short granule tags and
outlined checks.
Differential Revision: https://reviews.llvm.org/D63908
llvm-svn: 365551
This shaves an instruction (and a GOT entry in PIC code) off prologues of
functions with stack variables.
Differential Revision: https://reviews.llvm.org/D63472
llvm-svn: 364608
This saves roughly 32 bytes of instructions per function with stack objects
and causes us to preserve enough information that we can recover the original
tags of all stack variables.
Now that stack tags are deterministic, we no longer need to pass
-hwasan-generate-tags-with-calls during check-hwasan. This also means that
the new stack tag generation mechanism is exercised by check-hwasan.
Differential Revision: https://reviews.llvm.org/D63360
llvm-svn: 363636
The goal is to improve hwasan's error reporting for stack use-after-return by
recording enough information to allow the specific variable that was accessed
to be identified based on the pointer's tag. Currently we record the PC and
lower bits of SP for each stack frame we create (which will eventually be
enough to derive the base tag used by the stack frame) but that's not enough
to determine the specific tag for each variable, which is the stack frame's
base tag XOR a value (the "tag offset") that is unique for each variable in
a function.
In IR, the tag offset is most naturally represented as part of a location
expression on the llvm.dbg.declare instruction. However, the presence of the
tag offset in the variable's actual location expression is likely to confuse
debuggers which won't know about tag offsets, and moreover the tag offset
is not required for a debugger to determine the location of the variable on
the stack, so at the DWARF level it is represented as an attribute so that
it will be ignored by debuggers that don't know about it.
Differential Revision: https://reviews.llvm.org/D63119
llvm-svn: 363635
Summary:
Adds a call to __hwasan_handle_vfork(SP) at each landingpad entry.
Reusing __hwasan_handle_vfork instead of introducing a new runtime call
in order to be ABI-compatible with old runtime library.
Reviewers: pcc
Subscribers: kubamracek, hiraditya, #sanitizers, llvm-commits
Tags: #sanitizers, #llvm
Differential Revision: https://reviews.llvm.org/D61968
llvm-svn: 360959
Port hardware assisted address sanitizer to new PM following the same guidelines as msan and tsan.
Changes:
- Separate HWAddressSanitizer into a pass class and a sanitizer class.
- Create new PM wrapper pass for the sanitizer class.
- Use the getOrINsert pattern for some module level initialization declarations.
- Also enable kernel-kwasan in new PM
- Update llvm tests and add clang test.
Differential Revision: https://reviews.llvm.org/D61709
llvm-svn: 360707
It's been on in Android for a while without causing problems, so it's time
to make it the default and remove the flag.
Differential Revision: https://reviews.llvm.org/D60355
llvm-svn: 357960
This cleans up all GetElementPtr creation in LLVM to explicitly pass a
value type rather than deriving it from the pointer's element-type.
Differential Revision: https://reviews.llvm.org/D57173
llvm-svn: 352913
This cleans up all LoadInst creation in LLVM to explicitly pass the
value type rather than deriving it from the pointer's element-type.
Differential Revision: https://reviews.llvm.org/D57172
llvm-svn: 352911
Recommit r352791 after tweaking DerivedTypes.h slightly, so that gcc
doesn't choke on it, hopefully.
Original Message:
The FunctionCallee type is effectively a {FunctionType*,Value*} pair,
and is a useful convenience to enable code to continue passing the
result of getOrInsertFunction() through to EmitCall, even once pointer
types lose their pointee-type.
Then:
- update the CallInst/InvokeInst instruction creation functions to
take a Callee,
- modify getOrInsertFunction to return FunctionCallee, and
- update all callers appropriately.
One area of particular note is the change to the sanitizer
code. Previously, they had been casting the result of
`getOrInsertFunction` to a `Function*` via
`checkSanitizerInterfaceFunction`, and storing that. That would report
an error if someone had already inserted a function declaraction with
a mismatching signature.
However, in general, LLVM allows for such mismatches, as
`getOrInsertFunction` will automatically insert a bitcast if
needed. As part of this cleanup, cause the sanitizer code to do the
same. (It will call its functions using the expected signature,
however they may have been declared.)
Finally, in a small number of locations, callers of
`getOrInsertFunction` actually were expecting/requiring that a brand
new function was being created. In such cases, I've switched them to
Function::Create instead.
Differential Revision: https://reviews.llvm.org/D57315
llvm-svn: 352827
This reverts commit f47d6b38c7 (r352791).
Seems to run into compilation failures with GCC (but not clang, where
I tested it). Reverting while I investigate.
llvm-svn: 352800
The FunctionCallee type is effectively a {FunctionType*,Value*} pair,
and is a useful convenience to enable code to continue passing the
result of getOrInsertFunction() through to EmitCall, even once pointer
types lose their pointee-type.
Then:
- update the CallInst/InvokeInst instruction creation functions to
take a Callee,
- modify getOrInsertFunction to return FunctionCallee, and
- update all callers appropriately.
One area of particular note is the change to the sanitizer
code. Previously, they had been casting the result of
`getOrInsertFunction` to a `Function*` via
`checkSanitizerInterfaceFunction`, and storing that. That would report
an error if someone had already inserted a function declaraction with
a mismatching signature.
However, in general, LLVM allows for such mismatches, as
`getOrInsertFunction` will automatically insert a bitcast if
needed. As part of this cleanup, cause the sanitizer code to do the
same. (It will call its functions using the expected signature,
however they may have been declared.)
Finally, in a small number of locations, callers of
`getOrInsertFunction` actually were expecting/requiring that a brand
new function was being created. In such cases, I've switched them to
Function::Create instead.
Differential Revision: https://reviews.llvm.org/D57315
llvm-svn: 352791
Otherwise they are treated as dynamic allocas, which ends up increasing
code size significantly. This reduces size of Chromium base_unittests
by 2MB (6.7%).
Differential Revision: https://reviews.llvm.org/D57205
llvm-svn: 352152
This saves a cbz+cold call in the interceptor ABI, as well as a realign
in both ABIs, trading off a dcache entry against some branch predictor
entries and some code size.
Unfortunately the functionality is hidden behind a flag because ifunc is
known to be broken on static binaries on Android.
Differential Revision: https://reviews.llvm.org/D57084
llvm-svn: 351989
Each hwasan check requires emitting a small piece of code like this:
https://clang.llvm.org/docs/HardwareAssistedAddressSanitizerDesign.html#memory-accesses
The problem with this is that these code blocks typically bloat code
size significantly.
An obvious solution is to outline these blocks of code. In fact, this
has already been implemented under the -hwasan-instrument-with-calls
flag. However, as currently implemented this has a number of problems:
- The functions use the same calling convention as regular C functions.
This means that the backend must spill all temporary registers as
required by the platform's C calling convention, even though the
check only needs two registers on the hot path.
- The functions take the address to be checked in a fixed register,
which increases register pressure.
Both of these factors can diminish the code size effect and increase
the performance hit of -hwasan-instrument-with-calls.
The solution that this patch implements is to involve the aarch64
backend in outlining the checks. An intrinsic and pseudo-instruction
are created to represent a hwasan check. The pseudo-instruction
is register allocated like any other instruction, and we allow the
register allocator to select almost any register for the address to
check. A particular combination of (register selection, type of check)
triggers the creation in the backend of a function to handle the check
for specifically that pair. The resulting functions are deduplicated by
the linker. The pseudo-instruction (really the function) is specified
to preserve all registers except for the registers that the AAPCS
specifies may be clobbered by a call.
To measure the code size and performance effect of this change, I
took a number of measurements using Chromium for Android on aarch64,
comparing a browser with inlined checks (the baseline) against a
browser with outlined checks.
Code size: Size of .text decreases from 243897420 to 171619972 bytes,
or a 30% decrease.
Performance: Using Chromium's blink_perf.layout microbenchmarks I
measured a median performance regression of 6.24%.
The fact that a perf/size tradeoff is evident here suggests that
we might want to make the new behaviour conditional on -Os/-Oz.
But for now I've enabled it unconditionally, my reasoning being that
hwasan users typically expect a relatively large perf hit, and ~6%
isn't really adding much. We may want to revisit this decision in
the future, though.
I also tried experimenting with varying the number of registers
selectable by the hwasan check pseudo-instruction (which would result
in fewer variants being created), on the hypothesis that creating
fewer variants of the function would expose another perf/size tradeoff
by reducing icache pressure from the check functions at the cost of
register pressure. Although I did observe a code size increase with
fewer registers, I did not observe a strong correlation between the
number of registers and the performance of the resulting browser on the
microbenchmarks, so I conclude that we might as well use ~all registers
to get the maximum code size improvement. My results are below:
Regs | .text size | Perf hit
-----+------------+---------
~all | 171619972 | 6.24%
16 | 171765192 | 7.03%
8 | 172917788 | 5.82%
4 | 177054016 | 6.89%
Differential Revision: https://reviews.llvm.org/D56954
llvm-svn: 351920
to reflect the new license.
We understand that people may be surprised that we're moving the header
entirely to discuss the new license. We checked this carefully with the
Foundation's lawyer and we believe this is the correct approach.
Essentially, all code in the project is now made available by the LLVM
project under our new license, so you will see that the license headers
include that license only. Some of our contributors have contributed
code under our old license, and accordingly, we have retained a copy of
our old license notice in the top-level files in each project and
repository.
llvm-svn: 351636
The problem is similar to D55986 but for threads: a process with the
interceptor hwasan library loaded might have some threads started by
instrumented libraries and some by uninstrumented libraries, and we
need to be able to run instrumented code on the latter.
The solution is to perform per-thread initialization lazily. If a
function needs to access shadow memory or add itself to the per-thread
ring buffer its prologue checks to see whether the value in the
sanitizer TLS slot is null, and if so it calls __hwasan_thread_enter
and reloads from the TLS slot. The runtime does the same thing if it
needs to access this data structure.
This change means that the code generator needs to know whether we
are targeting the interceptor runtime, since we don't want to pay
the cost of lazy initialization when targeting a platform with native
hwasan support. A flag -fsanitize-hwaddress-abi={interceptor,platform}
has been introduced for selecting the runtime ABI to target. The
default ABI is set to interceptor since it's assumed that it will
be more common that users will be compiling application code than
platform code.
Because we can no longer assume that the TLS slot is initialized,
the pthread_create interceptor is no longer necessary, so it has
been removed.
Ideally, lazy initialization should only cost one instruction in the
hot path, but at present the call may cause us to spill arguments
to the stack, which means more instructions in the hot path (or
theoretically in the cold path if the spills are moved with shrink
wrapping). With an appropriately chosen calling convention for
the per-thread initialization function (TODO) the hot path should
always need just one instruction and the cold path should need two
instructions with no spilling required.
Differential Revision: https://reviews.llvm.org/D56038
llvm-svn: 350429
Summary:
The change is needed to support ELF TLS in Android. See D55581 for the
same change in compiler-rt.
Reviewers: srhines, eugenis
Reviewed By: eugenis
Subscribers: srhines, llvm-commits
Differential Revision: https://reviews.llvm.org/D55592
llvm-svn: 348983
Summary:
At compile-time, create an array of {PC,HumanReadableStackFrameDescription}
for every function that has an instrumented frame, and pass this array
to the run-time at the module-init time.
Similar to how we handle pc-table in SanitizerCoverage.
The run-time is dummy, will add the actual logic in later commits.
Reviewers: morehouse, eugenis
Reviewed By: eugenis
Subscribers: srhines, llvm-commits, kubamracek
Differential Revision: https://reviews.llvm.org/D53227
llvm-svn: 344985
This requires updating a number of .cpp files to adapt to the new API.
I've just systematically updated all uses of `TerminatorInst` within
these files te `Instruction` so thta I won't have to touch them again in
the future.
llvm-svn: 344498
Summary:
Display a list of recent stack frames (not a stack trace!) when
tag-mismatch is detected on a stack address.
The implementation uses alignment tricks to get both the address of
the history buffer, and the base address of the shadow with a single
8-byte load. See the comment in hwasan_thread_list.h for more
details.
Developed in collaboration with Kostya Serebryany.
Reviewers: kcc
Subscribers: srhines, kubamracek, mgorny, hiraditya, jfb, llvm-commits
Differential Revision: https://reviews.llvm.org/D52249
llvm-svn: 342923
Summary:
Display a list of recent stack frames (not a stack trace!) when
tag-mismatch is detected on a stack address.
The implementation uses alignment tricks to get both the address of
the history buffer, and the base address of the shadow with a single
8-byte load. See the comment in hwasan_thread_list.h for more
details.
Developed in collaboration with Kostya Serebryany.
Reviewers: kcc
Subscribers: srhines, kubamracek, mgorny, hiraditya, jfb, llvm-commits
Differential Revision: https://reviews.llvm.org/D52249
llvm-svn: 342921
Summary: Similar to asan's flag, it can be used to disable the use of ifunc to access hwasan shadow address.
Reviewers: vitalybuka, kcc
Subscribers: srhines, hiraditya, llvm-commits
Differential Revision: https://reviews.llvm.org/D50544
llvm-svn: 339447
Summary:
Retagging allocas before returning from the function might help
detecting use after return bugs, but it does not work at all in real
life, when instrumented and non-instrumented code is intermixed.
Consider the following code:
F_non_instrumented() {
T x;
F1_instrumented(&x);
...
}
{
F_instrumented();
F_non_instrumented();
}
- F_instrumented call leaves the stack below the current sp tagged
randomly for UAR detection
- F_non_instrumented allocates its own vars on that tagged stack,
not generating any tags, that is the address of x has tag 0, but the
shadow memory still contains tags left behind by F_instrumented on the
previous step
- F1_instrumented verifies &x before using it and traps on tag mismatch,
0 vs whatever tag was set by F_instrumented
Reviewers: eugenis
Subscribers: srhines, llvm-commits
Differential Revision: https://reviews.llvm.org/D48664
llvm-svn: 336011
The DEBUG() macro is very generic so it might clash with other projects.
The renaming was done as follows:
- git grep -l 'DEBUG' | xargs sed -i 's/\bDEBUG\s\?(/LLVM_DEBUG(/g'
- git diff -U0 master | ../clang/tools/clang-format/clang-format-diff.py -i -p1 -style LLVM
- Manual change to APInt
- Manually chage DOCS as regex doesn't match it.
In the transition period the DEBUG() macro is still present and aliased
to the LLVM_DEBUG() one.
Differential Revision: https://reviews.llvm.org/D43624
llvm-svn: 332240
We've been running doxygen with the autobrief option for a couple of
years now. This makes the \brief markers into our comments
redundant. Since they are a visual distraction and we don't want to
encourage more \brief markers in new code either, this patch removes
them all.
Patch produced by
for i in $(git grep -l '\\brief'); do perl -pi -e 's/\\brief //g' $i & done
Differential Revision: https://reviews.llvm.org/D46290
llvm-svn: 331272
Summary:
Support the dynamic shadow memory offset (the default case for user
space now) and static non-zero shadow memory offset
(-hwasan-mapping-offset option). Keeping the the latter case around
for functionality and performance comparison tests (and mostly for
-hwasan-mapping-offset=0 case).
The implementation is stripped down ASan one, picking only the relevant
parts in the following assumptions: shadow scale is fixed, the shadow
memory is dynamic, it is accessed via ifunc global, shadow memory address
rematerialization is suppressed.
Keep zero-based shadow memory for kernel (-hwasan-kernel option) and
calls instreumented case (-hwasan-instrument-with-calls option), which
essentially means that the generated code is not changed in these cases.
Reviewers: eugenis
Subscribers: srhines, llvm-commits
Differential Revision: https://reviews.llvm.org/D45840
llvm-svn: 330475
Sometimes instead of storing addresses as is, the kernel stores the address of
a page and an offset within that page, and then computes the actual address
when it needs to make an access. Because of this the pointer tag gets lost
(gets set to 0xff). The solution is to ignore all accesses tagged with 0xff.
This patch adds a -hwasan-match-all-tag flag to hwasan, which allows to ignore
accesses through pointers with a particular pointer tag value for validity.
Patch by Andrey Konovalov.
Differential Revision: https://reviews.llvm.org/D44827
llvm-svn: 329228
Summary:
Porting HWASan to Linux x86-64, first of the three patches, LLVM part.
The approach is similar to ARM case, trap signal is used to communicate
memory tag check failure. int3 instruction is used to generate a signal,
access parameters are stored in nop [eax + offset] instruction immediately
following the int3 one.
One notable difference is that x86-64 has to untag the pointer before use
due to the lack of feature comparable to ARM's TBI (Top Byte Ignore).
Reviewers: eugenis
Subscribers: kristof.beyls, llvm-commits
Differential Revision: https://reviews.llvm.org/D44699
llvm-svn: 328342
This patch changes hwasan inline instrumentation:
Fixes address untagging for shadow address calculation (use 0xFF instead of 0x00 for the top byte).
Emits brk instruction instead of hlt for the kernel and user space.
Use 0x900 instead of 0x100 for brk immediate (0x100 - 0x800 are unavailable in the kernel).
Fixes and adds appropriate tests.
Patch by Andrey Konovalov.
Differential Revision: https://reviews.llvm.org/D43135
llvm-svn: 325711
Summary:
Kernel addresses have 0xFF in the most significant byte.
A tag can not be pushed there with OR (tag << 56);
use AND ((tag << 56) | 0x00FF..FF) instead.
Reviewers: kcc, andreyknvl
Subscribers: srhines, llvm-commits, hiraditya
Differential Revision: https://reviews.llvm.org/D42941
llvm-svn: 324691
Summary:
-hwasan-mapping-offset defines the non-zero shadow base address.
-hwasan-kernel disables calls to __hwasan_init in module constructors.
Unlike ASan, -hwasan-kernel does not force callback instrumentation.
This is controlled separately with -hwasan-instrument-with-calls.
Reviewers: kcc
Subscribers: srhines, hiraditya, llvm-commits
Differential Revision: https://reviews.llvm.org/D42141
llvm-svn: 322785
Summary:
Very basic stack instrumentation using tagged pointers.
Tag for N'th alloca in a function is built as XOR of:
* base tag for the function, which is just some bits of SP (poor
man's random)
* small constant which is a function of N.
Allocas are aligned to 16 bytes. On every ReturnInst allocas are
re-tagged to catch use-after-return.
This implementation has a bunch of issues that will be taken care of
later:
1. lifetime intrinsics referring to tagged pointers are not
recognized in SDAG. This effectively disables stack coloring.
2. Generated code is quite inefficient. There is one extra
instruction at each memory access that adds the base tag to the
untagged alloca address. It would be better to keep tagged SP in a
callee-saved register and address allocas as an offset of that XOR
retag, but that needs better coordination between hwasan
instrumentation pass and prologue/epilogue insertion.
3. Lifetime instrinsics are ignored and use-after-scope is not
implemented. This would be harder to do than in ASan, because we
need to use a differently tagged pointer depending on which
lifetime.start / lifetime.end the current instruction is dominated
/ post-dominated.
Reviewers: kcc, alekseyshl
Subscribers: srhines, kubamracek, javed.absar, hiraditya, llvm-commits
Differential Revision: https://reviews.llvm.org/D41602
llvm-svn: 322324