Commit Graph

748 Commits

Author SHA1 Message Date
Nikita Popov 941c8e0ea5 [Bitcode] Support expanding constant expressions into instructions
This implements an autoupgrade from constant expressions to
instructions, which is needed for
https://discourse.llvm.org/t/rfc-remove-most-constant-expressions/63179.

The basic approach is that constant expressions (CST_CODE_CE_*
records) now initially only create a BitcodeConstant value that
holds opcode, flags and operands IDs. Then, when the value actually
gets used, it can be converted either into a constant expression
(if that expression type is still supported) or into a sequence of
instructions. As currently all expressions are still supported,
-expand-constant-exprs is added for testing purposes, to force
expansion.

PHI nodes require special handling, because the constant expression
needs to be evaluated on the incoming edge. We do this by putting
it into a temporary block and then wiring it up appropriately
afterwards (for non-critical edges, we could also move the
instructions into the predecessor).

This also removes the need for the forward referenced constants
machinery, as the BitcodeConstants only hold value IDs. At the
point where the value is actually materialized, no forward
references are needed anymore.

Differential Revision: https://reviews.llvm.org/D127729
2022-06-28 11:09:46 +02:00
Bradley Smith a83aa33d1b [IR] Move vector.insert/vector.extract out of experimental namespace
These intrinsics are now fundemental for SVE code generation and have been
present for a year and a half, hence move them out of the experimental
namespace.

Differential Revision: https://reviews.llvm.org/D127976
2022-06-27 10:48:45 +00:00
Mitch Phillips 8db981d463 Add sanitizer-specific GlobalValue attributes.
Plan is the migrate the global variable metadata for sanitizers, that's
currently carried around generally in the 'llvm.asan.globals' section,
onto the global variable itself.

This patch adds the attribute and plumbs it through the LLVM IR and
bitcode formats, but is a no-op other than that so far.

Reviewed By: vitalybuka, kstoimenov

Differential Revision: https://reviews.llvm.org/D126100
2022-06-10 12:28:18 -07:00
Nikita Popov 471bfb7016 [Bitcode] Re-enable verify-uselistorder test (NFC)
This issue has since been fixed, so re-enable the commented RUN
line.
2022-06-08 11:29:28 +02:00
Augie Fackler 42861faa8e attributes: introduce allockind attr for describing allocator fn behavior
I chose to encode the allockind information in a string constant because
otherwise we would get a bit of an explosion of keywords to deal with
the possible permutations of allocation function types.

I'm not sure that CodeGen.h is the correct place for this enum, but it
seemed to kind of match the UWTableKind enum so I put it in the same
place. Constructive suggestions on a better location most certainly
encouraged.

Differential Revision: https://reviews.llvm.org/D123088
2022-05-31 10:01:17 -04:00
Takafumi Arakaki 18e6b8234a Allow pointer types for atomicrmw xchg
This adds support for pointer types for `atomic xchg` and let us write
instructions such as `atomicrmw xchg i64** %0, i64* %1 seq_cst`. This
is similar to the patch for allowing atomicrmw xchg on floating point
types: https://reviews.llvm.org/D52416.

Differential Revision: https://reviews.llvm.org/D124728
2022-05-25 16:20:26 +00:00
Ahmed Bougacha 362b4066f0 [ObjCARC] Drop nullary clang.arc.attachedcall bundles in autoupgrade.
In certain use-cases, these can be emitted by old compilers, but the
operand is now always required.  These are only used for optimizations,
so it's safe to drop them if they happen to have the now-invalid format.
The semantically-required call is already a separate instruction.

Differential Revision: https://reviews.llvm.org/D123811
2022-05-20 15:27:29 -07:00
Wende Tan 6baaad740a [Bitcode] Include indirect users of BlockAddresses in bitcode
The original fix (commit 23ec5782c3) of
https://github.com/llvm/llvm-project/issues/52787 only adds `Function`s
that have `Instruction`s that directly use `BlockAddress`es into the
bitcode (`FUNC_CODE_BLOCKADDR_USERS`).

However, in either @rickyz's original reproducing code:

```
void f(long);

__attribute__((noinline)) static void fun(long x) {
  f(x + 1);
}

void repro(void) {
  fun(({
    label:
      (long)&&label;
  }));
}
```

```
...
define dso_local void @repro() #0 {
entry:
  br label %label

label:                                            ; preds = %entry
  tail call fastcc void @fun()
  ret void
}

define internal fastcc void @fun() unnamed_addr #1 {
entry:
  tail call void @f(i64 add (i64 ptrtoint (i8* blockaddress(@repro, %label) to i64), i64 1)) #3
  ret void
}
...
```

or the xfs and overlayfs in the Linux kernel, `BlockAddress`es (e.g.,
`i8* blockaddress(@repro, %label)`) may first compose `ConstantExpr`s
(e.g., `i64 ptrtoint (i8* blockaddress(@repro, %label) to i64)`) and
then used by `Instruction`s. This case is not handled by the original
fix.

This patch adds *indirect* users of `BlockAddress`es, i.e., the
`Instruction`s using some `Constant`s which further use the
`BlockAddress`es, into the bitcode as well, by doing depth-first
searches.

Fixes: https://github.com/llvm/llvm-project/issues/52787
Fixes: 23ec5782c3 ("[Bitcode] materialize Functions early when BlockAddress taken")

Reviewed By: nickdesaulniers

Differential Revision: https://reviews.llvm.org/D124878
2022-05-10 16:03:42 -07:00
Augie Fackler a907d36cfe Attributes: add a new `allocptr` attribute
This continues the push away from hard-coded knowledge about functions
towards attributes. We'll use this to annotate free(), realloc() and
cousins and obviate the hard-coded list of free functions.

Differential Revision: https://reviews.llvm.org/D123083
2022-04-26 13:57:11 -04:00
Alex Richardson 9107cd632d [AutoUpgrade] Don't lose attributes when upgrading mem intrinsics
The original AutoUpgrade code from 1e68724d24
did not retain existing attributes. I noticed this in some downstream test
cases, but it turns out there are also two affected testcase upstream.

Differential Revision: https://reviews.llvm.org/D121971
2022-04-13 09:30:10 +00:00
Daniel Kiss b0343a38a5 Support the min of module flags when linking, use for AArch64 BTI/PAC-RET
LTO objects might compiled with different `mbranch-protection` flags which will cause an error in the linker.
Such a setup is allowed in the normal build with this change that is possible.

Reviewed By: pcc

Differential Revision: https://reviews.llvm.org/D123493
2022-04-13 09:31:51 +02:00
Nick Desaulniers 23ec5782c3 [Bitcode] materialize Functions early when BlockAddress taken
IRLinker builds a work list of functions to materialize, then moves them
from a source module to a destination module one at a time.

This is a problem for blockaddress Constants, since they need not refer
to the function they are used in; IPSCCP is quite good at sinking these
constants deep into other functions when passed as arguments.

This would lead to curious errors during LTO:
  ld.lld: error: Never resolved function from blockaddress ...
based on the ordering of function definitions in IR.

The problem was that IRLinker would basically do:

  for function f in worklist:
    materialize f
    splice f from source module to destination module

in one pass, with Functions being lazily added to the running worklist.
This confuses BitcodeReader, which cannot disambiguate whether a
blockaddress is referring to a function which has not yet been parsed
("materialized") or is simply empty because its body was spliced out.
This causes BitcodeReader to insert Functions into its BasicBlockFwdRefs
list incorrectly, as it will never re-materialize an already
materialized (but spliced out) function.

Because of the possibility that blockaddress Constants may appear in
Functions other than the ones they reference, this patch adds a new
bitcode function code FUNC_CODE_BLOCKADDR_USERS that is a simple list of
Functions that contain BlockAddress Constants that refer back to this
Function, rather then the Function they are scoped in. We then
materialize those functions when materializing `f` from the example loop
above. This might over-materialize Functions should the user of
BitcodeReader ultimately decide not to link those Functions, but we can
at least now we can avoid this ordering related issue with blockaddresses.

Fixes: https://github.com/llvm/llvm-project/issues/52787
Fixes: https://github.com/ClangBuiltLinux/linux/issues/1215

Reviewed By: dexonsmith

Differential Revision: https://reviews.llvm.org/D120781
2022-04-12 11:38:35 -07:00
Nikita Popov 8a72391f60 [IR] Require intrinsic struct return type to be anonymous
This is an alternative to D122376. Rather than working around the
problem, this patch requires that struct return types in intrinsics
are anonymous/literal and adds auto-upgrade code to convert
existing uses of intrinsics with named struct types.

This ensures that the mapping between intrinsic name and
intrinsic function type is actually bijective, as it is supposed
to be.

This also fixes https://github.com/llvm/llvm-project/issues/37891.

Differential Revision: https://reviews.llvm.org/D122471
2022-03-30 09:51:24 +02:00
Johannes Doerfert a81fff8afd Reapply "[Intrinsics] Add `nocallback` to the default intrinsic attributes"
This reverts commit c5f789050d and
reapplies 7aea3ea8c3 with additional test
changes.
2022-03-25 09:36:50 -05:00
Arthur Eubanks 2362c4ecdc Revert "Revert "[OpaquePtr][LLParser] Automatically detect opaque pointers in .ll files""
This reverts commit 9c96a6bbfd.

Issues were already fixed at head.
2022-03-21 17:24:56 -07:00
Mitch Phillips 9c96a6bbfd Revert "[OpaquePtr][LLParser] Automatically detect opaque pointers in .ll files"
This reverts commit 295172ef51.

Reason: Broke the ASan buildbot. More details are available on the
original Phab review at https://reviews.llvm.org/D119482.
2022-03-21 16:04:36 -07:00
Arthur Eubanks 295172ef51 [OpaquePtr][LLParser] Automatically detect opaque pointers in .ll files
This allows us to not have to specify -opaque-pointers when updating
IR tests from typed pointers to opaque pointers.

We detect opaque pointers in .ll files by looking for relevant tokens,
either "ptr" or "*".

Reviewed By: #opaque-pointers, nikic

Differential Revision: https://reviews.llvm.org/D119482
2022-03-17 08:37:18 -07:00
Arthur Eubanks 2371c5a0e0 [OpaquePtr][ARM] Use elementtype on ldrex/ldaex/stlex/strex
Includes verifier changes checking the elementtype, clang codegen
changes to emit the elementtype, and ISel changes using the elementtype.

Basically the same as D120527.

Reviewed By: #opaque-pointers, nikic

Differential Revision: https://reviews.llvm.org/D121847
2022-03-16 14:11:53 -07:00
Arthur Eubanks 250620f76e [OpaquePtr][AArch64] Use elementtype on ldxr/stxr
Includes verifier changes checking the elementtype, clang codegen
changes to emit the elementtype, and ISel changes using the elementtype.

Reviewed By: #opaque-pointers, nikic

Differential Revision: https://reviews.llvm.org/D120527
2022-03-14 10:09:59 -07:00
Nikita Popov 2182665305 [Bitcode] Don't confuse type attributes on declaration and call
We should not be using APIs here that try to fetch the attribute
from both the call attributes and the function attributes. Otherwise
we'll try to upgrade a non-existent sret attribute on the call using
the attribute on the function.
2022-03-11 17:32:40 +01:00
Nikita Popov b190108693 [Bitcode] Encode alloca address space
Since D101045, allocas are no longer required to be part of the
default alloca address space. There may be allocas in multiple
different address spaces. However, the bitcode reader would
simply assume the default alloca address space, resulting in
either an error or incorrect IR.

Add an optional record for allocas which encodes the address
space.
2022-03-11 16:08:38 +01:00
Nikita Popov cda82d39f3 [Bitcode] Check for type mismatch when assigning value
If the value is forward-declared, then the type must match,
otherwise we can't RAUW.
2022-03-11 12:12:54 +01:00
Nikita Popov e732f69ea1 [Bitcode] Report error for missing element type for attr upgrade
Otherwise this is going to crash either the TypeFinder or the
Verifier.
2022-03-11 10:18:42 +01:00
Nikita Popov 22f9159bed [BitcodeReader] Support GEP without indices
LLVM considers these to be legal, so make sure the bitcode reader
can read them. I broke this when implementing opaque pointer
auto upgrade support.
2022-03-10 15:48:47 +01:00
Augie Fackler d664c4b73c Attributes: add a new allocalign attribute
This will let us start moving away from hard-coded attributes in
MemoryBuiltins.cpp and put the knowledge about various attribute
functions in the compilers that emit those calls where it probably
belongs.

Differential Revision: https://reviews.llvm.org/D117921
2022-03-04 15:57:53 -05:00
Nikita Popov 7a258c6a37 [Bitcode] Move x86_intrcc upgrade to bitcode reader
This upgrade requires access the legacy pointer element type, so
it needs to happen inside the bitcode reader.
2022-03-04 10:30:50 +01:00
Nikita Popov e3a9f68e2c [Bitcode] Fully support opaque pointer auto upgrade
This completes the propagation of type IDs through bitcode reading,
and switches remaining uses of getPointerElementType() to use
contained type IDs.

The main new thing here is that sometimes we need to create a type
ID for a type that was not explicitly encoded in bitcode (or we
don't know its ID at the current point). For such types we create a
"virtual" type ID, which is cached based on the type and the
contained type IDs. Luckily, we generally only need zero or one
contained type IDs, and in the one case where we need two, we can
get away with not including it in the cache key.

With this change, we pass the entirety of llvm-test-suite at O3
with opaque pointers.

Differential Revision: https://reviews.llvm.org/D120471
2022-03-04 10:23:06 +01:00
Tong Zhang 17ce89fa80 [SanitizerBounds] Add support for NoSanitizeBounds function
Currently adding attribute no_sanitize("bounds") isn't disabling
-fsanitize=local-bounds (also enabled in -fsanitize=bounds). The Clang
frontend handles fsanitize=array-bounds which can already be disabled by
no_sanitize("bounds"). However, instrumentation added by the
BoundsChecking pass in the middle-end cannot be disabled by the
attribute.

The fix is very similar to D102772 that added the ability to selectively
disable sanitizer pass on certain functions.

In this patch, if no_sanitize("bounds") is provided, an additional
function attribute (NoSanitizeBounds) is attached to IR to let the
BoundsChecking pass know we want to disable local-bounds checking. In
order to support this feature, the IR is extended (similar to D102772)
to make Clang able to preserve the information and let BoundsChecking
pass know bounds checking is disabled for certain function.

Reviewed By: melver

Differential Revision: https://reviews.llvm.org/D119816
2022-03-01 18:47:02 +01:00
Jan Korous 4a794d848c [Bitstream] Improve error propagation from ReadVBR
Differential Revision: https://reviews.llvm.org/D119307
2022-02-15 17:12:38 -08:00
Nikita Popov 1c456a8220 [Bitcode] Improve support for opaque-pointer bitcode upgrade
This is step two of supporting autoupgrade of old bitcode to opaque
pointers. Rather than tracking the element type ID of pointers in
particular, track all type IDs that a type contains. This allows us
to recover the element type in more complex situations, e.g. when
we need to determine the pointer element type of a vector element
or function type parameter.

Differential Revision: https://reviews.llvm.org/D119339
2022-02-15 09:39:48 +01:00
Ahmed Bougacha c703f852c9 [IR] Define "ptrauth" operand bundle.
This introduces a new "ptrauth" operand bundle to be used in
call/invoke. At the IR level, it's semantically equivalent to an
@llvm.ptrauth.auth followed by an indirect call, but it additionally
provides additional hardening, by preventing the intermediate raw
pointer from being exposed.

This mostly adds the IR definition, verifier checks, and support in
a couple of general helper functions. Clang IRGen and backend support
will come separately.

Note that we'll eventually want to support this bundle in indirectbr as
well, for similar reasons.  indirectbr currently doesn't support bundles
at all, and the IR data structures need to be updated to allow that.

Differential Revision: https://reviews.llvm.org/D113685
2022-02-14 11:27:35 -08:00
Momchil Velikov 6398903ac8 Extend the `uwtable` attribute with unwind table kind
We have the `clang -cc1` command-line option `-funwind-tables=1|2` and
the codegen option `VALUE_CODEGENOPT(UnwindTables, 2, 0) ///< Unwind
tables (1) or asynchronous unwind tables (2)`. However, this is
encoded in LLVM IR by the presence or the absence of the `uwtable`
attribute, i.e.  we lose the information whether to generate want just
some unwind tables or asynchronous unwind tables.

Asynchronous unwind tables take more space in the runtime image, I'd
estimate something like 80-90% more, as the difference is adding
roughly the same number of CFI directives as for prologues, only a bit
simpler (e.g. `.cfi_offset reg, off` vs. `.cfi_restore reg`). Or even
more, if you consider tail duplication of epilogue blocks.
Asynchronous unwind tables could also restrict code generation to
having only a finite number of frame pointer adjustments (an example
of *not* having a finite number of `SP` adjustments is on AArch64 when
untagging the stack (MTE) in some cases the compiler can modify `SP`
in a loop).
Having the CFI precise up to an instruction generally also means one
cannot bundle together CFI instructions once the prologue is done,
they need to be interspersed with ordinary instructions, which means
extra `DW_CFA_advance_loc` commands, further increasing the unwind
tables size.

That is to say, async unwind tables impose a non-negligible overhead,
yet for the most common use cases (like C++ exceptions), they are not
even needed.

This patch extends the `uwtable` attribute with an optional
value:
      -  `uwtable` (default to `async`)
      -  `uwtable(sync)`, synchronous unwind tables
      -  `uwtable(async)`, asynchronous (instruction precise) unwind tables

Reviewed By: MaskRay

Differential Revision: https://reviews.llvm.org/D114543
2022-02-14 14:35:02 +00:00
Nikita Popov c28b0b9d18 [Bitcode] Add partial support for opaque pointer auto-upgrade
Auto-upgrades that rely on the pointer element type do not work in
opaque pointer mode. The idea behind this patch is that we can
instead work with type IDs, for which we can retain the pointer
element type. For typed pointer bitcode, we will have a distinct
type ID for pointers with distinct element type, even if there will
only be a single corresponding opaque pointer type.

The disclaimer here is that this is only the first step of the change,
and there are still more getPointerElementType() calls to remove.
I expect that two more patches will be needed:
1. Track all "contained" type IDs, which will allow us to handle
function params (which are contained in the function type) and GEPs
(which may use vectors of pointers)
2. Track type IDs for values, which is e.g. necessary to handle loads.

Differential Revision: https://reviews.llvm.org/D118694
2022-02-11 09:32:46 +01:00
Nikita Popov ea93ca60ef [Bitcode] Fix size check for DIImportedEntity record
This was using && instead of ||.
2022-02-09 14:23:30 +01:00
Nikita Popov 72248712e5 [Bitcode] Check minimum size of constant GEP record
Checking this early, because we may end up reading up to two
records before the operands.
2022-02-09 14:23:30 +01:00
Nikita Popov bf17cb294a [Bitstream] Make MaxChunkSize word-size independent
We only support chunks <= 32 bits regardless of whether we're
running on a 64-bit platform or not. Chunk size > 32 <= 64 would
cause UB in the reading code.
2022-02-08 10:14:07 +01:00
Nikita Popov 6d52ea885f [Bitcode] Prevent OOB read for invalid name size 2022-02-08 09:49:39 +01:00
Nikita Popov 67348c8acf [Bitstream] Check for unterminated VBR
This avoid shift larger than bitwidth UB.
2022-02-08 09:49:39 +01:00
Andrew Browne 77f5a3fbba [Bitstream] Temporarily disable UBSan for invalid bitcode tests. This fixes failing sanitizer buildbots.
Tests added by
  3c86642edd
  f4fca0fbb0
cause existing code to exhibit UB:
  b4c6d1bb37/llvm/include/llvm/Bitstream/BitstreamReader.h (L244)

llvm/include/llvm/Bitstream/BitstreamReader.h:244:51:
runtime error: shift exponent 35 is too large for 32-bit type 'unsigned int'

Hitting the UBSan error before the expected error causes the test to fail under UBSan:

Reviewed By: vitalybuka

Differential Revision: https://reviews.llvm.org/D119182
2022-02-07 13:55:15 -08:00
Nikita Popov f4fca0fbb0 [Bitcode] Replace assertion with check 2022-02-07 12:39:35 +01:00
Nikita Popov 747fab903f [Bitcode] Add missing test file
This was supposed to be part of
ec18030f5f.
2022-02-07 12:37:21 +01:00
Nikita Popov fdf8cb978f [Bitcode] Handle invalid data layout gracefully 2022-02-07 12:28:15 +01:00
Nikita Popov 0c553bff8e [Bitcode] Guard against out of bounds value reference
We should make sure that the value ID is in bounds, otherwise
we will assert / read out of bounds.
2022-02-07 12:16:13 +01:00
Nikita Popov ec18030f5f [Bitstream] Check that there is enough space for blob
Instead of simply assuming that it will be zero. I double checked
that the bitstream reader doesn't have any special handling for
all-zero blobs, it will always write out the full contents.
2022-02-07 12:16:13 +01:00
Nikita Popov 89017772d9 [Bitcode] Don't assert on invalid attribute group record
Report an error instead.
2022-02-07 12:16:12 +01:00
Nikita Popov 3c86642edd [Bitstream] Reject implausibly large reservations
If we're trying to reserve more memory than bits in the stream,
reject this early to avoid OOM.
2022-02-07 12:16:12 +01:00
Nikita Popov 8a71854183 [Bitcode] Handle invalid abbrev number error more gracefully
Avoid report_fatal_error(), propagate the error upwards instead.
2022-02-07 10:34:34 +01:00
Nikita Popov c00ef03d73 [BitstreamReader] Handle errors more gracefully
Use proper error reporting instead of report_fatal_error().
2022-02-04 17:23:08 +01:00
Augie Fackler aade28ea70 add IR compatability test for (upcoming) allocsize attribute 2022-02-03 08:36:36 -08:00
Chih-Ping Chen 28bfa57a73 [DebugInfo] Add stringLocationExp field to DIStringType
DIStringType is used to encode the debug info of a character object
in Fortran. A Fortran deferred-length character object is typically
implemented as a pair of the following two pieces of info: An address
of the raw storage of the characters, and the length of the object.
The stringLocationExp field contains the DIExpression to get to the
raw storage.

This patch also enables the emission of DW_AT_data_location attribute
in a DW_TAG_string_type debug info entry based on stringLocationExp
in DIStringType.

A test is also added to ensure that the bitcode reader is backward
compatible with the old DIStringType format.

Differential Revision: https://reviews.llvm.org/D117586
2022-01-26 11:56:57 -05:00