Add basic support for the MemProf metadata (!memprof and !callsite)
which was initially described in "RFC: IR metadata format for MemProf"
(https://discourse.llvm.org/t/rfc-ir-metadata-format-for-memprof/59165).
The bulk of the patch is verification support, along with some tests.
There are a couple of changes to the format described in the original
RFC:
Initial measurements suggested that a tree format for the stack ids in
the contexts would be more efficient, but subsequent evaluation with
large applications showed that in fact the cost of the additional
metadata nodes required by this deduplication scheme overwhelmed the
benefit from sharing stack id nodes. Therefore, the implementation here
and in follow on patches utilizes a simpler scheme of lists of stack id
integers in the memprof profile contexts and callsite metadata. The
follow on matching patch employs context trimming optimizations to
reduce the cost.
Secondly, instead of verbosely listing all profiled fields in each
profiled context (memory info block or MIB), and deferring the
interpretation of the profile data, the profile data is evaluated and
converted into string tags specifying the behavior (e.g. "cold") during
profile matching. This reduces the verbosity of the profile metadata,
and allows additional context trimming optimizations. As a result, the
named metadata schema description is also no longer needed.
Differential Revision: https://reviews.llvm.org/D128141
D123493 introduced llvm::Module::Min to encode module flags metadata for AArch64
BTI/PAC-RET. llvm::Module::Min does not take effect when the flag is absent in
one module. This behavior is misleading and does not address backward
compatibility problems (when a bitcode with "branch-target-enforcement"==1 and
another without the flag are merged, the merge result is 1 instead of 0).
To address the problems, require Min flags to be non-negative and treat absence
as having a value of zero. For an old bitcode without
"branch-target-enforcement"/"sign-return-address", its value is as if 0.
Differential Revision: https://reviews.llvm.org/D129911
Undef tokens may appear in unreached code as result of RAUW of some optimization,
and it should not be considered as bad IR.
Patch by Dmitry Bakunevich!
Differential Revision: https://reviews.llvm.org/D128904
Reviewed By: mkazantsev
Following some recent discussions, this changes the representation
of callbrs in IR. The current blockaddress arguments are replaced
with `!` label constraints that refer directly to callbr indirect
destinations:
; Before:
%res = callbr i8* asm "", "=r,r,i"(i8* %x, i8* blockaddress(@test8, %foo))
to label %asm.fallthrough [label %foo]
; After:
%res = callbr i8* asm "", "=r,r,!i"(i8* %x)
to label %asm.fallthrough [label %foo]
The benefit of this is that we can easily update the successors of
a callbr, without having to worry about also updating blockaddress
references. This should allow us to remove some limitations:
* Allow unrolling/peeling/rotation of callbr, or any other
clone-based optimizations
(https://github.com/llvm/llvm-project/issues/41834)
* Allow duplicate successors
(https://github.com/llvm/llvm-project/issues/45248)
This is just the IR representation change though, I will follow up
with patches to remove limtations in various transformation passes
that are no longer needed.
Differential Revision: https://reviews.llvm.org/D129288
These intrinsics are now fundemental for SVE code generation and have been
present for a year and a half, hence move them out of the experimental
namespace.
Differential Revision: https://reviews.llvm.org/D127976
In the same spirit as D73543 and in reply to https://reviews.llvm.org/D126768#3549920 this patch is adding support for `__builtin_memset_inline`.
The idea is to get support from the compiler to easily write efficient memory function implementations.
This patch could be split in two:
- one for the LLVM part adding the `llvm.memset.inline.*` intrinsics.
- and another one for the Clang part providing the instrinsic as a builtin.
Differential Revision: https://reviews.llvm.org/D126903
I chose to encode the allockind information in a string constant because
otherwise we would get a bit of an explosion of keywords to deal with
the possible permutations of allocation function types.
I'm not sure that CodeGen.h is the correct place for this enum, but it
seemed to kind of match the UWTableKind enum so I put it in the same
place. Constructive suggestions on a better location most certainly
encouraged.
Differential Revision: https://reviews.llvm.org/D123088
This adds support for pointer types for `atomic xchg` and let us write
instructions such as `atomicrmw xchg i64** %0, i64* %1 seq_cst`. This
is similar to the patch for allowing atomicrmw xchg on floating point
types: https://reviews.llvm.org/D52416.
Differential Revision: https://reviews.llvm.org/D124728
This emits an `st_size` that represents the actual useable size of an object before the redzone is added.
Reviewed By: vitalybuka, MaskRay, hctim
Differential Revision: https://reviews.llvm.org/D123010
LTO objects might compiled with different `mbranch-protection` flags which will cause an error in the linker.
Such a setup is allowed in the normal build with this change that is possible.
Reviewed By: pcc
Differential Revision: https://reviews.llvm.org/D123493
The LLVM IR verifier and analysis linter defines and uses several macros in
code that performs validation of IR expectations. Previously, these macros
were named with an 'Assert' prefix. These names were misleading since the
macro definitions are not conditioned on build kind; they are defined
identically in builds that have asserts enabled and those that do not. This
was confusing since an LLVM developer might expect these macros to be
conditionally enabled as 'assert' is. Further confusion was possible since
the LLVM IR verifier is implicitly disabled (in Clang::ConstructJob()) for
builds without asserts enabled, but only for Clang driver invocations; not
for clang -cc1 invocations. This could make it appear that the macros were
not active for builds without asserts enabled, e.g. when investigating
behavior using the Clang driver, and thus lead to surprises when running
tests that exercise the clang -cc1 interface.
This change renames this set of macros as follows:
Assert -> Check
AssertDI -> CheckDI
AssertTBAA -> CheckTBAA
This patch mostly follows up on D121292 which introduced the vp.fcmp
intrinsic.
Reviewed By: craig.topper
Differential Revision: https://reviews.llvm.org/D122729
This patch adds the first support for vector-predicated comparison
intrinsics, starting with vp.fcmp. It uses metadata to encode its
condition code, like the llvm.experimental.constrained.fcmp intrinsic.
Reviewed By: craig.topper
Differential Revision: https://reviews.llvm.org/D121292
In DAGISel, the parameter alignment only have 4 bits to hold the value.
The encode(alignment) would plus the value by 1, so the max aligment that
ISel can support is 2^14. This patch verify align attribute for parameter.
Differential Revision: https://reviews.llvm.org/D122130
In DAGISel, the parameter alignment only have 4 bits to hold the value.
The encode(alignment) would plus the shift value by 1, so the max aligment
ISel can support is 2^14. This patch verify the parameter and return
value for alignment.
Differential Revision: https://reviews.llvm.org/D121898
Includes verifier changes checking the elementtype, clang codegen
changes to emit the elementtype, and ISel changes using the elementtype.
Basically the same as D120527.
Reviewed By: #opaque-pointers, nikic
Differential Revision: https://reviews.llvm.org/D121847
Includes verifier changes checking the elementtype, clang codegen
changes to emit the elementtype, and ISel changes using the elementtype.
Reviewed By: #opaque-pointers, nikic
Differential Revision: https://reviews.llvm.org/D120527
According to LangRef, an access scope must have zero operands and
be distinct. The access group may either be a single access scope
or a list of access scopes.
LoopInfo may assert if this is not the case.
Per LangRef, swifterror alloca must be a pointer.
Not checking this may result in a verifier error after transforms
instead, so make sure it's discarded early.
Now that clang no longer emits GlobalIFunc-s with a
declaration for a resolver, we can restore that check.
In addition, add a linkage check like the one we have
on GlobalAlias-es, and a Verifier test for ifuncs.
Signed-off-by: Itay Bookstein <ibookstein@gmail.com>
Reviewed By: MaskRay
Differential Revision: https://reviews.llvm.org/D120267
This introduces a new "ptrauth" operand bundle to be used in
call/invoke. At the IR level, it's semantically equivalent to an
@llvm.ptrauth.auth followed by an indirect call, but it additionally
provides additional hardening, by preventing the intermediate raw
pointer from being exposed.
This mostly adds the IR definition, verifier checks, and support in
a couple of general helper functions. Clang IRGen and backend support
will come separately.
Note that we'll eventually want to support this bundle in indirectbr as
well, for similar reasons. indirectbr currently doesn't support bundles
at all, and the IR data structures need to be updated to allow that.
Differential Revision: https://reviews.llvm.org/D113685
Add a new llvm.fptrunc.round intrinsic to precisely control
the rounding mode when converting from f32 to f16.
Differential Revision: https://reviews.llvm.org/D110579
Based on the output of include-what-you-use.
This is a big chunk of changes. It is very likely to break downstream code
unless they took a lot of care in avoiding hidden ehader dependencies, something
the LLVM codebase doesn't do that well :-/
I've tried to summarize the biggest change below:
- llvm/include/llvm-c/Core.h: no longer includes llvm-c/ErrorHandling.h
- llvm/IR/DIBuilder.h no longer includes llvm/IR/DebugInfo.h
- llvm/IR/IRBuilder.h no longer includes llvm/IR/IntrinsicInst.h
- llvm/IR/LLVMRemarkStreamer.h no longer includes llvm/Support/ToolOutputFile.h
- llvm/IR/LegacyPassManager.h no longer include llvm/Pass.h
- llvm/IR/Type.h no longer includes llvm/ADT/SmallPtrSet.h
- llvm/IR/PassManager.h no longer includes llvm/Pass.h nor llvm/Support/Debug.h
And the usual count of preprocessed lines:
$ clang++ -E -Iinclude -I../llvm/include ../llvm/lib/IR/*.cpp -std=c++14 -fno-rtti -fno-exceptions | wc -l
before: 6400831
after: 6189948
200k lines less to process is no that bad ;-)
Discourse thread on the topic: https://llvm.discourse.group/t/include-what-you-use-include-cleanup
Differential Revision: https://reviews.llvm.org/D118652
Currently, the clang.arc.attachedcall bundle takes an optional function
argument. Depending on whether the argument is present, calls with this
bundle have the following semantics:
- on x86, with the argument present, the call is lowered to:
call _target
mov rax, rdi
call _objc_retainAutoreleasedReturnValue
- on AArch64, without the argument, the call is lowered to:
bl _target
mov x29, x29
and the objc runtime call is expected to be emitted separately.
That's because, on x86, the objc runtime checks for both the mov and
the call on x86, and treats the combination as the ARC autorelease elision
marker.
But on AArch64, it only checks for the dedicated NOP marker, as that's
historically been sufficiently unique. Thanks to that, the runtime call
wasn't required to be adjacent to the NOP marker, so it wasn't emitted
as part of the bundle sequence.
This patch unifies both architectures: on AArch64, we now emit all
3 instructions for the bundle. This guarantees that the runtime call
is adjacent to the marker in the sequence, and that's information the
runtime can use to further optimize this.
This helps simplify some of the handling, in particular
BundledRetainClaimRVs, which no longer needs to know whether the bundle
is sufficient or not: it now always should be.
Note that this does not include an AutoUpgrade for the nullary bundles,
as they are only produced in ObjCContract as part of the obj/asm emission
pipeline, and are not expected to be in bitcode.
Differential Revision: https://reviews.llvm.org/D118214
Instead use either Type::getPointerElementType() or
Type::getNonOpaquePointerElementType().
This is part of D117885, in preparation for deprecating the API.
The invalid undef value already triggers a verifier failure, but
then the upwards scan from the cleanuppad ends up asserting. Make
sure this is handled gacefully instead.
I've changed the definition of the experimental.vector.splice
instrinsic to reject indices that are known to be or possibly
out-of-bounds. In practice, this means changing the definition so that
the index is now only valid in the range [-VL, VL-1] where VL is the
known minimum vector length. We use the vscale_range attribute to
take the minimum vscale value into account so that we can permit
more indices when the attribute is present.
The splice intrinsic is currently only ever generated by the vectoriser,
which will never attempt to splice vectors with out-of-bounds values.
Changing the definition also makes things simpler for codegen since we
can always assume that the index is valid.
This patch was created in response to review comments on D115863
Differential Revision: https://reviews.llvm.org/D115933
This class is solely used as a lightweight and clean way to build a set of
attributes to be removed from an AttrBuilder. Previously AttrBuilder was used
both for building and removing, which introduced odd situation like creation of
Attribute with dummy value because the only relevant part was the attribute
kind.
Differential Revision: https://reviews.llvm.org/D116110
Indirect inline asm operands may require the materialization of a
memory access according to the pointer element type. As this will
no longer be available with opaque pointers, we require it to be
explicitly annotated using the elementtype attribute, for example:
define void @test(i32* %p, i32 %x) {
call void asm "addl $1, $0", "=*rm,r"(i32* elementtype(i32) %p, i32 %x)
ret void
}
This patch only includes the LangRef change and Verifier updates to
allow adding the elementtype attribute in this position. It does not
yet enforce this, as this will require changes on the clang side
(and test updates) first.
Something I'm a bit unsure about is whether we really need the
elementtype for all indirect constraints, rather than only indirect
register constraints. I think indirect memory constraints might not
strictly need it (though the backend code is written in a way that
does require it). I think it's okay to just make this a general
requirement though, as this means we don't need to carefully deal
with multiple or alternative constraints. In addition, I believe
that MemorySanitizer benefits from having the element type even in
cases where it may not be strictly necessary for normal lowering
(cd2b050fa4/llvm/lib/Transforms/Instrumentation/MemorySanitizer.cpp (L4066)).
Differential Revision: https://reviews.llvm.org/D116531
The recursive implementation can run into stack overflows, e.g. like in PR52844.
The order the users are visited changes, but for the current use case
this only impacts the order error messages are emitted.
Rust allows enums to be scopes, as shown by the previous change. Sadly,
D111770 disallowed enums-as-scopes in the LLVM Verifier, which means
that LLVM HEAD stopped working for Rust compiles. As a result, we back
out the verifier part of D111770 with a modification to the testcase so
we don't break this in the future.
The testcase is now actual IR from rustc at commit 8f8092cc3, which is
the nightly as of 2021-09-28. I would expect rustc 1.57 to produce
similar or identical IR if someone wants to reproduce this IR in the
future with minimal changes. A recipe for reproducing the IR using rustc
is included in the test file.
Reviewed By: dblaikie
Differential Revision: https://reviews.llvm.org/D115353