Generate btf_tag annotations for DIDrived types. More specifically,
clang frontend generates the btf_tag annotations for record
fields. The annotations are represented as an DINodeArray
in DebugInfo. The following example illustrate how
annotations are encoded in IR:
distinct !DIDerivedType(tag: DW_TAG_member, ..., annotations: !10)
!10 = !{!11, !12}
!11 = !{!"btf_tag", !"a"}
!12 = !{!"btf_tag", !"b"}
Differential Revision: https://reviews.llvm.org/D106616
The purpose of __attribute__((disable_sanitizer_instrumentation)) is to
prevent all kinds of sanitizer instrumentation applied to a certain
function, Objective-C method, or global variable.
The no_sanitize(...) attribute drops instrumentation checks, but may
still insert code preventing false positive reports. In some cases
though (e.g. when building Linux kernel with -fsanitize=kernel-memory
or -fsanitize=thread) the users may want to avoid any kind of
instrumentation.
Differential Revision: https://reviews.llvm.org/D108029
Clang patch D106614 added attribute btf_tag support. This patch
generates btf_tag annotations for DIComposite types.
A field "annotations" is introduced to DIComposite, and the
annotations are represented as an DINodeArray, similar to
DIComposite elements. The following example illustrates
how annotations are encoded in IR:
distinct !DICompositeType(..., annotations: !10)
!10 = !{!11, !12}
!11 = !{!"btf_tag", !"a"}
!12 = !{!"btf_tag", !"b"}
Each btf_tag annotation is represented as a 2D array of
meta strings. Each record may have more than one
btf_tag annotations, as in the above example.
Reland with additional fixes for llvm/unittests/IR/DebugTypeODRUniquingTest.cpp.
Differential Revision: https://reviews.llvm.org/D106615
Clang patch D106614 added attribute btf_tag support. This patch
generates btf_tag annotations for DIComposite types.
A field "annotations" is introduced to DIComposite, and the
annotations are represented as an DINodeArray, similar to
DIComposite elements. The following example illustrates
how annotations are encoded in IR:
distinct !DICompositeType(..., annotations: !10)
!10 = !{!11, !12}
!11 = !{!"btf_tag", !"a"}
!12 = !{!"btf_tag", !"b"}
Each btf_tag annotation is represented as a 2D array of
meta strings. Each record may have more than one
btf_tag annotations, as in the above example.
Differential Revision: https://reviews.llvm.org/D106615
In the textual format, `noduplicates` means no COMDAT/section group
deduplication is performed. Therefore, if both sets of sections are retained, and
they happen to define strong external symbols with the same names,
there will be a duplicate definition linker error.
In PE/COFF, the selection kind lowers to `IMAGE_COMDAT_SELECT_NODUPLICATES`.
The name describes the corollary instead of the immediate semantics. The name
can cause confusion to other binary formats (ELF, wasm) which have implemented/
want to implement the "no deduplication" selection kind. Rename it to be clearer.
Reviewed By: rnk
Differential Revision: https://reviews.llvm.org/D106319
Use the elementtype attribute introduced in D105407 for the
llvm.preserve.array/struct.index intrinsics. It carries the
element type of the GEP these intrinsics effectively encode.
This patch:
* Adds a verifier check that the attribute is required.
* Adds it in the IRBuilder methods for these intrinsics.
* Autoupgrades old bitcode without the attribute.
* Updates the lowering code to use the attribute rather than
the pointer element type.
* Updates lots of tests to specify the attribute.
* Adds -force-opaque-pointers to the intrinsic-array.ll test
to demonstrate they work now.
https://reviews.llvm.org/D106184
This implements the elementtype attribute specified in D105407. It
just adds the attribute and the specified verifier rules, but
doesn't yet make use of it anywhere.
Differential Revision: https://reviews.llvm.org/D106008
This reverts commit 8cd35ad854.
It breaks `TestMembersAndLocalsWithSameName.py` on GreenDragon and
Mikael Holmén points out in D104827 that bitcode files created with the
patch cannot be parsed with binaries built before it.
While this should not matter for most architectures (where the program
address space is 0), it is important for CHERI (and therefore Arm Morello).
We use address space 200 for all of our code pointers and without this
change we assert in the SelectionDAG handling of BlockAddress nodes.
It is also useful for AVR: previously programs targeting
AVR that attempt to read their own machine code
via a pointer to a label would instead read from RAM
using a pointer relative to the the start of program flash.
Reviewed By: dylanmckay, theraven
Differential Revision: https://reviews.llvm.org/D48803
Add UNIQUED and DISTINCT properties in Metadata.def and use them to
implement restrictions on the `distinct` property of MDNodes:
* DIExpression can currently be parsed from IR or read from bitcode
as `distinct`, but this property is silently dropped when printing
to IR. This causes accepted IR to fail to round-trip. As DIExpression
appears inline at each use in the canonical form of IR, it cannot
actually be `distinct` anyway, as there is no syntax to describe it.
* Similarly, DIArgList is conceptually always uniqued. It is currently
restricted to only appearing in contexts where there is no syntax for
`distinct`, but for consistency it is treated equivalently to
DIExpression in this patch.
* DICompileUnit is already restricted to always being `distinct`, but
along with adding general support for the inverse restriction I went
ahead and described this in Metadata.def and updated the parser to be
general. Future nodes which have this restriction can share this
support.
The new UNIQUED property applies to DIExpression and DIArgList, and
forbids them to be `distinct`. It also implies they are canonically
printed inline at each use, rather than via MDNode ID.
The new DISTINCT property applies to DICompileUnit, and requires it to
be `distinct`.
A potential alternative change is to forbid the non-inline syntax for
DIExpression entirely, as is done with DIArgList implicitly by requiring
it appear in the context of a function. For example, we would forbid:
!named = !{!0}
!0 = !DIExpression()
Instead we would only accept the equivalent inlined version:
!named = !{!DIExpression()}
This essentially removes the ability to create a `distinct` DIExpression
by construction, as there is no syntax for `distinct` inline. If this
patch is accepted as-is, the result would be that the non-canonical
version is accepted, but the following would be an error and produce a diagnostic:
!named = !{!0}
; error: 'distinct' not allowed for !DIExpression()
!0 = distinct !DIExpression()
Also update some documentation to consistently use the inline syntax for
DIExpression, and to describe the restrictions on `distinct` for nodes
where applicable.
Reviewed By: StephenTozer, t-tye
Differential Revision: https://reviews.llvm.org/D104827
Currently, AsmWriter will stick uselistorder directives for global
values inside individual functions. This doesn't make a lot of sense,
and interacts badly with D104950, as use list order adjustments will
be performed while still working on a forward reference.
This patch instead always prints uselistorder directives for globals
at the module level. This isn't really compatible with the previously
used implementation approach. Rather than walking through all values
again, use the OrderMap (after stabilizing its order) to go through
all values and compute the use list shuffles for them. Classify them
per-function, or nullptr for globals.
Even independently of D104950, this seems to fix a few
verify-uselistorder failures. Conveniently, there is even a
pre-existing failing test that this fixes.
Differential Revision: https://reviews.llvm.org/D104976
After D104475 / D104658, building the Linux kernel with ThinLTO is
broken:
ld.lld: error: Unknown attribute kind (73) (Producer: 'LLVM13.0.0git'
Reader: 'LLVM 13.0.0git')
getAttrFromCode() has never handled this attribute so it is written
during the ThinLTO phase but it cannot be handled during the linking
phase.
Add noprofile to getAttrFromCode() so that disassembly works properly.
Reviewed By: MaskRay
Differential Revision: https://reviews.llvm.org/D104995
This can be seen as a follow up to commit 0ee439b705,
that changed the second argument of __powidf2, __powisf2 and
__powitf2 in compiler-rt from si_int to int. That was to align with
how those runtimes are defined in libgcc.
One thing that seem to have been missing in that patch was to make
sure that the rest of LLVM also handle that the argument now depends
on the size of int (not using the si_int machine mode for 32-bit).
When using __builtin_powi for a target with 16-bit int clang crashed.
And when emitting libcalls to those rtlib functions, typically when
lowering @llvm.powi), the backend would always prepare the exponent
argument as an i32 which caused miscompiles when the rtlib was
compiled with 16-bit int.
The solution used here is to use an overloaded type for the second
argument in @llvm.powi. This way clang can use the "correct" type
when lowering __builtin_powi, and then later when emitting the libcall
it is assumed that the type used in @llvm.powi matches the rtlib
function.
One thing that needed some extra attention was that when vectorizing
calls several passes did not support that several arguments could
be overloaded in the intrinsics. This patch allows overload of a
scalar operand by adding hasVectorInstrinsicOverloadedScalarOpd, with
an entry for powi.
Differential Revision: https://reviews.llvm.org/D99439
Since the opaque pointer type won't contain the pointee type, we need to
separately encode the value type for an atomicrmw.
Emit this new code for atomicrmw.
Handle this new code and the old one in the bitcode reader.
Reviewed By: dblaikie
Differential Revision: https://reviews.llvm.org/D103123
We really ought to support no_sanitize("coverage") in line with other
sanitizers. This came up again in discussions on the Linux-kernel
mailing lists, because we currently do workarounds using objtool to
remove coverage instrumentation. Since that support is only on x86, to
continue support coverage instrumentation on other architectures, we
must support selectively disabling coverage instrumentation via function
attributes.
Unfortunately, for SanitizeCoverage, it has not been implemented as a
sanitizer via fsanitize= and associated options in Sanitizers.def, but
rolls its own option fsanitize-coverage. This meant that we never got
"automatic" no_sanitize attribute support.
Implement no_sanitize attribute support by special-casing the string
"coverage" in the NoSanitizeAttr implementation. To keep the feature as
unintrusive to existing IR generation as possible, define a new negative
function attribute NoSanitizeCoverage to propagate the information
through to the instrumentation pass.
Fixes: https://bugs.llvm.org/show_bug.cgi?id=49035
Reviewed By: vitalybuka, morehouse
Differential Revision: https://reviews.llvm.org/D102772
Revert testcase changed in D87304 now the upgrader can correctly handle
the align attribute in upgrader.
Reviewed By: dexonsmith
Differential Revision: https://reviews.llvm.org/D102880
These checks already exist as asserts when creating the corresponding
instruction. Anybody creating these instructions already need to take
care to not break these checks.
Move the checks for success/failure ordering in cmpxchg from the
verifier to the LLParser and BitcodeReader plus an assert.
Add some tests for cmpxchg ordering. The .bc files are created from the
.ll files with an llvm-as with these checks disabled.
Reviewed By: dblaikie
Differential Revision: https://reviews.llvm.org/D102803
This is a follow-up of D102201. After some discussion, it is a better idea
to upgrade all invalid uses of alignment attributes on function return
values and parameters, not just limited to void function return types.
Reviewed By: jdoerfert
Differential Revision: https://reviews.llvm.org/D102726
Swift's new concurrency features are going to require guaranteed tail calls so
that they don't consume excessive amounts of stack space. This would normally
mean "tailcc", but there are also Swift-specific ABI desires that don't
naturally go along with "tailcc" so this adds another calling convention that's
the combination of "swiftcc" and "tailcc".
Support is added for AArch64 and X86 for now.
This extends any frame record created in the function to include that
parameter, passed in X22.
The new record looks like [X22, FP, LR] in memory, and FP is stored with 0b0001
in bits 63:60 (CodeGen assumes they are 0b0000 in normal operation). The effect
of this is that tools walking the stack should expect to see one of three
values there:
* 0b0000 => a normal, non-extended record with just [FP, LR]
* 0b0001 => the extended record [X22, FP, LR]
* 0b1111 => kernel space, and a non-extended record.
All other values are currently reserved.
If compiling for arm64e this context pointer is address-discriminated with the
discriminator 0xc31a and the DB (process-specific) key.
There is also an "i8** @llvm.swift.async.context.addr()" intrinsic providing
front-ends access to this slot (and forcing its creation initialized to nullptr
if necessary).
Since D87304, `align` become an invalid attribute on none pointer types and
verifier will reject bitcode that has invalid `align` attribute.
The problem is before the change, DeadArgumentElimination can easily
turn a pointer return type into a void return type without removing
`align` attribute. Teach Autograde to remove invalid `align` attribute
from return types to maintain bitcode compatibility.
rdar://77022993
Reviewed By: dexonsmith
Differential Revision: https://reviews.llvm.org/D102201
When auto-upgrade was replacing a call to a masked intrinsic, it would
not copy the metadata from the original call.
If an intrinsic had metadata, but did not need any updates, the metadata
would stay, but if an update was needed, the would end up being removed.
A similar effect could be observed with masked_expandload and
masked_compressstore, which at the moment are not handled by auto-upgrade:
the metadata remained untouched.
Differential Revision: https://reviews.llvm.org/D101201
When we pass a AArch64 Homogeneous Floating-Point
Aggregate (HFA) argument with increased alignment
requirements, for example
struct S {
__attribute__ ((__aligned__(16))) double v[4];
};
Clang uses `[4 x double]` for the parameter, which is passed
on the stack at alignment 8, whereas it should be at
alignment 16, following Rule C.4 in
AAPCS (https://github.com/ARM-software/abi-aa/blob/master/aapcs64/aapcs64.rst#642parameter-passing-rules)
Currently we don't have a way to express in LLVM IR the
alignment requirements of the function arguments. The align
attribute is applicable to pointers only, and only for some
special ways of passing arguments (e..g byval). When
implementing AAPCS32/AAPCS64, clang resorts to dubious hacks
of coercing to types, which naturally have the needed
alignment. We don't have enough types to cover all the
cases, though.
This patch introduces a new use of the stackalign attribute
to control stack slot alignment, when and if an argument is
passed in memory.
The attribute align is left as an optimizer hint - it still
applies to pointer types only and pertains to the content of
the pointer, whereas the alignment of the pointer itself is
determined by the stackalign attribute.
For byval arguments, the stackalign attribute assumes the
role, previously perfomed by align, falling back to align if
stackalign` is absent.
On the clang side, when passing arguments using the "direct"
style (cf. `ABIArgInfo::Kind`), now we can optionally
specify an alignment, which is emitted as the new
`stackalign` attribute.
Patch by Momchil Velikov and Lucas Prates.
Differential Revision: https://reviews.llvm.org/D98794
I think byval/sret and the others are close to being able to rip out
the code to support the missing type case. A lot of this code is
shared with inalloca, so catch this up to the others so that can
happen.
This attribute represents the minimum and maximum values vscale can
take. For now this attribute is not hooked up to anything during
codegen, this will be added in the future when such codegen is
considered stable.
Additionally hook up the -msve-vector-bits=<x> clang option to emit this
attribute.
Differential Revision: https://reviews.llvm.org/D98030
This patch adds support for intrinsic overloading on unnamed types.
This fixes PR38117 and PR48340 and will also be needed for the Full Restrict Patches (D68484).
The main problem is that the intrinsic overloading name mangling is using 's_s' for unnamed types.
This can result in identical intrinsic mangled names for different function prototypes.
This patch changes this by adding a '.XXXXX' to the intrinsic mangled name when at least one of the types is based on an unnamed type, ensuring that we get a unique name.
Implementation details:
- The mapping is created on demand and kept in Module.
- It also checks for existing clashes and recycles potentially existing prototypes and declarations.
- Because of extra data in Module, Intrinsic::getName needs an extra Module* argument and, for speed, an optional FunctionType* argument.
- I still kept the original two-argument 'Intrinsic::getName' around which keeps the original behavior (providing the base name).
-- Main reason is that I did not want to change the LLVMIntrinsicGetName version, as I don't know how acceptable such a change is
-- The current situation already has a limitation. So that should not get worse with this patch.
- Intrinsic::getDeclaration and the verifier are now using the new version.
Other notes:
- As far as I see, this should not suffer from stability issues. The count is only added for prototypes depending on at least one anonymous struct
- The initial count starts from 0 for each intrinsic mangled name.
- In case of name clashes, existing prototypes are remembered and reused when that makes sense.
Reviewed By: fhahn
Differential Revision: https://reviews.llvm.org/D91250
explicitly emitting retainRV or claimRV calls in the IR
This reapplies ed4718eccb, which was reverted
because it was causing a miscompile. The bug that was causing the miscompile
has been fixed in 75805dce5f.
Original commit message:
Background:
This fixes a longstanding problem where llvm breaks ARC's autorelease
optimization (see the link below) by separating calls from the marker
instructions or retainRV/claimRV calls. The backend changes are in
https://reviews.llvm.org/D92569.
https://clang.llvm.org/docs/AutomaticReferenceCounting.html#arc-runtime-objc-autoreleasereturnvalue
What this patch does to fix the problem:
- The front-end adds operand bundle "clang.arc.attachedcall" to calls,
which indicates the call is implicitly followed by a marker
instruction and an implicit retainRV/claimRV call that consumes the
call result. In addition, it emits a call to
@llvm.objc.clang.arc.noop.use, which consumes the call result, to
prevent the middle-end passes from changing the return type of the
called function. This is currently done only when the target is arm64
and the optimization level is higher than -O0.
- ARC optimizer temporarily emits retainRV/claimRV calls after the calls
with the operand bundle in the IR and removes the inserted calls after
processing the function.
- ARC contract pass emits retainRV/claimRV calls after the call with the
operand bundle. It doesn't remove the operand bundle on the call since
the backend needs it to emit the marker instruction. The retainRV and
claimRV calls are emitted late in the pipeline to prevent optimization
passes from transforming the IR in a way that makes it harder for the
ARC middle-end passes to figure out the def-use relationship between
the call and the retainRV/claimRV calls (which is the cause of
PR31925).
- The function inliner removes an autoreleaseRV call in the callee if
nothing in the callee prevents it from being paired up with the
retainRV/claimRV call in the caller. It then inserts a release call if
claimRV is attached to the call since autoreleaseRV+claimRV is
equivalent to a release. If it cannot find an autoreleaseRV call, it
tries to transfer the operand bundle to a function call in the callee.
This is important since the ARC optimizer can remove the autoreleaseRV
returning the callee result, which makes it impossible to pair it up
with the retainRV/claimRV call in the caller. If that fails, it simply
emits a retain call in the IR if retainRV is attached to the call and
does nothing if claimRV is attached to it.
- SCCP refrains from replacing the return value of a call with a
constant value if the call has the operand bundle. This ensures the
call always has at least one user (the call to
@llvm.objc.clang.arc.noop.use).
- This patch also fixes a bug in replaceUsesOfNonProtoConstant where
multiple operand bundles of the same kind were being added to a call.
Future work:
- Use the operand bundle on x86-64.
- Fix the auto upgrader to convert call+retainRV/claimRV pairs into
calls with the operand bundles.
rdar://71443534
Differential Revision: https://reviews.llvm.org/D92808
This caused miscompiles of Chromium tests for iOS due clobbering of live
registers. See discussion on the code review for details.
> Background:
>
> This fixes a longstanding problem where llvm breaks ARC's autorelease
> optimization (see the link below) by separating calls from the marker
> instructions or retainRV/claimRV calls. The backend changes are in
> https://reviews.llvm.org/D92569.
>
> https://clang.llvm.org/docs/AutomaticReferenceCounting.html#arc-runtime-objc-autoreleasereturnvalue
>
> What this patch does to fix the problem:
>
> - The front-end adds operand bundle "clang.arc.attachedcall" to calls,
> which indicates the call is implicitly followed by a marker
> instruction and an implicit retainRV/claimRV call that consumes the
> call result. In addition, it emits a call to
> @llvm.objc.clang.arc.noop.use, which consumes the call result, to
> prevent the middle-end passes from changing the return type of the
> called function. This is currently done only when the target is arm64
> and the optimization level is higher than -O0.
>
> - ARC optimizer temporarily emits retainRV/claimRV calls after the calls
> with the operand bundle in the IR and removes the inserted calls after
> processing the function.
>
> - ARC contract pass emits retainRV/claimRV calls after the call with the
> operand bundle. It doesn't remove the operand bundle on the call since
> the backend needs it to emit the marker instruction. The retainRV and
> claimRV calls are emitted late in the pipeline to prevent optimization
> passes from transforming the IR in a way that makes it harder for the
> ARC middle-end passes to figure out the def-use relationship between
> the call and the retainRV/claimRV calls (which is the cause of
> PR31925).
>
> - The function inliner removes an autoreleaseRV call in the callee if
> nothing in the callee prevents it from being paired up with the
> retainRV/claimRV call in the caller. It then inserts a release call if
> claimRV is attached to the call since autoreleaseRV+claimRV is
> equivalent to a release. If it cannot find an autoreleaseRV call, it
> tries to transfer the operand bundle to a function call in the callee.
> This is important since the ARC optimizer can remove the autoreleaseRV
> returning the callee result, which makes it impossible to pair it up
> with the retainRV/claimRV call in the caller. If that fails, it simply
> emits a retain call in the IR if retainRV is attached to the call and
> does nothing if claimRV is attached to it.
>
> - SCCP refrains from replacing the return value of a call with a
> constant value if the call has the operand bundle. This ensures the
> call always has at least one user (the call to
> @llvm.objc.clang.arc.noop.use).
>
> - This patch also fixes a bug in replaceUsesOfNonProtoConstant where
> multiple operand bundles of the same kind were being added to a call.
>
> Future work:
>
> - Use the operand bundle on x86-64.
>
> - Fix the auto upgrader to convert call+retainRV/claimRV pairs into
> calls with the operand bundles.
>
> rdar://71443534
>
> Differential Revision: https://reviews.llvm.org/D92808
This reverts commit ed4718eccb.
The llvm.ptr.annotation and llvm.var.annotation intrinsics were changed
since the 11.0 release to add an additional parameter. This patch
auto-upgrades IR containing the four-parameter versions of these
intrinsics, adding a null pointer as the fifth argument.
Differential Revision: https://reviews.llvm.org/D95993
Perform DSOLocal propagation within summary list of every GV. This
avoids the repeated query of this information during function
importing.
Differential Revision: https://reviews.llvm.org/D96398
explicitly emitting retainRV or claimRV calls in the IR
Background:
This fixes a longstanding problem where llvm breaks ARC's autorelease
optimization (see the link below) by separating calls from the marker
instructions or retainRV/claimRV calls. The backend changes are in
https://reviews.llvm.org/D92569.
https://clang.llvm.org/docs/AutomaticReferenceCounting.html#arc-runtime-objc-autoreleasereturnvalue
What this patch does to fix the problem:
- The front-end adds operand bundle "clang.arc.attachedcall" to calls,
which indicates the call is implicitly followed by a marker
instruction and an implicit retainRV/claimRV call that consumes the
call result. In addition, it emits a call to
@llvm.objc.clang.arc.noop.use, which consumes the call result, to
prevent the middle-end passes from changing the return type of the
called function. This is currently done only when the target is arm64
and the optimization level is higher than -O0.
- ARC optimizer temporarily emits retainRV/claimRV calls after the calls
with the operand bundle in the IR and removes the inserted calls after
processing the function.
- ARC contract pass emits retainRV/claimRV calls after the call with the
operand bundle. It doesn't remove the operand bundle on the call since
the backend needs it to emit the marker instruction. The retainRV and
claimRV calls are emitted late in the pipeline to prevent optimization
passes from transforming the IR in a way that makes it harder for the
ARC middle-end passes to figure out the def-use relationship between
the call and the retainRV/claimRV calls (which is the cause of
PR31925).
- The function inliner removes an autoreleaseRV call in the callee if
nothing in the callee prevents it from being paired up with the
retainRV/claimRV call in the caller. It then inserts a release call if
claimRV is attached to the call since autoreleaseRV+claimRV is
equivalent to a release. If it cannot find an autoreleaseRV call, it
tries to transfer the operand bundle to a function call in the callee.
This is important since the ARC optimizer can remove the autoreleaseRV
returning the callee result, which makes it impossible to pair it up
with the retainRV/claimRV call in the caller. If that fails, it simply
emits a retain call in the IR if retainRV is attached to the call and
does nothing if claimRV is attached to it.
- SCCP refrains from replacing the return value of a call with a
constant value if the call has the operand bundle. This ensures the
call always has at least one user (the call to
@llvm.objc.clang.arc.noop.use).
- This patch also fixes a bug in replaceUsesOfNonProtoConstant where
multiple operand bundles of the same kind were being added to a call.
Future work:
- Use the operand bundle on x86-64.
- Fix the auto upgrader to convert call+retainRV/claimRV pairs into
calls with the operand bundles.
rdar://71443534
Differential Revision: https://reviews.llvm.org/D92808
This is a follow up patch to D83136 adding the align attribute to `cmpxchg`.
See also D83465 for `atomicrmw`.
Differential Revision: https://reviews.llvm.org/D87443
emitting retainRV or claimRV calls in the IR
This reapplies 3fe3946d9a without the
changes made to lib/IR/AutoUpgrade.cpp, which was violating layering.
Original commit message:
Background:
This patch makes changes to the front-end and middle-end that are
needed to fix a longstanding problem where llvm breaks ARC's autorelease
optimization (see the link below) by separating calls from the marker
instructions or retainRV/claimRV calls. The backend changes are in
https://reviews.llvm.org/D92569.
https://clang.llvm.org/docs/AutomaticReferenceCounting.html#arc-runtime-objc-autoreleasereturnvalue
What this patch does to fix the problem:
- The front-end adds operand bundle "clang.arc.rv" to calls, which
indicates the call is implicitly followed by a marker instruction and
an implicit retainRV/claimRV call that consumes the call result. In
addition, it emits a call to @llvm.objc.clang.arc.noop.use, which
consumes the call result, to prevent the middle-end passes from changing
the return type of the called function. This is currently done only when
the target is arm64 and the optimization level is higher than -O0.
- ARC optimizer temporarily emits retainRV/claimRV calls after the calls
with the operand bundle in the IR and removes the inserted calls after
processing the function.
- ARC contract pass emits retainRV/claimRV calls after the call with the
operand bundle. It doesn't remove the operand bundle on the call since
the backend needs it to emit the marker instruction. The retainRV and
claimRV calls are emitted late in the pipeline to prevent optimization
passes from transforming the IR in a way that makes it harder for the
ARC middle-end passes to figure out the def-use relationship between
the call and the retainRV/claimRV calls (which is the cause of
PR31925).
- The function inliner removes an autoreleaseRV call in the callee if
nothing in the callee prevents it from being paired up with the
retainRV/claimRV call in the caller. It then inserts a release call if
the call is annotated with claimRV since autoreleaseRV+claimRV is
equivalent to a release. If it cannot find an autoreleaseRV call, it
tries to transfer the operand bundle to a function call in the callee.
This is important since ARC optimizer can remove the autoreleaseRV
returning the callee result, which makes it impossible to pair it up
with the retainRV/claimRV call in the caller. If that fails, it simply
emits a retain call in the IR if the implicit call is a call to
retainRV and does nothing if it's a call to claimRV.
Future work:
- Use the operand bundle on x86-64.
- Fix the auto upgrader to convert call+retainRV/claimRV pairs into
calls annotated with the operand bundles.
rdar://71443534
Differential Revision: https://reviews.llvm.org/D92808
emitting retainRV or claimRV calls in the IR
Background:
This patch makes changes to the front-end and middle-end that are
needed to fix a longstanding problem where llvm breaks ARC's autorelease
optimization (see the link below) by separating calls from the marker
instructions or retainRV/claimRV calls. The backend changes are in
https://reviews.llvm.org/D92569.
https://clang.llvm.org/docs/AutomaticReferenceCounting.html#arc-runtime-objc-autoreleasereturnvalue
What this patch does to fix the problem:
- The front-end adds operand bundle "clang.arc.rv" to calls, which
indicates the call is implicitly followed by a marker instruction and
an implicit retainRV/claimRV call that consumes the call result. In
addition, it emits a call to @llvm.objc.clang.arc.noop.use, which
consumes the call result, to prevent the middle-end passes from changing
the return type of the called function. This is currently done only when
the target is arm64 and the optimization level is higher than -O0.
- ARC optimizer temporarily emits retainRV/claimRV calls after the calls
with the operand bundle in the IR and removes the inserted calls after
processing the function.
- ARC contract pass emits retainRV/claimRV calls after the call with the
operand bundle. It doesn't remove the operand bundle on the call since
the backend needs it to emit the marker instruction. The retainRV and
claimRV calls are emitted late in the pipeline to prevent optimization
passes from transforming the IR in a way that makes it harder for the
ARC middle-end passes to figure out the def-use relationship between
the call and the retainRV/claimRV calls (which is the cause of
PR31925).
- The function inliner removes an autoreleaseRV call in the callee if
nothing in the callee prevents it from being paired up with the
retainRV/claimRV call in the caller. It then inserts a release call if
the call is annotated with claimRV since autoreleaseRV+claimRV is
equivalent to a release. If it cannot find an autoreleaseRV call, it
tries to transfer the operand bundle to a function call in the callee.
This is important since ARC optimizer can remove the autoreleaseRV
returning the callee result, which makes it impossible to pair it up
with the retainRV/claimRV call in the caller. If that fails, it simply
emits a retain call in the IR if the implicit call is a call to
retainRV and does nothing if it's a call to claimRV.
Future work:
- Use the operand bundle on x86-64.
- Fix the auto upgrader to convert call+retainRV/claimRV pairs into
calls annotated with the operand bundles.
rdar://71443534
Differential Revision: https://reviews.llvm.org/D92808
Imported functions and variable get the visibility from the module supplying the
definition. However, non-imported definitions do not get the visibility from
(ELF) the most constraining visibility among all modules (Mach-O) the visibility
of the prevailing definition.
This patch
* adds visibility bits to GlobalValueSummary::GVFlags
* computes the result visibility and propagates it to all definitions
Protected/hidden can imply dso_local which can enable some optimizations (this
is stronger than GVFlags::DSOLocal because the implied dso_local can be
leveraged for ELF -shared while default visibility dso_local has to be cleared
for ELF -shared).
Note: we don't have summaries for declarations, so for ELF if a declaration has
the most constraining visibility, the result visibility may not be that one.
Differential Revision: https://reviews.llvm.org/D92900
The main change is to add a 'IsDecl' field to DIModule so
that when IsDecl is set to true, the debug info entry generated
for the module would be marked as a declaration. That way, the debugger
would look up the definition of the module in the gloabl scope.
Please see the comments in llvm/test/DebugInfo/X86/dimodule.ll
for what the debug info entries would look like.
Differential Revision: https://reviews.llvm.org/D93462
Clang FE currently has hot/cold function attribute. But we only have
cold function attribute in LLVM IR.
This patch adds support of hot function attribute to LLVM IR. This
attribute will be used in setting function section prefix/suffix.
Currently .hot and .unlikely suffix only are added in PGO (Sample PGO)
compilation (through isFunctionHotInCallGraph and
isFunctionColdInCallGraph).
This patch changes the behavior. The new behavior is:
(1) If the user annotates a function as hot or isFunctionHotInCallGraph
is true, this function will be marked as hot. Otherwise,
(2) If the user annotates a function as cold or
isFunctionColdInCallGraph is true, this function will be marked as
cold.
The changes are:
(1) user annotated function attribute will used in setting function
section prefix/suffix.
(2) hot attribute overwrites profile count based hotness.
(3) profile count based hotness overwrite user annotated cold attribute.
The intention for these changes is to provide the user a way to mark
certain function as hot in cases where training input is hard to cover
all the hot functions.
Differential Revision: https://reviews.llvm.org/D92493
Currently the backend special cases x86_intrcc and treats the first
parameter as byval. Make the IR require byval for this parameter to
remove this special case, and avoid the dependence on the pointee
element type.
Fixes bug 46672.
I'm not sure the IR is enforcing all the calling convention
constraints. clang seems to ignore the attribute for empty parameter
lists, but the IR tolerates it.
Define ConstantData::PoisonValue.
Add support for poison value to LLLexer/LLParser/BitcodeReader/BitcodeWriter.
Add support for poison value to llvm-c interface.
Add support for poison value to OCaml binding.
Add m_Poison in PatternMatch.
Differential Revision: https://reviews.llvm.org/D71126
Add a calling convention called amdgpu_gfx for real function calls
within graphics shaders. For the moment, this uses the same calling
convention as other calls in amdgpu, with registers excluded for return
address, stack pointer and stack buffer descriptor.
Differential Revision: https://reviews.llvm.org/D88540
This is needed to support fortran assumed rank arrays which
have runtime rank.
Summary:
Fortran assumed rank arrays have dynamic rank. DWARF TAG
DW_TAG_generic_subrange is needed to support that.
Testing:
unit test cases added (hand-written)
check llvm
check debug-info
Reviewed By: aprantl
Differential Revision: https://reviews.llvm.org/D89218
D70365 allows us to make attributes default. This is a follow up to
actually make nosync, nofree and willreturn default. The approach we
chose, for now, is to opt-in to default attributes to avoid introducing
problems to target specific intrinsics. Intrinsics with default
attributes can be created using `DefaultAttrsIntrinsic` class.
This adds the LLVM IR attribute `mustprogress` as defined in LangRef through D86233. This attribute will be applied to functions with in languages like C++ where forward progress is guaranteed. Functions without this attribute are not required to make progress.
Reviewed By: nikic
Differential Revision: https://reviews.llvm.org/D85393
D18714 introduced writeonly attribute:
"Also start using the attribute for memset, memcpy, and memmove intrinsics,
and remove their special-casing in BasicAliasAnalysis."
But actually, writeonly was not attached to memmove - oversight, it seems.
So let's add it. As we can see, this helps DSE to eliminate redundant stores.
Reviewed By: jdoerfert
Differential Revision: https://reviews.llvm.org/D89724
LLVM rejects DWARF operator DW_OP_over. This DWARF operator is needed
for Flang to support assumed rank array.
Summary:
Currently LLVM rejects DWARF operator DW_OP_over. Below error is
produced when llvm finds this operator.
[..]
invalid expression
!DIExpression(151, 20, 16, 48, 30, 35, 80, 34, 6)
warning: ignoring invalid debug info in over.ll
[..]
There were some parts missing in support of this operator, which are
now completed.
Testing
-added a unit testcase
-check-debuginfo
-check-llvm
Reviewed By: aprantl
Differential Revision: https://reviews.llvm.org/D89208
Happened to notice some of these printing as UnknownCode while running llvm-bcanalyzer on a bc file I had.
Differential Revision: https://reviews.llvm.org/D86900
This patch adds support for DWARF attribute DW_AT_rank.
Summary:
Fortran assumed rank arrays have dynamic rank. DWARF attribute
DW_AT_rank is needed to support that.
Testing:
unit test cases added (hand-written)
check llvm
check debug-info
Reviewed By: aprantl
Differential Revision: https://reviews.llvm.org/D89141
Alignment attributes need to be dropped for non-pointer values.
This also introduces a check into the verifier to ensure you don't use
`align` on anything but a pointer. Test needed to be adjusted
accordingly.
Reviewed By: fhahn
Differential Revision: https://reviews.llvm.org/D87304
This reverts commit 55c4ff91bd.
Issues were introduced as discussed in https://reviews.llvm.org/D88241
where this change made previous bugs in the linker and BitCodeWriter
visible.
Make the corresponding change that was made for byval in
b7141207a4. Like byval, this requires a
bulk update of the test IR tests to include the type before this can
be mandatory.
This patch adjusts the following ARM/AArch64 LLVM IR intrinsics:
- neon_bfmmla
- neon_bfmlalb
- neon_bfmlalt
so that they take and return bf16 and float types. Previously these
intrinsics used <8 x i8> and <4 x i8> vectors (a rudiment from
implementation lacking bf16 IR type).
The neon_vbfdot[q] intrinsics are adjusted similarly. This change
required some additional selection patterns for vbfdot itself and
also for vector shuffles (in a previous patch) because of SelectionDAG
transformations kicking in and mangling the original code.
This patch makes the generated IR cleaner (less useless bitcasts are
produced), but it does not affect the final assembly.
Reviewed By: dmgreen
Differential Revision: https://reviews.llvm.org/D86146
As discussed in
http://lists.llvm.org/pipermail/llvm-dev/2020-July/143801.html.
Currently no users outside of unit tests.
Replace all instances in tests of -constprop with -instsimplify.
Notable changes in tests:
* vscale.ll - @llvm.sadd.sat.nxv16i8 is evaluated by instsimplify, use a fake intrinsic instead
* InsertElement.ll - insertelement undef is removed by instsimplify in @insertelement_undef
llvm/test/Transforms/ConstProp moved to llvm/test/Transforms/InstSimplify/ConstProp
Reviewed By: lattner, nikic
Differential Revision: https://reviews.llvm.org/D85159
This avoid GUID lookup in Index.findSummaryInModule.
Follow up for D81242.
Reviewed By: tejohnson
Differential Revision: https://reviews.llvm.org/D85269
Summary:
The upgrading path from old ModuleFlag based linker options to the new
NamedMetadata based linker option in in materializeMetadata() which gets
called once for the module and once for every GV. The linker options are
getting dup'ed every time and it can create massive amount of the linker
options in the object file that gets created from old bitcode. Fix the
problem by checking if the new option exists or not before upgrade
again.
rdar://64543389
Reviewers: pcc, t.p.northover, dexonsmith, arphaman
Reviewed By: arphaman
Subscribers: hiraditya, jkorous, ributzka, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D83688
Summary:
If bitcode reader gets an invalid branch weight, drop that from the
inputs. This allows us to read the broken modules we generated before
the verifier was able to catch this.
rdar://64870641
Reviewers: yrouban, t.p.northover, dexonsmith, arphaman, aprantl
Reviewed By: aprantl
Subscribers: aprantl, hiraditya, jkorous, ributzka, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D83699
Summary:
This support is needed for the Fortran array variables with pointer/allocatable
attribute. This support enables debugger to identify the status of variable
whether that is currently allocated/associated.
for pointer array (before allocation/association)
without DW_AT_associated
(gdb) pt ptr
type = integer (140737345375288:140737354129776)
(gdb) p ptr
value requires 35017956 bytes, which is more than max-value-size
with DW_AT_associated
(gdb) pt ptr
type = integer (:)
(gdb) p ptr
$1 = <not associated>
for allocatable array (before allocation)
without DW_AT_allocated
(gdb) pt arr
type = integer (140737345375288:140737354129776)
(gdb) p arr
value requires 35017956 bytes, which is more than max-value-size
with DW_AT_allocated
(gdb) pt arr
type = integer, allocatable (:)
(gdb) p arr
$1 = <not allocated>
Testing
- unit test cases added
- check-llvm
- check-debuginfo
Reviewed By: aprantl
Differential Revision: https://reviews.llvm.org/D83544
This allows tracking the in-memory type of a pointer argument to a
function for ABI purposes. This is essentially a stripped down version
of byval to remove some of the stack-copy implications in its
definition.
This includes the base IR changes, and some tests for places where it
should be treated similarly to byval. Codegen support will be in a
future patch.
My original attempt at solving some of these problems was to repurpose
byval with a different address space from the stack. However, it is
technically permitted for the callee to introduce a write to the
argument, although nothing does this in reality. There is also talk of
removing and replacing the byval attribute, so a new attribute would
need to take its place anyway.
This is intended avoid some optimization issues with the current
handling of aggregate arguments, as well as fixes inflexibilty in how
frontends can specify the kernel ABI. The most honest representation
of the amdgpu_kernel convention is to expose all kernel arguments as
loads from constant memory. Today, these are raw, SSA Argument values
and codegen is responsible for turning these into loads.
Background:
There currently isn't a satisfactory way to represent how arguments
for the amdgpu_kernel calling convention are passed. In reality,
arguments are passed in a single, flat, constant memory buffer
implicitly passed to the function. It is also illegal to call this
function in the IR, and this is only ever invoked by a driver of some
kind.
It does not make sense to have a stack passed parameter in this
context as is implied by byval. It is never valid to write to the
kernel arguments, as this would corrupt the inputs seen by other
dispatches of the kernel. These argumets are also not in the same
address space as the stack, so a copy is needed to an alloca. From a
source C-like language, the kernel parameters are invisible.
Semantically, a copy is always required from the constant argument
memory to a mutable variable.
The current clang calling convention lowering emits raw values,
including aggregates into the function argument list, since using
byval would not make sense. This has some unfortunate consequences for
the optimizer. In the aggregate case, we end up with an aggregate
store to alloca, which both SROA and instcombine turn into a store of
each aggregate field. The optimizer never pieces this back together to
see that this is really just a copy from constant memory, so we end up
stuck with expensive stack usage.
This also means the backend dictates the alignment of arguments, and
arbitrarily picks the LLVM IR ABI type alignment. By allowing an
explicit alignment, frontends can make better decisions. For example,
there's real no advantage to an aligment higher than 4, so a frontend
could choose to compact the argument layout. Similarly, there is a
high penalty to using an alignment lower than 4, so a frontend could
opt into more padding for small arguments.
Another design consideration is when it is appropriate to expose the
fact that these arguments are all really passed in adjacent
memory. Currently we have a late IR optimization pass in codegen to
rewrite the kernel argument values into explicit loads to enable
vectorization. In most programs, unrelated argument loads can be
merged together. However, exposing this property directly from the
frontend has some disadvantages. We still need a way to track the
original argument sizes and alignments to report to the driver. I find
using some side-channel, metadata mechanism to track this
unappealing. If the kernel arguments were exposed as a single buffer
to begin with, alias analysis would be unaware that the padding bits
betewen arguments are meaningless. Another family of problems is there
are still some gaps in replacing all of the available parameter
attributes with metadata equivalents once lowered to loads.
The immediate plan is to start using this new attribute to handle all
aggregate argumets for kernels. Long term, it makes sense to migrate
all kernel arguments, including scalars, to be passed indirectly in
the same manner.
Additional context is in D79744.
This restores commit 80d0a137a5, and the
follow on fix in 873c0d0786, with a new
fix for test failures after a 2-stage clang bootstrap, and a more robust
fix for the Chromium build failure that an earlier version partially
fixed. See also discussion on D75201.
Reviewers: evgeny777
Subscribers: mehdi_amini, Prazek, hiraditya, steven_wu, dexonsmith, arphaman, davidxl, cfe-commits, llvm-commits
Tags: #clang, #llvm
Differential Revision: https://reviews.llvm.org/D73242
The `noundef` attribute indicates an argument or return value which
may never have an undef value representation.
This patch allows LLVM to parse the attribute.
Differential Revision: https://reviews.llvm.org/D83412
The indexing was messed up, so the result was completely broken.
Shuffle constant exprs are rare in practice; without vscale types,
constant folding generally elminates them. So sort of hard to trip over.
Fixes regression from D72467.
(Recommitting after fix for memory leak.)
Differential Revision: https://reviews.llvm.org/D80330
The indexing was messed up, so the result was completely broken.
Shuffle constant exprs are rare in practice; without vscale types,
constant folding generally elminates them. So sort of hard to trip over.
Fixes regression from D72467.
Differential Revision: https://reviews.llvm.org/D80330
In preparation for a patch that will enforce new rules for the usage of
the strictfp attribute, this patch introduces auto-upgrade behavior that
will replace the strictfp attribute on callsites with nobuiltin if the
enclosing function declaration doesn't also have the strictfp attribute.
This auto-upgrade isn't being performed on .ll files because that would
prevent us from writing a test for the forthcoming verifier behavior.
Differential Revision: https://reviews.llvm.org/D70096
Summary:
This patch adds optional field into function summary,
implements asm and bitcode serialization. YAML
serialization is omitted and can be added later if
needed.
This patch includes this information into summary only
if module contains at least one sanitize_memtag function.
In a near future MTE is the user of the analysis.
Later if needed we can provede more direct control
on when information is included into summary.
Reviewers: eugenis
Subscribers: hiraditya, steven_wu, dexonsmith, arphaman, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D80908
Currently, gc.relocates are defined in terms of indices into the statepoint's operand list. Given the gc args are at the end of a variable length list of operands, this makes interpreting their indices by hand a tad challenging. We can simplify the statepoint sequence and improve readability quite a bit by pulling these new operands into their own named operand bundle.
This patch defines a new operand bundle tag "gc-live". The semantics of the bundle are the same as the existing gc arguments of a statepoint. This patch simply introduces the definition and codegen for the bundle, future patches will migrate RS4GC to emitting the new form.
Interestingly, with this done and the recent migration to using deopt and gc-transition bundles, we really don't have much left in the statepoint itself. It really looks like the existing ID and flags fields are redundant; we have (existing!) attributes for all of them. I think we'll be able to reduce the gc.statepoint signature to simply a wrapped call (e.g. actual target and actual arguments).
Differential Revision: https://reviews.llvm.org/D80937
Summary:
Count the per-module number of basic blocks when the module summary is computed
and sum them up during Thin LTO indexing.
This is used to estimate the working set size under the partial sample PGO.
This is split off of D79831.
Reviewers: davidxl, espindola
Subscribers: emaste, inglorion, hiraditya, MaskRay, steven_wu, dexonsmith, arphaman, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D80403
This patch upgrades DISubrange to support fortran requirements.
Summary:
Below are the updates/addition of fields.
lowerBound - Now accepts signed integer or DIVariable or DIExpression,
earlier it accepted only signed integer.
upperBound - This field is now added and accepts signed interger or
DIVariable or DIExpression.
stride - This field is now added and accepts signed interger or
DIVariable or DIExpression.
This is required to describe bounds of array which are known at runtime.
Testing:
unit test cases added (hand-written)
check clang
check llvm
check debug-info
Reviewed By: aprantl
Differential Revision: https://reviews.llvm.org/D80197
This is D77454, except for stores. All the infrastructure work was done
for loads, so the remaining changes necessary are relatively small.
Differential Revision: https://reviews.llvm.org/D79968
The "null-pointer-is-valid" attribute needs to be checked by many
pointer-related combines. To make the check more efficient, convert
it from a string into an enum attribute.
In the future, this attribute may be replaced with data layout
properties.
Differential Revision: https://reviews.llvm.org/D78862
This patch adds support for DWARF attribute DW_AT_data_location.
Summary:
Dynamic arrays in fortran are described by array descriptor and
data allocation address. Former is mapped to DW_AT_location and
later is mapped to DW_AT_data_location.
Testing:
unit test cases added (hand-written)
check llvm
check debug-info
Reviewed By: aprantl
Differential Revision: https://reviews.llvm.org/D79592
For IR generated by a compiler, this is really simple: you just take the
datalayout from the beginning of the file, and apply it to all the IR
later in the file. For optimization testcases that don't care about the
datalayout, this is also really simple: we just use the default
datalayout.
The complexity here comes from the fact that some LLVM tools allow
overriding the datalayout: some tools have an explicit flag for this,
some tools will infer a datalayout based on the code generation target.
Supporting this properly required plumbing through a bunch of new
machinery: we want to allow overriding the datalayout after the
datalayout is parsed from the file, but before we use any information
from it. Therefore, IR/bitcode parsing now has a callback to allow tools
to compute the datalayout at the appropriate time.
Not sure if I covered all the LLVM tools that want to use the callback.
(clang? lli? Misc IR manipulation tools like llvm-link?). But this is at
least enough for all the LLVM regression tests, and IR without a
datalayout is not something frontends should generate.
This change had some sort of weird effects for certain CodeGen
regression tests: if the datalayout is overridden with a datalayout with
a different program or stack address space, we now parse IR based on the
overridden datalayout, instead of the one written in the file (or the
default one, if none is specified). This broke a few AVR tests, and one
AMDGPU test.
Outside the CodeGen tests I mentioned, the test changes are all just
fixing CHECK lines and moving around datalayout lines in weird places.
Differential Revision: https://reviews.llvm.org/D78403
This patch extends DIModule Debug metadata in LLVM to support
Fortran modules. DIModule is extended to contain File and Line
fields, these fields will be used by Flang FE to create debug
information necessary for representing Fortran modules at IR level.
Furthermore DW_TAG_module is also extended to contain these fields.
If these fields are missing, debuggers like GDB won't be able to
show Fortran modules information correctly.
Reviewed By: aprantl
Differential Revision: https://reviews.llvm.org/D79484
Add llvm.call.preallocated.{setup,arg} instrinsics.
Add "preallocated" operand bundle which takes a token produced by llvm.call.preallocated.setup.
Add "preallocated" parameter attribute, which is like byval but without the copy.
Verifier changes for these IR constructs.
See https://github.com/rnk/llvm-project/blob/call-setup-docs/llvm/docs/CallSetup.md
Subscribers: hiraditya, jdoerfert, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D74651
Summary:
The change is to fix conflict value for metadata "Objective-C Garbage Collection" in the mix of swift and Objective-C bitcode.
The purpose is to provide the support of LTO for swift and Objective-C mixed project.
Reviewers: rjmccall, ahatanak, steven_wu
Reviewed By: rjmccall, steven_wu
Subscribers: manmanren, mehdi_amini, hiraditya, dexonsmith, llvm-commits, jinlin
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D71219
This reverts commit 80d0a137a5, and the
follow on fix in 873c0d0786. It is
causing test failures after a multi-stage clang bootstrap. See
discussion on D73242 and D75201.
in C++ templates."
This was reverted in 802b22b5c8 due to
missing .bc file and a chromium bot failure.
https://bugs.chromium.org/p/chromium/issues/detail?id=1057559#c1
This revision address both of them.
Summary:
This patch adds support for debuginfo generation for defaulted
parameters in clang and also extends corresponding DebugMetadata/IR to support this feature.
Reviewers: probinson, aprantl, dblaikie
Reviewed By: aprantl, dblaikie
Differential Revision: https://reviews.llvm.org/D73462
The Bitcode/DITemplateParameter-5.0.ll test is failing:
FAIL: LLVM :: Bitcode/DITemplateParameter-5.0.ll (5894 of 36324)
******************** TEST 'LLVM :: Bitcode/DITemplateParameter-5.0.ll' FAILED ********************
Script:
--
: 'RUN: at line 1'; /usr/local/google/home/thakis/src/llvm-project/out/gn/bin/llvm-dis -o - /usr/local/google/home/thakis/src/llvm-project/llvm/test/Bitcode/DITemplateParameter-5.0.ll.bc | /usr/local/google/home/thakis/src/llvm-project/out/gn/bin/FileCheck /usr/local/google/home/thakis/src/llvm-project/llvm/test/Bitcode/DITemplateParameter-5.0.ll
--
Exit Code: 2
Command Output (stderr):
--
It looks like the Bitcode/DITemplateParameter-5.0.ll.bc file was never checked in.
This reverts commit c2b437d53d.
in C++ templates.
Summary:
This patch adds support for debuginfo generation for defaulted
parameters in clang and also extends corresponding DebugMetadata/IR to support this feature.
Reviewers: probinson, aprantl, dblaikie
Reviewed By: aprantl, dblaikie
Differential Revision: https://reviews.llvm.org/D73462
This reverts commit 80a34ae311 with fixes.
Previously, since bots turning on EXPENSIVE_CHECKS are essentially turning on
MachineVerifierPass by default on X86 and the fact that
inline-asm-avx-v-constraint-32bit.ll and inline-asm-avx512vl-v-constraint-32bit.ll
are not expected to generate functioning machine code, this would go
down to `report_fatal_error` in MachineVerifierPass. Here passing
`-verify-machineinstrs=0` to make the intent explicit.
This reverts commit 80a34ae311 with fixes.
On bots llvm-clang-x86_64-expensive-checks-ubuntu and
llvm-clang-x86_64-expensive-checks-debian only,
llc returns 0 for these two tests unexpectedly. I tweaked the RUN line a little
bit in the hope that LIT is the culprit since this change is not in the
codepath these tests are testing.
llvm\test\CodeGen\X86\inline-asm-avx-v-constraint-32bit.ll
llvm\test\CodeGen\X86\inline-asm-avx512vl-v-constraint-32bit.ll
This reverts commit rGcd5b308b828e, rGcd5b308b828e, rG8cedf0e2994c.
There are issues to be investigated for polly bots and bots turning on
EXPENSIVE_CHECKS.
This restores commit 748bb5a0f1, along
with a fix for a Chromium test suite build issue (and a new test for
that case).
Differential Revision: https://reviews.llvm.org/D73242
Summary:
Currently type test assume sequences inserted for devirtualization are
removed during WPD. This patch delays their removal until later in the
optimization pipeline. This is an enabler for upcoming enhancements to
indirect call promotion, for example streamlined promotion guard
sequences that compare against vtable address instead of the target
function, when there are small number of possible vtables (either
determined via WPD or by in-progress type profiling). We need the type
tests to correlate the callsites with the address point offset needed in
the compare sequence, and optionally to associated type summary info
computed during WPD.
This depends on work in D71913 to enable invocation of LowerTypeTests to
drop type test assume sequences, which will now be invoked following ICP
in the ThinLTO post-LTO link pipelines, and also after the existing
export phase LowerTypeTests invocation in regular LTO (which is already
after ICP). We cannot simply move the existing import phase
LowerTypeTests pass later in the ThinLTO post link pipelines, as the
comment in PassBuilder.cpp notes (it must run early because when
performing CFI other passes may disturb the sequences it looks for).
This necessitated adding a new type test resolution "Unknown" that we
can use on the type test assume sequences previously removed by WPD,
that we now want LTT to ignore.
Depends on D71913.
Reviewers: pcc, evgeny777
Subscribers: mehdi_amini, Prazek, hiraditya, steven_wu, dexonsmith, arphaman, davidxl, cfe-commits, llvm-commits
Tags: #clang, #llvm
Differential Revision: https://reviews.llvm.org/D73242
Pass the Scalability test to VectorType::get in order to be able to
deserialize bitcode that contains scalable vector operations
Differential Revision: https://reviews.llvm.org/D73144
Summary:
* Pass the Scalability test to VectorType::get in order to be
able to deserialize bitcode that contains scalable vector operations
Change-Id: I37fe5b1c0c237a9153130deefdc1a6d595c7f12e
Reviewers: efriedma, pcc, sdesmalen, apazos, huihuiz, chrisj
Reviewed By: sdesmalen
Subscribers: tschuett, hiraditya, rkruppe, psnobl, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D72792
Summary:
This patch could be treated as a rebase of D33960. It also fixes PR35547.
A fix for `llvm/test/Other/close-stderr.ll` is proposed in D68164. Seems
the consensus is that the test is passing by chance and I'm not
sure how important it is for us. So it is removed like in D33960 for now.
The rest of the test fixes are just adding `--crash` flag to `not` tool.
** The reason it fixes PR35547 is
`exit` does cleanup including calling class destructor whereas `abort`
does not do any cleanup. In multithreading environment such as ThinLTO or JIT,
threads may share states which mostly are ManagedStatic<>. If faulting thread
tearing down a class when another thread is using it, there are chances of
memory corruption. This is bad 1. It will stop error reporting like pretty
stack printer; 2. The memory corruption is distracting and nondeterministic in
terms of error message, and corruption type (depending one the timing, it
could be double free, heap free after use, etc.).
Reviewers: rnk, chandlerc, zturner, sepavloff, MaskRay, espindola
Reviewed By: rnk, MaskRay
Subscribers: wuzish, jholewinski, qcolombet, dschuff, jyknight, emaste, sdardis, nemanjai, jvesely, nhaehnle, sbc100, arichardson, jgravelle-google, aheejin, kbarton, fedor.sergeev, asb, rbar, johnrusso, simoncook, apazos, sabuasal, niosHD, jrtc27, zzheng, edward-jones, atanasyan, rogfer01, MartinMosbeck, brucehoult, the_o, PkmX, jocewei, jsji, lenary, s.egerton, pzheng, cfe-commits, MaskRay, filcab, davide, MatzeB, mehdi_amini, hiraditya, steven_wu, dexonsmith, rupprecht, seiya, llvm-commits
Tags: #llvm, #clang
Differential Revision: https://reviews.llvm.org/D67847
This patch imports constant variables even when they can't be internalized
(which results in promotion). This offers some extra constant folding
opportunities.
Differential revision: https://reviews.llvm.org/D70404
We were upgrading it to faddp, but a version taking two type parameters instead
of one. This then got upgraded a second time to the version with just one
parameter, but occasionally (for reasons I don't understand) this unusual
two-stage process corrupted a use-list, leading to a crash when the two faddp
declarations didn't match.
D56351 (included in LLVM 8.0.0) introduced "frame-pointer". All tests
which use "no-frame-pointer-elim" or "no-frame-pointer-elim-non-leaf"
have been migrated to use "frame-pointer".
Implement UpgradeFramePointerAttributes to upgrade the two obsoleted
function attributes for bitcode. Their semantics are ignored.
Differential Revision: https://reviews.llvm.org/D71863
Summary:
This patch redefines freeze instruction from being UnaryOperator to a subclass of UnaryInstruction.
ConstantExpr freeze is removed, as discussed in the previous review.
FreezeOperator is not added because there's no ConstantExpr freeze.
`freeze i8* null` test is added to `test/Bindings/llvm-c/freeze.ll` as well, because the null pointer-related bug in `tools/llvm-c/echo.cpp` is now fixed.
InstVisitor has visitFreeze now because freeze is not unaryop anymore.
Reviewers: whitequark, deadalnix, craig.topper, jdoerfert, lebedev.ri
Reviewed By: craig.topper, lebedev.ri
Subscribers: regehr, nlopes, mehdi_amini, hiraditya, steven_wu, dexonsmith, jfb, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D69932
Patch allows importing declarations of functions and variables, referenced
by the initializer of some other readonly variable.
Differential revision: https://reviews.llvm.org/D69561
Summary:
This extends the rules for when a call instruction is deemed to be an
FPMathOperator, which is based on the type of the call (i.e. the return
type of the function being called). Previously we only allowed
floating-point and vector-of-floating-point types. Now we also allow
arrays (nested to any depth) of floating-point and
vector-of-floating-point types.
This was motivated by llpc, the pipeline compiler for AMD GPUs
(https://github.com/GPUOpen-Drivers/llpc). llpc has many math library
functions that operate on vectors, typically represented as <4 x float>,
and some that operate on matrices, typically represented as
[4 x <4 x float>], and it's useful to be able to decorate calls to all
of them with fast math flags.
Reviewers: spatel, wristow, arsenm, hfinkel, aemerson, efriedma, cameron.mcinally, mcberg2017, jmolloy
Subscribers: wdng, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D69161
Summary:
A new function pass (Transforms/CFGuard/CFGuard.cpp) inserts CFGuard checks on
indirect function calls, using either the check mechanism (X86, ARM, AArch64) or
or the dispatch mechanism (X86-64). The check mechanism requires a new calling
convention for the supported targets. The dispatch mechanism adds the target as
an operand bundle, which is processed by SelectionDAG. Another pass
(CodeGen/CFGuardLongjmp.cpp) identifies and emits valid longjmp targets, as
required by /guard:cf. This feature is enabled using the `cfguard` CC1 option.
Reviewers: thakis, rnk, theraven, pcc
Subscribers: ychen, hans, metalcanine, dmajor, tomrittervg, alex, mehdi_amini, mgorny, javed.absar, kristof.beyls, hiraditya, steven_wu, dexonsmith, cfe-commits, llvm-commits
Tags: #clang, #llvm
Differential Revision: https://reviews.llvm.org/D65761
new functions are compatible before upgrading a function call to an
intrinsic call.
Sometimes users insert calls to ARC runtime functions that are not
compatible with the corresponding intrinsic functions (for example,
'i8* @objc_storeStrong' instead of 'void @objc_storeStrong'). Don't
upgrade those calls.
rdar://problem/56447127
In particular, the function definition is not marked strictfp despite
containing a function marked strictfp. Also, if any function call is marked
strictfp then all function calls in that function must be marked.
This change to move the one strictfp call to a new properly marked function
meets all the new rules.
Tested with a stricter version of D68233.
Reviewed by: spatel
Approved by: spatel
Differential Revision: https://reviews.llvm.org/D68713
llvm-svn: 374186
The changes here are based on the corresponding diffs for allowing FMF on 'select':
D61917 <https://reviews.llvm.org/D61917>
As discussed there, we want to have fast-math-flags be a property of an FP value
because the alternative (having them on things like fcmp) leads to logical
inconsistency such as:
https://bugs.llvm.org/show_bug.cgi?id=38086
The earlier patch for select made almost no practical difference because most
unoptimized conditional code begins life as a phi (based on what I see in clang).
Similarly, I don't expect this patch to do much on its own either because
SimplifyCFG promptly drops the flags when converting to select on a minimal
example like:
https://bugs.llvm.org/show_bug.cgi?id=39535
But once we have this plumbing in place, we should be able to wire up the FMF
propagation and start solving cases like that.
The change to RecurrenceDescriptor::AddReductionVar() is required to prevent a
regression in a LoopVectorize test. We are intersecting the FMF of any
FPMathOperator there, so if a phi is not properly annotated, new math
instructions may not be either. Once we fix the propagation in SimplifyCFG, it
may be safe to remove that hack.
Differential Revision: https://reviews.llvm.org/D67564
llvm-svn: 372878
The changes here are based on the corresponding diffs for allowing FMF on 'select':
D61917
As discussed there, we want to have fast-math-flags be a property of an FP value
because the alternative (having them on things like fcmp) leads to logical
inconsistency such as:
https://bugs.llvm.org/show_bug.cgi?id=38086
The earlier patch for select made almost no practical difference because most
unoptimized conditional code begins life as a phi (based on what I see in clang).
Similarly, I don't expect this patch to do much on its own either because
SimplifyCFG promptly drops the flags when converting to select on a minimal
example like:
https://bugs.llvm.org/show_bug.cgi?id=39535
But once we have this plumbing in place, we should be able to wire up the FMF
propagation and start solving cases like that.
The change to RecurrenceDescriptor::AddReductionVar() is required to prevent a
regression in a LoopVectorize test. We are intersecting the FMF of any
FPMathOperator there, so if a phi is not properly annotated, new math
instructions may not be either. Once we fix the propagation in SimplifyCFG, it
may be safe to remove that hack.
Differential Revision: https://reviews.llvm.org/D67564
llvm-svn: 372866
Summary:
Add function to AutoUpgrade to change the datalayout of old X86 datalayout strings.
This adds "-p270:32:32-p271:32:32-p272:64:64" to X86 datalayouts that are otherwise valid
and don't already contain it.
This also removes the compatibility changes in https://reviews.llvm.org/D66843.
Datalayout change in https://reviews.llvm.org/D64931.
Reviewers: rnk, echristo
Subscribers: hiraditya, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D67631
llvm-svn: 372267
We cannot create null constants for certain types, e.g. VoidTy,
FunctionTy or LabelTy. getNullValue asserts if we pass in an
unsupported type. We should also check for opaque types, but I'm not
sure how.
This fixes https://bugs.chromium.org/p/oss-fuzz/issues/detail?id=14795.
Reviewers: t.p.northover, jfb, vsk
Reviewed By: vsk
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D65897
llvm-svn: 369557
Summary:
This is a tweak to r368311 and r368646 which auto upgrades the calls to
objc runtime functions to objc runtime intrinsics, in order to make sure
that the auto upgrader does not trigger with up-to-date bitcode.
It is possible for bitcode that is up-to-date to contain direct calls to
objc runtime function and those are not inserted by compiler as part of
ARC and they should not be upgraded. Now auto upgrader only triggers as
when the old style of ARC marker is used so it is guaranteed that it
won't trigger on update-to-date bitcode.
This also means it won't do this upgrade for bitcode from llvm-8 and
llvm-9, which preserves the behavior of those releases. Ideally they
should be upgraded as well but it is more important to make sure
AutoUpgrader will not trigger on up-to-date bitcode.
Reviewers: ahatanak, rjmccall, dexonsmith, pete
Reviewed By: dexonsmith
Subscribers: hiraditya, jkorous, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D66153
llvm-svn: 368730
to intrinsic calls
This fixes a bug in r368311.
It turns out that the ARC runtime functions in the IR can have pointer
parameter types that are not i8* or i8**. Instead of RAUWing normal
functions with intrinsics, manually bitcast the arguments before passing
them to the intrinsic functions and bitcast the return value back to the
type of the original call instruction.
This recommits r368634, which was reverted in r368637. The loop in the
patch was iterating over uses of a function and deleting function calls
inside it, which caused bots to crash.
rdar://problem/54125406
Differential Revision: https://reviews.llvm.org/D66047
llvm-svn: 368646
to intrinsic calls
This fixes a bug in r368311.
It turns out that the ARC runtime functions in the IR can have pointer
parameter types that are not i8* or i8**. Instead of RAUWing normal
functions with intrinsics, manually bitcast the arguments before passing
them to the intrinsic functions and bitcast the return value back to the
type of the original call instruction.
rdar://problem/54125406
llvm-svn: 368634
the bitcode has the arm64 retainAutoreleasedReturnValue marker
The ARC middle-end passes stopped optimizing or transforming bitcode
that has been compiled with old compilers after we started emitting
calls to ARC runtime functions as intrinsic calls instead of normal
function calls in the front-end and made changes to teach the ARC
middle-end passes about those intrinsics (see r349534). This patch
converts calls to ARC runtime functions that are not intrinsic functions
to intrinsic function calls if the bitcode has the arm64
retainAutoreleasedReturnValue marker. Checking for the presence of the
marker is necessary to make sure we aren't changing ARC function calls
that were originally MRR message sends (see r349952).
rdar://problem/53280660
Differential Revision: https://reviews.llvm.org/D65902
llvm-svn: 368311
For consistency with normal instructions and clarity when reading IR,
it's best to print the %0, %1, ... names of function arguments in
definitions.
Also modifies the parser to accept IR in that form for obvious reasons.
llvm-svn: 367755
Add a new serializer, using a binary format based on the LLVM bitstream
format.
This format provides a way to serialize the remarks in two modes:
1) Separate mode: the metadata is separate from the remark entries.
2) Standalone mode: the metadata and the remark entries are in the same
file.
The format contains:
* a meta block: container version, container type, string table,
external file path, remark version
* a remark block: type, remark name, pass name, function name, debug
file, debug line, debug column, hotness, arguments (key, value, debug
file, debug line, debug column)
A string table is required for this format, which will be dumped in the
meta block to be consumed before parsing the remark blocks.
On clang itself, we noticed a size reduction of 13.4x compared to YAML,
and a compile-time reduction of between 1.7% and 3.5% on CTMark.
Differential Revision: https://reviews.llvm.org/D63466
Original llvm-svn: 367364
Revert llvm-svn: 367370
llvm-svn: 367372