To make uses of the deprecated constructor easier to spot, and to
ensure that no new uses are introduced, rename it to
Address::deprecated().
While doing the rename, I've filled in element types in cases
where it was relatively obvious, but we're still left with 135
calls to the deprecated constructor.
The "-fzero-call-used-regs" option tells the compiler to zero out
certain registers before the function returns. It's also available as a
function attribute: zero_call_used_regs.
The two upper categories are:
- "used": Zero out used registers.
- "all": Zero out all registers, whether used or not.
The individual options are:
- "skip": Don't zero out any registers. This is the default.
- "used": Zero out all used registers.
- "used-arg": Zero out used registers that are used for arguments.
- "used-gpr": Zero out used registers that are GPRs.
- "used-gpr-arg": Zero out used GPRs that are used as arguments.
- "all": Zero out all registers.
- "all-arg": Zero out all registers used for arguments.
- "all-gpr": Zero out all GPRs.
- "all-gpr-arg": Zero out all GPRs used for arguments.
This is used to help mitigate Return-Oriented Programming exploits.
Reviewed By: nickdesaulniers
Differential Revision: https://reviews.llvm.org/D110869
Following the discussion on D118229, this marks all pointer-typed
kernel arguments as having ABI alignment, per section 6.3.5 of
the OpenCL spec:
> For arguments to a __kernel function declared to be a pointer to
> a data type, the OpenCL compiler can assume that the pointee is
> always appropriately aligned as required by the data type.
Differential Revision: https://reviews.llvm.org/D118894
Take the following as an example
struct z {
z (*p)();
};
z f();
When we attempt to get the LLVM type of f, we recurse into z. z itself
has a function pointer with the same type as f. Given the recursion,
Clang simply treats z::p as a pointer to an empty struct `{}*`. The
LLVM type of f is as expected. So we have two different potential
LLVM types for a given Clang type. If we store one of those into the
cache, when we access the cache with a different context (e.g. we
are/aren't recursing on z) we may get an incorrect result. There is some
attempt to clear the cache in these cases, but it doesn't seem to handle
all cases.
This change makes it so we only use the cache when we are not in any
sort of function context, i.e. `noRecordsBeingLaidOut() &&
FunctionsBeingProcessed.empty()`, which are the cases where we may
decide to choose a different LLVM type for a given Clang type. LLVM
types for builtin types are never recursive so they're always ok.
This allows us to clear the type cache less often (as seen with the
removal of one of the calls to `TypeCache.clear()`). We
still need to clear it when we use a placeholder type then replace it
later with the final type and other dependent types need to be
recalculated.
I've added a check that the cached type matches what we compute. It
triggered in this test case without the fix. It's currently not
check-clang clean so it's not on by default for something like expensive
checks builds.
This change uncovered another issue where the LLVM types for an argument
and its local temporary don't match. For example in type-cache-3, when
expanding z::dc's argument into a temporary alloca, we ConvertType() the
type of z::p which is `void ({}*)*`, which doesn't match the alloca GEP
type of `{}*`.
No noticeable compile time changes:
https://llvm-compile-time-tracker.com/compare.php?from=3918dd6b8acf8c5886b9921138312d1c638b2937&to=50bdec9836ed40e38ece0657f3058e730adffc4c&stat=instructionsFixes#53465.
Reviewed By: rnk
Differential Revision: https://reviews.llvm.org/D118744
After fa87fa97fb, this was no longer guaranteed to be the cleanup
just added by this code, if IsEHCleanup got disabled. Instead, use
stable_begin(), which _is_ guaranteed to be the cleanup just added.
This caused a crash when a object that is callee destroyed (e.g. with the MS ABI) was passed in a call from a noexcept function.
Added a test to verify.
Fixes: fa87fa97fb
Instead use either Type::getPointerElementType() or
Type::getNonOpaquePointerElementType().
This is part of D117885, in preparation for deprecating the API.
When adding new attributes, existing attributes are dropped. While
this appears to be a longstanding issue, this was highlighted by D105169
which dropped a lot of attributes due to adding the new noundef
attribute.
Ahmed Bougacha (@ab) tracked down the issue and provided the fix in
CGCall.cpp. I bundled it up and updated the tests.
Turning on `enable_noundef_analysis` flag allows better codegen by removing freeze instructions.
I modified clang by renaming `enable_noundef_analysis` flag to `disable-noundef-analysis` and turning it off by default.
Test updates are made as a separate patch: D108453
Reviewed By: eugenis
Differential Revision: https://reviews.llvm.org/D105169
Minor adjustment in order of noundef analysis to be a bit more optimal (when disabled).
Reviewed By: vitalybuka
Differential Revision: https://reviews.llvm.org/D117078
sret is special in that it does not use the memory type
representation. Manually construct the LValue using ConvertType
instead of ConvertTypeForMem here.
This fixes matrix-lowering-opt-levels.c on s390x.
This is the last cleanup step resulting from D115804 .
Now that clang uses intrinsics when we're in the special FP mode,
we don't need a function attribute as an indicator to the backend.
The LLVM part of the change is in D115885.
Differential Revision: https://reviews.llvm.org/D115886
This no longer allows creating an invalid Address through the regular
constructor. There were only two places that did this (AggValueSlot
and EHCleanupScope) which did this by converting a potential nullptr
into an Address. I've fixed both of these by directly storing an
Address instead.
This is intended as a bit of preliminary cleanup for D103465.
Differential Revision: https://reviews.llvm.org/D115630
WG14 adopted the _ExtInt feature from Clang for C23, but renamed the
type to be _BitInt. This patch does the vast majority of the work to
rename _ExtInt to _BitInt, which accounts for most of its size. The new
type is exposed in older C modes and all C++ modes as a conforming
extension. However, there are functional changes worth calling out:
* Deprecates _ExtInt with a fix-it to help users migrate to _BitInt.
* Updates the mangling for the type.
* Updates the documentation and adds a release note to warn users what
is going on.
* Adds new diagnostics for use of _BitInt to call out when it's used as
a Clang extension or as a pre-C23 compatibility concern.
* Adds new tests for the new diagnostic behaviors.
I want to call out the ABI break specifically. We do not believe that
this break will cause a significant imposition for early adopters of
the feature, and so this is being done as a full break. If it turns out
there are critical uses where recompilation is not an option for some
reason, we can consider using ABI tags to ease the transition.
This patch removes the assumption propagation that was added in D110655
primarily to get assumption informatino on opaque call sites for
optimizations. The analysis done in D111445 allows us to do this more
intelligently in the back-end.
Depends on D111445
Reviewed By: jdoerfert
Differential Revision: https://reviews.llvm.org/D111463
Turning on `enable_noundef_analysis` flag allows better codegen by removing freeze instructions.
I modified clang by renaming `enable_noundef_analysis` flag to `disable-noundef-analysis` and turning it off by default.
Test updates are made as a separate patch: D108453
Reviewed By: eugenis
Differential Revision: https://reviews.llvm.org/D105169
[Clang/Test]: Rename enable_noundef_analysis to disable-noundef-analysis and turn it off by default (2)
This patch updates test files after D105169.
Autogenerated test codes are changed by `utils/update_cc_test_checks.py,` and non-autogenerated test codes are changed as follows:
(1) I wrote a python script that (partially) updates the tests using regex: {F18594904} The script is not perfect, but I believe it gives hints about which patterns are updated to have `noundef` attached.
(2) The remaining tests are updated manually.
Reviewed By: eugenis
Differential Revision: https://reviews.llvm.org/D108453
Resolve lit failures in clang after 8ca4b3e's land
Fix lit test failures in clang-ppc* and clang-x64-windows-msvc
Fix missing failures in clang-ppc64be* and retry fixing clang-x64-windows-msvc
Fix internal_clone(aarch64) inline assembly
Turning on `enable_noundef_analysis` flag allows better codegen by removing freeze instructions.
I modified clang by renaming `enable_noundef_analysis` flag to `disable-noundef-analysis` and turning it off by default.
Test updates are made as a separate patch: D108453
Reviewed By: eugenis
Differential Revision: https://reviews.llvm.org/D105169
Turning on `enable_noundef_analysis` flag allows better codegen by removing freeze instructions.
I modified clang by renaming `enable_noundef_analysis` flag to `disable-noundef-analysis` and turning it off by default.
Test updates are made as a separate patch: D108453
Reviewed By: eugenis
Differential Revision: https://reviews.llvm.org/D105169
Currently, there're multiple float types that can be represented by
__attribute__((mode(xx))). It's parsed, and then a corresponding type is
created if available.
This refactor moves the enum for mode into a global enum class visible
to ASTContext.
Reviewed By: aaron.ballman, erichkeane
Differential Revision: https://reviews.llvm.org/D111391
Currently the max alignment representable is 1GB, see D108661.
Setting the align of an object to 4GB is desirable in some cases to make sure the lower 32 bits are clear which can be used for some optimizations, e.g. https://crbug.com/1016945.
This uses an extra bit in instructions that carry an alignment. We can store 15 bits of "free" information, and with this change some instructions (e.g. AtomicCmpXchgInst) use 14 bits.
We can increase the max alignment representable above 4GB (up to 2^62) since we're only using 33 of the 64 values, but I've just limited it to 4GB for now.
The one place we have to update the bitcode format is for the alloca instruction. It stores its alignment into 5 bits of a 32 bit bitfield. I've added another field which is 8 bits and should be future proof for a while. For backward compatibility, we check if the old field has a value and use that, otherwise use the new field.
Updating clang's max allowed alignment will come in a future patch.
Reviewed By: hans
Differential Revision: https://reviews.llvm.org/D110451
Currently the max alignment representable is 1GB, see D108661.
Setting the align of an object to 4GB is desirable in some cases to make sure the lower 32 bits are clear which can be used for some optimizations, e.g. https://crbug.com/1016945.
This uses an extra bit in instructions that carry an alignment. We can store 15 bits of "free" information, and with this change some instructions (e.g. AtomicCmpXchgInst) use 14 bits.
We can increase the max alignment representable above 4GB (up to 2^62) since we're only using 33 of the 64 values, but I've just limited it to 4GB for now.
The one place we have to update the bitcode format is for the alloca instruction. It stores its alignment into 5 bits of a 32 bit bitfield. I've added another field which is 8 bits and should be future proof for a while. For backward compatibility, we check if the old field has a value and use that, otherwise use the new field.
Updating clang's max allowed alignment will come in a future patch.
Reviewed By: hans
Differential Revision: https://reviews.llvm.org/D110451
Currently the max alignment representable is 1GB, see D108661.
Setting the align of an object to 4GB is desirable in some cases to make sure the lower 32 bits are clear which can be used for some optimizations, e.g. https://crbug.com/1016945.
This uses an extra bit in instructions that carry an alignment. We can store 15 bits of "free" information, and with this change some instructions (e.g. AtomicCmpXchgInst) use 14 bits.
We can increase the max alignment representable above 4GB (up to 2^62) since we're only using 33 of the 64 values, but I've just limited it to 4GB for now.
The one place we have to update the bitcode format is for the alloca instruction. It stores its alignment into 5 bits of a 32 bit bitfield. I've added another field which is 8 bits and should be future proof for a while. For backward compatibility, we check if the old field has a value and use that, otherwise use the new field.
Updating clang's max allowed alignment will come in a future patch.
Reviewed By: hans
Differential Revision: https://reviews.llvm.org/D110451
This patch adds OpenMP assumption attributes to call sites in applicable
regions. Currently this applies the caller's assumption attributes to
any calls contained within it. So, if a call occurs inside an OpenMP
assumes region to a function outside that region, we will assume that
call respects the assumptions. This is primarily useful for inline
assembly calls used heavily in the OpenMP GPU device runtime, which
allows us to then make judgements about what the ASM will do.
Reviewed By: jdoerfert
Differential Revision: https://reviews.llvm.org/D110655
Add support for the GNU C style __attribute__((error(""))) and
__attribute__((warning(""))). These attributes are meant to be put on
declarations of functions whom should not be called.
They are frequently used to provide compile time diagnostics similar to
_Static_assert, but which may rely on non-ICE conditions (ie. relying on
compiler optimizations). This is also similar to diagnose_if function
attribute, but can diagnose after optimizations have been run.
While users may instead simply call undefined functions in such cases to
get a linkage failure from the linker, these provide a much more
ergonomic and actionable diagnostic to users and do so at compile time
rather than at link time. Users instead may be able use inline asm .err
directives.
These are used throughout the Linux kernel in its implementation of
BUILD_BUG and BUILD_BUG_ON macros. These macros generally cannot be
converted to use _Static_assert because many of the parameters are not
ICEs. The Linux kernel still needs to be modified to make use of these
when building with Clang; I have a patch that does so I will send once
this feature is landed.
To do so, we create a new IR level Function attribute, "dontcall" (both
error and warning boil down to one IR Fn Attr). Then, similar to calls
to inline asm, we attach a !srcloc Metadata node to call sites of such
attributed callees.
The backend diagnoses these during instruction selection, while we still
know that a call is a call (vs say a JMP that's a tail call) in an arch
agnostic manner.
The frontend then reconstructs the SourceLocation from that Metadata,
and determines whether to emit an error or warning based on the callee's
attribute.
Link: https://bugs.llvm.org/show_bug.cgi?id=16428
Link: https://github.com/ClangBuiltLinux/linux/issues/1173
Reviewed By: aaron.ballman
Differential Revision: https://reviews.llvm.org/D106030
For fixed SVE types, predicates are represented using vectors of i8,
where as for scalable types they are represented using vectors of i1. We
can avoid going through memory for casts between these by bitcasting the
i1 scalable vectors to/from a scalable i8 vector of matching size, which
can then use the existing vector insert/extract logic.
Differential Revision: https://reviews.llvm.org/D106860
This change is intended as initial setup. The plan is to add
more semantic checks later. I plan to update the documentation
as more semantic checks are added (instead of documenting the
details up front). Most of the code closely mirrors that for
the Swift calling convention. Three places are marked as
[FIXME: swiftasynccc]; those will be addressed once the
corresponding convention is introduced in LLVM.
Reviewed By: rjmccall
Differential Revision: https://reviews.llvm.org/D95561
In preparation for dropping support for it. I've replaced it with
a proper type where the correct type was obvious and left an
explicit getPointerElementType() where it wasn't.
C++ constructors/destructors need to go through a different constructor to construct a GlobalDecl object in order to retrieve their linkage type. This causes an assert failure in the default constructor of GlobalDecl. I'm chaning it to using the exsiting GlobalDecl object.
Reviewed By: dblaikie
Differential Revision: https://reviews.llvm.org/D102356
If the memory object is scalable type, we do not know the exact size of
it at compile time. Set the size of lifetime marker to unknown if the
object is scalable one.
Differential Revision: https://reviews.llvm.org/D102822
I wouldn't recommend writing code like the testcase; a function
parameter isn't atomic, so using an atomic type doesn't really make
sense. But it's valid, so clang shouldn't crash on it.
The code was assuming hasAggregateEvaluationKind(Ty) implies Ty is a
RecordType, which isn't true. Just use isRecordType() instead.
Differential Revision: https://reviews.llvm.org/D102015
The original change was reverted because it was discovered
that clang mishandles thunks, and they receive wrong
attributes for their this/return types - the ones for the function
they will call, not the ones they have.
While i have tried to fix this in https://reviews.llvm.org/D100388
that patch has been up and stuck for a month now,
with little signs of progress.
So while it will be good to solve this for real,
for now we can simply avoid introducing the bug,
by not annotating this/return for thunks.
This reverts commit 6270b3a1ea,
relanding 0aa0458f14.
As it was discovered in post-commit feedback
for 0aa0458f14,
we handle thunks incorrectly, and end up annotating
their this/return with attributes that are valid
for their callees, not for thunks themselves.
While it would be good to fix this properly,
and keep annotating them on thunks,
i've tried doing that in https://reviews.llvm.org/D100388
with little success, and the patch is stuck for a month now.
So for now, as a stopgap measure, subj.
LLVM should be smarter about *known* malloc's alignment and this knowledge may enable other optimizations.
Originally started as LLVM patch - https://reviews.llvm.org/D100862 but this logic should be really in Clang.
Reviewed By: rjmccall
Differential Revision: https://reviews.llvm.org/D100879
This is a Clang-only change and depends on the existing "musttail"
support already implemented in LLVM.
The [[clang::musttail]] attribute goes on a return statement, not
a function definition. There are several constraints that the user
must follow when using [[clang::musttail]], and these constraints
are verified by Sema.
Tail calls are supported on regular function calls, calls through a
function pointer, member function calls, and even pointer to member.
Future work would be to throw a warning if a users tries to pass
a pointer or reference to a local variable through a musttail call.
Reviewed By: rsmith
Differential Revision: https://reviews.llvm.org/D99517
When we pass a AArch64 Homogeneous Floating-Point
Aggregate (HFA) argument with increased alignment
requirements, for example
struct S {
__attribute__ ((__aligned__(16))) double v[4];
};
Clang uses `[4 x double]` for the parameter, which is passed
on the stack at alignment 8, whereas it should be at
alignment 16, following Rule C.4 in
AAPCS (https://github.com/ARM-software/abi-aa/blob/master/aapcs64/aapcs64.rst#642parameter-passing-rules)
Currently we don't have a way to express in LLVM IR the
alignment requirements of the function arguments. The align
attribute is applicable to pointers only, and only for some
special ways of passing arguments (e..g byval). When
implementing AAPCS32/AAPCS64, clang resorts to dubious hacks
of coercing to types, which naturally have the needed
alignment. We don't have enough types to cover all the
cases, though.
This patch introduces a new use of the stackalign attribute
to control stack slot alignment, when and if an argument is
passed in memory.
The attribute align is left as an optimizer hint - it still
applies to pointer types only and pertains to the content of
the pointer, whereas the alignment of the pointer itself is
determined by the stackalign attribute.
For byval arguments, the stackalign attribute assumes the
role, previously perfomed by align, falling back to align if
stackalign` is absent.
On the clang side, when passing arguments using the "direct"
style (cf. `ABIArgInfo::Kind`), now we can optionally
specify an alignment, which is emitted as the new
`stackalign` attribute.
Patch by Momchil Velikov and Lucas Prates.
Differential Revision: https://reviews.llvm.org/D98794
As it is being noted in D99249, lack of alignment information on `this`
has been preventing LICM from happening.
For some time now, lack of alignment attribute does *not* imply
natural alignment, but an alignment of `1`.
Also, we used to treat dereferenceable as implying alignment,
but we no longer do, so it's a bugfix.
Differential Revision: https://reviews.llvm.org/D99790
I think byval/sret and the others are close to being able to rip out
the code to support the missing type case. A lot of this code is
shared with inalloca, so catch this up to the others so that can
happen.
These are incompatible with opaque pointers. This is in preparation
of dropping this API on the IRBuilder side as well.
Instead explicitly pass the loaded type.
It's available both in CodeGenOptions and in LangOptions, and LangOptions
implementation is slightly better as it uses a StringRef instead of a char
pointer, so use it.
Differential Revision: https://reviews.llvm.org/D98175
The option -funique-internal-linkage-names was added in D73307 and D78243 as a
LLVM early pass to insert a unique suffix to internal linkage functions and
vars. The unique suffix was the hash of the module path. However, we found
that this can be done more cleanly in clang early and the fixes that need to
be done later can be completely avoided. The fixes in particular are trying
to modify the DW_AT_linkage_name and finding the right place to insert the
pass.
This patch ressurects the original implementation proposed in D73307 which
was reviewed and then ditched in favor of the pass based approach.
Differential Revision: https://reviews.llvm.org/D96109
Use a WeakTrackingVH to cope with the stmt emission logic that cleans up
unreachable blocks. This invalidates the reference to the deferred
replacement placeholder. Cope with it.
Fixes PR25102 (from 2015!)
This change adds a new IR noundef attribute, which denotes when a function call argument or return val may never contain uninitialized bits.
In MemorySanitizer, this attribute enables optimizations which decrease instrumented code size by up to 17% (measured with an instrumented build of clang) . I'll introduce the change allowing msan to take advantage of this information in a separate patch.
Differential Revision: https://reviews.llvm.org/D81678
This takes advantage of the implicit default behavior to reduce the number of
attributes, which in turns reduces compilation time. I've observed -3% in
instruction count when compiling sqlite3 amalgamation with -O0
Differential Revision: https://reviews.llvm.org/D96400
Move nomerge attribute from function declaration/definition to callsites to
allow virtual function calls attach the attribute.
Differential Revision: https://reviews.llvm.org/D94537
VLST return values are coerced to VLATs in the function epilog for
consistency with the VLAT ABI. Previously, this coercion was done
through memory. It is preferable to use the
llvm.experimental.vector.insert intrinsic to avoid going through memory
here.
Reviewed By: c-rhodes
Differential Revision: https://reviews.llvm.org/D94290
VLST arguments are coerced to VLATs at the function boundary for
consistency with the VLAT ABI. They are then bitcast back to VLSTs in
the function prolog. Previously, this conversion is done through memory.
With the introduction of the llvm.vector.{insert,extract} intrinsic, we
can avoid going through memory here.
Depends on D92761
Differential Revision: https://reviews.llvm.org/D92762
Clang FE currently has hot/cold function attribute. But we only have
cold function attribute in LLVM IR.
This patch adds support of hot function attribute to LLVM IR. This
attribute will be used in setting function section prefix/suffix.
Currently .hot and .unlikely suffix only are added in PGO (Sample PGO)
compilation (through isFunctionHotInCallGraph and
isFunctionColdInCallGraph).
This patch changes the behavior. The new behavior is:
(1) If the user annotates a function as hot or isFunctionHotInCallGraph
is true, this function will be marked as hot. Otherwise,
(2) If the user annotates a function as cold or
isFunctionColdInCallGraph is true, this function will be marked as
cold.
The changes are:
(1) user annotated function attribute will used in setting function
section prefix/suffix.
(2) hot attribute overwrites profile count based hotness.
(3) profile count based hotness overwrite user annotated cold attribute.
The intention for these changes is to provide the user a way to mark
certain function as hot in cases where training input is hard to cover
all the hot functions.
Differential Revision: https://reviews.llvm.org/D92493
The `assume` attribute is a way to provide additional, arbitrary
information to the optimizer. For now, assumptions are restricted to
strings which will be accumulated for a function and emitted as comma
separated string function attribute. The key of the LLVM-IR function
attribute is `llvm.assume`. Similar to `llvm.assume` and
`__builtin_assume`, the `assume` attribute provides a user defined
assumption to the compiler.
A follow up patch will introduce an LLVM-core API to query the
assumptions attached to a function. We also expect to add more options,
e.g., expression arguments, to the `assume` attribute later on.
The `omp [begin] asssumes` pragma will leverage this attribute and
expose the functionality in the absence of OpenMP.
Reviewed By: aaron.ballman
Differential Revision: https://reviews.llvm.org/D91979
This patch adds the functionality to compare BFI counts with real
profile
counts right after reading the profile. It will print remarks under
-Rpass-analysis=pgo, or the internal option -pass-remarks-analysis=pgo.
Differential Revision: https://reviews.llvm.org/D91813
Swiftcall does it's own target-independent argument type classification,
since it is not designed to be ABI compatible with anything local on the
target that isn't LLVM-based. This means it never uses inalloca.
However, we have duplicate logic for checking for inalloca parameters
that runs before call argument setup. This logic needs to know ahead of
time if inalloca will be used later, and we can't move the
CGFunctionInfo calculation earlier.
This change gets the calling convention from either the
FunctionProtoType or ObjCMethodDecl, checks if it is swift, and if so
skips the stackbase setup.
Depends on D92883.
Differential Revision: https://reviews.llvm.org/D92944
This template exists to abstract over FunctionPrototype and
ObjCMethodDecl, which have similar APIs for storing parameter types. In
place of a template, use a PointerUnion with two cases to handle this.
Hopefully this improves readability, since the type of the prototype is
easier to discover. This allows me to sink this code, which is mostly
assertions, out of the header file and into the cpp file. I can also
simplify the overloaded methods for computing isGenericMethod, and get
rid of the second EmitCallArgs overload.
Differential Revision: https://reviews.llvm.org/D92883
After D17993, with -fno-delete-null-pointer-checks we add the dereferenceable attribute to the `this` pointer.
We have observed that one internal target which worked before fails even with -fno-delete-null-pointer-checks.
Switching to dereferenceable_or_null fixes the problem.
dereferenceable currently does not always respect NullPointerIsValid and may
imply nonnull and lead to aggressive optimization. The optimization may be
related to `CallBase::isReturnNonNull`, `Argument::hasNonNullAttr`, or
`Value::getPointerDereferenceableBytes`. See D66664 and D66618 for some discussions.
Reviewed By: bkramer, rsmith
Differential Revision: https://reviews.llvm.org/D92297
arguments.
* Adds 'nonnull' and 'dereferenceable(N)' to 'this' pointer arguments
* Gates 'nonnull' on -f(no-)delete-null-pointer-checks
* Introduces this-nonnull.cpp and microsoft-abi-this-nullable.cpp tests to
explicitly test the behavior of this change
* Refactors hundreds of over-constrained clang tests to permit these
attributes, where needed
* Updates Clang12 patch notes mentioning this change
Reviewed-by: rsmith, jdoerfert
Differential Revision: https://reviews.llvm.org/D17993
Followup to D85191.
This changes getTypeInfoInChars to return a TypeInfoChars
struct instead of a std::pair of CharUnits. This lets the
interface match getTypeInfo more closely.
Reviewed By: efriedma
Differential Revision: https://reviews.llvm.org/D86447
- `-cl-fp32-correctly-rounded-divide-sqrt` is already handled in a
per-instruction manner by annotating the accuracy required. There's no
need to add that fn-attr. So far, there's no in-tree backend handling
that attr and that OpenCL specific option.
- In case that out-of-tree backends are broken, this change could be
reverted if those backends could not be fixed.
Differential Revision: https://reviews.llvm.org/D88424
This reverts commit 55c4ff91bd.
Issues were introduced as discussed in https://reviews.llvm.org/D88241
where this change made previous bugs in the linker and BitCodeWriter
visible.
Extend -fsanitize=nullability-arg to handle call sites which accept C++
member pointers.
rdar://62476022
Differential Revision: https://reviews.llvm.org/D88336
- `-cl-fp32-correctly-rounded-divide-sqrt` is an OpenCL-specific option
and `correctly-rounded-divide-sqrt-fp-math` should be added for OpenCL
at most.
Differential revision: https://reviews.llvm.org/D88303
Make the corresponding change that was made for byval in
b7141207a4. Like byval, this requires a
bulk update of the test IR tests to include the type before this can
be mandatory.
After the recent discussion on cfe-dev 'Can indirect class parameters be
noalias?' [1], it seems like using using noalias is problematic for
current C++, but should be allowed for C-only code.
This patch introduces a new option to let the user indicate that it is
safe to mark indirect class parameters as noalias. Note that this also
applies to external callers, e.g. it might not be safe to use this flag
for C functions that are called by C++ functions.
In targets that allocate indirect arguments in the called function, this
enables more agressive optimizations with respect to memory operations
and brings a ~1% - 2% codesize reduction for some programs.
[1] : http://lists.llvm.org/pipermail/cfe-dev/2020-July/066353.html
Reviewed By: rjmccall
Differential Revision: https://reviews.llvm.org/D85473
This relands D85743 with a fix for test
CodeGen/attr-arm-sve-vector-bits-call.c that disables the new pass
manager with '-fno-experimental-new-pass-manager'. Test was failing due
to IR differences with the new pass manager which broke the Fuchsia
builder [1]. Reverted in 2e7041f.
[1] http://lab.llvm.org:8011/builders/fuchsia-x86_64-linux/builds/10375
Original summary:
This patch implements codegen for the 'arm_sve_vector_bits' type
attribute, defined by the Arm C Language Extensions (ACLE) for SVE [1].
The purpose of this attribute is to define vector-length-specific (VLS)
versions of existing vector-length-agnostic (VLA) types.
VLSTs are represented as VectorType in the AST and fixed-length vectors
in the IR everywhere except in function args/return. Implemented in this
patch is codegen support for the following:
* Implicit casting between VLA <-> VLS types.
* Coercion of VLS types in function args/return.
* Mangling of VLS types.
Casting is handled by the CK_BitCast operation, which has been extended
to support the two new vector kinds for fixed-length SVE predicate and
data vectors, where the cast is implemented through memory rather than a
bitcast which is unsupported. Implementing this as a normal bitcast
would require relaxing checks in LLVM to allow bitcasting between
scalable and fixed types. Another option was adding target-specific
intrinsics, although codegen support would need to be added for these
intrinsics. Given this, casting through memory seemed like the best
approach as it's supported today and existing optimisations may remove
unnecessary loads/stores, although there is room for improvement here.
Coercion of VLSTs in function args/return from fixed to scalable is
implemented through the AArch64 ABI in TargetInfo.
The VLA and VLS types are defined by the ACLE to map to the same
machine-level SVE vectors. VLS types are mangled in the same way as:
__SVE_VLS<typename, unsigned>
where the first argument is the underlying variable-length type and the
second argument is the SVE vector length in bits. For example:
#if __ARM_FEATURE_SVE_BITS==512
// Mangled as 9__SVE_VLSIu11__SVInt32_tLj512EE
typedef svint32_t vec __attribute__((arm_sve_vector_bits(512)));
// Mangled as 9__SVE_VLSIu10__SVBool_tLj512EE
typedef svbool_t pred __attribute__((arm_sve_vector_bits(512)));
#endif
The latest ACLE specification (00bet5) does not contain details of this
mangling scheme, it will be specified in the next revision. The
mangling scheme is otherwise defined in the appendices to the Procedure
Call Standard for the Arm Architecture, see [2] for more information.
[1] https://developer.arm.com/documentation/100987/latest
[2] https://github.com/ARM-software/abi-aa/blob/master/aapcs64/aapcs64.rst#appendix-c-mangling
Reviewed By: efriedma
Differential Revision: https://reviews.llvm.org/D85743