Removed AArch64 usage of the getMaxVScale interface, replacing it with
the vscale_range(min, max) IR Attribute.
Reviewed By: paulwalker-arm
Differential Revision: https://reviews.llvm.org/D106277
In LLVM IR terms the ACLE type 'data512_t' is essentially an aggregate
type { [8 x i64] }. When emitting code for inline assembly operands,
clang tries to scalarize aggregate types to an integer of the equivalent
length, otherwise it passes them by-reference. This patch adds a target
hook to tell whether a given inline assembly operand is scalarizable
so that clang can emit code to pass/return it by-value.
Differential Revision: https://reviews.llvm.org/D94098
This change is intended as initial setup. The plan is to add
more semantic checks later. I plan to update the documentation
as more semantic checks are added (instead of documenting the
details up front). Most of the code closely mirrors that for
the Swift calling convention. Three places are marked as
[FIXME: swiftasynccc]; those will be addressed once the
corresponding convention is introduced in LLVM.
Reviewed By: rjmccall
Differential Revision: https://reviews.llvm.org/D95561
This patch contains a couple of minor corrections to my previous
crypto patch:
Since both AArch32 and AArch64 are now correctly setting the aes and
sha2 features individually, it is not necessary to continue to check
the crypto feature when defining feature macros.
In the AArch32 driver, the feature vector is only modified when the
crypto feature is actually in the vector. If crypto is not present,
there is no need to split it and explicitly define crypto/sha2/aes.
Reviewed By: lenary
Differential Revision: https://reviews.llvm.org/D102406
This patch changes the AArch32 crypto instructions (sha2 and aes) to
require the specific sha2 or aes features. These features have
already been implemented and can be controlled through the command
line, but do not have the expected result (i.e. `+noaes` will not
disable aes instructions). The crypto feature retains its existing
meaning of both sha2 and aes.
Several small changes are included due to the knock-on effect this has:
- The AArch32 driver has been modified to ensure sha2/aes is correctly
set based on arch/cpu/fpu selection and feature ordering.
- Crypto extensions are permitted for AArch32 v8-R profile, but not
enabled by default.
- ACLE feature macros have been updated with the fine grained crypto
algorithms. These are also used by AArch64.
- Various tests updated due to the change in feature lists and macros.
Reviewed By: lenary
Differential Revision: https://reviews.llvm.org/D99079
Reverts parts of https://reviews.llvm.org/D17183, but keeps the
resetDataLayout() API and adds an assert that checks that datalayout string and
user label prefix are in sync.
Approach 1 in https://reviews.llvm.org/D17183#2653279
Reduces number of TUs build for 'clang-format' from 689 to 575.
I also implemented approach 2 in D100764. If someone feels motivated
to make us use DataLayout more, it's easy to revert this change here
and go with D100764 instead. I don't plan on doing more work in this
area though, so I prefer going with the smaller, more self-consistent change.
Differential Revision: https://reviews.llvm.org/D100776
This patch implements the __rndr and __rndrrs intrinsics to provide access to the random
number instructions introduced in Armv8.5-A. They are only defined for the AArch64
execution state and are available when __ARM_FEATURE_RNG is defined.
These intrinsics store the random number in their pointer argument and return a status
code if the generation succeeded. The difference between __rndr __rndrrs, is that the latter
intrinsic reseeds the random number generator.
The instructions write the NZCV flags indicating the success of the operation that we can
then read with a CSET.
[1] https://developer.arm.com/docs/101028/latest/data-processing-intrinsics
[2] https://bugs.llvm.org/show_bug.cgi?id=47838
Differential Revision: https://reviews.llvm.org/D98264
Change-Id: I8f92e7bf5b450e5da3e59943b53482edf0df6efc
This introduces the ARMv8.7-A LS64 extension's intrinsics for 64 bytes
atomic loads and stores: `__arm_ld64b`, `__arm_st64b`, `__arm_st64bv`,
and `__arm_st64bv0`. These are selected into the LS64 instructions
LD64B, ST64B, ST64BV and ST64BV0, respectively.
Based on patches written by Simon Tatham.
Reviewed By: tmatheson
Differential Revision: https://reviews.llvm.org/D93232
This introduces command-line support for the 'armv8.7-a' architecture name
(and an alias without the '-', as usual), and for the 'ls64' extension name.
Based on patches written by Simon Tatham.
Reviewed By: ostannard
Differential Revision: https://reviews.llvm.org/D91776
This also teaches MachO writers/readers about the MachO cpu subtype,
beyond the minimal subtype reader support present at the moment.
This also defines a preprocessor macro to allow users to distinguish
__arm64__ from __arm64e__.
arm64e defaults to an "apple-a12" CPU, which supports v8.3a, allowing
pointer-authentication codegen.
It also currently defaults to ios14 and macos11.
Differential Revision: https://reviews.llvm.org/D87095
shouldRTTIBeUnique() returns false for iOS64CXXABI, which causes
RTTI objects to be emitted hidden. Update two tests that didn't
expect this to happen for the default triple.
Also rename iOS64CXXABI to AppleARM64CXXABI, since it's used for
arm64-apple-macos triples too.
Part of PR46644.
Differential Revision: https://reviews.llvm.org/D91904
This patch implements the definition of __ARM_FEATURE_ATOMICS and fixes the
missing definition of __ARM_FEATURE_CRC32 for Armv8.1-A.
Differential Revision: https://reviews.llvm.org/D91438
The macro is emitted when wargeting SVE code generation with the additional command line option `-msve-vector-bits=<N>`.
The behavior implied by the macro is described in sections "3.7.3.3. Behavior specific to SVE vectors" of the SVE ACLE (Version 00bet6) that can be found at https://developer.arm.com/documentation/100987/latest
Reviewed By: rengolin, rsandifo-arm
Differential Revision: https://reviews.llvm.org/D90956
This adds support for -mcpu=cortex-r82. Some more information about this
core can be found here:
https://www.arm.com/products/silicon-ip-cpu/cortex-r/cortex-r82
One note about the system register: that is a bit of a refactoring because of
small differences between v8.4-A AArch64 and v8-R AArch64.
This is based on patches from Mark Murray and Mikhail Maltsev.
Differential Revision: https://reviews.llvm.org/D88660
The __ARM_FEATURE_SVE_BITS feature macro is specified in the Arm C
Language Extensions (ACLE) for SVE [1] (version 00bet5). From the spec,
where __ARM_FEATURE_SVE_BITS==N:
When N is nonzero, indicates that the implementation is generating
code for an N-bit SVE target and that the arm_sve_vector_bits(N)
attribute is available.
This was defined in D83550 as __ARM_FEATURE_SVE_BITS_EXPERIMENTAL and
enabled under the -msve-vector-bits flag to simplify initial tests.
This patch drops _EXPERIMENTAL now there is support for the feature.
[1] https://developer.arm.com/documentation/100987/latest
Reviewed By: david-arm
Differential Revision: https://reviews.llvm.org/D86720
Summary:
This patch implements parsing support for the 'arm_sve_vector_bits' type
attribute, defined by the Arm C Language Extensions (ACLE, version 00bet5,
section 3.7.3) for SVE [1].
The purpose of this attribute is to define fixed-length (VLST) versions
of existing sizeless types (VLAT). For example:
#if __ARM_FEATURE_SVE_BITS==512
typedef svint32_t fixed_svint32_t __attribute__((arm_sve_vector_bits(512)));
#endif
Creates a type 'fixed_svint32_t' that is a fixed-length version of
'svint32_t' that is normal-sized (rather than sizeless) and contains
exactly 512 bits. Unlike 'svint32_t', this type can be used in places
such as structs and arrays where sizeless types can't.
Implemented in this patch is the following:
* Defined and tested attribute taking single argument.
* Checks the argument is an integer constant expression.
* Attribute can only be attached to a single SVE vector or predicate
type, excluding tuple types such as svint32x4_t.
* Added the `-msve-vector-bits=<bits>` flag. When specified the
`__ARM_FEATURE_SVE_BITS__EXPERIMENTAL` macro is defined.
* Added a language option to store the vector size specified by the
`-msve-vector-bits=<bits>` flag. This is used to validate `N ==
__ARM_FEATURE_SVE_BITS`, where N is the number of bits passed to the
attribute and `__ARM_FEATURE_SVE_BITS` is the feature macro defined under
the same flag.
The `__ARM_FEATURE_SVE_BITS` macro will be made non-experimental in the final
patch of the series.
[1] https://developer.arm.com/documentation/100987/latest
This is patch 1/4 of a patch series.
Reviewers: sdesmalen, rsandifo-arm, efriedma, ctetreau, cameron.mcinally, rengolin, aaron.ballman
Reviewed By: sdesmalen, aaron.ballman
Differential Revision: https://reviews.llvm.org/D83550
Summary:
The following feature macros have been added:
__ARM_FEATURE_SVE_BF16
__ARM_FEATURE_SVE_MATMUL_INT8
__ARM_FEATURE_SVE_MATMUL_FP32
__ARM_FEATURE_SVE_MATMUL_FP64
The driver has been updated to enable them accordingly to the value of
the target feature passed at command line.
The SVE ACLE tests using the macros have been modified to work with
the target feature instead of passing the macro at command line.
Reviewers: sdesmalen, efriedma, c-rhodes, kmclaughlin, SjoerdMeijer, rengolin
Subscribers: tschuett, kristof.beyls, rkruppe, psnobl, cfe-commits
Tags: #clang
Differential Revision: https://reviews.llvm.org/D82623
This patch enables the following macros when their corresponding
target attributes are set:
__ARM_FEATURE_SVE (+sve)
__ARM_FEATURE_SVE2 (+sve2)
__ARM_FEATURE_SVE2_AES (+sve2-aes)
__ARM_FEATURE_SVE2_BITPERM (+sve2-bitperm)
__ARM_FEATURE_SVE2_SHA3 (+sve2-sha3)
__ARM_FEATURE_SVE2_SM4 (+sve2-sm4)
This implies that the base SVE and SVE2 ACLE (00bet2) are now feature
complete, meaning that all intrinsics are implemented in LLVM and Clang.
Disclaimer:
To implement the ACLE we have had to fix up many parts of LLVM to make it
support scalable vectors. We have also used many target-specific intrinsics
to reduce reliance on parts of LLVM where we know scalable vectors may
not yet be handled properly (e.g. some transformation might drop the
'scalable' flag on a vector type). While we've done a best effort with
the limited testing that is available to us, we're still working to improve the
stability of the implementation. Additionally, Clang may print warnings
that code may have miscompiled. We find this often to be a false alarm
where the wrong interfaces have been used in LLVM and where resulting
code is not actually incorrect. However, this warrants a bug report
and investigation. If you find any bugs or issues, please raise them on
bugs.llvm.org and let us know!
Reviewers: rengolin, efriedma, david-arm, SjoerdMeijer
Reviewed By: SjoerdMeijer
Differential Revision: https://reviews.llvm.org/D81725
Summary:
This patch upstreams support for a new storage only bfloat16 C type.
This type is used to implement primitive support for bfloat16 data, in
line with the Bfloat16 extension of the Armv8.6-a architecture, as
detailed here:
https://community.arm.com/developer/ip-products/processors/b/processors-ip-blog/posts/arm-architecture-developments-armv8-6-a
The bfloat type, and its properties are specified in the Arm Architecture
Reference Manual:
https://developer.arm.com/docs/ddi0487/latest/arm-architecture-reference-manual-armv8-for-armv8-a-architecture-profile
In detail this patch:
- introduces an opaque, storage-only C-type __bf16, which introduces a new bfloat IR type.
This is part of a patch series, starting with command-line and Bfloat16
assembly support. The subsequent patches will upstream intrinsics
support for BFloat16, followed by Matrix Multiplication and the
remaining Virtualization features of the armv8.6-a architecture.
The following people contributed to this patch:
- Luke Cheeseman
- Momchil Velikov
- Alexandros Lamprineas
- Luke Geeson
- Simon Tatham
- Ties Stuij
Reviewers: SjoerdMeijer, rjmccall, rsmith, liutianle, RKSimon, craig.topper, jfb, LukeGeeson, fpetrogalli
Reviewed By: SjoerdMeijer
Subscribers: labrinea, majnemer, asmith, dexonsmith, kristof.beyls, arphaman, danielkiss, cfe-commits
Tags: #clang
Differential Revision: https://reviews.llvm.org/D76077
This patch upstreams support for the Armv8.6-a Matrix Multiplication
Extension. A summary of the features can be found here:
https://community.arm.com/developer/ip-products/processors/b/processors-ip-blog/posts/arm-architecture-developments-armv8-6-a
This patch includes:
- Assembly support for AArch64 only (no SVE or Neon)
- Intrinsics Support for AArch64 Armv8.6a Matrix Multiplication Instructions (No bfloat16 matrix multiplication)
No IR types or C Types are needed for this extension.
This is part of a patch series, starting with BFloat16 support and
the other components in the armv8.6a extension (in previous patches
linked in phabricator)
Based on work by:
- Luke Geeson
- Oliver Stannard
- Luke Cheeseman
Reviewers: ostannard, t.p.northover, rengolin, kmclaughlin
Reviewed By: kmclaughlin
Subscribers: kmclaughlin, kristof.beyls, hiraditya, danielkiss,
cfe-commits
Tags: #clang
Differential Revision: https://reviews.llvm.org/D77871
Summary:
As defined by Arm C Language Extensions (ACLE) these macro defines
should be set to specific values depending on -mbranch-protection.
Reviewers: chill
Reviewed By: chill
Subscribers: danielkiss, cfe-commits, kristof.beyls
Tags: #clang
Differential Revision: https://reviews.llvm.org/D77134
Summary:
This patch introduces command-line support for the Armv8.6-a architecture and assembly support for BFloat16. Details can be found
https://community.arm.com/developer/ip-products/processors/b/processors-ip-blog/posts/arm-architecture-developments-armv8-6-a
in addition to the GCC patch for the 8..6-a CLI:
https://gcc.gnu.org/legacy-ml/gcc-patches/2019-11/msg02647.html
In detail this patch
- march options for armv8.6-a
- BFloat16 assembly
This is part of a patch series, starting with command-line and Bfloat16
assembly support. The subsequent patches will upstream intrinsics
support for BFloat16, followed by Matrix Multiplication and the
remaining Virtualization features of the armv8.6-a architecture.
Based on work by:
- labrinea
- MarkMurrayARM
- Luke Cheeseman
- Javed Asbar
- Mikhail Maltsev
- Luke Geeson
Reviewers: SjoerdMeijer, craig.topper, rjmccall, jfb, LukeGeeson
Reviewed By: SjoerdMeijer
Subscribers: stuij, kristof.beyls, hiraditya, dexonsmith, danielkiss, cfe-commits, llvm-commits
Tags: #clang, #llvm
Differential Revision: https://reviews.llvm.org/D76062
Reworked the patch to avoid sharing a header (SVETypeFlags.h) between
include/clang/Basic and utils/TableGen/SveEmitter.cpp. Now the patch
generates the enum/flags which is included in TargetBuiltins.h.
Also renamed one of the SveEmitter options to be in line with MVE.
Summary:
This is a first patch in a series for the SveEmitter to generate the arm_sve.h
header file and builtins.
I've tried my best to strip down this patch as best as I could, but there
are still a few changes that are not necessarily exercised by the load intrinsics
in this patch, mostly around the SVEType class which has some common logic to
represent types from a type and prototype string. I thought it didn't make
much sense to remove that from this patch and split it up.
Summary:
Adds the constraints described below to ensure that we
can tie variables of SVE ACLE types to operands in inline-asm:
- y: SVE registers Z0-Z7
- Upl: One of the low eight SVE predicate registers (P0-P7)
- Upa: Full range of SVE predicate registers (P0-P15)
Reviewers: sdesmalen, huntergr, rovka, cameron.mcinally, efriedma, rengolin
Reviewed By: efriedma
Subscribers: miyuki, tschuett, rkruppe, psnobl, cfe-commits
Tags: #clang
Differential Revision: https://reviews.llvm.org/D75690
This is a first patch in a series for the SveEmitter to generate the arm_sve.h
header file and builtins.
I've tried my best to strip down this patch as best as I could, but there
are still a few changes that are not necessarily exercised by the load intrinsics
in this patch, mostly around the SVEType class which has some common logic to
represent types from a type and prototype string. I thought it didn't make
much sense to remove that from this patch and split it up.
Reviewers: efriedma, rovka, SjoerdMeijer, rsandifo-arm, rengolin
Reviewed By: SjoerdMeijer
Tags: #clang
Differential Revision: https://reviews.llvm.org/D75470
Make Clang on aarch64 targets predefine `__AARCH64_CMODEL_SMALL__`
or `__AARCH64_CMODEL_TINY__`, etc. These are the names that GCC
uses for its predefines.
Reviewed By: tamur, MaskRay
Differential Revision: https://reviews.llvm.org/D75002
Summary:
Add support for vcadd_* family of intrinsics. This set of intrinsics is
available in Armv8.3-A.
The fp16 versions require the FP16 extension, which has been available
(opt-in) since Armv8.2-A.
Reviewers: t.p.northover
Reviewed By: t.p.northover
Subscribers: t.p.northover, kristof.beyls, hiraditya, cfe-commits, llvm-commits
Tags: #clang, #llvm
Differential Revision: https://reviews.llvm.org/D70862
This patch adds the SVE built-in types defined by the Procedure Call
Standard for the Arm Architecture:
https://developer.arm.com/docs/100986/0000
It handles the types in all relevant places that deal with built-in types.
At the moment, some of these places bail out with an error, including:
(1) trying to generate LLVM IR for the types
(2) trying to generate debug info for the types
(3) trying to mangle the types using the Microsoft C++ ABI
(4) trying to @encode the types in Objective C
(1) and (2) are fixed by follow-on patches but (unlike this patch)
they deal mostly with target-specific LLVM details, so seemed like
a logically separate change. There is currently no spec for (3) and
(4), so reporting an error seems like the correct behaviour for now.
The intention is that the types will become sizeless types:
http://lists.llvm.org/pipermail/cfe-dev/2019-June/062523.html
The main purpose of the sizeless type extension is to diagnose
impossible or dangerous uses of the types, such as any that would
require sizeof to have a meaningful defined value.
Until then, the patch sets the alignments of the types to the values
specified in the link above. It also sets the sizes of the types to
zero, which is chosen to be consistently wrong and shouldn't affect
correctly-written code (i.e. code that would compile even with the
sizeless type extension).
The patch adds the common subset of functionality needed to test the
sizeless type extension on the one hand and to provide SVE intrinsic
functions on the other. After this patch, the two pieces of work are
essentially independent.
The patch is based on one by Graham Hunter:
https://reviews.llvm.org/D59245
Differential Revision: https://reviews.llvm.org/D62960
llvm-svn: 368413
The Arm C Language Extensions for SVE document specifies that
__ARM_FEATURE_SVE should be set when the compiler supports SVE and
implements all the extensions described in the document.
This is currently not yet the case, so the feature should be disabled
until the compiler can provide all the extensions as described.
Reviewers: c-rhodes, rengolin, rovka, ktkachov
Reviewed By: rengolin
Differential Revision: https://reviews.llvm.org/D65404
llvm-svn: 367301
Adds the SVE vector and predicate registers to the list of known registers.
Patch by Kerry McLaughlin.
Reviewers: erichkeane, sdesmalen, rengolin
Reviewed By: sdesmalen
Differential Revision: https://reviews.llvm.org/D64739
llvm-svn: 366878
The jcvt intrinsic defined in ACLE [1] is available when ARM_FEATURE_JCVT is defined.
This change introduces the AArch64 intrinsic, wires it up to the instruction and a new clang builtin function.
The __ARM_FEATURE_JCVT macro is now defined when an Armv8.3-A or higher target is used.
I've implemented the target detection logic in Clang so that this feature is enabled for architectures from armv8.3-a onwards (so -march=armv8.4-a also enables this, for example).
make check-all didn't show any new failures.
[1] https://developer.arm.com/docs/101028/latest/data-processing-intrinsics
Differential Revision: https://reviews.llvm.org/D64495
llvm-svn: 366197
These macro definitions don't depend on the template parameter, so they
don't need to be part of the template. Move them to a .cpp file.
llvm-svn: 365556
According to alignment section in below ARM64 ABI document, MSVC could increase
alignment of global data based on its total size. Clang doesn't do this. Compile
the same symbol into different alignments by Clang and MSVC could cause link
error because some instruction encodings, like 64-bit LDR/STR with immediate,
require the target to be 8 bytes aligned, and linker could choose code stream
with such LDR/STR instruction from MSVC and 4 bytes aligned data from Clang into
final image, which actually cannot be linked together
(see https://bugs.llvm.org/show_bug.cgi?id=41506 for more details).
https://docs.microsoft.com/en-us/cpp/build/arm64-windows-abi-conventions?view=vs-2019#alignment
Differential Revision: https://reviews.llvm.org/D61225
llvm-svn: 359744
This provides intrinsics support for Memory Tagging Extension (MTE),
which was introduced with the Armv8.5-a architecture.
These intrinsics are available when __ARM_FEATURE_MEMORY_TAGGING is defined.
Each intrinsic is described in detail in the ACLE Q1 2019 documentation:
https://developer.arm.com/docs/101028/latest
Reviewed By: Tim Nortover, David Spickett
Differential Revision: https://reviews.llvm.org/D60485
llvm-svn: 359348
As Discussed here:
http://lists.llvm.org/pipermail/llvm-dev/2019-January/129543.html
There are problems exposing the _Float16 type on architectures that
haven't defined the ABI/ISel for the type yet, so we're temporarily
disabling the type and making it opt-in.
Differential Revision: https://reviews.llvm.org/D57188
Change-Id: I5db7366dedf1deb9485adb8948b1deb7e612a736
llvm-svn: 352221