We do not want to define __PRIVILEGED__. There is no use case for the
definition and gcc does not define it. This patch removes that definition.
Reviewed By: lei, NeHuang
Differential Revision: https://reviews.llvm.org/D107461
%%%
This patch defines the macro __HOS_AIX__ when the target is AIX and without any dependency on the host. The macro indicates that the host is AIX. Defining the macro will help minimize porting pain for existing code compiled with xlc/xlC. xlC never shipped cross-compiling support, so the difference is not observable anyway.
%%%
This is a follow up to the discussion in https://reviews.llvm.org/D107242.
Reviewed By: cebowleratibm, joerg
Differential Revision: https://reviews.llvm.org/D107825
This patch allows target specific addr space in target builtins for HIP. It inserts implicit addr
space cast for non-generic pointer to generic pointer in general, and inserts implicit addr
space cast for generic to non-generic for target builtin arguments only.
It is NFC for non-HIP languages.
Differential Revision: https://reviews.llvm.org/D102405
%%%
This patch defines __HOS_AIX__ macro for AIX in case of a cross compiler implementation.
%%%
Tested with SPEC.
Reviewed By: cebowleratibm
Differential Revision: https://reviews.llvm.org/D107242
%%%
This patch defines the macro __THW_PPC__ for AIX.
%%%
Tested with SPEC.
Reviewed By: cebowleratibm
Differential Revision: https://reviews.llvm.org/D107243
%%%
This patch defines the macro __THW_BIG_ENDIAN__ for AIX.
%%%
Tested with SPEC.
Reviewed By: cebowleratibm
Differential Revision: https://reviews.llvm.org/D107241
This matches the behavior of GCC.
Patch does not change remapping logic itself, so adding one simple smoke test should be enough.
Reviewed By: MaskRay
Differential Revision: https://reviews.llvm.org/D107393
In LLVM IR terms the ACLE type 'data512_t' is essentially an aggregate
type { [8 x i64] }. When emitting code for inline assembly operands,
clang tries to scalarize aggregate types to an integer of the equivalent
length, otherwise it passes them by-reference. This patch adds a target
hook to tell whether a given inline assembly operand is scalarizable
so that clang can emit code to pass/return it by-value.
Differential Revision: https://reviews.llvm.org/D94098
'pipe' keyword is introduced in OpenCL C 2.0: so do checks for OpenCL C version while
parsing and then later on check for language options to construct actual pipe. This feature
requires support of __opencl_c_generic_address_space, so diagnostics for that is provided as well.
This is the same patch as in D106748 but with a tiny fix in checking of diagnostic messages.
Also added tests when program scope global variables are not supported.
Reviewed By: Anastasia
Differential Revision: https://reviews.llvm.org/D107154
'pipe' keyword is introduced in OpenCL C 2.0: so do checks for OpenCL C version while
parsing and then later on check for language options to construct actual pipe. This feature
requires support of __opencl_c_generic_address_space, so diagnostics for that is provided as well.
Reviewed By: Anastasia
Differential Revision: https://reviews.llvm.org/D106748
This feature requires support of __opencl_c_images, so diagnostics for that is provided as well.
Also, ensure that cl_khr_3d_image_writes feature macro is set to the same value.
Reviewed By: Anastasia
Differential Revision: https://reviews.llvm.org/D106260
The Intel compiler ICC supports the option "-fp-model=(source|double|extended)"
which causes the compiler to use a wider type for intermediate floating point
calculations. Also supported is a way to embed this effect in the source
program with #pragma float_control(source|double|extended).
This patch extends pragma float_control syntax, and also adds support
for a new floating point option "-ffp-eval-method=(source|double|extended)".
source: intermediate results use source precision
double: intermediate results use double precision
extended: intermediate results use extended precision
Reviewed By: Aaron Ballman
Differential Revision: https://reviews.llvm.org/D93769
Remove overriding MinGlobalAlign to 0 for z/OS target to be consistent with SystemZ.
Reviewed By: abhina.sreeskantharajan
Differential Revision: https://reviews.llvm.org/D106890
This patch adds support for the next-generation arch14
CPU architecture to the SystemZ backend.
This includes:
- Basic support for the new processor and its features.
- Detection of arch14 as host processor.
- Assembler/disassembler support for new instructions.
- New LLVM intrinsics for certain new instructions.
- Support for low-level builtins mapped to new LLVM intrinsics.
- New high-level intrinsics in vecintrin.h.
- Indicate support by defining __VEC__ == 10304.
Note: No currently available Z system supports the arch14
architecture. Once new systems become available, the
official system name will be added as supported -march name.
Reland of 31859f896.
This change implements new DAG notes GLOBAL_GET/GLOBAL_SET, and
lowering methods for load and stores of reference types from IR
globals. Once the lowering creates the new nodes, tablegen pattern
matches those and converts them to Wasm global.get/set.
Reviewed By: tlively
Differential Revision: https://reviews.llvm.org/D104797
This patch defines the macro __LONGDOUBLE64 for AIX when long double is 8 bytes.
Reviewed By: cebowleratibm
Differential Revision: https://reviews.llvm.org/D105477
This commit adds driver support for the Mac Catalyst target,
as supported by the Apple clang compile
Differential Revision: https://reviews.llvm.org/D105960
This patch is in a series of patches to provide builtins for compatibility
with the XL compiler. This patch adds the builtin and intrinsic for "__stbcx".
Reviewed By: nemanjai, #powerpc
Differential revision: https://reviews.llvm.org/D106484
This commit adds supports for clang to remap macOS availability attributes that have introduced,
deprecated or obsoleted versions to appropriate Mac Catalyst availability attributes. This
mapping is done using the version mapping provided in the macOS SDK, in the SDKSettings.json file.
The mappings in the SDKSettings json file will also be used in the clang driver for the driver
Mac Catalyst patch, and they could also be used in the future for other platforms as well.
Differential Revision: https://reviews.llvm.org/D105257
This patch is in a series of patches to provide
builtins for compatibility with the XL compiler.
This patch adds builtins related to floating point
operations
Reviewed By: #powerpc, nemanjai, amyk, NeHuang
Differential Revision: https://reviews.llvm.org/D103986
This is part of a patch series working towards the ability to make
SourceLocation into a 64-bit type to handle larger translation units.
NFC: this patch introduces typedefs for the integer type used by
SourceLocation and makes all the boring changes to use the typedefs
everywhere, but for the moment, they are unconditionally defined to
uint32_t.
Patch originally by Mikhail Maltsev.
Reviewed By: tmatheson
Differential Revision: https://reviews.llvm.org/D105492
Implemented builtins for mtmsr, mfspr, mtspr on PowerPC;
the patch is intended for XL Compatibility.
Differential revision: https://reviews.llvm.org/D106130
This patch implements store, load, move from and to registers related
builtins, as well as the builtin for stfiw. The patch aims to provide
feature parady with xlC on AIX.
Differential revision: https://reviews.llvm.org/D105946
The Intel compiler ICC supports the option "-fp-model=(source|double|extended)"
which causes the compiler to use a wider type for intermediate floating point
calculations. Also supported is a way to embed this effect in the source
program with #pragma float_control(source|double|extended).
This patch extends pragma float_control syntax, and also adds support
for a new floating point option "-ffp-eval-method=(source|double|extended)".
source: intermediate results use source precision
double: intermediate results use double precision
extended: intermediate results use extended precision
Reviewed By: Aaron Ballman
Differential Revision: https://reviews.llvm.org/D93769
This commit adds support for Mac Catalyst availability attribute, as
supported by the Apple clang compiler. A follow-up commit will provide
additional support for inferring Mac Catalyst availability from macOS
availability using the mapping in the SDKSettings.json.
Differential Revision: https://reviews.llvm.org/D105052
This patch is in a series of patches to provide builtins for compatibility
with the XL compiler. This patch add the builtin and emit target independent
code for __cmpb.
Reviewed By: nemanjai, #powerpc
Differential revision: https://reviews.llvm.org/D105194
Added a number of different builtins that exist in the XL compiler. Most of
these builtins already exist in clang under a different name.
Reviewed By: nemanjai, #powerpc
Differential Revision: https://reviews.llvm.org/D104386
This patch is in a series of patches to provide builtins for
compatibility with the XL compiler. This patch adds software divide
builtins with no checking. These builtins are each emitted as a fast
fdiv.
Reviewed By: #powerpc, nemanjai
Differential Revision: https://reviews.llvm.org/D106150
Use _Float16 as the half-precision floating point type. Define a new
type specifier 'x' for the _Float16 type.
Differential Revision: https://reviews.llvm.org/D105001
Implement a subset of builtins required for compatiblilty with AIX XL compiler.
Reviewed By: nemanjai
Differential Revision: https://reviews.llvm.org/D105930
This patch is in a series of patches to provide builtins for compatibility
with the XL compiler. This patch adds the builtins and instrisics for population
count, reversed load and store related operations.
Reviewed By: nemanjai, #powerpc
Differential revision: https://reviews.llvm.org/D106021
This patch is in a series of patches to provide builtins for compatibility
with the XL compiler. This patch adds the builtins and emit target independent
code for rotate related operations.
Reviewed By: nemanjai, #powerpc
Differential revision: https://reviews.llvm.org/D104744
This patch is in a series of patches to provide builtins for compatibility
with the XL compiler. This patch adds the builtins and instrisics for compare
and multiply related operations.
Reviewed By: nemanjai, #powerpc
Differential revision: https://reviews.llvm.org/D102875
[NFC] This patch adds features for pwr7, pwr8, and pwr9 that can be
used for semachecking builtin functions that are only valid for certain
versions of ppc.
Reviewed By: nemanjai, #powerpc
Authored By: Quinn Pham <Quinn.Pham@ibm.com>
Differential revision: https://reviews.llvm.org/D105501
Similar to D46745, "S" represents an absolute symbolic operand, which
can be used to specify the access models, e.g.
extern int var;
void *addr_via_asm() {
void *ret;
asm("lui %0, %%hi(%1)\naddi %0,%0,%%lo(%1)" : "=r"(ret) : "S"(&var));
return ret;
}
'S' is documented in trunk GCC: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=101275
Reviewed By: luismarques
Differential Revision: https://reviews.llvm.org/D105254
[NFC] This patch adds features for pwr7, pwr8, and pwr9 that can be
used for semachecking builtin functions that are only valid for certain
versions of ppc.
Reviewed By: nemanjai, #powerpc
Authored By: Quinn Pham <Quinn.Pham@ibm.com>
Differential revision: https://reviews.llvm.org/D105501
This feature requires support of __opencl_c_images, so diagnostics for that is provided as well
Reviewed By: Anastasia
Differential Revision: https://reviews.llvm.org/D104915
This patch implements trap and FP to and from double conversions. The builtins
generate code that mirror what is generated from the XL compiler. Intrinsics
are named conventionally with builtin_ppc, but are aliased to provide the same
builtin names as the XL compiler.
Differential Revision: https://reviews.llvm.org/D103668
OpenMP 5.1 added support for writing OpenMP directives using [[]]
syntax in addition to using #pragma and this introduces support for the
new syntax.
In OpenMP, the attributes take one of two forms:
[[omp::directive(...)]] or [[omp::sequence(...)]]. A directive
attribute contains an OpenMP directive clause that is identical to the
analogous #pragma syntax. A sequence attribute can contain either
sequence or directive arguments and is used to ensure that the
attributes are processed sequentially for situations where the order of
the attributes matter (remember:
https://eel.is/c++draft/dcl.attr.grammar#4.sentence-4).
The approach taken here is somewhat novel and deserves mention. We
could refactor much of the OpenMP parsing logic to work for either
pragma annotation tokens or for attribute clauses. It would be a fair
amount of effort to share the logic for both, but it's certainly
doable. However, the semantic attribute system is not designed to
handle the arbitrarily complex arguments that OpenMP directives
contain. Adding support to thread the novel parsed information until we
can produce a semantic attribute would be considerably more effort.
What's more, existing OpenMP constructs are not (often) represented as
semantic attributes. So doing this through Attr.td would be a massive
undertaking that would likely only benefit OpenMP and comes with
additional risks. Rather than walk down that path, I am taking
advantage of the fact that the syntax of the directives within the
directive clause is identical to that of the #pragma form. Once the
parser recognizes that we're processing an OpenMP attribute, it caches
all of the directive argument tokens and then replays them as though
the user wrote a pragma. This reuses the same OpenMP parsing and
semantic logic directly, but does come with a risk if the OpenMP
committee decides to purposefully diverge their pragma and attribute
syntaxes. So, despite this being a novel approach that does token
replay, I think it's actually a better approach than trying to do this
through the declarative syntax in Attr.td.
This change is intended as initial setup. The plan is to add
more semantic checks later. I plan to update the documentation
as more semantic checks are added (instead of documenting the
details up front). Most of the code closely mirrors that for
the Swift calling convention. Three places are marked as
[FIXME: swiftasynccc]; those will be addressed once the
corresponding convention is introduced in LLVM.
Reviewed By: rjmccall
Differential Revision: https://reviews.llvm.org/D95561
C++23 will make these conversions ambiguous - so fix them to make the
codebase forward-compatible with C++23 (& a follow-up change I've made
will make this ambiguous/invalid even in <C++23 so we don't regress
this & it generally improves the code anyway)
%%%
Transfer the predefined macro, __TOS_AIX__, from the AIX XL C/C++ compilers.
__TOS_AIX__ indicates that the target operating system is AIX.
%%%
Reviewed By: cebowleratibm
Differential Revision: https://reviews.llvm.org/D103587
This patch implaments the load and reserve and store conditional
builtins for the PowerPC target, in order to have feature parody with
xlC on AIX.
Differential revision: https://reviews.llvm.org/D105236
This means `max_align_t` is 8 bytes which also sets the alignment
malloc. Since this is technically and ABI breaking change we have
limited to just the emscripten OS target. It is also relatively low
import breakage since it will only effect the alignement of struct that
contai `long double`s (extremerly rare I imagine).
Emscripten's malloc implementation already use 8 byte alignement
(dlmalloc uses and alignement of 2*sizeof(void*) == 8 rather than
checking max_align_t) so will not be effected by this change. By
bringing the ABI in line with the current malloc code this will fix
several issue we have seen in the wild.
See: https://github.com/emscripten-core/emscripten/pull/14456
Differential Revision: https://reviews.llvm.org/D104808
Reland of 31859f896.
This change implements new DAG notes GLOBAL_GET/GLOBAL_SET, and
lowering methods for load and stores of reference types from IR
globals. Once the lowering creates the new nodes, tablegen pattern
matches those and converts them to Wasm global.get/set.
Differential Revision: https://reviews.llvm.org/D104797
This patch adds a new clang builtin, __arithmetic_fence. The purpose of the
builtin is to provide the user fine control, at the expression level, over
floating point optimization when -ffast-math (-ffp-model=fast) is enabled.
The builtin prevents the optimizer from rearranging floating point expression
evaluation. The new option fprotect-parens has the same effect on
parenthesized expressions, forcing the optimizer to respect the parentheses.
Reviewed By: aaron.ballman, kpn
Differential Revision: https://reviews.llvm.org/D100118
This patch adds a new clang builtin, __arithmetic_fence. The purpose of the
builtin is to provide the user fine control, at the expression level, over
floating point optimization when -ffast-math (-ffp-model=fast) is enabled.
The builtin prevents the optimizer from rearranging floating point expression
evaluation. The new option fprotect-parens has the same effect on
parenthesized expressions, forcing the optimizer to respect the parentheses.
Reviewed By: aaron.ballman, kpn
Differential Revision: https://reviews.llvm.org/D100118
This is mostly a mechanical change, but a testcase that contains
parts of the StringRef class (clang/test/Analysis/llvm-conventions.cpp)
isn't touched.
Moving the definition of the defineXLCompatMacros function from
the header file clang/lib/Basic/Targets/PPC.h to the source file
clang/lib/Basic/Targets/PPC.cpp.
Differential revision: https://reviews.llvm.org/D104125
Also:
- add driver test (fsanitize-use-after-return.c)
- add basic IR test (asan-use-after-return.cpp)
- (NFC) cleaned up logic for generating table of __asan_stack_malloc
depending on flag.
for issue: https://github.com/google/sanitizers/issues/1394
Reviewed By: vitalybuka
Differential Revision: https://reviews.llvm.org/D104076
Implementation of the unroll directive introduced in OpenMP 5.1. Follows the approach from D76342 for the tile directive (i.e. AST-based, not using the OpenMPIRBuilder). Tries to use `llvm.loop.unroll.*` metadata where possible, but has to fall back to an AST representation of the outer loop if the partially unrolled generated loop is associated with another directive (because it needs to compute the number of iterations).
Reviewed By: ABataev
Differential Revision: https://reviews.llvm.org/D99459
This change implements new DAG notes GLOBAL_GET/GLOBAL_SET, and
lowering methods for load and stores of reference types from IR
globals. Once the lowering creates the new nodes, tablegen pattern
matches those and converts them to Wasm global.get/set.
Reviewed By: tlively
Differential Revision: https://reviews.llvm.org/D95425
The MathExtras.h header is included purely for the countPopulation() method - by moving this into Sanitizers.cpp we can remove the use of this costly header.
We only ever use isPowerOf2() / countPopulation() inside asserts so this shouldn't have any performance effects on production code.
Differential Revision: https://reviews.llvm.org/D103953
So far, support for x86_64-linux-gnux32 has been handled by explicit
comparisons of Triple.getEnvironment() to GNUX32. This worked as long as
x86_64-linux-gnux32 was the only X32 environment to worry about, but we
now have x86_64-linux-muslx32 as well. To support this, this change adds
an isX32() function and uses it. It replaces all checks for GNUX32 or
MuslX32 by isX32(), except for the following:
- Triple::isGNUEnvironment() and Triple::isMusl() are supposed to treat
GNUX32 and MuslX32 differently.
- computeTargetTriple() needs to be able to transform triples to add or
remove X32 from the environment and needs to map GNU to GNUX32, and
Musl to MuslX32.
- getMultiarchTriple() completely lacks any Musl support and retains the
explicit check for GNUX32 as it can only return x86_64-linux-gnux32.
Reviewed By: MaskRay
Differential Revision: https://reviews.llvm.org/D103777
Extend debug info handling by adding DWARF address space mapping for
SPIR, with corresponding test case.
Reviewed By: Anastasia
Differential Revision: https://reviews.llvm.org/D103097
This is the first in a series of patches to provide builtins for
compatibility with the XL compiler. Most of the builtins already had
intrinsics and only needed to be implemented in the front end.
Intrinsics were created for the three iospace builtins, eieio, and icbt.
Pseudo instructions were created for eieio and iospace_eieio to
ensure that nops were inserted before the eieio instruction.
Reviewed By: nemanjai, #powerpc
Differential Revision: https://reviews.llvm.org/D102443
As discussed in PR50385, strict-fp on PowerPC SPE has not been handled
well. This patch disables it by default for SPE.
Reviewed By: nemanjai, vit9696, jhibbits
Differential Revision: https://reviews.llvm.org/D103235
The original version of this was reverted, and @rjmcall provided some
advice to architect a new solution. This is that solution.
This implements a builtin to provide a unique name that is stable across
compilations of this TU for the purposes of implementing the library
component of the unnamed kernel feature of SYCL. It does this by
running the Itanium mangler with a few modifications.
Because it is somewhat common to wrap non-kernel-related lambdas in
macros that aren't present on the device (such as for logging), this
uniquely generates an ID for all lambdas involved in the naming of a
kernel. It uses the lambda-mangling number to do this, except replaces
this with its own number (starting at 10000 for readabililty reasons)
for lambdas used to name a kernel.
Additionally, this implements itself as constexpr with a slight catch:
if a name would be invalidated by the use of this lambda in a later
kernel invocation, it is diagnosed as an error (see the Sema tests).
Differential Revision: https://reviews.llvm.org/D103112
WG14 adopted N2645 and WG21 EWG has accepted P2334 in principle (still
subject to full EWG vote + CWG review + plenary vote), which add
support for #elifdef as shorthand for #elif defined and #elifndef as
shorthand for #elif !defined. This patch adds support for the new
preprocessor directives.
GCC allows each target to define a set of non-letter and non-digit
escaped characters for inline assembly that will be replaced by another
string (They call this "punctuation" characters. The existing "%%" and
"%{" -- replaced by '%' and '{' at the end -- can be seen as special
cases shared by all targets).
This patch implements this feature by adding a new hook in `TargetInfo`.
Differential Revision: https://reviews.llvm.org/D103036
Allow use of bit-fields as a clang extension
in OpenCL. The extension can be enabled using
pragma directives.
This fixes PR45339!
Differential Revision: https://reviews.llvm.org/D101843
There already exists cl_khr_fp64 extension. So OpenCL C 3.0
and higher should use the feature, earlier versions still
use the extension. OpenCL C 3.0 API spec states that extension
will be not described in the option string if corresponding
optional functionality is not supported (see 4.2. Querying Devices).
Due to that fact the usage of features for OpenCL C 3.0 must
be as follows:
```
$ clang -Xclang -cl-ext=+cl_khr_fp64,+__opencl_c_fp64 ...
$ clang -Xclang -cl-ext=-cl_khr_fp64,-__opencl_c_fp64 ...
```
e.g. the feature and the equivalent extension (if exists)
must be set to the same values
Reviewed By: Anastasia
Differential Revision: https://reviews.llvm.org/D96524
This patch adds supports for inline assembly operands and some simple
operand constraints, including register and constant operands.
Differential Revision: https://reviews.llvm.org/D102585
This patch enables explicitly building inferred modules.
Effectively a cherry-pick of https://github.com/apple/llvm-project/pull/699 authored by @Bigcheese with libclang and dependency scanner changes omitted.
Contains the following changes:
1. [Clang] Fix the header paths in clang::Module for inferred modules.
* The UmbrellaAsWritten and NameAsWritten fields in clang::Module are a lie for framework modules. For those they actually are the path to the header or umbrella relative to the clang::Module::Directory.
* The exception to this case is for inferred modules. Here it actually is the name as written, because we print out the module and read it back in when implicitly building modules. This causes a problem when explicitly building an inferred module, as we skip the printing out step.
* In order to fix this issue this patch adds a new field for the path we want to use in getInputBufferForModule. It also makes NameAsWritten actually be the name written in the module map file (or that would be, in the case of an inferred module).
2. [Clang] Allow explicitly building an inferred module.
* Building the actual module still fails, but make sure it fails for the right reason.
Split from D100934.
Reviewed By: dexonsmith
Differential Revision: https://reviews.llvm.org/D102491
This patch contains a couple of minor corrections to my previous
crypto patch:
Since both AArch32 and AArch64 are now correctly setting the aes and
sha2 features individually, it is not necessary to continue to check
the crypto feature when defining feature macros.
In the AArch32 driver, the feature vector is only modified when the
crypto feature is actually in the vector. If crypto is not present,
there is no need to split it and explicitly define crypto/sha2/aes.
Reviewed By: lenary
Differential Revision: https://reviews.llvm.org/D102406
I've taken the following steps to add unwinding support from inline assembly:
1) Add a new `unwind` "attribute" (like `sideeffect`) to the asm syntax:
```
invoke void asm sideeffect unwind "call thrower", "~{dirflag},~{fpsr},~{flags}"()
to label %exit unwind label %uexit
```
2.) Add Bitcode writing/reading support + LLVM-IR parsing.
3.) Emit EHLabels around inline assembly lowering (SelectionDAGBuilder + GlobalISel) when `InlineAsm::canThrow` is enabled.
4.) Tweak InstCombineCalls/InlineFunction pass to not mark inline assembly "calls" as nounwind.
5.) Add clang support by introducing a new clobber: "unwind", which lower to the `canThrow` being enabled.
6.) Don't allow unwinding callbr.
Reviewed By: Amanieu
Differential Revision: https://reviews.llvm.org/D95745
Add user-facing front end option to turn off power10 prefixed instructions.
Reviewed By: nemanjai
Differential Revision: https://reviews.llvm.org/D102191
This commit brought build break in some f128 related tests. But that's
not the root cause. There exists some differences between Clang and
GCC's definition for 128-bit float types on PPC, so macros/functions in
glibc may not work with clang -mfloat128 well. We need to handle this
carefully and reland it.
This patch adds support for WebAssembly globals in LLVM IR, representing
them as pointers to global values, in a non-default, non-integral
address space. Instruction selection legalizes loads and stores to
these pointers to new WebAssemblyISD nodes GLOBAL_GET and GLOBAL_SET.
Once the lowering creates the new nodes, tablegen pattern matches those
and converts them to Wasm global.get/set of the appropriate type.
Based on work by Paulo Matos in https://reviews.llvm.org/D95425.
Reviewed By: pmatos
Differential Revision: https://reviews.llvm.org/D101608
Added __cl_clang_non_portable_kernel_param_types extension that
allows using non-portable types as kernel parameters. This allows
bypassing the portability guarantees from the restrictions specified
in C++ for OpenCL v1.0 s2.4.
Currently this only disables the restrictions related to the data
layout. The programmer should ensure the compiler generates the same
layout for host and device or otherwise the argument should only be
accessed on the device side. This extension could be extended to other
case (e.g. permitting size_t) if desired in the future.
Patch by olestrohm (Ole Strohm)!
https://reviews.llvm.org/D101168
Warn when a declaration uses an identifier that doesn't obey the reserved
identifier rule from C and/or C++.
Differential Revision: https://reviews.llvm.org/D93095
Reduces numbers of files built for clang-format from 575 to 449.
Requires two small changes:
1. Don't use llvm::ExceptionHandling in LangOptions. This isn't
even quite the right type since we don't use all of its values.
Tweaks the changes made in:
- https://reviews.llvm.org/D93215
- https://reviews.llvm.org/D93216
2. Move section name validation code added (long ago) in commit 30ba67439 out
of libBasic into Sema and base the check on the triple. This is a bit less
OOP-y, but completely in line with what we do in many other places in Sema.
No behavior change.
Differential Revision: https://reviews.llvm.org/D101463
This patch changes the AArch32 crypto instructions (sha2 and aes) to
require the specific sha2 or aes features. These features have
already been implemented and can be controlled through the command
line, but do not have the expected result (i.e. `+noaes` will not
disable aes instructions). The crypto feature retains its existing
meaning of both sha2 and aes.
Several small changes are included due to the knock-on effect this has:
- The AArch32 driver has been modified to ensure sha2/aes is correctly
set based on arch/cpu/fpu selection and feature ordering.
- Crypto extensions are permitted for AArch32 v8-R profile, but not
enabled by default.
- ACLE feature macros have been updated with the fine grained crypto
algorithms. These are also used by AArch64.
- Various tests updated due to the change in feature lists and macros.
Reviewed By: lenary
Differential Revision: https://reviews.llvm.org/D99079
Language options are not available when a target is being created,
thus, a new method is introduced. Also, some refactoring is done,
such as removing OpenCL feature macros setting from TargetInfo.
Reviewed By: Anastasia
Differential Revision: https://reviews.llvm.org/D101087
Reverts parts of https://reviews.llvm.org/D17183, but keeps the
resetDataLayout() API and adds an assert that checks that datalayout string and
user label prefix are in sync.
Approach 1 in https://reviews.llvm.org/D17183#2653279
Reduces number of TUs build for 'clang-format' from 689 to 575.
I also implemented approach 2 in D100764. If someone feels motivated
to make us use DataLayout more, it's easy to revert this change here
and go with D100764 instead. I don't plan on doing more work in this
area though, so I prefer going with the smaller, more self-consistent change.
Differential Revision: https://reviews.llvm.org/D100776
Commit e3d8ee35e4 ("reland "[DebugInfo] Support to emit debugInfo
for extern variables"") added support to emit debugInfo for
extern variables if requested by the target. Currently, only
BPF target enables this feature by default.
As BPF ecosystem grows, callback function started to get
support, e.g., recently bpf_for_each_map_elem() is introduced
(https://lwn.net/Articles/846504/) with a callback function as an
argument. In the future we may have something like below as
a demonstration of use case :
extern int do_work(int);
long bpf_helper(void *callback_fn, void *callback_ctx, ...);
long prog_main() {
struct { ... } ctx = { ... };
return bpf_helper(&do_work, &ctx, ...);
}
Basically bpf helper may have a callback function and the
callback function is defined in another file or in the kernel.
In this case, we would like to know the debuginfo types for
do_work(), so the verifier can proper verify the safety of
bpf_helper() call.
For the following example,
extern int do_work(int);
long bpf_helper(void *callback_fn);
long prog() {
return bpf_helper(&do_work);
}
Currently, there is no debuginfo generated for extern function do_work().
In the IR, we have,
...
define dso_local i64 @prog() local_unnamed_addr #0 !dbg !7 {
entry:
%call = tail call i64 @bpf_helper(i8* bitcast (i32 (i32)* @do_work to i8*)) #2, !dbg !11
ret i64 %call, !dbg !12
}
...
declare dso_local i32 @do_work(i32) #1
...
This patch added support for the above callback function use case, and
the generated IR looks like below:
...
declare !dbg !17 dso_local i32 @do_work(i32) #1
...
!17 = !DISubprogram(name: "do_work", scope: !1, file: !1, line: 1, type: !18, flags: DIFlagPrototyped, spFlags: DISPFlagOptimized, retainedNodes: !2)
!18 = !DISubroutineType(types: !19)
!19 = !{!20, !20}
!20 = !DIBasicType(name: "int", size: 32, encoding: DW_ATE_signed)
The TargetInfo.allowDebugInfoForExternalVar is renamed to
TargetInfo.allowDebugInfoForExternalRef as now it guards
both extern variable and extern function debuginfo generation.
Differential Revision: https://reviews.llvm.org/D100567
The headers shipped with the XMOS XCore compiler expect __xcore__ to be defined.
The __XS1B__ macro, already defined, is for the default subtarget.
No other targets affected.
Default address space (applies when no explicit address space was
specified) maps to generic (4) address space.
Added SYCL named address spaces `sycl_global`, `sycl_local` and
`sycl_private` defined as sub-sets of the default address space.
Static variables without address space now reside in global address
space when compile for SPIR target, unless they have an explicit address
space qualifier in source code.
Differential Revision: https://reviews.llvm.org/D89909
Clang only defines __VFP_FP__ when the FPU is enabled. However, gcc
defines it unconditionally.
This patch aligns Clang with gcc.
Reviewed By: peter.smith, rengolin
Differential Revision: https://reviews.llvm.org/D100372
The `ppc32` cpu model was introduced a while ago in a9321059b9 as an independent copy of the `ppc` one but was never wired into clang.
Reviewed By: #powerpc, nemanjai
Differential Revision: https://reviews.llvm.org/D100933
Clang _requires_ every target to provide a va_list kind so we shouldn't
put a llvm_unreachable there. Using `VoidPtrBuiltinVaList` because m68k
doesn't have any special ABI for variadic args.
After https://reviews.llvm.org/D90484 libclang is unable to read a serialized diagnostic file
which contains a diagnostic which came from a file with an empty filename. The reason being is
that the serialized diagnostic reader is creating a virtual file for the "" filename, which now
fails after the changes in https://reviews.llvm.org/D90484. This patch restores the previous
behavior in getVirtualFileRef by allowing it to construct a file entry ref with an empty name by
pretending its name is "." so that the directory entry can be created.
Differential Revision: https://reviews.llvm.org/D100428
This patch fixes an issue with the SVE prefetch and qinc/qdec intrinsics
that take an `enum` argument, but where the builtin prototype encodes
these as `int`. Some code in SemaDecl found the mismatch and chose
to forget about the builtin altogether, which meant that any future
code using that builtin would fail. The code that forgets about the
builtin was actually obsolete after D77491 and should have been removed.
This patch now removes that code.
This patch also fixes another issue with the SVE prefetch intrinsic
when built with C++, where the builtin didn't accept the correct
pointer type, which should be `const void *`.
Reviewed By: tambre
Differential Revision: https://reviews.llvm.org/D100046
The backend can't handle this and will throw a fatal error from
type legalization. It's easy enough to fix that for this intrinsic
by just splitting the IR intrinsic since it works on individual bytes.
There will be other intrinsics in the future that would be harder
to support through splitting, for example grev, gorc, and shfl. Those
would require a compare and a select be inserted to check the MSB of
their control input.
This patch adds support for preventing this in the frontend with
a nice diagnostic.
Reviewed By: frasercrmck
Differential Revision: https://reviews.llvm.org/D99984
Clang spends a decent amount of time in the LineOffsetMapping::get(...)
function. This function used to be vectorized (through SSE2) then the
optimization got dropped because the sequential version was on-par performance
wise.
This provides an optimization of the sequential version that works on a word at
a time, using (documented) bithacks to provide a portable vectorization.
When preprocessing the sqlite amalgamation, this yields a sweet 3% speedup.
This is a recommit of 6951b72334 with endianness
and unsigned long vs uint64_t issues fixed (hopefully).
Differential Revision: https://reviews.llvm.org/D99409
Clang spends a decent amount of time in the LineOffsetMapping::get(...)
function. This function used to be vectorized (through SSE2) then the
optimization got dropped because the sequential version was on-par performance
wise.
This provides an optimization of the sequential version that works on a word at
a time, using (documented) bithacks to provide a portable vectorization.
When preprocessing the sqlite amalgamation, this yields a sweet 3% speedup.
Differential Revision: https://reviews.llvm.org/D99409
On z/OS there is a hard limitation on on the maximum requestable alignment in aligned attribute for static variables. We need to truncate values greater than that.
Reviewed By: abhina.sreeskantharajan
Differential Revision: https://reviews.llvm.org/D98864
Zero length bitfield alignment is not respected if they are leading members on z/OS target.
Reviewed By: abhina.sreeskantharajan
Differential Revision: https://reviews.llvm.org/D98890
Add builtin function __builtin_get_device_side_mangled_name
to get device side manged name for functions and global
variables, which can be used to get symbol address of kernels
or variables by mangled name in dynamically loaded
bundled code objects at run time.
Reviewed by: Artem Belevich
Differential Revision: https://reviews.llvm.org/D99301
Add an option to tell the compiler that it can use privileged instructions.
This patch only adds the option. Backend implementation will be added in a
future patch.
Reviewed By: lei, amyk
Differential Revision: https://reviews.llvm.org/D99193
In order to have the same option on power PC LLVM and power PC gcc
the option will be changed from -mrop-protection to -mrop-protect.
The feature will be off by default and turned on when the option is used.
Reviewed By: lei, amyk
Differential Revision: https://reviews.llvm.org/D99185
* List inferred lists of imports in `#pragma clang __debug module_map`.
* Add `#pragma clang __debug modules {all,visible,building}` to dump
lists of known / visible module names or the building modules stack.
Now that the WebAssembly SIMD specification is finalized and engines are
generally up-to-date, there is no need for a separate target feature for gating
SIMD instructions that engines have not implemented. With this change,
v128.const is now enabled by default with the simd128 target feature.
Differential Revision: https://reviews.llvm.org/D98457
Split out some of the instructions predicated on the dot2-insts target
feature into a new dot7-insts, in preparation for subtargets that have
some but not all of these instructions. NFCI.
Differential Revision: https://reviews.llvm.org/D98717
This patch adds the `__PCREL__` define when PC Relative addressing is enabled.
Reviewed By: nemanjai, #powerpc
Differential Revision: https://reviews.llvm.org/D98546
This patch implements the __rndr and __rndrrs intrinsics to provide access to the random
number instructions introduced in Armv8.5-A. They are only defined for the AArch64
execution state and are available when __ARM_FEATURE_RNG is defined.
These intrinsics store the random number in their pointer argument and return a status
code if the generation succeeded. The difference between __rndr __rndrrs, is that the latter
intrinsic reseeds the random number generator.
The instructions write the NZCV flags indicating the success of the operation that we can
then read with a CSET.
[1] https://developer.arm.com/docs/101028/latest/data-processing-intrinsics
[2] https://bugs.llvm.org/show_bug.cgi?id=47838
Differential Revision: https://reviews.llvm.org/D98264
Change-Id: I8f92e7bf5b450e5da3e59943b53482edf0df6efc
There is no need to check for enabled pragma for core or optional core features,
thus this check is removed
Reviewed By: Anastasia
Differential Revision: https://reviews.llvm.org/D97058
It's available both in CodeGenOptions and in LangOptions, and LangOptions
implementation is slightly better as it uses a StringRef instead of a char
pointer, so use it.
Differential Revision: https://reviews.llvm.org/D98175
This is the first patch supporting M68k in Clang
- Register M68k as a target
- Target specific CodeGen support
- Target specific attribute support
Authors: myhsu, m4yers, glaubitz
Differential Revision: https://reviews.llvm.org/D88393
This changes the target data layout to make stack align to 16 bytes
on Power10. Before this change, stack was being aligned to 32 bytes.
Reviewed By: #powerpc, nemanjai
Differential Revision: https://reviews.llvm.org/D96265
gfx1030 added a new way to implement readcyclecounter using the
SHADER_CYCLES hardware register, but the s_memtime instruction still
exists, so the MC layer should still accept it and the
llvm.amdgcn.s.memtime intrinsic should still work.
Differential Revision: https://reviews.llvm.org/D97928
Use that to print the diagnostic in SemaChecking instead of
listing all of the builtins in a switch.
With the required features, IR generation will also be able
to error on this. Checking this here allows us to have a RISCV
focused error message.
Reviewed By: HsiangKai
Differential Revision: https://reviews.llvm.org/D97826
This commit refactors extension support to allow
specifying whether pragma is needed or not explicitly.
For backward compatibility pragmas are set to required
for all extensions that were added prior to this but
not for OpenCL 3.0 features.
Differential Revision: https://reviews.llvm.org/D97052
Exchanging types, reordering fields and borrowing a bit from OptionGroupIndex shrinks this from 12 bytes to 8.
This knocks ~20k from the binary size.
Reviewed By: aaron.ballman
Differential Revision: https://reviews.llvm.org/D97553
Use the fact that the number of line break is lower than printable characters to
guide the optimization process. Also use a fuzzy test that catches both \n and
\r in a single check to speedup the computation.
Differential Revision: https://reviews.llvm.org/D97320
Post D96572, a warning started showing up for me:
`clang/lib/Basic/Sanitizers.cpp:73:1: warning: control reaches end of non-void function [-Wreturn-type]`
So this adds a default to the case to return invalid, which seems appropriate,
and appears to correct the issue.
Differential Revision: https://reviews.llvm.org/D97496
The new `-fsanitize-address-destructor-kind=` option allows control over how module
destructors are emitted by ASan.
The new option is consumed by both the driver and the frontend and is propagated into
codegen options by the frontend.
Both the legacy and new pass manager code have been updated to consume the new option
from the codegen options.
It would be nice if the new utility functions (`AsanDtorKindToString` and
`AsanDtorKindFromString`) could live in LLVM instead of Clang so they could be
consumed by other language frontends. Unfortunately that doesn't work because
the clang driver doesn't link against the LLVM instrumentation library.
rdar://71609176
Differential Revision: https://reviews.llvm.org/D96572
Demonstrate how to add RISC-V V builtins and lower them to IR intrinsics for V extension.
Authored-by: Roger Ferrer Ibanez <rofirrim@gmail.com>
Co-Authored-by: Hsiangkai Wang <kai.wang@sifive.com>
Differential Revision: https://reviews.llvm.org/D93446
This (mostly) reverts 32c501dd88. Hit a
case where this causes a behaviour change, perhaps the same root cause
that triggered the revert of a40db5502b in
7799ef7121.
(The API changes in DirectoryEntry.h have NOT been reverted as a number
of subsequent commits depend on those.)
https://reviews.llvm.org/D90497#2582166
This patch responds to a comment from @vitalybuka in D96203: suggestion to
do the change incrementally, and start by modifying this file name. I modified
the file name and made the other changes that follow from that rename.
Reviewers: vitalybuka, echristo, MaskRay, jansvoboda11, aaron.ballman
Differential Revision: https://reviews.llvm.org/D96974
Added -mrop-protection for Power PC to turn on codegen that provides some
protection from ROP attacks.
The option is off by default and can be turned on for Power 8, Power 9 and
Power 10.
This patch is for the option only. The feature will be implemented by a later
patch.
Reviewed By: amyk
Differential Revision: https://reviews.llvm.org/D96512
Add the types for the RISC-V V extension builtins.
These types will be used by the RISC-V V intrinsics which require
types of the form <vscale x 1 x i64>(LMUL=1 element size=64) or
<vscale x 4 x i32>(LMUL=2 element size=32), etc. The vector_size
attribute does not work for us as it doesn't create a scalable
vector type. We want these types to be opaque and have no operators
defined for them. We want them to be sizeless. This makes them
similar to the ARM SVE builtin types. But we will have quite a bit
more types. This patch adds around 60. Later patches will add
another 230 or so types representing tuples of these types similar
to the x2/x3/x4 types in ARM SVE. But with extra complexity that
these types are combined with the LMUL concept that is unique to
RISCV.
For more background see this RFC
http://lists.llvm.org/pipermail/llvm-dev/2020-October/145850.html
Authored-by: Roger Ferrer Ibanez <roger.ferrer@bsc.es>
Co-Authored-by: Hsiangkai Wang <kai.wang@sifive.com>
Differential Revision: https://reviews.llvm.org/D92715
The tile directive is in OpenMP's Technical Report 8 and foreseeably will be part of the upcoming OpenMP 5.1 standard.
This implementation is based on an AST transformation providing a de-sugared loop nest. This makes it simple to forward the de-sugared transformation to loop associated directives taking the tiled loops. In contrast to other loop associated directives, the OMPTileDirective does not use CapturedStmts. Letting loop associated directives consume loops from different capture context would be difficult.
A significant amount of code generation logic is taking place in the Sema class. Eventually, I would prefer if these would move into the CodeGen component such that we could make use of the OpenMPIRBuilder, together with flang. Only expressions converting between the language's iteration variable and the logical iteration space need to take place in the semantic analyzer: Getting the of iterations (e.g. the overload resolution of `std::distance`) and converting the logical iteration number to the iteration variable (e.g. overload resolution of `iteration + .omp.iv`). In clang, only CXXForRangeStmt is also represented by its de-sugared components. However, OpenMP loop are not defined as syntatic sugar. Starting with an AST-based approach allows us to gradually move generated AST statements into CodeGen, instead all at once.
I would also like to refactor `checkOpenMPLoop` into its functionalities in a follow-up. In this patch it is used twice. Once for checking proper nesting and emitting diagnostics, and additionally for deriving the logical iteration space per-loop (instead of for the loop nest).
Differential Revision: https://reviews.llvm.org/D76342
The patch only plumbs through the option necessary for targeting sm_86 GPUs w/o
adding any new functionality.
Differential Revision: https://reviews.llvm.org/D95974
This patch implements generation of remaining codegen options and tests it by performing parse-generate-parse round trip.
Reviewed By: dexonsmith
Differential Revision: https://reviews.llvm.org/D96056
This patch implements generation of remaining language options and tests it by performing parse-generate-parse round trip (on by default for assert builds, off otherwise).
This patch also correctly reports failures in `parseSanitizerKinds`, which is necessary for emitting diagnostics when an invalid sanitizer is passed to `-fsanitize=` during round-trip.
This patch also removes TableGen marshalling classes from two options:
* `fsanitize_blacklist` When parsing: it's first initialized via the generated code, but then also changed by manually written code, which is confusing.
* `fopenmp` When parsing: it's first initialized via generated code, but then conditionally changed by manually written code. This is also confusing. Moreover, we need to do some extra checks when generating it, which would be really cumbersome in TableGen. (Specifically, not emitting it when `-fopenmp-simd` was present.)
Reviewed By: dexonsmith
Differential Revision: https://reviews.llvm.org/D95793
Currently the emscripten frontend driver injects this when building
with thread support. Moving this into the clang driver itself makes
the emscripten python driver less magical.
Differential Revision: https://reviews.llvm.org/D96171
This patch adds possibility to define OpenCL C 3.0 feature macros
via command line option or target setting.
Reviewed By: Anastasia
Differential Revision: https://reviews.llvm.org/D95776
Commit 6bf29dbb enables float128 feature by default for Power9 targets.
But float128 may cause build failure in libcxx testing. Revert this
commit first to unblock LLVM 12 release.
When the -matomics feature is not enabled, disable POSIXThreads
mode and set the thread model to Single, so that we don't predefine
macros like `__STDCPP_THREADS__`.
Differential Revision: https://reviews.llvm.org/D96091
This change implements support for applying profile instrumentation
only to selected files or functions. The implementation uses the
sanitizer special case list format to select which files and functions
to instrument, and relies on the new noprofile IR attribute to exclude
functions from instrumentation.
Differential Revision: https://reviews.llvm.org/D94820
This change implements support for applying profile instrumentation
only to selected files or functions. The implementation uses the
sanitizer special case list format to select which files and functions
to instrument, and relies on the new noprofile IR attribute to exclude
functions from instrumentation.
Differential Revision: https://reviews.llvm.org/D94820
The `LangStandard::Kind` parsed from command line arguments is used to set up some `LangOption` defaults, but isn't stored anywhere.
To be able to generate `-std=` (in future patch), we need `CompilerInvocation` to not forget it.
This patch demonstrates another use-case: using `LangStd` to set up defaults of marshalled options.
Reviewed By: dexonsmith
Differential Revision: https://reviews.llvm.org/D95342
Change `SourceManager::getOrCreateContentCache` to take a `FileEntryRef`
and update call sites (mostly internal to SourceManager.cpp). In a
couple of cases this temporarily relies on `FileEntry::getLastRef`, but
those can be cleaned up once other APIs switch over.
The one change outside of SourceManager.cpp is in ASTReader.cpp, which
stops relying on the auto-degrade-to-`FileEntry*` behaviour from
`InputFile::getFile` since it now needs a `FileEntryRef`.
No functionality change here.
Differential Revision: https://reviews.llvm.org/D92983
Change `SourceManager::createFileID(const FileEntry*)` to defer to
`SourceManager::createFileID(FileEntryRef)`. This fixes an unexercised
bug where the latter gained support for named pipes and the former
didn't, but since we're trying to remove all calls to the former it
doesn't really make sense to test this explicitly now that the
implementation is hollowed out.
This is a belated follow-up to 245218bb35,
which sunk named pipe support into FileManager and SourceManager. The
original version of that patch was based on top of
https://reviews.llvm.org/D92984, which removed the `FileEntry` overload
of `createFileID()`, and I missed the subtle difference when it was
rebased.
Currently, there is some refactoring needed in existing interface of OpenCL option
settings to support OpenCL C 3.0. The problem is that OpenCL extensions and features
are not only determined by the target platform but also by the OpenCL version.
Also, there are core extensions/features which are supported unconditionally in
specific OpenCL C version. In fact, these rules are not being followed for all targets.
For example, there are some targets (as nvptx and r600) which don't support
OpenCL C 2.0 core features (nvptx.languageOptsOpenCL.cl, r600.languageOptsOpenCL.cl).
After the change there will be explicit differentiation between optional core and core
OpenCL features which allows giving diagnostics if target doesn't support any of
necessary core features for specific OpenCL version.
This patch also eliminates `OpenCLOptions` instance duplication from `TargetOptions`.
`OpenCLOptions` instance should take place in `Sema` as it's going to be modified
during parsing. Removing this duplication will also allow to generally simplify
`OpenCLOptions` class for parsing purposes.
Reviewed By: Anastasia
Differential Revision: https://reviews.llvm.org/D92277