Similar to D131064, this alters most of the intrinsics in arm_neon.h to
be target based, not preprocessor based. The intrinsics that are changed
are the ones with obvious target features (fp16, fp16fml, cryptos, i8mm
and bf16). The ones that are not yet altered are the ones without target
features like rdma (8.1) and complex (8.3). Those will be switched in a
followup patch that allows targeting architecture versions.
The existing ArchGuard in arm_neon.td is split into ArchGuard that still
adds ifdef defines (for example for intrinsics that require __aarch64__),
and TargetGuards for intrinsics dependant on target features. From there
the TargetGuards are used in two ways:
- For intrinsics emitted as functions, __attribute__((target(TargetGuard)))
is added to the definition of the function. Along with the existing
always_inline intrinsic, this will give a compile time error if the
function is used in a context where the target feature is not available.
- For intrinsics emitted as macros, the __builtins are emitted into
arm_neon.inc using TARGET_BUILTIN as opposed to BUILTIN, which includes
the target feature and gives an error if the builtin is found in a
function without the required features, similar to arm_sve.h.
The second method requires that the intrinsics be separable from the
existing _v intrinsics used in other types. For example
__builtin_neon_splat_lane_bf16 is used as opposed to
__builtin_neon_splat_lane_v. There are some adjustments to the CGBuiltin
to account for intrinsics that can be treated similarly, except for
their target features.
Differential Revision: https://reviews.llvm.org/D132034
This makes it possible to be used in all modes, instead of just when
`-fms-extensions` is enabled. Also moves the `-fms-extensions`-exclusive
traits into their own file so we can check the others aren't dependent
on this flag.
This is information that the compiler already has, and should be exposed
so that the library doesn't need to reimplement the exact same
functionality.
This was originally a part of D116280.
Depends on D135177.
Differential Revision: https://reviews.llvm.org/D135339
... as builtins.
This is information that the compiler already has, and should be exposed
so that the library doesn't need to reimplement the exact same
functionality.
This was originally a part of D116280.
Depends on D135175.
Differential Revision: https://reviews.llvm.org/D135177
This is information that the compiler already has, and should be exposed
so that the library doesn't need to reimplement the exact same
functionality.
This was originally a part of D116280.
Differential Revision: https://reviews.llvm.org/D135175
Summary:
This test started failing locally due to a misspelling of `-save-temps`
for the linker wrapper and the `ls` command not having the glob
arguments put in a string. This patch should fix it.
MSVC allows bit-fields to be specified as instruction operands in inline asm
blocks. Such references are resolved to the address of the allocation unit that
the named bitfield is a part of. The result is that reads and writes of such
operands will read or mutate nearby bit-fields stored in the same allocation
unit. This is a surprising behavior for which MSVC issues warning C4401,
"'<identifier>': member is bit field". Intel's icc compiler also allows such
bit-field references, but does not issue a diagnostic.
Prior to this change, Clang fails the following assertion when a bit-field is
referenced in such instructions:
clang/lib/CodeGen/CGValue.h:338: llvm::Value* clang::CodeGen::LValue::getPointer(clang::CodeGen::CodeGenFunction&) const: Assertion `isSimple()' failed.
In non-assert enabled builds, Clang's behavior appears to match the behavior
of the MSVC and icc compilers, though it is unknown if that is by design or
happenstance.
Following this change, attempts to use a bit-field as an instruction operand
in Microsoft style asm blocks is diagnosed as an error due to the risk of
unintentionally reading or writing nearby bit-fields.
Fixes https://github.com/llvm/llvm-project/issues/57791
Reviewed By: erichkeane, aaron.ballman
Differential Revision: https://reviews.llvm.org/D135500
unevaluated context
Closes https://github.com/llvm/llvm-project/issues/58133
The direct cause for this issue is that the compilation process
continues after it found it is in a invalid state. [expr.await]p2 says
clearly that the co_await expressions are not allowed to appear in
unevaluated context. So we can exit early in this case. It also reduces
many redundant diagnostic messages (Such as 'expression with side
effects has no effect in an unevaluated context').
Generally, with PGO enabled the C++20 likelyhood attributes shall be
dropped assuming the profile has a good coverage. However, currently
this is not the case for the following code:
if (always_false()) [[likely]] {
...
}
The patch fixes this and drops the attribute, if the parent context was
executed in the profile. The patch still preserves the attribute, if the
parent context was not executed, e.g. to support the cases when the
profile has insufficient coverage.
Differential Revision: https://reviews.llvm.org/D134456
Closes https://github.com/llvm/llvm-project/issues/58196.
The root cause for the problem is an oversight in
https://reviews.llvm.org/D127624, which allows the redeclarations in
partitions. However, we took a mistake there that we should only allow
it if the redeclarations in the one same module instead of return
directly if either the redeclaration lives in a partition. The original
implementation makes no sense and I believe it was an oversight.
Closes https://github.com/llvm/llvm-project/issues/58199
Previously, when we act on a import statement, we'll assume there is a
module declaration in the current TU if the command line tells us we're
compiling a module unit. This makes since on valid codes. However, for
invalid codes, it is possible. See
https://github.com/llvm/llvm-project/issues/58199 for example.
This patch removes the assertion. And the assertion is a noop and it
should be safe to remove it.
Currently baremetal driver adds <sysroot>/include/c++/v1
for libc++ headers. However on ChromeOS, all include files
are inside <sysroot>/usr/include. So add
<sysroot>/usr/include/c++/v1 if it exists in baremetal driver.
Reviewed By: MaskRay
Differential Revision: https://reviews.llvm.org/D134478
Two another atomic compare capture forms, `{ v = x; expr-stmt }` and `{ expr-stmt; v = x; }`
where `expr-stmt` could be `cond-expr-stmt` are missing.
Reviewed By: ABataev
Differential Revision: https://reviews.llvm.org/D135236
Promoting kernel arg pointer to global addr space is only
available with registered amdgcn target.
Fix test so that it does not require registered amdgcn target.
Currently there is a middle-end or backend issue
https://github.com/llvm/llvm-project/issues/58176
which causes values loaded from bool pointer incorrect when
bool range metadata is emitted. Temporarily
disable bool range metadata until the backend issue
is fixed.
Reviewed by: Artem Belevich
Differential Revision: https://reviews.llvm.org/D135269
Fixes: SWDEV-344137
As another regression from the Deferred Concepts Instantiation patch, we
weren't properly detecting that a friend referenced its containing
Record when it referred to it without its template parameters. This
patch makes sure that we do.
Without this patch `VarDecl::hasDependent()` checks only undeduced auto types, so can give false negatives result for other undeduced types.
This lead to crashes in sequence `!VarDecl::hasDepentent()` => `getDeclAlign()`.
It seems this problem appeared since D105380
Reviewed By: mizvekov
Differential Revision: https://reviews.llvm.org/D135362
This caused false positives, see comment on the code review.
> When support for copy elision was initially added in e97654b2f2, it
> was taking attributes from a constructor call, although that constructor
> call is actually not involved. It seems more natural to use attributes
> on the function returning the scoped capability, which is where it's
> actually coming from. This would also support a number of interesting
> use cases, like producing different scope kinds without the need for tag
> types, or producing scopes from a private mutex.
>
> Changing the behavior was surprisingly difficult: we were not handling
> CXXConstructorExpr calls like regular calls but instead handled them
> through the DeclStmt they're contained in. This was based on the
> assumption that constructors are basically only called in variable
> declarations (not true because of temporaries), and that variable
> declarations necessitate constructors (not true with C++17 anymore).
>
> Untangling this required separating construction from assigning a
> variable name. When a call produces an object, we use a placeholder
> til::LiteralPtr for `this`, and we collect the call expression and
> placeholder in a map. Later when going through a DeclStmt, we look up
> the call expression and set the placeholder to the new VarDecl.
>
> The change has a couple of nice side effects:
> * We don't miss constructor calls not contained in DeclStmts anymore,
> allowing patterns like
> MutexLock{&mu}, requiresMutex();
> The scoped lock temporary will be destructed at the end of the full
> statement, so it protects the following call without the need for a
> scope, but with the ability to unlock in case of an exception.
> * We support lifetime extension of temporaries. While unusual, one can
> now write
> const MutexLock &scope = MutexLock(&mu);
> and have it behave as expected.
> * Destructors used to be handled in a weird way: since there is no
> expression in the AST for implicit destructor calls, we instead
> provided a made-up DeclRefExpr to the variable being destructed, and
> passed that instead of a CallExpr. Then later in translateAttrExpr
> there was special code that knew that destructor expressions worked a
> bit different.
> * We were producing dummy DeclRefExprs in a number of places, this has
> been eliminated. We now use til::SExprs instead.
>
> Technically this could break existing code, but the current handling
> seems unexpected enough to justify this change.
>
> Reviewed By: aaron.ballman
>
> Differential Revision: https://reviews.llvm.org/D129755
This reverts commit 0041a69495 and the follow-up
warning fix in 83d93d3c11.
Previously we were stripping these normally inherited attributes during
explicit specialization. However for class template member functions
(but not function templates), MSVC keeps the attribute.
This makes Clang match that behavior, and fixes GitHub issue #54717
Differential revision: https://reviews.llvm.org/D135154
Summary:
This added a check for no unknown argument warnings. This apparently
occurs in the Windows toolchain as it cannot find a toolchain. This
patch fixes it by just ignoring this warning.
The `cl-uniform-work-group` attribute asserts that the global work-size
be a multiple of the work-group specified work group size. This should
allow optimizations. It is already present by default in the AMD
compiler and for HIP kernels so it should be safe to allow this for
OpenMP kernels by default.
Reviewed By: jdoerfert
Differential Revision: https://reviews.llvm.org/D135374
This is a follow-up to D134224. The original patch added new `ExcludedHeader` enumerator to `ModuleMap::ModuleHeaderRole` and started associating headers with the modules they were excluded from. This was necessary to consider their module maps as "affecting" in certain situations and in turn serialize them into the PCM.
The association of the header and module needs to be handled when deserializing the PCM as well, though. This patch fixes a potential assertion failure and a regression. This essentially reverts parts of feb54b6ded.
Reviewed By: Bigcheese
Differential Revision: https://reviews.llvm.org/D135381
The offloading toolchain makes heavy use of options beginning with
`--o`. This is problematic when combined with the joined `-o` flag. In
the following situation, the user will not get the expected output and
will not notice as the expected output will still be written.
```
clang++ -x cuda foo.cu -offload-arch=sm_80 -o foo
```
This patch introduces a warning that checks for joined `-o` arguments
that would also be a valid driver argument if an additional `-` were
added. I believe this situation is uncommon enough to warrant a warning,
and can be trivially fixed by the end user by using the more common
separate form instead.
Reviewed By: tra, MaskRay
Differential Revision: https://reviews.llvm.org/D135389
After generated call for ctor/dtor for entry, global variable for ctor/dtor are useless.
Remove them for non-lib profiles.
Lib profile still need these in case export function used the global variable which require ctor/dtor.
Reviewed By: efriedma
Differential Revision: https://reviews.llvm.org/D133993
When support for copy elision was initially added in e97654b2f2, it
was taking attributes from a constructor call, although that constructor
call is actually not involved. It seems more natural to use attributes
on the function returning the scoped capability, which is where it's
actually coming from. This would also support a number of interesting
use cases, like producing different scope kinds without the need for tag
types, or producing scopes from a private mutex.
Changing the behavior was surprisingly difficult: we were not handling
CXXConstructorExpr calls like regular calls but instead handled them
through the DeclStmt they're contained in. This was based on the
assumption that constructors are basically only called in variable
declarations (not true because of temporaries), and that variable
declarations necessitate constructors (not true with C++17 anymore).
Untangling this required separating construction from assigning a
variable name. When a call produces an object, we use a placeholder
til::LiteralPtr for `this`, and we collect the call expression and
placeholder in a map. Later when going through a DeclStmt, we look up
the call expression and set the placeholder to the new VarDecl.
The change has a couple of nice side effects:
* We don't miss constructor calls not contained in DeclStmts anymore,
allowing patterns like
MutexLock{&mu}, requiresMutex();
The scoped lock temporary will be destructed at the end of the full
statement, so it protects the following call without the need for a
scope, but with the ability to unlock in case of an exception.
* We support lifetime extension of temporaries. While unusual, one can
now write
const MutexLock &scope = MutexLock(&mu);
and have it behave as expected.
* Destructors used to be handled in a weird way: since there is no
expression in the AST for implicit destructor calls, we instead
provided a made-up DeclRefExpr to the variable being destructed, and
passed that instead of a CallExpr. Then later in translateAttrExpr
there was special code that knew that destructor expressions worked a
bit different.
* We were producing dummy DeclRefExprs in a number of places, this has
been eliminated. We now use til::SExprs instead.
Technically this could break existing code, but the current handling
seems unexpected enough to justify this change.
Reviewed By: aaron.ballman
Differential Revision: https://reviews.llvm.org/D129755
We might have a CK_NoOp cast and a further CK_ConstructorConversion.
As an optimization, drop some IgnoreParens calls: inside of the
CK_{Constructor,UserDefined}Conversion should be no more parentheses,
and inside the CXXBindTemporaryExpr should also be none.
Lastly, we factor out the unpacking so that we can reuse it for
MaterializeTemporaryExprs later on.
Reviewed By: aaron.ballman
Differential Revision: https://reviews.llvm.org/D129752
The documentation specifies that the parameters for the vcipher builtins are
```
vector unsigned char
```
The code used
```
vector unsigned long long
```
This patch fixes the types for the vcipher builtins.
Reviewed By: amyk
Differential Revision: https://reviews.llvm.org/D135300
Make the empty headers used by cl-pch-showincludes.cpp unique so that
filesystems that link these files together by contents will not see
different behaviour in this test, which is not testing linked files
specifically.
This was uncovered by 5ea78c4113 which made us stop mutating the name
of the presumed loc for the file in ContentCache, but that just surfaced
an underlying issue that the filename of multiple includes of linked
files are not separately tracked.
Differential Revision: https://reviews.llvm.org/D135373
The new driver supports LTO for RDC-mode compilations. However, this was
not correctly handled for non-LTO compilations. HIP can handle this as
it is fed to `lld` which will perform the LTO itself. CUDA however would
require every work which is wholly useless in non-RDC mode so it should
report an error.
Reviewed By: yaxunl
Differential Revision: https://reviews.llvm.org/D135305
We would issue the same diagnostic twice in the case that the K&R C
function definition is preceded by a static declaration of the function
with a prototype.
Fixes#58181
Remove the ability to disable opaque pointers by default in clang.
It is still possible to explicitly disable them via cc1
-no-opaque-pointers.
Differential Revision: https://reviews.llvm.org/D135259
In HIP a library is usually compiled with default target ID e.g. gfx906 so that
it can be used in all GPU configurations. The bitcode is saved in bundled
bitcode with gfx906 in entry ID.
In runtime compilation, a HIP program is compiled with a target ID matching
the GPU configuration, e.g. gfx906:xnack-. This program needs to link with
a library bundled bitcode with target ID gfx906.
For example:
clang --offload-arch=gfx906 -o lib.o lib.hip
clang --offload-arch=gfx906:xnack- program.hip lib.o
This common use case requires that clang-offlod-bundler to be able to extract
entry with compatible target ID, e.g. extracting an gfx906 entry when requesting
gfx906:xnack-.
Currently clang-offload-bundler only allow extracting entry with exact match
of target ID. This patch relaxes that so that it can extract entries with compatible
target ID.
Reviewed by: Artem Belevich, Saiyedul Islam
Differential Revision: https://reviews.llvm.org/D134546
When dep-scanning, canonicalize the module map path as much as we can.
This avoids unnecessarily needing to build multiple versions of a module
due to symlinks or case-insensitive file paths.
Despite the name `tryGetRealPathName`, the previous implementation did
not actually return the realpath most of the time, and indeed it would
be incorrect to do so since the realpath could be outside the module
directory, which would have broken finding headers relative to the
module.
Instead, use a canonicalization that is specific to the needs of
modulemap files (canonicalize the directory separately from the
filename).
Differential Revision: https://reviews.llvm.org/D134923
Update SourceManager::ContentCache::OrigEntry to keep the original
FileEntryRef, and use that to enable ModuleMap::getModuleMapFile* to
return the original FileEntryRef. This change should be NFC for
most users of SourceManager::ContentCache, but it could affect behaviour
for users of getNameAsRequested such as in compileModuleImpl. I have not
found a way to detect that difference without additional functional
changes, other than incidental cases like changes from / to \ on
Windows so there is no new test.
Differential Revision: https://reviews.llvm.org/D135220
In the context of caching clang invocations it is important to emit diagnostics in deterministic order;
the same clang invocation should result in the same diagnostic output.
rdar://100336989
Differential Revision: https://reviews.llvm.org/D135118
We use protected visibility for almost everything with offloading. This
is because it provides us with the ability to read things from the host
without the expectation that it will be preempted by a shared library
load, bugs related to this have happened when offloading to the host.
This patch just makes the `exec_mode` global generated for each plugin
have protected visibility.
Reviewed By: jdoerfert
Differential Revision: https://reviews.llvm.org/D135285
Allow register binding attribute on variables.
Report warning when register binding attribute applies to local variable or static variable.
It will be ignored in this case.
Type check for register binding is tracked with https://github.com/llvm/llvm-project/issues/57886.
Reviewed By: aaron.ballman
Differential Revision: https://reviews.llvm.org/D134617
LookupSpecialMember might fail, so changes the cast to cast_or_null.
Inside Sema, skip a particular base, similar to other cases, rather than
asserting on dtor showing up.
Other option would be to mark classes with invalid destructors as invalid, but
that seems like a lot more invasive and we do lose lots of diagnostics that
currently work on classes with broken members.
Differential Revision: https://reviews.llvm.org/D135254
Currently the following case fails:
```
template<typename Ty>
Ty foo(Ty *addr, Ty val) {
Ty v;
#pragma omp atomic compare capture
{
v = *addr;
if (*addr > val)
*addr = val;
}
return v;
}
```
The compiler complains `addr` is not a lvalue. That's because when an expression
is instantiation dependent, we cannot tell if it is lvalue or not.
Reviewed By: ABataev
Differential Revision: https://reviews.llvm.org/D135224
Details posted here: https://reviews.llvm.org/D119051#3747201
3 cases that were inconsistent with the MSABI without this patch applied:
https://godbolt.org/z/GY48qxh3G - field with protected member
https://godbolt.org/z/Mb1PYhjrP - non-static data member initializer
https://godbolt.org/z/sGvxcEPjo - defaulted copy constructor
I'm not sure what's suitable/sufficient testing for this - I did verify
the three cases above. Though if it helps to add them as explicit tests,
I can do that too.
Also, I was wondering if the other use of isTrivialForAArch64MSVC in
isPermittedToBeHomogenousAggregate could be another source of bugs - I
tried changing the function to unconditionally call
isTrivialFor(AArch64)MSVC without testing AArch64 first, but no tests
fail, so it looks like this is undertested in any case. But I had
trouble figuring out how to exercise this functionality properly to add
test coverage and then compare that to MSVC itself... - I got very
confused/turned around trying to test this, so I've given up enough to
send what I have out for review, but happy to look further into this
with help.
Differential Revision: https://reviews.llvm.org/D133817
This patch removes the ability of a dependency scanning worker to share a `FileManager` instance between individual scans. It's not sound and doesn't provide performance benefits (due to the underlying caching VFS).
Reviewed By: benlangmuir
Differential Revision: https://reviews.llvm.org/D134976
As @mizvekov suggested in D134772. This works great for D128750 when
dealing with AutoType's.
Reviewed By: mizvekov, erichkeane
Differential Revision: https://reviews.llvm.org/D135088
Turn it into a single Expr::isFlexibleArrayMemberLike method, as discussed in
https://discourse.llvm.org/t/rfc-harmonize-flexible-array-members-handling
Keep different behavior with respect to macro / template substitution, and
harmonize sharp edges: ObjC interface now behave as C struct wrt. FAM and
-fstrict-flex-arrays.
This does not impact __builtin_object_size interactions with FAM.
Differential Revision: https://reviews.llvm.org/D134791
Adds a fix to the diagnostic of replacing the `= default` to `= delete`
Reviewed By: aaron.ballman
Differential Revision: https://reviews.llvm.org/D134549
When -fmodule-file-home-is-cwd and the path to the PCM is relative, we
shouldn't assume that the path to the PCM is relative to the modulemap
that produced it. To respect the option -fmodule-file-home-is-cwd, we
should assume the path is relative to the current working directory.
Reviewed By: rmaz
Differential Revision: https://reviews.llvm.org/D134911
requires-expression
As reported: https://github.com/llvm/llvm-project/issues/57487
We properly treated a failed instantiation of a concept as a
unsatisified constraint, however, we need to do this at the 'requires
clause' level as well. This ensures that the parameters on a requires
clause that fail instantiation will cause a satisfaction failure.
This patch implements this by running requires parameter clause
instantiation under a SFINAE trap, then stores any such failure as a
requirement failure, so it can be diagnosed later.
The linker requires at least a "major.minor" for the SDK version, so it will fail when we don't have
a minor version in the case we don't actually have an SDK info.
Previously, a lambda expression in a dependent context with a default argument
containing an immediately invoked lambda expression would produce a closure
class object that, if invoked such that the default argument was used, resulted
in a compiler crash or one of the following assertion failures during code
generation. The failures occurred regardless of whether the lambda expressions
were dependent.
clang/lib/CodeGen/CGCall.cpp:
Assertion `(isGenericMethod || Ty->isVariablyModifiedType() || Ty.getNonReferenceType()->isObjCRetainableType() || getContext() .getCanonicalType(Ty.getNonReferenceType()) .getTypePtr() == getContext().getCanonicalType((*Arg)->getType()).getTypePtr()) && "type mismatch in call argument!"' failed.
clang/lib/AST/Decl.cpp:
Assertion `!Init->isValueDependent()' failed.
Default arguments in declarations in local context are instantiated along with
their enclosing function or variable template (since such declarations can't
be explicitly specialized). Previously, such instantiations were performed at
the same time that their associated parameters were instantiated. However, that
approach fails in cases like the following in which the context for the inner
lambda is the outer lambda, but construction of the outer lambda is dependent
on the parameters of the inner lambda. This change resolves this dependency by
delyaing instantiation of default arguments in local contexts until after
construction of the enclosing context.
template <typename T>
auto f() {
return [](T = []{ return T{}; }()) { return 0; };
}
Refactoring included with this change results in the same code now being used
to instantiate default arguments that appear in local context and those that
are only instantiated when used at a call site; previously, such code was
duplicated and out of sync.
Fixes https://github.com/llvm/llvm-project/issues/49178
Reviewed By: erichkeane
Differential Revision: https://reviews.llvm.org/D133500
If 'order(concurrent)' clause is specified, then the iterations of SIMD loop
can be executed concurrently.
This patch adds support for LLVM IR codegen via OMPIRBuilder for SIMD loop
with 'order(concurrent)' clause. The functionality added to OMPIRBuilder is
similar to the functionality implemented in 'CodeGenFunction::EmitOMPSimdInit'.
Reviewed By: jdoerfert
Differential Revision: https://reviews.llvm.org/D134046
Signed-off-by: Dominik Adamski <dominik.adamski@amd.com>
D128745 handled DR1432 for the partial ordering of partial specializations, but
missed the handling for the partial ordering of function templates. This patch
implements the latter. While at it, also simplifies the previous implementation to
be more close to the wording without functional changes.
Fixes https://github.com/llvm/llvm-project/issues/56090
Reviewed By: erichkeane, #clang-language-wg, mizvekov
Differential Revision: https://reviews.llvm.org/D133683
* Remove -no-canonical-prefixes. See 980679981f
* Test a.out and %t.out in few tests and remove excess `-o %t.o`
* Avoid wrapping lines too aggresstively
* Replace `"{{[^"]*}}ld{{(\.(lld|bfd|gold))?}}{{(\.exe)?}}"` with `ld{{(.exe)?}}"`. The new pattern supports more CLANG_DEFAULT_LINKER.
This option forces constructors and non-deleting destructors to return
`this` pointer in C++ ABI (except for Microsoft ABI, on which this flag
has no effect).
This is similar to ARM32, Apple ARM64, or Fuchsia C++ ABI, but can be
applied to any target triple.
Differential Revision: https://reviews.llvm.org/D119209
With this patch, TypedefTypes and UsingTypes can have an
underlying type which diverges from their corresponding
declarations.
For the TypedefType case, this can be seen when getting
the common sugared type between two redeclarations with
different sugar.
For both cases, this will become important as resugaring
is implemented, as this will allow us to resugar these
when they were dependent before instantiation.
Signed-off-by: Matheus Izvekov <mizvekov@gmail.com>
Differential Revision: https://reviews.llvm.org/D133468
As fallout of the Deferred Concept Instantiation patch (babdef27c5), we
got a number of reports of a regression, where we asserted when
instantiating a constraint on a generic lambda inside of a variable
template. See: https://github.com/llvm/llvm-project/issues/57958
The problem was that getTemplateInstantiationArgs function only walked
up declaration contexts, and missed that this is not necessarily the
case with a lambda (which can ALSO be in a separate context).
This patch refactors the getTemplateInstantiationArgs function in a way
that is hopefully more readable, and fixes the problem with the concepts
on a generic lambda.
Differential Revision: https://reviews.llvm.org/D134874
D132236 would have introduced regressions in the symbol lifetime
handling. However, the testsuite did not catch this, so here we have
some tests, which would have break if D132236 had landed.
This patch addresses the comment https://reviews.llvm.org/D132236#3753238
Co-authored-by: Balazs Benics <balazs.benics@sonarsource.com>
Reviewed By: martong
Differential Revision: https://reviews.llvm.org/D134941
This adds support under AArch64 for the target("..") attributes. The
current parsing is very X86-shaped, this patch attempts to bring it line
with the GCC implementation from
https://gcc.gnu.org/onlinedocs/gcc/AArch64-Function-Attributes.html#AArch64-Function-Attributes.
The supported formats are:
- "arch=<arch>" strings, that specify the architecture features for a
function as per the -march=arch+feature option.
- "cpu=<cpu>" strings, that specify the target-cpu and any implied
atributes as per the -mcpu=cpu+feature option.
- "tune=<cpu>" strings, that specify the tune-cpu cpu for a function as
per -mtune.
- "+<feature>", "+no<feature>" enables/disables the specific feature, for
compatibility with GCC target attributes.
- "<feature>", "no-<feature>" enabled/disables the specific feature, for
backward compatibility with previous releases.
To do this, the parsing of target attributes has been moved into
TargetInfo to give the target the opportunity to override the existing
parsing. The only non-aarch64 change should be a minor alteration to the
error message, specifying using "CPU" to describe the cpu, not
"architecture", and the DuplicateArch/Tune from ParsedTargetAttr have
been combined into a single option.
Differential Revision: https://reviews.llvm.org/D133848
Define the feature test macro for named character escapes.
I assume this was not done because it was implemented before formally accepted, right? cxx_status says the paper is implemented.
Reviewed By: cor3ntin
Differential Revision: https://reviews.llvm.org/D134898
Add support for a CLANG_NO_DEFAULT_CONFIG envvar that works like
the --no-default-config option when set to a non-empty value. Use it
to disable loading system configuration files during the test suite
runs.
Configuration files can change the driver behavior in extensive ways,
and it is neither really possible nor feasible to account for or undo
the effects of even the most common configuration uses. Therefore,
the most reasonable option seems to be to ignore configuration files
while running the majority of tests (with the notable exception of tests
for configuration file support).
Due to the diversity of ways that %clang is used in the test suite,
including using it to copy or symlink the clang executable, as well to
call -cc1 and -cc1as modes, it is not feasible to pass the explicit
options to disable config loading either. Using an environment variable
has the advantage of being easily applied across the test suite
and easily unset for default configuration file loading tests.
Differential Revision: https://reviews.llvm.org/D134905
This change exposes the ceil library function for HLSL,
excluding long, int, and long long doubles.
Ceil is supported for all scalar, vector, and matrix types.
Long and long long double support is missing in this patch because those types
don't exist in HLSL. Int is missing because the ceil function only works on floating type arguments.
The full documentation of the HLSL ceil function is available here:
https://docs.microsoft.com/en-us/windows/win32/direct3dhlsl/dx-graphics-hlsl-ceil
Reviewed By: python3kgae
Differential Revision: https://reviews.llvm.org/D134319
D103221 changed HIP's default to C++14, removing the driver logic to
force it into a different std.
Change-Id: I9f5220a7456687039b0bd3b3574f3124d3cc7665
Differential Revision: https://reviews.llvm.org/D134314
Change-Id: I40513f2ebe93ee53ea95c8bb3cc704487d970263
This partially reverts commit 1609a5d771 (the
test/Driver part). We want to discourage %clang_cc1 and clang -cc1 in
test/Driver. The clang -cc1 uses in hlsl/offload/etc are not good examples.
Use the `%clang_cc1` substitution consistently across the test suite,
replacing inline `%clang -cc1` invocations, except for one Preprocessor
test where this is causing breakage. This is necessary to ensure that
additional parameters passed via `%clang` do not interfere with `-cc1`
that must always be passed as the first command-line argument.
Remove the additional substitution blocking `%clang_cc1` use in Driver
tests. It has been added in 2013 and was supposed to prevent tests
calling `clang -cc1` from being added to Driver. The state of the test
suite proves that it did not succeed at all.
Differential Revision: https://reviews.llvm.org/D134880
Move the `%clang_dxc` substitution from local definition in clang/test
to lit's `llvm/config.py` module where all other driver definitions
are found. This improves consistency and makes it easier to control
global clang options.
Differential Revision: https://reviews.llvm.org/D134871
Change the default config file loading logic to be more flexible
and more readable at the same time. The new algorithm focuses on four
locations, in order:
1. <triple>-<mode>.cfg using real driver mode
2. <triple>-<mode>.cfg using executable suffix
3. <triple>.cfg + <mode>.cfg using real driver mode
4. <triple>.cfg + <mode>.cfg using executable suffix
This is meant to preserve reasonable level of compatibility with
the existing use, while introducing more flexibility and making the code
simpler. Notably:
1. In this layout, the actual target triple is normally respected,
and e.g. in `-m32` build the `x86_64-*` configs will never be used.
2. Both real driver mode (preferable) and executable suffix are
supported. This permits correctly handling calls with explicit
`--driver-mode=` while at the same time preserving compatibility
with the existing code.
3. The first two locations provide users with the ability to override
configuration per specific target+mode combinaton, while the next two
make it possible to independently specify per-target and per-mode
configuration.
4. All config file locations are applicable independently of whether
clang is started via a prefixed executable, or bare `clang`.
5. If the target is not explicitly specified and the executable prefix
does not name a valid triple, it is used instead of the actual target
triple for backwards compatibility.
This is particularly meant to address Gentoo's use case for
configuration files: to configure the default runtimes (i.e. `-rtlib=`,
`-stdlib=`) and `--gcc-install-dir=` for all the relevant drivers,
as well as to make it more convenient for users to override `-W` flags
to test compatibility with future versions of Clang easier.
Differential Revision: https://reviews.llvm.org/D134337
There are lots of options interacting in complex ways here, and when moving to
`getDefaultUnwindTableLevel` I had refactored this and changed behaviour in
some cases. So this reverts the basic structure of the logic back to the
original, while leaving the hook in the new style.
The -fallow-half-arguments-and-returns option was removed in
59528e4bdb27ed4ab3, replaced with an always-on target option under
AArch64/Arm. There are two tests - fp16-sema.c and renderscripts.rs that
test that an error is produced for __fp16 function args/returns, which
are now expected to pass for Arm/AArch64. i.e they no longer give the
same error as before on native Arm/AArch64 machines. Alter the targets
of those tests to compensate.
This cc1 option -fallow-half-arguments-and-returns allows __fp16 to be
passed by argument and returned, without giving an error. It is
currently always enabled for Arm and AArch64, by forcing the option in
the driver. This means any cc1 tests (especially those needing
arm_neon.h) need to specify the option too, to prevent the error from
being emitted.
This changes it to a target option instead, set to true for Arm and
AArch64. This allows the option to be removed. Previously it was implied
by -fnative_half_arguments_and_returns, which is set for certain
languages like open_cl, renderscript and hlsl, so that option now too
controls the errors. There were are few other non-arm uses of
-fallow-half-arguments-and-returns but I believe they were unnecessary.
The strictfp_builtins.c tests were converted from __fp16 to _Float16 to
avoid the issues.
Differential Revision: https://reviews.llvm.org/D133885
Although the instruction names begin "frint", the ACLE spec states that
the intrinsic names begin "__rint", without the "f".
Differential Revision: https://reviews.llvm.org/D134824
Driver options usually use `Joined` instead of `Separate`. It is also weird that
`--config-system-dir=`/etc exist while `--config=` did not exist.
Reviewed By: mgorny
Differential Revision: https://reviews.llvm.org/D134790
k: A memory operand whose address is formed by a base register and
(optionally scaled) index register.
m: A memory operand whose address is formed by a base register and
offset that is suitable for use in instructions with the same
addressing mode as st.w and ld.w.
ZB: An address that is held in a general-purpose register. The offset
is zero.
ZC: A memory operand whose address is formed by a base register and
offset that is suitable for use in instructions with the same
addressing mode as ll.w and sc.w.
Differential Revision: https://reviews.llvm.org/D134638
Currently, clang does not emit debuginfo for the switch stmt
case value if it is an enum value. For example,
$ cat test.c
enum { AA = 1, BB = 2 };
int func1(int a) {
switch(a) {
case AA: return 10;
case BB: return 11;
default: break;
}
return 0;
}
$ llvm-dwarfdump test.o | grep AA
$
Note that gcc does emit debuginfo for the same test case.
This patch added such a support with similar implementation
to CodeGenFunction::EmitDeclRefExprDbgValue(). With this patch,
$ clang -g -c test.c
$ llvm-dwarfdump test.o | grep AA
DW_AT_name ("AA")
$
Differential Revision: https://reviews.llvm.org/D134705
This amends fd874e5fb1 to correctly set
the bit width of a '!' operator to be the same width as an 'int'. This
fixes a failed assertion about unexpected bit widths that was reported
during post-commit testing.
This implements WG14 N2927 and WG14 N2930, which together define the
feature for typeof and typeof_unqual, which get the type of their
argument as either fully qualified or fully unqualified. The argument
to either operator is either a type name or an expression. If given a
type name, the type information is pulled directly from the given name.
If given an expression, the type information is pulled from the
expression. Recursive use of these operators is allowed and has the
expected behavior (the innermost operator is resolved to a type, and
that's used to resolve the next layer of typeof specifier, until a
fully resolved type is determined.
Note, we already supported typeof in GNU mode as a non-conforming
extension and we are *not* exposing typeof_unqual as a non-conforming
extension in that mode, nor are we exposing typeof or typeof_unqual as
a nonconforming extension in other language modes. The GNU variant of
typeof supports a form where the parentheses are elided from the
operator when given an expression (e.g., typeof 0 i = 12;). When in C2x
mode, we do not support this extension.
Differential Revision: https://reviews.llvm.org/D134286
This patch implements P0634r3 that removes the need for 'typename' in certain contexts.
For example,
```
template <typename T>
using foo = T::type; // ok
```
This is also allowed in previous language versions as an extension, because I think it's pretty useful. :)
Reviewed By: #clang-language-wg, erichkeane
Differential Revision: https://reviews.llvm.org/D53847
While working on D53847, I noticed that this test would fail once we
started recognizing the types in the modified `export` statement [0].
The tests would fail because Clang would emit a "declaration does not
declare anything" diagnostic instead of the expected namespace scope
diagnostic.
I believe that the test is currently incorrectly passing because Clang
doesn't parse the type and therefore doesn't treat the statement as a
declaration. My understanding is that the intention of this test case is
that it wants to export a `struct` type, which I believe requires a
`struct` keyword, even for types with template parameters. With this
change, the only error with these two statements should be the
namespace scope issue.
[0]: https://reviews.llvm.org/D53847?id=462032#inline-1297053
Reviewed By: erichkeane
Differential Revision: https://reviews.llvm.org/D134578
This eagerly reports use of undef values when passed to noundef
parameters or returned from noundef functions.
This also decreases binary sizes under msan.
To go back to the previous behavior, pass `-fno-sanitize-memory-param-retval`.
Reviewed By: vitalybuka, MaskRay
Differential Revision: https://reviews.llvm.org/D134669
Although using-enum's grammar is 'using elaborated-enum-specifier',
the lookup for the enum is ordinary lookup (and not the tagged-type
lookup that normally occurs wth an tagged-type specifier). Thus (a)
we can find typedefs and (b) do not find enum tags hidden by a non-tag
name (the struct stat thing).
This reimplements that part of using-enum handling, to address DR2621,
where clang's behaviour does not match std intent (and other
compilers).
Reviewed By: aaron.ballman
Differential Revision: https://reviews.llvm.org/D134283
This is extremely useful for language tooling as it allows
for providing go-to-def/find-references/etc. for many
more situations than what is currently possible.
Differential Revision: https://reviews.llvm.org/D134087
After https://reviews.llvm.org/D134461, Clang will diagnose a warning if
trying to deference void pointers in C mode. However, this causes a lot
of noises when compiling a 5.19.11 Linux kernel.
This patch reduces the warning by marking deferencing void pointers in
unevaluated context OK, like `sizeof(*void_ptr)`, `typeof(*void_ptr)`
and etc.
Fixes https://github.com/ClangBuiltLinux/linux/issues/1720
Signed-off-by: Jun Zhang <jun@junz.org>
Differential Revision: https://reviews.llvm.org/D134702
It is data mapping ordering problem.
According omp spec
If one or more map clauses are present, the list item conversions that
are performed for any use_device_ptr or use_device_addr clause occur
after all variables are mapped on entry to the region according to those
map clauses.
The change is to put mapping data for use_device_addr at end of data
mapping array.
Differential Revision: https://reviews.llvm.org/D134556
The following three static functions in `clang/lib/Driver/ToolChains/CommonArgs.cpp`
```
static void renderRpassOptions(...)
static void renderRemarksOptions(...)
static void renderRemarksHotnessOptions(...)
```
use `--plugin-opt` for the plugin option prefix, while the function `tools::addLTOOptions` uses `-plugin-opt`. This patch makes sure that we only use `-plugin-opt` (single dash) everywhere. It is not clear to me that why we decided to use `--plugin-opt` in https://reviews.llvm.org/D85810. If using `--plugin-opt` is intended, I'd love to hear the reason and I will close this patch.
We intend to followup this patch with a few other patches that teach `clang` to pass plugin options to the AIX linker, which uses a different prefix (`-bplugin_opt:`).
Reviewed By: w2yehia
Differential Revision: https://reviews.llvm.org/D134668
The ptxas assembler does not allow the `-g` flag along with
optimizations. Normally this is degraded to line info in the driver, but
when using LTO we did not have this step and the linker wrapper was not
correctly degrading the option. Note that this will not work if the user
does not pass `-g` again to the linker invocation. That will require
setting some flags in the binary to indicate that debugging was used
when building.
This fixes#57990
Reviewed By: jdoerfert
Differential Revision: https://reviews.llvm.org/D134660
As reported in GH #57945, this would crash because the decl context for
the lambda was being loaded via 'getNonClosureContext', which only gets
CODE contexts, so a global lambda was getting 'nullptr' here instead.
This patch does some work to make sure we get a valid/valuable
declcontext here instead.
A given function is compatible with all previous arch versions.
To avoid compering values of the attribute this logic adds all predecessor
architecture values.
Reviewed By: dmgreen, DavidSpickett
Differential Revision: https://reviews.llvm.org/D134353
HIP is able to unbundle archive of bundled bitcode.
However currently there are two bugs:
1. archives passed by -l: are not unbundled.
2. archives passed as input files are not unbundled
The actual file name of an archive passed by -l: should
not be prefixed with lib and appended with '.a',
but the file path is prefixed with paths in '-L' options.
The actual file name of an archive passed as an input file
stays the same, not affected by the '-L' options.
This reverts commit 192d69f7e6.
This fixes the condition to check whether this is a situation where we
are in a recovery-expr'ed concept a little better, so we don't access an
inactive member of a union, which should make the bots happy.
Differential Revision: https://reviews.llvm.org/D134542
This reverts commit e3d14bee23.
There are apparently a large number of crashes in libcxx and some JSON
Parser thing, so clearly this has some sort of serious issue. Reverting
so I can take some time to figure out what is going on.
Discovered by reducing a different problem, we currently assert because
we failed to make the constraint expressions not dependent, since a
RecoveryExpr cannot be transformed.
This patch fixes that, and gets reasonably nice diagnostics by
introducing a concept (hah!) of "ContainsErrors" to the Satisfaction
types, which causes us to treat the candidate as non-viable.
However, just making THAT candidate non-viable would result in choosing
the 'next best' canddiate, which can result in awkward errors, where we
start evaluating a candidate that is not intended to be selected.
Because of this, and to make diagnostics more relevant, we now just
cause the entire lookup to result in a 'no-viable-candidates'.
This means we will only emit the list of candidates, rather than any
cascading failures.
In case when the prvalue is returned from the function (kind is one
of `SimpleReturnedValueKind`, `CXX17ElidedCopyReturnedValueKind`),
then it construction happens in context of the caller.
We pass `BldrCtx` explicitly, as `currBldrCtx` will always refer to callee
context.
In the following example:
```
struct Result {int value; };
Result create() { return Result{10}; }
int accessValue(Result r) { return r.value; }
void test() {
for (int i = 0; i < 2; ++i)
accessValue(create());
}
```
In case when the returned object was constructed directly into the
argument to a function call `accessValue(create())`, this led to
inappropriate value of `blockCount` being used to locate parameter region,
and as a consequence resulting object (from `create()`) was constructed
into a different region, that was later read by inlined invocation of
outer function (`accessValue`).
This manifests itself only in case when calling block is visited more
than once (loop in above example), as otherwise there is no difference
in `blockCount` value between callee and caller context.
This happens only in case when copy elision is disabled (before C++17).
Reviewed By: NoQ
Differential Revision: https://reviews.llvm.org/D132030