Leverage the method OpenCL uses that adds C intrinsics when the lookup
failed. There is no need to define C intrinsics in the header file any
more. It could help to avoid the large header file to speed up the
compilation of RVV source code. Besides that, only the C intrinsics used
by the users will be added into the declaration table.
This patch is based on https://reviews.llvm.org/D103228 and inspired by
OpenCL implementation.
### Experimental Results
#### TL;DR:
- Binary size of clang increase ~200k, which is +0.07% for debug build and +0.13% for release build.
- Single file compilation speed up ~33x for debug build and ~8.5x for release build
- Regression time reduce ~10% (`ninja check-all`, enable all targets)
#### Header size change
```
| size | LoC |
------------------------------
Before | 4,434,725 | 69,749 |
After | 6,140 | 162 |
```
#### Single File Compilation Time
Testcase:
```
#include <riscv_vector.h>
vint32m1_t test_vadd_vv_vfloat32m1_t(vint32m1_t op1, vint32m1_t op2, size_t vl) {
return vadd(op1, op2, vl);
}
```
##### Debug build:
Before:
```
real 0m19.352s
user 0m19.252s
sys 0m0.092s
```
After:
```
real 0m0.576s
user 0m0.552s
sys 0m0.024s
```
~33x speed up for debug build
##### Release build:
Before:
```
real 0m0.773s
user 0m0.741s
sys 0m0.032s
```
After:
```
real 0m0.092s
user 0m0.080s
sys 0m0.012s
```
~8.5x speed up for release build
#### Regression time
Note: the failed case is `tools/llvm-debuginfod-find/debuginfod.test` which is unrelated to this patch.
##### Debug build
Before:
```
Testing Time: 1358.38s
Skipped : 11
Unsupported : 446
Passed : 75767
Expectedly Failed: 190
Failed : 1
```
After
```
Testing Time: 1220.29s
Skipped : 11
Unsupported : 446
Passed : 75767
Expectedly Failed: 190
Failed : 1
```
##### Release build
Before:
```
Testing Time: 381.98s
Skipped : 12
Unsupported : 1407
Passed : 74765
Expectedly Failed: 176
Failed : 1
```
After:
```
Testing Time: 346.25s
Skipped : 12
Unsupported : 1407
Passed : 74765
Expectedly Failed: 176
Failed : 1
```
#### Binary size of clang
##### Debug build
Before
```
text data bss dec hex filename
335261851 12726004 552812 348540667 14c64efb bin/clang
```
After
```
text data bss dec hex filename
335442803 12798708 552940 348794451 14ca2e53 bin/clang
```
+253K, +0.07% code size
##### Release build
Before
```
text data bss dec hex filename
144123975 8374648 483140 152981763 91e5103 bin/clang
```
After
```
text data bss dec hex filename
144255762 8447296 483268 153186326 9217016 bin/clang
```
+204K, +0.13%
Authored-by: Kito Cheng <kito.cheng@sifive.com>
Co-Authored-by: Hsiangkai Wang <kai.wang@sifive.com>
Reviewed By: khchen, aaron.ballman
Differential Revision: https://reviews.llvm.org/D111617
Currently in Sema::ActOnEnumBody(...) when calculating NumPositiveBits we miss
the case where there is only a single enumerator with value zero and the case of
an empty enum. In both cases we end up with zero positive bits when in fact we
need one bit to store the value zero.
This PR updates the calculation to account for these cases.
Differential Revision: https://reviews.llvm.org/D130301
This includes the revised provisions of [basic.lookup.argdep] p4
1. ADL is amended to handle p 4.3 where functions in trasitively imported modules may
become visible when they are exported in the same namespace as a visible type.
2. If a function is in a different modular TU, and has internal-linkage, we invalidate
its entry in an overload set.
[basic.lookup.argdep] p5 ex 2 now passes.
Differential Revision: https://reviews.llvm.org/D129174
Currently the C++20 concepts are only merged in `ASTReader`, i.e. when
coming from different TU. This can causes ambiguious reference errors when
trying to access the same concept that should otherwise be merged.
Please see the added test for an example.
Note that we currently use `ASTContext::isSameEntity` to check for ODR
violations. However, it will not check that concept requirements match.
The same issue holds for mering concepts from different TUs, I added a
FIXME and filed a GH issue to track this:
https://github.com/llvm/llvm-project/issues/56310
Reviewed By: ChuanqiXu
Differential Revision: https://reviews.llvm.org/D128921
As per P2327R1,
|=, &= and ^= are no longer deprecated in all languages mode.
Reviewed By: aaron.ballman
Differential Revision: https://reviews.llvm.org/D130421
In cases where a non-template function is defined inside a function
template, we don't have information about the original uninstantiated
version. In the case of concepts instantiation, we will need the
ability to get back to the original template. This patch splits a piece
of the deferred concepts instantaition patch off to accomplish the
storage of this, with minor runtime overhead, and zero additional
storage.
The re-land fixes module map module dependencies seen on Greendragon, but
not in the clang test suite.
---
Currently we only implement this for the Itanium ABI since the correct
mangling for the initializers in other ABIs is not yet known.
Intended result:
For a module interface [which includes partition interface and implementation
units] (instead of the generic CXX initializer) we emit a module init that:
- wraps the contained initializations in a control variable to ensure that
the inits only happen once, even if a module is imported many times by
imports of the main unit.
- calls module initializers for imported modules first. Note that the
order of module import is not significant, and therefore neither is the
order of imported module initializers.
- We then call initializers for the Global Module Fragment (if present)
- We then call initializers for the current module.
- We then call initializers for the Private Module Fragment (if present)
For a module implementation unit, or a non-module TU that imports at least one
module we emit a regular CXX init that:
- Calls the initializers for any imported modules first.
- Then proceeds as normal with remaining inits.
For all module unit kinds we include a global constructor entry, this allows
for the (in most cases unusual) possibility that a module object could be
included in a final binary without a specific call to its initializer.
Implementation:
- We provide the module pointer in the AST Context so that CodeGen can act
on it and its sub-modules.
- We need to account for module build lines like this:
` clang -cc1 -std=c++20 Foo.pcm -emit-obj -o Foo.o` or
` clang -cc1 -std=c++20 -xc++-module Foo.cpp -emit-obj -o Foo.o`
- in order to do this, we add to ParseAST to set the module pointer in
the ASTContext, once we establish that this is a module build and we
know the module pointer. To be able to do this, we make the query for
current module public in Sema.
- In CodeGen, we determine if the current build requires a CXX20-style module
init and, if so, we defer any module initializers during the "Eagerly
Emitted" phase.
- We then walk the module initializers at the end of the TU but before
emitting deferred inits (which adds any hidden and static ones, fixing
https://github.com/llvm/llvm-project/issues/51873 ).
- We then proceed to emit the deferred inits and continue to emit the CXX
init function.
Differential Revision: https://reviews.llvm.org/D126189
EnumDecl's promotion type is set either to the parsed type or calculated type
after completing its definition. When it's bool type and has no definition,
its promotion type is bool which is not allowed by clang.
Fixes#56560.
Differential Revision: https://reviews.llvm.org/D130210
Currently, clang may meet an infinite loop in a very tricky case when it
iterates the default args. This patch tries to fix this by adding a
`fixed` check.
re-land fixes an unwanted interaction with module-map modules, seen in
Greendragon testing.
This provides updates to
[class.mfct]:
Pre C++20 [class.mfct]p2:
A member function may be defined (8.4) in its class definition, in
which case it is an inline member function (7.1.2)
Post C++20 [class.mfct]p1:
If a member function is attached to the global module and is defined
in its class definition, it is inline.
and
[class.friend]:
Pre-C++20 [class.friend]p5
A function can be defined in a friend declaration of a
class . . . . Such a function is implicitly inline.
Post C++20 [class.friend]p7
Such a function is implicitly an inline function if it is attached
to the global module.
We add the output of implicit-inline to the TextNodeDumper, and amend
a couple of existing tests to account for this, plus add tests for the
cases covered above.
Differential Revision: https://reviews.llvm.org/D129045
Some code [0] consider that trailing arrays are flexible, whatever their size.
Support for these legacy code has been introduced in
f8f6324983 but it prevents evaluation of
__builtin_object_size and __builtin_dynamic_object_size in some legit cases.
Introduce -fstrict-flex-arrays=<n> to have stricter conformance when it is
desirable.
n = 0: current behavior, any trailing array member is a flexible array. The default.
n = 1: any trailing array member of undefined, 0 or 1 size is a flexible array member
n = 2: any trailing array member of undefined or 0 size is a flexible array member
This takes into account two specificities of clang: array bounds as macro id
disqualify FAM, as well as non standard layout.
Similar patch for gcc discuss here: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=101836
[0] https://docs.freebsd.org/en/books/developers-handbook/sockets/#sockets-essential-functions
Previously, we forget to handle reachability for deduction guide.
The deduction guide is a hint to the compiler. And the deduction guide
should be able to use if the corresponding template decl is reachable.
This reverts commit 7c51f02eff because it
stills breaks the LLDB tests. This was re-landed without addressing the
issue or even agreement on how to address the issue. More details and
discussion in https://reviews.llvm.org/D112374.
Without this patch, clang will not wrap in an ElaboratedType node types written
without a keyword and nested name qualifier, which goes against the intent that
we should produce an AST which retains enough details to recover how things are
written.
The lack of this sugar is incompatible with the intent of the type printer
default policy, which is to print types as written, but to fall back and print
them fully qualified when they are desugared.
An ElaboratedTypeLoc without keyword / NNS uses no storage by itself, but still
requires pointer alignment due to pre-existing bug in the TypeLoc buffer
handling.
---
Troubleshooting list to deal with any breakage seen with this patch:
1) The most likely effect one would see by this patch is a change in how
a type is printed. The type printer will, by design and default,
print types as written. There are customization options there, but
not that many, and they mainly apply to how to print a type that we
somehow failed to track how it was written. This patch fixes a
problem where we failed to distinguish between a type
that was written without any elaborated-type qualifiers,
such as a 'struct'/'class' tags and name spacifiers such as 'std::',
and one that has been stripped of any 'metadata' that identifies such,
the so called canonical types.
Example:
```
namespace foo {
struct A {};
A a;
};
```
If one were to print the type of `foo::a`, prior to this patch, this
would result in `foo::A`. This is how the type printer would have,
by default, printed the canonical type of A as well.
As soon as you add any name qualifiers to A, the type printer would
suddenly start accurately printing the type as written. This patch
will make it print it accurately even when written without
qualifiers, so we will just print `A` for the initial example, as
the user did not really write that `foo::` namespace qualifier.
2) This patch could expose a bug in some AST matcher. Matching types
is harder to get right when there is sugar involved. For example,
if you want to match a type against being a pointer to some type A,
then you have to account for getting a type that is sugar for a
pointer to A, or being a pointer to sugar to A, or both! Usually
you would get the second part wrong, and this would work for a
very simple test where you don't use any name qualifiers, but
you would discover is broken when you do. The usual fix is to
either use the matcher which strips sugar, which is annoying
to use as for example if you match an N level pointer, you have
to put N+1 such matchers in there, beginning to end and between
all those levels. But in a lot of cases, if the property you want
to match is present in the canonical type, it's easier and faster
to just match on that... This goes with what is said in 1), if
you want to match against the name of a type, and you want
the name string to be something stable, perhaps matching on
the name of the canonical type is the better choice.
3) This patch could exposed a bug in how you get the source range of some
TypeLoc. For some reason, a lot of code is using getLocalSourceRange(),
which only looks at the given TypeLoc node. This patch introduces a new,
and more common TypeLoc node which contains no source locations on itself.
This is not an inovation here, and some other, more rare TypeLoc nodes could
also have this property, but if you use getLocalSourceRange on them, it's not
going to return any valid locations, because it doesn't have any. The right fix
here is to always use getSourceRange() or getBeginLoc/getEndLoc which will dive
into the inner TypeLoc to get the source range if it doesn't find it on the
top level one. You can use getLocalSourceRange if you are really into
micro-optimizations and you have some outside knowledge that the TypeLocs you are
dealing with will always include some source location.
4) Exposed a bug somewhere in the use of the normal clang type class API, where you
have some type, you want to see if that type is some particular kind, you try a
`dyn_cast` such as `dyn_cast<TypedefType>` and that fails because now you have an
ElaboratedType which has a TypeDefType inside of it, which is what you wanted to match.
Again, like 2), this would usually have been tested poorly with some simple tests with
no qualifications, and would have been broken had there been any other kind of type sugar,
be it an ElaboratedType or a TemplateSpecializationType or a SubstTemplateParmType.
The usual fix here is to use `getAs` instead of `dyn_cast`, which will look deeper
into the type. Or use `getAsAdjusted` when dealing with TypeLocs.
For some reason the API is inconsistent there and on TypeLocs getAs behaves like a dyn_cast.
5) It could be a bug in this patch perhaps.
Let me know if you need any help!
Signed-off-by: Matheus Izvekov <mizvekov@gmail.com>
Differential Revision: https://reviews.llvm.org/D112374
The code would assume that SubstExpr() cannot fail on concept
specialization. This is incorret - we give up on some things after fatal
error occurred, since there's no value in doing futher work that the
user will not see anyway. In this case, this lead to crash.
The fatal error is simulated in tests with -ferror-limit=1, but this
could happen in other cases too.
Fixes https://github.com/llvm/llvm-project/issues/55401
Differential Revision: https://reviews.llvm.org/D129499
Previous warning went on whenever a struct with a struct member with alignment => 16
was declared. This led to too many false positives and led to diagnostic lit failures
due to it being emitted too frequently. Only emit the warning when such a struct and
that struct contains a member that has an alignment of 16 bytes is passed to a caller
function since this is where the potential binary compatibility issue with XL 16.1.0
and older exists.
Reviewed By: sfertile, aaron.ballman
Differential Revision: https://reviews.llvm.org/D118350
This reverts commit bdc6974f92 because it
breaks all the LLDB tests that import the std module.
import-std-module/array.TestArrayFromStdModule.py
import-std-module/deque-basic.TestDequeFromStdModule.py
import-std-module/deque-dbg-info-content.TestDbgInfoContentDequeFromStdModule.py
import-std-module/forward_list.TestForwardListFromStdModule.py
import-std-module/forward_list-dbg-info-content.TestDbgInfoContentForwardListFromStdModule.py
import-std-module/list.TestListFromStdModule.py
import-std-module/list-dbg-info-content.TestDbgInfoContentListFromStdModule.py
import-std-module/queue.TestQueueFromStdModule.py
import-std-module/stack.TestStackFromStdModule.py
import-std-module/vector.TestVectorFromStdModule.py
import-std-module/vector-bool.TestVectorBoolFromStdModule.py
import-std-module/vector-dbg-info-content.TestDbgInfoContentVectorFromStdModule.py
import-std-module/vector-of-vectors.TestVectorOfVectorsFromStdModule.py
https://green.lab.llvm.org/green/view/LLDB/job/lldb-cmake/45301/
Without this patch, clang will not wrap in an ElaboratedType node types written
without a keyword and nested name qualifier, which goes against the intent that
we should produce an AST which retains enough details to recover how things are
written.
The lack of this sugar is incompatible with the intent of the type printer
default policy, which is to print types as written, but to fall back and print
them fully qualified when they are desugared.
An ElaboratedTypeLoc without keyword / NNS uses no storage by itself, but still
requires pointer alignment due to pre-existing bug in the TypeLoc buffer
handling.
Signed-off-by: Matheus Izvekov <mizvekov@gmail.com>
Differential Revision: https://reviews.llvm.org/D112374
Otherwise these functions are not instantiated and we end up with an undefined
symbol.
Fix#55560
Differential Revision: https://reviews.llvm.org/D128119
This provides updates to
[class.mfct]:
Pre C++20 [class.mfct]p2:
A member function may be defined (8.4) in its class definition, in
which case it is an inline member function (7.1.2)
Post C++20 [class.mfct]p1:
If a member function is attached to the global module and is defined
in its class definition, it is inline.
and
[class.friend]:
Pre-C++20 [class.friend]p5
A function can be defined in a friend declaration of a
class . . . . Such a function is implicitly inline.
Post C++20 [class.friend]p7
Such a function is implicitly an inline function if it is attached
to the global module.
We add the output of implicit-inline to the TextNodeDumper, and amend
a couple of existing tests to account for this, plus add tests for the
cases covered above.
Differential Revision: https://reviews.llvm.org/D129045
Currently we only implement this for the Itanium ABI since the correct
mangling for the initializers in other ABIs is not yet known.
Intended result:
For a module interface [which includes partition interface and implementation
units] (instead of the generic CXX initializer) we emit a module init that:
- wraps the contained initializations in a control variable to ensure that
the inits only happen once, even if a module is imported many times by
imports of the main unit.
- calls module initializers for imported modules first. Note that the
order of module import is not significant, and therefore neither is the
order of imported module initializers.
- We then call initializers for the Global Module Fragment (if present)
- We then call initializers for the current module.
- We then call initializers for the Private Module Fragment (if present)
For a module implementation unit, or a non-module TU that imports at least one
module we emit a regular CXX init that:
- Calls the initializers for any imported modules first.
- Then proceeds as normal with remaining inits.
For all module unit kinds we include a global constructor entry, this allows
for the (in most cases unusual) possibility that a module object could be
included in a final binary without a specific call to its initializer.
Implementation:
- We provide the module pointer in the AST Context so that CodeGen can act
on it and its sub-modules.
- We need to account for module build lines like this:
` clang -cc1 -std=c++20 Foo.pcm -emit-obj -o Foo.o` or
` clang -cc1 -std=c++20 -xc++-module Foo.cpp -emit-obj -o Foo.o`
- in order to do this, we add to ParseAST to set the module pointer in
the ASTContext, once we establish that this is a module build and we
know the module pointer. To be able to do this, we make the query for
current module public in Sema.
- In CodeGen, we determine if the current build requires a CXX20-style module
init and, if so, we defer any module initializers during the "Eagerly
Emitted" phase.
- We then walk the module initializers at the end of the TU but before
emitting deferred inits (which adds any hidden and static ones, fixing
https://github.com/llvm/llvm-project/issues/51873 ).
- We then proceed to emit the deferred inits and continue to emit the CXX
init function.
Differential Revision: https://reviews.llvm.org/D126189
Add a fix-it for the common case of setters/constructors using parameters with the same name as fields
```lang=c++
struct A{
int X;
A(int X) { /*this->*/X = X; }
void setX(int X) { /*this->*/X = X;
};
```
Reviewed By: aaron.ballman
Differential Revision: https://reviews.llvm.org/D129202
The existing provision is not sufficient, it did not allow for the cases
where an implementation partition includes the primary module interface,
or for the case that an exported interface partition is contains a decl
that is then implemented in a regular implementation unit.
It is somewhat unfortunate that we have to compare top level module names
to achieve this, since built modules are not necessarily available.
TODO: It might be useful to cache a hash of the primary module name if
this test proves to be a significant load.
Differential Revision: https://reviews.llvm.org/D127624
See https://github.com/cplusplus/draft/pull/5204 for a detailed
background.
Simply, the test redundant-template-default-arg.cpp attached to this
patch should be accepted instead of being complained about the
redefinition.
Reviewed By: urnathan, rsmith, ChuanqiXu
Differential Revision: https://reviews.llvm.org/D118034
D127041 introduced the support for `fmax` and `fmin` such that we can also reprent
`atomic compare` and `atomic compare capture` with `atomicrmw` instruction. This
patch simply lifts the limitation we set before.
Depend on D127041.
Reviewed By: ABataev
Differential Revision: https://reviews.llvm.org/D127042
Clang only allows you to use __attribute__((format)) on variadic functions. There are legit use cases for __attribute__((format)) on non-variadic functions, such as:
(1) variadic templates
```c++
template<typename… Args>
void print(const char *fmt, Args… &&args) __attribute__((format(1, 2))); // error: format attribute requires variadic function
```
(2) functions which take fixed arguments and a custom format:
```c++
void print_number_string(const char *fmt, unsigned number, const char *string) __attribute__((format(1, 2)));
// ^error: format attribute requires variadic function
void foo(void) {
print_number_string(“%08x %s\n”, 0xdeadbeef, “hello”);
print_number_string(“%d %s”, 0xcafebabe, “bar”);
}
```
This change allows Clang users to attach __attribute__((format)) to non-variadic functions, including functions with C++ variadic templates. It replaces the error with a GCC compatibility warning and improves the type checker to ensure that received arrays are treated like pointers (this is a possibility in C++ since references to template types can bind to arrays).
Reviewed By: aaron.ballman
Differential Revision: https://reviews.llvm.org/D112579
rdar://84629099
Based on feedback from @Aaron.Ballman.
Remove the unused static ID char (can re-add it later if needed).
Add test to cover some invalid HLSL vector instantations ensuring
that the appropriate error messages are generated.
HLSL vector types are ext_vector types, but they are also exposed via a
template syntax `vector<T, #>`. This is morally equavalent to the code:
```c++
template <typename T, int Size>
using vector = T __attribute__((ext_vector_type(Size)))
```
The problem is that templates aren't supported before HLSL 2021, and
type aliases still aren't supported in HLSL.
To resolve this (and other issues where HLSL can't represent its own
types), we rely on an external AST & Sema source being registered for
HLSL code.
This patch adds the HLSLExternalSemaSource and registers the vector
type alias.
Depends on D127802
Differential Revision: https://reviews.llvm.org/D128012
This is a recommit of b822efc740,
reverted in dc34d8df4c. The commit caused
fails because the test ast-print-fp-pragmas.c did not specify particular
target, and it failed on targets which do not support constrained
intrinsics. The original commit message is below.
AST does not have special nodes for pragmas. Instead a pragma modifies
some state variables of Sema, which in turn results in modified
attributes of AST nodes. This technique applies to floating point
operations as well. Every AST node that can depend on FP options keeps
current set of them.
This technique works well for options like exception behavior or fast
math options. They represent instructions to the compiler how to modify
code generation for the affected nodes. However treatment of FP control
modes has problems with this technique. Modifying FP control mode
(like rounding direction) usually requires operations on hardware, like
writing to control registers. It must be done prior to the first
operation that depends on the control mode. In particular, such
operations are required for implementation of `pragma STDC FENV_ROUND`,
compiler should set up necessary rounding direction at the beginning of
compound statement where the pragma occurs. As there is no representation
for pragmas in AST, the code generation becomes a complicated task in
this case.
To solve this issue FP options are kept inside CompoundStmt. Unlike to FP
options in expressions, these does not affect any operation on FP values,
but only inform the codegen about the FP options that act in the body of
the statement. As all pragmas that modify FP environment may occurs only
at the start of compound statement or at global level, such solution
works for all relevant pragmas. The options are kept as a difference
from the options in the enclosing compound statement or default options,
it helps codegen to set only changed control modes.
Differential Revision: https://reviews.llvm.org/D123952
region with implicit default inside the member function.
This is to fix assert when field is referenced in OpenMP region with
default (first|private) clause inside member function.
The problem of assert is that the capture is not generated for the field.
This patch is to generate capture when the field is used with implicit
default, use it in the code, and save the capture off to make sure it is
considered from that point and add first/private clauses.
1> Add new field ImplicitDefaultFirstprivateFDs in SharingMapTy, used to
store generated capture fields info.
2> In function isOpenMPCaptureDecl: the caputer is generated and saved
in ImplicitDefaultFirstprivateFDs.
3> Add new help functions:
getImplicitFDCapExprDecl
isImplicitDefaultFirstprivateFD
addImplicitDefaultFirstprivateFD
4> Add addition argument in hasDSA to check default attribute for
default(first|private).
5> The isImplicitDefaultFirstprivateFD is used in VisitDeclRefExpr to
build the implicit clause.
6> Add new parameter "Context" for buildCaptureDecl, due to when capture
field, the parent context is needed to be used.
7> Change in isOpenMPPrivateDecl where stop propagate the capture from
the enclosing region for private variable.
8> In ActOnOpenMPFirstprivate/ActOnOpenMPPrivate, using captured info
to generate first|private clause.
9> Add new function isOpenMPRebuildMemberExpr: use to determine if field
needs to be rebuild during template instantiation.
Differential Revision: https://reviews.llvm.org/D127803