In OpenMP target offloading an in other offloading languages, we
maintain a difference between device functions and kernel functions.
Kernel functions must be visible to the host and act as the entry point
to the target device. Device functions however cannot be called directly
by the host and must be called by a kernel function. Currently, we make
all definitions on the device protected by default. Because device
functions cannot be called or used by the host they should have hidden
visibility. This allows for the definitions to be better optimized via
LTO or other passes.
This patch marks every device function in the AST as having `hidden`
visibility. The kernel function is generated later at code-gen and we
set its visibility explicitly so it should not be affected. This
prevents the user from overriding the visibility, but since the user
can't do anything with these symbols anyway there is no point exporting
them right now.
Reviewed By: jdoerfert
Differential Revision: https://reviews.llvm.org/D136111
This is a change to how we represent type subsitution in the AST.
Instead of only storing the replaced type, we track the templated
entity we are substituting, plus an index.
We modify MLTAL to track the templated entity at each level.
Otherwise, it's much more expensive to go from the template parameter back
to the templated entity, and not possible to do in some cases, as when
we instantiate outer templates, parameters might still reference the
original entity.
This also allows us to very cheaply lookup the templated entity we saw in
the naming context and find the corresponding argument it was replaced
from, such as for implementing template specialization resugaring.
Signed-off-by: Matheus Izvekov <mizvekov@gmail.com>
Differential Revision: https://reviews.llvm.org/D131858
Seems there's a narrow case - where a packed type doesn't pack its base
subobjects (only fields), but when packing a field of the derived type,
GCC does pack the resulting total object - effectively packing the base
subobject.
So ensure that this non-pod type (owing to it having a base class) that
is packed, gets packed when placed in /another/ type that is also
packed.
This is a (smallish?) ABI fix to a regression introduced by D117616 -
but that regression/ABI break hasn't been released in LLVM as-yet (it's
been reverted on the release branch from the last two LLVM releases - I
probably should've just reverted the whole patch while we hashed out
this and other issues) so this change isn't itself an ABI break, as far
as LLVM releases are concerned (for folks releasing their own copies of
LLVM from ToT/without the LLVM release branch, and didn't opt into the
clang-abi-compat 14 or below (soon to be 15 or below, I guess I should
say) then this would be an ABI break against clang from the last 9
months or so)
Differential Revision: https://reviews.llvm.org/D135916
This introduces support for nullptr and nullptr_t in C2x mode. The
proposal accepted by WG14 is:
https://www.open-std.org/jtc1/sc22/wg14/www/docs/n3042.htm
Note, there are quite a few incompatibilities with the C++ feature in
some of the edge cases of this feature. Therefore, there are some FIXME
comments in tests for testing behavior that might change after WG14 has
resolved national body comments (a process we've not yet started). So
this implementation might change slightly depending on the resolution
of comments. This is called out explicitly in the release notes as
well.
Differential Revision: https://reviews.llvm.org/D135099
The diagnostics engine is very smart about being passed a NamedDecl to
print as part of a diagnostic; it gets the "right" form of the name,
quotes it properly, etc. However, the result of using an unnamed tag
declaration was to print '' instead of anything useful.
This patch causes us to print the same information we'd have gotten if
we had printed the type of the declaration rather than the name of it,
as that's the most relevant information we can display.
Differential Revision: https://reviews.llvm.org/D134813
Just keep one version of the function and differentiate between
std::is_signed() and unsigned using a constexpr if, instead of having
two different versions for the signed and unsigned cases.
Implement ArrayInitLoopExprs, which are used in copy constructors to
copy arrays. Also fix problems encountered while doing that.
Differential Revision: https://reviews.llvm.org/D134361
Implement passing the this pointer to member functions and constructors.
The this pointer is passed via the stack. This changes the functions to
explicitly track whether they have a RVO pointer and a this pointer.
Differential Revision: https://reviews.llvm.org/D134699
This doesn't make much sense with functions that expect valid parameters
and/or a certain call stack on the caller side like This/RVO pointers.
Differential Revision: https://reviews.llvm.org/D135569
We can ignore casts where FromT and ToT are the same type. But that's a
performance optimization that I'd like to do later. For now, this code
is doing the right thing.
- store NestedNameSpecifier & Loc for the qualifiers
This information was entirely missing from the AST.
- expose the location information for qualifier/identifier/typedefs as typeloc
This allows many traversals/astmatchers etc to handle these generically along
with other references. The decl vs type split can help preserve typedef
sugar when https://github.com/llvm/llvm-project/issues/57659 is resolved.
- fix the SourceRange of UsingEnumDecl to include 'using'.
Fixes https://github.com/clangd/clangd/issues/1283
Differential Revision: https://reviews.llvm.org/D134303
Without this patch `VarDecl::hasDependent()` checks only undeduced auto types, so can give false negatives result for other undeduced types.
This lead to crashes in sequence `!VarDecl::hasDepentent()` => `getDeclAlign()`.
It seems this problem appeared since D105380
Reviewed By: mizvekov
Differential Revision: https://reviews.llvm.org/D135362
As with static bool for whatever reason printing them on their own
worked fine but wasn't handled when you printed the whole type.
I don't see a good way to test this from clang's side so our existing
tests will have to do.
We can now print all of the struct "A", so there's no need for a separate
one for static bool testing. I've not checked the output, just that it
succeeds. This saves us having to handle different min/max between systems.
Depends on D135169
Reviewed By: aeubanks, shafik
Differential Revision: https://reviews.llvm.org/D135170
In the context of caching clang invocations it is important to emit diagnostics in deterministic order;
the same clang invocation should result in the same diagnostic output.
rdar://100336989
Differential Revision: https://reviews.llvm.org/D135118
(Re-Apply with fixes to clang MicrosoftMangle.cpp)
This is a first step towards high level representation for fp8 types
that have been built in to hardware with near term roadmaps. Like the
BFLOAT16 type, the family of fp8 types are inspired by IEEE-754 binary
floating point formats but, due to the size limits, have been tweaked in
various ways in order to maximally use the range/precision in various
scenarios. The list of variants is small/finite and bounded by real
hardware.
This patch introduces the E5M2 FP8 format as proposed by Nvidia, ARM,
and Intel in the paper: https://arxiv.org/pdf/2209.05433.pdf
As the more conformant of the two implemented datatypes, we are plumbing
it through LLVM's APFloat type and MLIR's type system first as a
template. It will be followed by the range optimized E4M3 FP8 format
described in the paper. Since that format deviates further from the
IEEE-754 norms, it may require more debate and implementation
complexity.
Given that we see two parts of the FP8 implementation space represented
by these cases, we are recommending naming of:
* `F8M<N>` : For FP8 types that can be conceived of as following the
same rules as FP16 but with a smaller number of mantissa/exponent
bits. Including the number of mantissa bits in the type name is enough
to fully specify the type. This naming scheme is used to represent
the E5M2 type described in the paper.
* `F8M<N>F` : For FP8 types such as E4M3 which only support finite
values.
The first of these (this patch) seems fairly non-controversial. The
second is previewed here to illustrate options for extending to the
other known variant (but can be discussed in detail in the patch
which implements it).
Many conversations about these types focus on the Machine-Learning
ecosystem where they are used to represent mixed-datatype computations
at a high level. At that level (which is why we also expose them in
MLIR), it is important to retain the actual type definition so that when
lowering to actual kernels or target specific code, the correct
promotions, casts and rescalings can be done as needed. We expect that
most LLVM backends will only experience these types as opaque `I8`
values that are applicable to some instructions.
MLIR does not make it particularly easy to add new floating point types
(i.e. the FloatType hierarchy is not open). Given the need to fully
model FloatTypes and make them interop with tooling, such types will
always be "heavy-weight" and it is not expected that a highly open type
system will be particularly helpful. There are also a bounded number of
floating point types in use for current and upcoming hardware, and we
can just implement them like this (perhaps looking for some cosmetic
ways to reduce the number of places that need to change). Creating a
more generic mechanism for extending floating point types seems like it
wouldn't be worth it and we should just deal with defining them one by
one on an as-needed basis when real hardware implements a new scheme.
Hopefully, with some additional production use and complete software
stacks, hardware makers will converge on a set of such types that is not
terribly divergent at the level that the compiler cares about.
(I cleaned up some old formatting and sorted some items for this case:
If we converge on landing this in some form, I will NFC commit format
only changes as a separate commit)
Differential Revision: https://reviews.llvm.org/D133823
As @mizvekov suggested in D134772. This works great for D128750 when
dealing with AutoType's.
Reviewed By: mizvekov, erichkeane
Differential Revision: https://reviews.llvm.org/D135088
Turn it into a single Expr::isFlexibleArrayMemberLike method, as discussed in
https://discourse.llvm.org/t/rfc-harmonize-flexible-array-members-handling
Keep different behavior with respect to macro / template substitution, and
harmonize sharp edges: ObjC interface now behave as C struct wrt. FAM and
-fstrict-flex-arrays.
This does not impact __builtin_object_size interactions with FAM.
Differential Revision: https://reviews.llvm.org/D134791