The way this code checks whether a pointer is null is wrong for other
reasons; it doesn't actually check whether a null pointer constant is a
"constant" in the C++ standard sense. But this fix at least makes sure
we don't treat a non-null pointer as if it were null.
Fixes https://github.com/llvm/llvm-project/issues/57883
Differential Revision: https://reviews.llvm.org/D134928
Should have done this from the start. Since all the injected AST types
are in the hlsl namespace we should also put the header-defined types
and functions in there too.
This updates the basic_types test to run once with the namespaced types
and once without, and adds using declarations or namespaces calls in
other tests.
Reviewed By: python3kgae
Differential Revision: https://reviews.llvm.org/D135973
This is an alternative of D120395 and D120411.
Previously we use `__bfloat16` as a typedef of `unsigned short`. The
name may give user an impression it is a brand new type to represent
BF16. So that they may use it in arithmetic operations and we don't have
a good way to block it.
To solve the problem, we introduced `__bf16` to X86 psABI and landed the
support in Clang by D130964. Now we can solve the problem by switching
intrinsics to the new type.
Reviewed By: LuoYuanke, RKSimon
Differential Revision: https://reviews.llvm.org/D132329
Previously, `LazyCompoundVal` bindings to subregions referred by
`LazyCopoundVals`, were not marked as //lazily copied//.
This change returns `LazyCompoundVals` from `getInterestingValues()`,
so their regions can be marked as //lazily copied// in `RemoveDeadBindingsWorker::VisitBinding()`.
Depends on D134947
Authored by: Tomasz Kamiński <tomasz.kamiński@sonarsource.com>
Reviewed By: martong
Differential Revision: https://reviews.llvm.org/D135136
To illustrate our current understanding, let's start with the following program:
https://godbolt.org/z/33f6vheh1
```lang=c++
void clang_analyzer_printState();
struct C {
int x;
int y;
int more_padding;
};
struct D {
C c;
int z;
};
C foo(D d, int new_x, int new_y) {
d.c.x = new_x; // B1
assert(d.c.x < 13); // C1
C c = d.c; // L
assert(d.c.y < 10); // C2
assert(d.z < 5); // C3
d.c.y = new_y; // B2
assert(d.c.y < 10); // C4
return c; // R
}
```
In the code, we create a few bindings to subregions of root region `d` (`B1`, `B2`), a constrain on the values (`C1`, `C2`, ….), and create a `lazyCompoundVal` for the part of the region `d` at point `L`, which is returned at point `R`.
Now, the question is which of these should remain live as long the return value of the `foo` call is live. In perfect a word we should preserve:
# only the bindings of the subregions of `d.c`, which were created before the copy at `L`. In our example, this includes `B1`, and not `B2`. In other words, `new_x` should be live but `new_y` shouldn’t.
# constraints on the values of `d.c`, that are reachable through `c`. This can be created both before the point of making the copy (`L`) or after. In our case, that would be `C1` and `C2`. But not `C3` (`d.z` value is not reachable through `c`) and `C4` (the original value of`d.c.y` was overridden at `B2` after the creation of `c`).
The current code in the `RegionStore` covers the use case (1), by using the `getInterestingValues()` to extract bindings to parts of the referred region present in the store at the point of copy. This also partially covers point (2), in case when constraints are applied to a location that has binding at the point of the copy (in our case `d.c.x` in `C1` that has value `new_x`), but it fails to preserve the constraints that require creating a new symbol for location (`d.c.y` in `C2`).
We introduce the concept of //lazily copied// locations (regions) to the `SymbolReaper`, i.e. for which a program can access the value stored at that location, but not its address. These locations are constructed as a set of regions referred to by `lazyCompoundVal`. A //readable// location (region) is a location that //live// or //lazily copied// . And symbols that refer to values in regions are alive if the region is //readable//.
For simplicity, we follow the current approach to live regions and mark the base region as //lazily copied//, and consider any subregions as //readable//. This makes some symbols falsy live (`d.z` in our example) and keeps the corresponding constraints alive.
The rename `Regions` to `LiveRegions` inside `RegionStore` is NFC change, that was done to make it clear, what is difference between regions stored in this two sets.
Regression Test: https://reviews.llvm.org/D134941
Co-authored-by: Balazs Benics <benicsbalazs@gmail.com>
Reviewed By: martong, xazax.hun
Differential Revision: https://reviews.llvm.org/D134947
This is a fix related to D135414. The original intention was to keep `BaseFS` as a member of the worker and conditionally overlay it with local in-memory FS. The `move` of ref-counted `BaseFS` was not intended, and it's a bug.
Disabling parallelism in the "by-module-name" test reliably reproduces this, and the test itself doesn't *need* parallelism. (I think `-j 4` was cargo culted from another test.) Reusing that test to check for correct behavior...
Reviewed By: DavidSpickett
Differential Revision: https://reviews.llvm.org/D136124
Support SV_DispatchThreadID attribute.
Translate it into dx.thread.id in clang codeGen.
Reviewed By: beanz, aaron.ballman
Differential Revision: https://reviews.llvm.org/D133983
This patch changes the kernels generated by OpenMP to have protected
visibility. This is unlikely to change anything functionally. However,
protected visibility better matches the behaviour of these GPU kernels.
We do not expect any pending shared library load to preempt these
kernels so we can specify a more restrictive visibility.
Reviewed By: jdoerfert
Differential Revision: https://reviews.llvm.org/D136198
On AIX, `strict-dwarf` defaults to `true`. This patch implement this default behaviour. Additionally, it adds debug tuning tests.
Reviewed By: shchenz
Differential Revision: https://reviews.llvm.org/D135908
The parser assumes that the lexer never emits coloncolon token for C code, but this assumption no longer holds in C2x attribute namespaces. As a result, stray coloncolon tokens out of attributes cause assertion failures and hangs in release build, which this patch tries to handle.
Crash input minimal example: `T n::v`
Reviewed By: aaron.ballman
Differential Revision: https://reviews.llvm.org/D133248
GCC behavior regarding defining __SOFTFP__ when (implicitly) specifying
-mfloat-abi=softfp:
- compile without (implicit) FP: define __SOFTFP__
- compile with (implicit) FP: don't define __SOFTFP__
Currently Clang doesn't define __SOFTFP__ when softfp is specified, either with
or without FP. This patch brings Clang in line with GCC behavior.
This was raised by itaig1 over on Github:
https://github.com/llvm/llvm-project/issues/55755
Reviewed By: pratlucas
Differential Revision: https://reviews.llvm.org/D135680
In OpenMP target offloading an in other offloading languages, we
maintain a difference between device functions and kernel functions.
Kernel functions must be visible to the host and act as the entry point
to the target device. Device functions however cannot be called directly
by the host and must be called by a kernel function. Currently, we make
all definitions on the device protected by default. Because device
functions cannot be called or used by the host they should have hidden
visibility. This allows for the definitions to be better optimized via
LTO or other passes.
This patch marks every device function in the AST as having `hidden`
visibility. The kernel function is generated later at code-gen and we
set its visibility explicitly so it should not be affected. This
prevents the user from overriding the visibility, but since the user
can't do anything with these symbols anyway there is no point exporting
them right now.
Reviewed By: jdoerfert
Differential Revision: https://reviews.llvm.org/D136111
Reported as it showed up as a constriants failure after the deferred
instantiation patch, we were checking constraints TWICE after overload
resolution. The first is during overload resolution, the second is when
diagnosing a use.
This patch modifies DiagnoseUseOfDecl to skip the trailing requires
clause check in some cases. First, of course, after choosing a candidate
after overload resolution.
The second is when evaluating a shadow using constructor, which had its
constraints checked when picking a constructor (as this is ALWAYS an
overload situation!).
Differential Revision: https://reviews.llvm.org/D135772
Make MTE intrinsics available in function scope too.
Followup from D133359.
Reviewed By: dmgreen
Differential Revision: https://reviews.llvm.org/D136062
Make MTE intrinsics available in function scope too.
Followup from D133359.
Reviewed By: dmgreen
Differential Revision: https://reviews.llvm.org/D136062
Currently generation of align assumptions for OpenMP simd construct is done
outside OMPIRBuilder for C code and it is not supported for Fortran.
According to OpenMP 5.0 standard (2.9.3) only pointers and arrays can be
aligned for C code.
If given aligned variable is pointer, then Clang generates the following set
of the LLVM IR isntructions to support simd align clause:
; memory allocation for pointer address:
%A.addr = alloca ptr, align 8
; some LLVM IR code
; Alignment instructions (alignment is equal to 32):
%0 = load ptr, ptr %A.addr, align 8
call void @llvm.assume(i1 true) [ "align"(ptr %0, i64 32) ]
If given aligned variable is array, then Clang generates the following set
of the LLVM IR isntructions to support simd align clause:
; memory allocation for array:
%B = alloca [10 x i32], align 16
; some LLVM IR code
; Alignment instructions (alignment is equal to 32):
%arraydecay = getelementptr inbounds [10 x i32], ptr %B, i64 0, i64 0
call void @llvm.assume(i1 true) [ "align"(ptr %arraydecay, i64 32) ]
OMPIRBuilder was modified to generate aligned assumptions. It generates only
llvm.assume calls. Frontend is responsible for generation of aligned pointer
and getting the default alignment value if user does not specify it in aligned
clause.
Unit and regression tests were added to check if aligned clause was handled correctly.
Differential Revision: https://reviews.llvm.org/D133578
Reviewed By: jdoerfert
In O0, all the functions (except the always-inline-functions) in the importee
shouldn't be imported for compilation speeds.
But with optimizations, all the potentially called function in the
importee should be imported to not prevent any inter-procedural
optimizations (primarily inline), which is pretty important for runtime
performances.
This patch adds the tests for the feature.
Extra NUL does not impact functionality of the generated code, but it confuses
various NVIDIA tools used to examine embedded GPU binaries.
Differential Revision: https://reviews.llvm.org/D135832
''register(ID, space)'' like register(t3, space1) will be translated into
i32 3, i32 1 as the last 2 operands for resource annotation metadata.
NamedMetadata for CBuffers and SRVs are added as "hlsl.srvs" and "hlsl.cbufs".
Reviewed By: beanz
Differential Revision: https://reviews.llvm.org/D130951
This follows the path that AArch64 SVE has taken. Doing this via a function attribute set in the frontend is basically a workaround for the fact that several analyzes which need the information (i.e. known bits, lvi, scev) can't easily use TTI without significant amounts of plumbing changes.
This patch hard codes "v" numbers, and directly follows the SVE precedent as a result. In a follow up, I hope to drive this from RISCVISAInfo.h/cpp instead, but the MinVLen number being returned from that interface seemed to always be 0 (which is wrong), and I haven't figured out what's going wrong there.
Differential Revision: https://reviews.llvm.org/D135894
Calling `getFunctionLinkage(CalleeInfo.getCalleeDecl())` will crash when the declaration does not have a body, e.g., `extern void foo();`. Instead, we can use `isExternallyVisible()` to see if the delcaration has internal linkage.
I believe using `!isExternallyVisible()` is correct because the clang linkage must be `InternalLinkage` or `UniqueExternalLinkage`, both of which are "internal linkage" in llvm.
9c26f51f5e/clang/include/clang/Basic/Linkage.h (L28-L40)
Fixes https://github.com/llvm/llvm-project/issues/54139
Reviewed By: tmsriram
Differential Revision: https://reviews.llvm.org/D135926
PPC64 ABI pass aggregates smaller than a register into the least
significant bits of the register. In the case of variadic functions,
they will end up right-aligned in their argument slots in the argument
area on big-endian targets. Apply right-alignment for these aggregates.
Fixes#55900.
Reviewed By: rjmccall
Differential Revision: https://reviews.llvm.org/D133338
This is a change to how we represent type subsitution in the AST.
Instead of only storing the replaced type, we track the templated
entity we are substituting, plus an index.
We modify MLTAL to track the templated entity at each level.
Otherwise, it's much more expensive to go from the template parameter back
to the templated entity, and not possible to do in some cases, as when
we instantiate outer templates, parameters might still reference the
original entity.
This also allows us to very cheaply lookup the templated entity we saw in
the naming context and find the corresponding argument it was replaced
from, such as for implementing template specialization resugaring.
Signed-off-by: Matheus Izvekov <mizvekov@gmail.com>
Differential Revision: https://reviews.llvm.org/D131858
The extension that allows for pointer arithmetic on 'void' types treats
the 'void' as a 'char'. We should use the 'char' size in bits instead of
1 (the number of bytes) to allow warning when pointer arithmetic would
go out of bounds.
Differential Revision: https://reviews.llvm.org/D135989
D121868 provided support for -darwin-target-variant-triple, but the
support for -darwin-target-variant-sdk-version was still missing for
cc1as. These changes build upon the previous and provides such support.
- Extracted the common code to handle -darwin-target-variant-triple and
-darwin-target-variant-sdk-version in the Darwin toolchain to a method
that can be used for both the cc1 and the cc1as job construction.
cc1as does not support some of the parameters that were provided to
cc1, so the same code cannot be used for both.
- Invoke that new common code when constructing a cc1as invocation.
- Parse the new -darwin-target-variant-sdk-version in the cc1as driver.
Apply its value to the MCObjectFileInfo to generate the right values
in the object files.
- Includes two new tests that check that cc1as uses the provided values
in -darwin-target-variant-sdk and that the Clang driver creates the
jobs with the correct arguments.
Differential Revision: https://reviews.llvm.org/D135729
Seems there's a narrow case - where a packed type doesn't pack its base
subobjects (only fields), but when packing a field of the derived type,
GCC does pack the resulting total object - effectively packing the base
subobject.
So ensure that this non-pod type (owing to it having a base class) that
is packed, gets packed when placed in /another/ type that is also
packed.
This is a (smallish?) ABI fix to a regression introduced by D117616 -
but that regression/ABI break hasn't been released in LLVM as-yet (it's
been reverted on the release branch from the last two LLVM releases - I
probably should've just reverted the whole patch while we hashed out
this and other issues) so this change isn't itself an ABI break, as far
as LLVM releases are concerned (for folks releasing their own copies of
LLVM from ToT/without the LLVM release branch, and didn't opt into the
clang-abi-compat 14 or below (soon to be 15 or below, I guess I should
say) then this would be an ABI break against clang from the last 9
months or so)
Differential Revision: https://reviews.llvm.org/D135916
The documentation specifies that the input and ouput for the builtin
__builtin_crypto_vsbox should be vector unsigned char.
This patch fixes this type for the builtin.
Reviewed By: amyk
Differential Revision: https://reviews.llvm.org/D135834
functions in getter/setter functions of non-trivial C struct properties
This fixes a bug where the getter/setter functions were doing a trivial
copy instead of calling the synthesized functions that copy non-trivial
C struct types.
This fixes https://github.com/llvm/llvm-project/issues/56680.
Differential Revision: https://reviews.llvm.org/D131701
A given arch feature might enabled by a pragma or a function attribute so in this cases would be nice to use intrinsics.
Today GCC offers the intrinsics without the march flag[1].
PR[2] for ACLE to clarify the intention and remove the need for -march flag for a given intrinsics.
This is going to be more useful when D127812 lands.
[1] https://godbolt.org/z/bxcMhav3z
[2] https://github.com/ARM-software/acle/pull/214
Reviewed By: dmgreen
Differential Revision: https://reviews.llvm.org/D133359
There are currently two options that are used to tell the compiler to perform
unsafe floating-point optimizations:
'-ffast-math' and '-funsafe-math-optimizations'.
'-ffast-math' is enabled by default. It automatically enables the driver option
'-menable-unsafe-fp-math'.
Below is a table illustrating the special operations enabled automatically by
'-ffast-math', '-funsafe-math-optimizations' and '-menable-unsafe-fp-math'
respectively.
Special Operations -ffast-math -funsafe-math-optimizations -menable-unsafe-fp-math
MathErrno 0 1 1
FiniteMathOnly 1 0 0
AllowFPReassoc 1 1 1
NoSignedZero 1 1 1
AllowRecip 1 1 1
ApproxFunc 1 1 1
RoundingMath 0 0 0
UnsafeFPMath 1 0 1
FPContract fast on on
'-ffast-math' enables '-fno-math-errno', '-ffinite-math-only',
'-funsafe-math-optimzations' and sets 'FpContract' to 'fast'. The driver option
'-menable-unsafe-fp-math' enables the same special options than
'-funsafe-math-optimizations'. This is redundant.
We propose to remove the driver option '-menable-unsafe-fp-math' and use
instead, the setting of the special operations to set the function attribute
'unsafe-fp-math'. This attribute will be enabled only if those special
operations are enabled and if 'FPContract' is either 'fast' or set to the
default value.
Differential Revision: https://reviews.llvm.org/D135097
This introduces support for nullptr and nullptr_t in C2x mode. The
proposal accepted by WG14 is:
https://www.open-std.org/jtc1/sc22/wg14/www/docs/n3042.htm
Note, there are quite a few incompatibilities with the C++ feature in
some of the edge cases of this feature. Therefore, there are some FIXME
comments in tests for testing behavior that might change after WG14 has
resolved national body comments (a process we've not yet started). So
this implementation might change slightly depending on the resolution
of comments. This is called out explicitly in the release notes as
well.
Differential Revision: https://reviews.llvm.org/D135099
This patch improves the LIT tests on the following :
1. The test on `uses_allocators` clause in the `target` region by
adding the respective CHECK lines. Allocator `omp_thread_mem_alloc`
is also added in the test.
2. The `defaultmap` clause wasn't being tested for the variable-
category `scalar` and the implicit-behavior `tofrom` with respect
to the OpenMP default version.
These improvements are inspired from SOLLVE tests.
SOLLVE repo: https://github.com/SOLLVE/sollve_vv
Reviewed By: jdoerfert
Differential Revision: https://reviews.llvm.org/D132855