A one-bit signed bit-field can only hold the values 0 and -1; this
corrects the diagnostic behavior accordingly.
Fixes#53253
Differential Revision: https://reviews.llvm.org/D131255
As was observed in https://reviews.llvm.org/D123627#3707635, it's
confusing that a user can write:
```
float rintf(void) {}
```
and get a warning, but writing:
```
float rintf() {}
```
gives an error. This patch changes the behavior so that both are
warnings, so that users who have functions which conflict with a
builtin identifier can still use that identifier as they wish.
Differential Revision: https://reviews.llvm.org/D131499
Earlier, if the QualType was sugared, then we would error out
as it was not a pointer type, for example,
typedef int *int_star;
int_star __ptr32 p;
Now, if ptr32 is given we apply it if the raw Canonical Type
(i.e., the desugared type) is a PointerType, instead of only
checking whether the sugared type is a pointer type.
As before, we still disallow ptr32 usage if the pointer is used
as a pointer to a member.
Differential Revision: https://reviews.llvm.org/D130123
Driver options taking a value typically use `=` as the separator, instead of a
space. Unfortunately many older driver options do not stick with the rule, but I
find -Xclang used a lot and will be convenient if -Xclang= exists.
For build systems using a string array instead of a string to indicate compiler options,
`["-Xclang=-foo"]` is more convenient than `["-Xclang", "-foo"]`.
If a tool wants to filter out -Xclang=-foo, it is trivial for the `=` form, but
complex for the space separated form.
Reviewed By: jhuber6
Differential Revision: https://reviews.llvm.org/D131455
In D130058 we diagnose the undefined behavior of setting the value outside the
range of the enumerations values for an enum without a fixed underlying type.
Based on feedback we will provide users to the ability to downgrade this
diagnostic to a waring to allow for a transition period. We expect to turn this
diagnostic to an error in the next release.
Differential Revision: https://reviews.llvm.org/D131307
Sharing the FileManager across implicit module builds currently leaks
paths looked up in an importer into the built module itself. This can
cause non-deterministic results across scans. It is especially bad for
modules since the path can be saved into the pcm file itself, leading to
stateful behaviour if the cache is shared.
This should not impact the number of real filesystem accesses in the
scanner, since it is already caching in the
DependencyScanningWorkerFilesystem.
Note: this change does not affect whether or not the FileManager is
shared across TUs in the scanner, which is a separate issue.
Differential Revision: https://reviews.llvm.org/D131412
Currently if an OpenMP program uses `linear` clause, and is compiled with
optimization, `llvm.lifetime.end` for variables listed in `linear` clause are
emitted too early such that there could still be uses after that. Let's take the
following code as example:
```
// loop.c
int j;
int *u;
void loop(int n) {
int i;
for (i = 0; i < n; ++i) {
++j;
u = &j;
}
}
```
We compile using the command:
```
clang -cc1 -fopenmp-simd -O3 -x c -triple x86_64-apple-darwin10 -emit-llvm loop.c -o loop.ll
```
The following IR (simplified) will be generated:
```
@j = local_unnamed_addr global i32 0, align 4
@u = local_unnamed_addr global ptr null, align 8
define void @loop(i32 noundef %n) local_unnamed_addr {
entry:
%j = alloca i32, align 4
%cmp = icmp sgt i32 %n, 0
br i1 %cmp, label %simd.if.then, label %simd.if.end
simd.if.then: ; preds = %entry
call void @llvm.lifetime.start.p0(i64 4, ptr nonnull %j)
store ptr %j, ptr @u, align 8
call void @llvm.lifetime.end.p0(i64 4, ptr nonnull %j)
%0 = load i32, ptr %j, align 4
store i32 %0, ptr @j, align 4
br label %simd.if.end
simd.if.end: ; preds = %simd.if.then, %entry
ret void
}
```
The most important part is:
```
call void @llvm.lifetime.end.p0(i64 4, ptr nonnull %j)
%0 = load i32, ptr %j, align 4
store i32 %0, ptr @j, align 4
```
`%j` is still loaded after `@llvm.lifetime.end.p0(i64 4, ptr nonnull %j)`. This
could cause the backend incorrectly optimizes the code and further generates
incorrect code. The root cause is, when we emit a construct that could have
`linear` clause, it usually has the following pattern:
```
EmitOMPLinearClauseInit(S)
{
OMPPrivateScope LoopScope(*this);
...
EmitOMPLinearClause(S, LoopScope);
...
(void)LoopScope.Privatize();
...
}
EmitOMPLinearClauseFinal(S, [](CodeGenFunction &) { return nullptr; });
```
Variables that need to be privatized are added into `LoopScope`, which also
serves as a RAII object. When `LoopScope` is destructed and if optimization is
enabled, a `@llvm.lifetime.end` is also emitted for each privatized variable.
However, the writing back to original variables in `linear` clause happens after
the scope in `EmitOMPLinearClauseFinal`, causing the issue we see above.
A quick "fix" seems to be, moving `EmitOMPLinearClauseFinal` inside the scope.
However, it doesn't work. That's because the local variable map has been updated
by `LoopScope` such that a variable declaration is mapped to the privatized
variable, instead of the actual one. In that way, the following code will be
generated:
```
%0 = load i32, ptr %j, align 4
store i32 %0, ptr %j, align 4
call void @llvm.lifetime.end.p0(i64 4, ptr nonnull %j)
```
Well, now the life time is correct, but apparently the writing back is broken.
In this patch, a new function `OMPPrivateScope::restoreMap` is added and called
before calling `EmitOMPLinearClauseFinal`. This can make sure that
`EmitOMPLinearClauseFinal` can find the orignal varaibls to write back.
Fixes#56913.
Reviewed By: ABataev
Differential Revision: https://reviews.llvm.org/D131272
When not set output, set default output to stdout.
When set output with -Fo and no -fcgl, set -emit-obj to generate dx container.
Reviewed By: beanz
Differential Revision: https://reviews.llvm.org/D130858
When performing device only compilation, there was an issue where
`cubin` outputs were being renamed to `cubin` despite the user's name.
This is required in a normal compilation flow as the Nvidia tools only
understand specific filenames instead of checking magic bytes for some
unknown reason. We do not want to perform this transformation when the
user is performing device only compilation.
Reviewed By: tra
Differential Revision: https://reviews.llvm.org/D131278
Use of `ORIGINAL_PCH_DIR` record has been superseeded by making PCH/PCM files with relocatable paths at write time.
Removing this record is useful for producing an output-path-independent PCH file and enable sharing of the same PCH file even
when it was intended for a different output path.
Differential Revision: https://reviews.llvm.org/D131124
Sharing the FileManager between the importer and the module build should
only be an optimization. Add a cc1 option -fno-modules-share-filemanager
to allow us to test this. Fix the path to modulemap files, which
previously depended on the shared FileManager when using path mapped to
an external file in a VFS.
Differential Revision: https://reviews.llvm.org/D131076
Support for functions wmempcpy, wmemmove, wmemcmp is added to the checker.
The same tests are copied that exist for the non-wide versions, with
non-wide functions and character types changed to the wide version.
Reviewed By: martong
Differential Revision: https://reviews.llvm.org/D130470
Closing https://github.com/llvm/llvm-project/issues/56919
It is meaningless to preserve the lifetime markers for the spilled
allocas in the coroutine frames and it would block some optimizations
too.
In D130807 we added the `skipprofile` attribute. This commit
changes the format so we can either `forbid` or `skip` profiling
functions by adding the `noprofile` or `skipprofile` attributes,
respectively. The behavior of the original format remains
unchanged.
Also, add the `skipprofile` attribute when using
`-fprofile-function-groups`.
This was originally landed as https://reviews.llvm.org/D130808 but was
reverted due to a Windows test failure.
Differential Revision: https://reviews.llvm.org/D131195
-E option will set entry function for hlsl.
The format is -E entry_name.
To avoid conflict with existing option with name 'E', add an extra prefix '--'.
A new field HLSLEntry is added to TargetOption.
To share code with HLSLShaderAttr, entry function will be add HLSLShaderAttr attribute too.
Reviewed By: beanz
Differential Revision: https://reviews.llvm.org/D124751
We would like to make the ACLE NEON and SVE intrinsics more useable by
gating them on the target, not by ifdef preprocessor macros. In order to
do this the types they use need to be available. This patches makes
__bf16 always available under AArch64 not just when the bf16
architecture feature is present. This bringing it in-line with GCC. In
subsequent patches the NEON bfloat16x8_t and SVE svbfloat16_t types
(along with bfloat16_t used in arm_sve.h) will be made unconditional
too.
The operations valid on the types are still very limited. They can be
used as a storage type, but the intrinsics used for convertions are
still behind an ifdef guard in arm_neon.h/arm_bf16.h.
Differential Revision: https://reviews.llvm.org/D130973
Close#56885: WG14 N2630 added %b to fprintf/fscanf and recommended %B for
fprintf. This patch teaches -Wformat %b for the printf/scanf family of functions
and %B for the printf family of functions.
glibc 2.35 and latest Android bionic added %b/%B printf support. From
https://www.openwall.com/lists/libc-coord/2022/07/ no scanf support is available
yet.
Like GCC, we don't test library support.
GCC 12 -Wformat -pedantic emits a warning:
> warning: ISO C17 does not support the ‘%b’ gnu_printf format [-Wformat=]
The behavior is not ported.
Note: `freebsd_kernel_printf` uses %b differently.
Reviewed By: aaron.ballman, dim, enh
Differential Revision: https://reviews.llvm.org/D131057
In D130807 we added the `skipprofile` attribute. This commit
changes the format so we can either `forbid` or `skip` profiling
functions by adding the `noprofile` or `skipprofile` attributes,
respectively. The behavior of the original format remains
unchanged.
Also, add the `skipprofile` attribute when using
`-fprofile-function-groups`.
Reviewed By: phosek
Differential Revision: https://reviews.llvm.org/D130808
As discussed in [0], this diff adds the `skipprofile` attribute to
prevent the function from being profiled while allowing profiled
functions to be inlined into it. The `noprofile` attribute remains
unchanged.
The `noprofile` attribute is used for functions where it is
dangerous to add instrumentation to while the `skipprofile` attribute is
used to reduce code size or performance overhead.
[0] https://discourse.llvm.org/t/why-does-the-noprofile-attribute-restrict-inlining/64108
Reviewed By: phosek
Differential Revision: https://reviews.llvm.org/D130807
This patch enhances clang's ability to check compile-time determinable
string literals as format strings, and can give FixIt hints at literals
(unlike gcc). Issue https://github.com/llvm/llvm-project/issues/55805
mentiond two compile-time string cases. And this patch partially fixes
one.
```
constexpr const char* foo() {
return "%s %d";
}
int main() {
printf(foo(), "abc", "def");
return 0;
}
```
This patch enables clang check format string for this:
```
<source>:4:24: warning: format specifies type 'int' but the argument has type 'const char *' [-Wformat]
printf(foo(), "abc", "def");
~~~~~ ^~~~~
<source>:2:42: note: format string is defined here
constexpr const char *foo() { return "%s %d"; }
^~
%s
1 warning generated.
```
Reviewed By: aaron.ballman
Signed-off-by: YingChi Long <me@inclyc.cn>
Differential Revision: https://reviews.llvm.org/D130906
On targets with non-default program address space (e.g., Harvard
architectures), clang crashes when emitting Objective-C method metadata,
because the address of the method IMP cannot be bitcast to i8*. It similarly
crashes at messenger callsite with a failed bitcast.
Define the _imp field instead as i8 addrspace(1)* (or whatever the target's
program address space is). And in getMessageSendInfo(), create signatureType by
specifying the program address space.
Add a regression test using the AVR target. Test failed previously and passes
now. Checked codegen of the test for x86_64-apple-darwin19.6.0 and saw no
difference, as expected.
Reviewed By: rjmccall, dylanmckay
Differential Revision: https://reviews.llvm.org/D112113
This completes the implementation of P1091R3 and P1381R1.
This patch allow the capture of structured bindings
both for C++20+ and C++17, with extension/compat warning.
In addition, capturing an anonymous union member,
a bitfield, or a structured binding thereof now has a
better diagnostic.
We only support structured bindings - as opposed to other kinds
of structured statements/blocks. We still emit an error for those.
In addition, support for structured bindings capture is entirely disabled in
OpenMP mode as this needs more investigation - a specific diagnostic indicate the feature is not yet supported there.
Note that the rest of P1091R3 (static/thread_local structured bindings) was already implemented.
at the request of @shafik, i can confirm the correct behavior of lldb wit this change.
Fixes https://github.com/llvm/llvm-project/issues/54300
Fixes https://github.com/llvm/llvm-project/issues/54300
Fixes https://github.com/llvm/llvm-project/issues/52720
Reviewed By: aaron.ballman
Differential Revision: https://reviews.llvm.org/D122768
Newer SDKs don't even provide libstdc++ headers, so it's effectively
never valid to build for libstdc++ unless the user explicitly asks
for it (in which case they will need to provide include paths and more).
We already correctly rejected:
typedef __attribute__((vector_size(16))) _BitInt(4) Ty;
but we would assert with:
typedef __attribute__((ext_vector_type(4))) _BitInt(4) Ty;
Now we issue the same error in both cases.
This completes the implementation of P1091R3 and P1381R1.
This patch allow the capture of structured bindings
both for C++20+ and C++17, with extension/compat warning.
In addition, capturing an anonymous union member,
a bitfield, or a structured binding thereof now has a
better diagnostic.
We only support structured bindings - as opposed to other kinds
of structured statements/blocks. We still emit an error for those.
In addition, support for structured bindings capture is entirely disabled in
OpenMP mode as this needs more investigation - a specific diagnostic indicate the feature is not yet supported there.
Note that the rest of P1091R3 (static/thread_local structured bindings) was already implemented.
at the request of @shafik, i can confirm the correct behavior of lldb wit this change.
Fixes https://github.com/llvm/llvm-project/issues/54300
Fixes https://github.com/llvm/llvm-project/issues/54300
Fixes https://github.com/llvm/llvm-project/issues/52720
Reviewed By: aaron.ballman
Differential Revision: https://reviews.llvm.org/D122768
In ExprEngine::bindReturnValue() we cast an SVal to DefinedOrUnknownSVal,
however this SVal can also be Undefined, which leads to an assertion failure.
Fixes: #56873
Differential Revision: https://reviews.llvm.org/D130974
https://github.com/llvm/llvm-project/issues/56884
The root problem is in isOpenMPRebuildMemberExpr, it is only need to rebuild
for field expression. No need for member function call.
The fix is to check field for member expression and skip rebuild for member
function call.
Differential Revision: https://reviews.llvm.org/D131024
The getKeywordStatus function is a horrible mess of inter-dependent 'if'
statements that depend significantly on the ORDER of the checks. This
patch removes the dependency on order by checking each set-flag only
once.
It does this by looping through each of the set bits, and checks each
individual flag for its effect, then combines them at the end.
This might slow down startup performance slightly, as there are only a
few hundred keywords, and a vast majority will only get checked 1x
still.
This patch ALSO removes the KEYWORD_CONCEPTS flag, because it has since
become synonymous with C++20.
Differential Revision: https://reviews.llvm.org/D131007
The SystemZ ABI says that 128 bit integers should be aligned to only 8 bytes.
Reviewed By: Ulrich Weigand, Nikita Popov
Differential Revision: https://reviews.llvm.org/D130900
We didn't check that a destructor's name matches the directly enclosing class if the class was dependent.
I enabled the check we already had for non-dependent types, which seems to work. Added appropriate tests.
Fixes GitHub issue #56772
Reviewed By: erichkeane
Differential Revision: https://reviews.llvm.org/D130936
Instructions.
We will switch all UndefValue to PoisonValue in follow up patches.
Thanks for Kito to help on verification with their interanl testsuite.
Reviewed By: kito-cheng
Differential Revision: https://reviews.llvm.org/D126748
Instructions.
We will switch all UndefValue to PoisonValue in follow up patches.
Reviewed By: kito-cheng
Differential Revision: https://reviews.llvm.org/D126746
vcompress.
We will switch all UndefValue to PoisonValue in follow up patches.
Reviewed By: kito-cheng
Differential Revision: https://reviews.llvm.org/D126745
vwcvtu, vfabs and vfneg.
We will switch all UndefValue to PoisonValue in follow up patches.
Reviewed By: kito-cheng
Differential Revision: https://reviews.llvm.org/D126744
The Clang compiler generates internal functions for OpenMP. Current
patch marks these functions as artificial.
Reviewed By: aprantl
Differential Revision: https://reviews.llvm.org/D111521
This reverts commit 0cc3c184c7.
The changes did not account for templated code where one instantiation
may trigger the diagnostic but other instantiations will not, as in:
```
template <int I, class T>
void foo(int x) {
bool b1 = (x & sizeof(T)) == 8;
bool b2 = (x & I) == 8;
bool b3 = (x & 4) == 8;
}
void run(int x) {
foo<4, int>(8);
}
```
Added support for incremental mode 8 and 28 ie. `frontend::EmitBC:` and `frontend::PrintPreprocessedInput:`
Added supporting clang tests to test in clang-repl mode
Reviewed By: v.g.vassilev
Differential Revision: https://reviews.llvm.org/D125946
Previously when we add module initializer, we forget to handle header
units. This results that we couldn't compile a Hello World Example with
Header Units. This patch tries to fix this.
Reviewed By: iains
Differential Revision: https://reviews.llvm.org/D130871
Previously when we add module initializer, we forget to handle header
units. This results that we couldn't compile a Hello World Example with
Header Units. This patch tries to fix this.
Reviewed By: iains
Differential Revision: https://reviews.llvm.org/D130871
This test had to be disabled because ps4 targets don't support
-fuse-ld. Preferably, this should just be unsupported for ps4
targets. However no such lit feature exists so I have just gone
ahead and set the target explicitly. Moreover, this needs
to create a terminal link step, either an executable or shared
object to get the link error. With the change to the explicit
target I've had to also add -nostartfiles -nostdlib so that
clang doesn't pull crt files into the link which may not be
present. Again, this would likely be solved if this test
was unsupported for the one platform that disables -fuse-ld
1. Add policy functions support and tests for vadd, vmv, vfmv and all load
instructions except segment load. I didn't add all combination of policy
functions in test because it seem not to make sense.
2. Rename HasUnMaskedOverloaded to SupportOverloading.
3. vmv.s.x for ta policy could not have overloaded API.
4. This patch does not support all operations, I will have other follow-up
patches support all.
[RFC] https://github.com/riscv-non-isa/rvv-intrinsic-doc/pull/137
Reviewed By: kito-cheng, fakepaper56, fakepaper56
Differential Revision: https://reviews.llvm.org/D126742
I went over the output of the following mess of a command:
(ulimit -m 2000000; ulimit -v 2000000; git ls-files -z |
parallel --xargs -0 cat | aspell list --mode=none --ignore-case |
grep -E '^[A-Za-z][a-z]*$' | sort | uniq -c | sort -n |
grep -vE '.{25}' | aspell pipe -W3 | grep : | cut -d' ' -f2 | less)
and proceeded to spend a few days looking at it to find probable typos
and fixed a few hundred of them in all of the llvm project (note, the
ones I found are not anywhere near all of them, but it seems like a
good start).
Differential Revision: https://reviews.llvm.org/D130827
HLSL Resource types need special annotations that the backend will use
to build out metadata and resource annotations that are required by
DirectX and Vulkan drivers in order to provide correct data bindings
for shader exeuction.
This patch adds some of the required data for unordered-access-views
(UAV) resource binding into the module flags. This data will evolve
over time to cover all the required use cases, but this should get
things started.
Depends on D130018.
Differential Revision: https://reviews.llvm.org/D130019
Closing https://github.com/llvm/llvm-project/issues/56803. The root
cause for this bug is that we lack a good method to detect the language
mdoe when parsing the command line. There is a FIXME too. Dut to we lack
a good solution now, keep the workaround.
This is successor for D125291. This revision would try to use
@llvm.threadlocal.address in clang to access TLS variable. The reason
why the OpenMP tests contains a lot of change is that they uses
utils/update_cc_test_checks.py to update their tests.
Reviewed By: nikic
Differential Revision: https://reviews.llvm.org/D129833
Without this patch, clang-repl incorrectly pass some tests when there's
error occured.
Signed-off-by: Jun Zhang <jun@junz.org>
Differential Revision: https://reviews.llvm.org/D130422
We have seen random symbol not found "__cxa_throw" error in fuschia build bots and out-of-tree users. The understanding have been that they are built without exception support, but it turned out that these platforms have LLVM_STATIC_LINK_CXX_STDLIB ON so that they link libstdc++ to llvm statically. The reason why this is problematic for clang-repl is that by default clang-repl tries to find symbols from symbol table of executable and dynamic libraries loaded by current process. It needs to load another libstdc++, but the platform that had LLVM_STATIC_LINK_CXX_STDLIB turned on is usally those with missing or obsolate shared libstdc++ in the first place -- trying to load it again would be destined to fail eventually with a risk to introuduce mixed libstdc++ versions.
A proper solution that doesn't take a workaround is statically link the same libstdc++ by clang-repl side, but this is not possible with old JIT linker runtimedyld. New just-in-time linker JITLink handles this relatively well, but it's not availalbe in majority of platforms. For now, this patch just disables the building of clang-repl when LLVM_STATIC_LINK_CXX_STDLIB is ON and removes the "__cxa_throw" check in exception unittest as well as reverting previous exception check flag patch.
Reviewed By: v.g.vassilev
Differential Revision: https://reviews.llvm.org/D130788
This is a follow-up to D130058 to fix how we handle the Max value we obtain from
getValueRange(...) in IntExprEvaluator::VisitCastExpr(...) which in the case of
an enum that contains an enumerator with the max integer value will overflow by
one.
The fix is to decrement the value of Max and use slt and ult for comparison Vs
sle and ule.`
Differential Revision: https://reviews.llvm.org/D130811
This is useful to enable sharing of the same PCH file even when it's intended for a different output path.
The only information this option disables writing is for `ORIGINAL_PCH_DIR` record which is treated as optional and (when present) used as fallback for resolving input file paths relative to it.
Differential Revision: https://reviews.llvm.org/D130710
C99 6.7.4p2 clarifies that a function specifier can only be used in the
declaration of a function. _Noreturn is a function specifier, so it is
a constraint violation to write it on a structure or union field, but
we missed that case.
Fixes#56800
To make use of SPARC support in `getHostCPUName` as implemented by D130272
<https://reviews.llvm.org/D130272>, this patch uses it to handle
`-mcpu=native` and `-mtune=native`. To match GCC, this patch rejects
`-march` instead of silently treating it as a no-op.
Tested on `sparcv9-sun-solaris2.11` and checking that those options are
passed on as `-target-cpu` resp. `-tune-cpu` as expected.
Differential Revision: https://reviews.llvm.org/D130273
This is the Linux/sparc64 equivalent to D118021
<https://reviews.llvm.org/D118021>, necessary to provide an external
implementation of atomics on 32-bit SPARC which LLVM cannot inline even
with `-mcpu=v9` or an equivalent default.
Tested on `sparc64-unknown-linux-gnu`.
Differential Revision: https://reviews.llvm.org/D130569
According to [basic.def.odr]p14, the same redeclarations in different TU
but not attached to a named module are allowed. But we didn't take care
of concept decl for this condition. This patch tries to fix this
problem.
Reviewed By: ilya-biryukov
Differention Revision: https://reviews.llvm.org/D130614
Add the ability to put __attribute__((maybe_undef)) on function arguments.
Clang codegen introduces a freeze instruction on the argument.
Differential Revision: https://reviews.llvm.org/D130224
HLSL Resource objects will have restrictions on use and codegen
requirements. This patch is fairly minimal just adding the attribute
with no spellings since it will only be attached by the
HLSLExternalSemaSource.
Depends on D1300017.
Differential Revision: https://reviews.llvm.org/D130018
DR2338 clarified that it was undefined behavior to set the value outside the
range of the enumerations values for an enum without a fixed underlying type.
We should diagnose this with a constant expression context.
Differential Revision: https://reviews.llvm.org/D130058
The "strict context hash" is insufficient to identify module
dependencies during scanning, leading to different module build commands
being produced for a single module, and non-deterministically choosing
between them. This commit switches to hashing the canonicalized
`CompilerInvocation` of the module. By hashing the invocation we are
converting these from correctness issues to performance issues, and we
can then incrementally improve our ability to canonicalize
command-lines.
This change can cause a regression in the number of modules needed. Of
the 4 projects I tested, 3 had no regression, but 1, which was
clang+llvm itself, had a 66% regression in number of modules (4%
regression in total invocations). This is almost entirely due to
differences between -W options across targets. Of this, 25% of the
additional modules are system modules, which we could avoid if we
canonicalized -W options when -Wsystem-headers is not present --
unfortunately this is non-trivial due to some warnings being enabled in
system headers by default. The rest of the additional modules are mostly
real differences in potential warnings, reflecting incorrect behaviour
in the current scanner.
There were also a couple of differences due to `-DFOO`
`-fmodule-ignore-macro=FOO`, which I fixed here.
Since the output paths for the module depend on its context hash, we
hash the invocation before filling in outputs, and rely on the build
system to always return the same output paths for a given module.
Note: since the scanner itself uses an implicit modules build, there can
still be non-determinism, but it will now present as different
module+hashes rather than different command-lines for the same
module+hash.
Differential Revision: https://reviews.llvm.org/D129884
This fills out the default constructor for RWBuffer to assign the
handle with the result of __builtin_hlsl_create_handle which we can
then treat as a pointer to the resource data through the mid-level of
the compiler.
Depends on D130016
Differential Revision: https://reviews.llvm.org/D130017
This builtin allows the creation of custom scheduling pipelines on a per-region
basis. Like the sched_barrier builtin this is intended to be used either for
testing, in situations where the default scheduler heuristics cannot be
improved, or in critical kernels where users are trying to get performance that
is close to handwritten assembly. Obviously using these builtins will require
extra work from the kernel writer to maintain the desired behavior.
The builtin can be used to create groups of instructions called "scheduling
groups" where ordering between the groups is enforced by the scheduler.
__builtin_amdgcn_sched_group_barrier takes three parameters. The first parameter
is a mask that determines the types of instructions that you would like to
synchronize around and add to a scheduling group. These instructions will be
selected from the bottom up starting from the sched_group_barrier's location
during instruction scheduling. The second parameter is the number of matching
instructions that will be associated with this sched_group_barrier. The third
parameter is an identifier which is used to describe what other
sched_group_barriers should be synchronized with. Note that multiple
sched_group_barriers must be added in order for them to be useful since they
only synchronize with other sched_group_barriers. Only "scheduling groups" with
a matching third parameter will have any enforced ordering between them.
As an example, the code below tries to create a pipeline of 1 VMEM_READ
instruction followed by 1 VALU instruction followed by 5 MFMA instructions...
// 1 VMEM_READ
__builtin_amdgcn_sched_group_barrier(32, 1, 0)
// 1 VALU
__builtin_amdgcn_sched_group_barrier(2, 1, 0)
// 5 MFMA
__builtin_amdgcn_sched_group_barrier(8, 5, 0)
// 1 VMEM_READ
__builtin_amdgcn_sched_group_barrier(32, 1, 0)
// 3 VALU
__builtin_amdgcn_sched_group_barrier(2, 3, 0)
// 2 VMEM_WRITE
__builtin_amdgcn_sched_group_barrier(64, 2, 0)
Reviewed By: jrbyrnes
Differential Revision: https://reviews.llvm.org/D128158
This is pretty straightforward, it just adds a builtin to return a
pointer to a resource handle. This maps to a dx intrinsic.
The shape of this builtin and the underlying intrinsic will likely
shift a bit as this implementation becomes more feature complete, but
this is a good basis to get started.
Depends on D128569.
Differential Revision: https://reviews.llvm.org/D130016
Most of the change here is fleshing out the HLSLExternalSemaSource with
builder implementations to build the builtin types. Eventually, I may
move some of this code into tablegen or a more managable declarative
file but I want to get the AST generation logic ready first.
This code adds two new types into the HLSL AST, `hlsl::Resource` and
`hlsl::RWBuffer`. The `Resource` type is just a wrapper around a handle
identifier, and is largely unused in source. It will morph a bit over
time as I work on getting the source compatability correct, but for now
it is a reasonable stand-in. The `RWBuffer` type is not ready for use.
I'm posting this change for review because it adds a lot of
infrastructure code and is testable.
There is one change to clang code outside the HLSL-specific logic here,
which addresses a behavior change introduced a long time ago in
967d438439. That change resulted in unintentionally breaking
situations where an incomplete template declaration was provided from
an AST source, and needed to be completed later by the external AST.
That situation doesn't happen in the normal AST importer flow, but can
happen when an AST source provides incomplete declarations of
templates. The solution is to annotate template specializations of
incomplete types with the HasExternalLexicalSource bit from the base
template.
Depends on D128012.
Differential Revision: https://reviews.llvm.org/D128569
These vdup and vmov float16 intrinsics are being defined in both the
general section and then again in fp16 under a !aarch64 flag. The
vdup_lane intrinsics were being defined in both aarch64 and !aarch64
sections, so have been commoned. They are defined as macros, so do not
give duplicate warnings, but removing the duplicates shouldn't alter the
available intrinsics.
Add host exception support check utility flag. This is needed to not run tests that require exception support in few buildbots that lacks related symbols for some reason.
Reviewed By: lhames
Differential Revision: https://reviews.llvm.org/D129242
The patch mainly focuses on the lack of warnings for
-Wtautological-compare. It works fine for positive numbers but doesn't
for negative numbers. This is because the warning explicitly checks for
an IntegerLiteral AST node, but -1 is represented by a UnaryOperator
with an IntegerLiteral sub-Expr.
For the below code we have warnings:
if (0 == (5 | x)) {}
but not for
if (0 == (-5 | x)) {}
This patch changes the analysis to not look at the AST node directly to
see if it is an IntegerLiteral, but instead attempts to evaluate the
expression to see if it is an integer constant expression. This handles
unary negation signs, but also handles all the other possible operators
as well.
Fixes#42918
Differential Revision: https://reviews.llvm.org/D130510