This patch is in a series of patches to provide builtins for compatibility with the XL compiler.
This patch adds semachecking for an already implemented builtin, `__icbt`. `__icbt` is only
valid for Power8 and up.
Reviewed By: #powerpc, nemanjai
Differential Revision: https://reviews.llvm.org/D105834
For example, in OpenMP offload codegen tests, global variables like
`.offload_maptypes*` are much easier to read in hex.
Reviewed By: jdoerfert
Differential Revision: https://reviews.llvm.org/D104743
`--check-globals` activates checks for all global values, and
`--global-value-regex` filters them. For example, I'd like to use it
in OpenMP offload codegen tests to check only global variables like
`.offload_maptypes*`.
Reviewed By: jdoerfert
Differential Revision: https://reviews.llvm.org/D104742
This patch fixes `__builtin_ppc_recipdivf`, `__builtin_ppc_recipdivd`,
`__builtin_ppc_rsqrtf`, and `__builtin_ppc_rsqrtd`. FastMathFlags are
set to fast immediately before emitting these builtins. Now the flags
are restored to their previous values after the builtins are emitted.
Reviewed By: nemanjai, #powerpc
Differential Revision: https://reviews.llvm.org/D105984
Summary:
Test and produce warning for subtracting a pointer from null or subtracting
null from a pointer.
This reland adds the functionality that the warning is no longer reusing an
existing warning, it has different wording for C vs C++ to refect the fact
that nullptr-nullptr has defined behaviour in C++, it is suppressed
when the warning is triggered by a system header and adds
-Wnull-pointer-subtraction to allow the warning to be controlled. -Wextra
implies -Wnull-pointer-subtraction.
Author: Jamie Schmeiser <schmeise@ca.ibm.com>
Reviewed By: efriedma (Eli Friedman), nickdesaulniers (Nick Desaulniers)
Differential Revision: https://reviews.llvm.org/D98798
Added a number of different builtins that exist in the XL compiler. Most of
these builtins already exist in clang under a different name.
Reviewed By: nemanjai, #powerpc
Differential Revision: https://reviews.llvm.org/D104386
This patch avoid minimizing input files that contributed to a PCH or its modules. This prevents the implicit modular build to fail on unexpected file size. Depends on D106146.
Reviewed By: dexonsmith
Differential Revision: https://reviews.llvm.org/D104536
This patch is in a series of patches to provide builtins for
compatibility with the XL compiler. This patch adds software divide
builtins with no checking. These builtins are each emitted as a fast
fdiv.
Reviewed By: #powerpc, nemanjai
Differential Revision: https://reviews.llvm.org/D106150
This diff changes llvm-ifs to use unified IFS file format
and perform other renaming changes in preparation for the
merging between elfabi/ifs.
Differential Revision: https://reviews.llvm.org/D99810
This patch adds the `-faltivec-src-compat=mixed` option to the
`builtins-ppc-altivec.c` test.
Currently, the default for `-faltivec-src-compat` is `mixed`. The reason we
explicitly specify `mixed` to the RUN lines of this test is because eventually,
the default will set to `xl`.
Having the default as `xl` changes the CHECKs of this test slightly, as it
reorders some of the `vector bool` and `vector pixel` CHECKs (since under the
`xl` option, `vector bool` and `vector pixel` are treated in the same way as
other vector scalars). Explicitly specifying `mixed` ensures that we are testing
pre-existing Clang behaviour.
Differential Revision: https://reviews.llvm.org/D106282
Use _Float16 as the half-precision floating point type. Define a new
type specifier 'x' for the _Float16 type.
Differential Revision: https://reviews.llvm.org/D105001
This patch implements the initialization of vectors under the
-faltivec-src-compat=xl option introduced in https://reviews.llvm.org/D103615.
Under this option, the initialization of scalar vectors, vector bool, and vector
pixel are treated the same, where the initialization value is splatted across
the whole vector.
This patch does not change the behaviour of the -faltivec-src-compat=mixed option,
which is the current default for Clang.
Differential Revision: https://reviews.llvm.org/D106120
Summary:
The AIX linker will produce errors on unresolved weak symbols. Change the
generated code to not check for the initialization function but just call
it and ensure that it always exists. Also, the AIX atexit routine has a
different name (and signature) so call it correctly. Update the lit tests
to test on AIX appropriately.
Author: Jamie Schmeiser <schmeise@ca.ibm.com>
Reviewed By: hubert.reinterpretcast (Hubert Tong)
Differential Revision: https://reviews.llvm.org/D104420
This patch handles the `std::swap` function specialization
for `std::unique_ptr`. Implemented to be very similar to
how `swap` method is handled
Differential Revision: https://reviews.llvm.org/D104300
It's noteworthy that GCC has the same bug here, which is a bit
surprising. Both Clang and GCC's bug is only for function template
arguments that are themselves templates with default template arguments
(f1<t1<int[, missing_default_here]>>). Probably because function name
matching isn't generally necessary - whereas type matching is necessary
for DWARF consumers to associate declarations and definitions across
translation units, so the bug's been addressed there already - but
continued to exist for function templates since it's fairly benign
there.
I came across this while working on a change that could reconstitute
these pretty printed names based on the rest of the DWARF, reducing the
size of the DWARF by not having to encode all the template parameters in
the name string. That reconstitution code can't tell the difference
between a defaulted argument or not, so couldn't create the current
buggy-ish output.
Making the names more consistent between direct and indirect references,
and between function and class templates seems all to the good.
(I fixed the function template version of this a few years back in
9fdd09a4cc - clearly I should've looked
more closely and generalized the code better so it only had to be fixed
once - well, doing that here now)
Use the elementtype attribute introduced in D105407 for the
llvm.preserve.array/struct.index intrinsics. It carries the
element type of the GEP these intrinsics effectively encode.
This patch:
* Adds a verifier check that the attribute is required.
* Adds it in the IRBuilder methods for these intrinsics.
* Autoupgrades old bitcode without the attribute.
* Updates the lowering code to use the attribute rather than
the pointer element type.
* Updates lots of tests to specify the attribute.
* Adds -force-opaque-pointers to the intrinsic-array.ll test
to demonstrate they work now.
https://reviews.llvm.org/D106184
Parallel regions are outlined as functions with capture variables explicitly generated as distinct parameters in the function's argument list. That complicates the fork_call interface in the OpenMP runtime: (1) the fork_call is variadic since there is a variable number of arguments to forward to the outlined function, (2) wrapping/unwrapping arguments happens in the OpenMP runtime, which is sub-optimal, has been a source of ABI bugs, and has a hardcoded limit (16) in the number of arguments, (3) forwarded arguments must cast to pointer types, which complicates debugging. This patch avoids those issues by aggregating captured arguments in a struct to pass to the fork_call.
Reviewed By: jdoerfert
Differential Revision: https://reviews.llvm.org/D102107
Turning on -funique-internal-linkage-names when -fpseudo-probe-for-profiling is on, unless -fno-unique-internal-linkage-names is specified.
Reviewed By: wenlei
Differential Revision: https://reviews.llvm.org/D106193
This provides intrinsics for emitting instructions that set the FPSCR (`mtfsf/mtfsfi`).
The patch also conservatively marks the rounding mode as an implicit def for both since they both may set the rounding mode depending on the operands.
Reviewed By: #powerpc, qiucf
Differential Revision: https://reviews.llvm.org/D105957
Implement a subset of builtins required for compatiblilty with AIX XL compiler.
Reviewed By: nemanjai
Differential Revision: https://reviews.llvm.org/D105930
This patch adds unique idenfitiers to the existing OpenMP remarks. This makes
it easier to identify the corresponding documentation for each remark that will
be hosted in the OpenMP webpage.
Depends on D105898
Reviewed By: jdoerfert
Differential Revision: https://reviews.llvm.org/D105939
This patch rewrites and reworks a few of the existing remarks to make the mmore
concise and consistent prior to writing the documentation for them.
Reviewed By: jdoerfert
Differential Revision: https://reviews.llvm.org/D105898
On Power PC some legacy compilers included a number of builtins in a
builtins.h header file. While this header file is not required to hold
builtins for clang some legacy code does try to include this file and so
this patch provides an empty version of that file.
Differential Revision: https://reviews.llvm.org/D106065
This change addresses this assertion that occurs in a downstream
compiler with a custom target.
```APInt.h:1151: bool llvm::APInt::operator==(const llvm::APInt &) const: Assertion `BitWidth == RHS.BitWidth && "Comparison requires equal bit widths"'```
No covering test case is susbmitted with this change since this crash
cannot be reproduced using any upstream supported target. The test case
that exposes this issue is as simple as:
```lang=c++
void test(int * p) {
int * q = p-1;
if (q) {}
if (q) {} // crash
(void)q;
}
```
The custom target that exposes this problem supports two address spaces,
16-bit `char`s, and a `_Bool` type that maps to 16-bits. There are no upstream
supported targets with similar attributes.
The assertion appears to be happening as a result of evaluating the
`SymIntExpr` `(reg_$0<int * p>) != 0U` in `VisitSymIntExpr` located in
`SimpleSValBuilder.cpp`. The `LHS` is evaluated to `32b` and the `RHS` is
evaluated to `16b`. This eventually leads to the assertion in `APInt.h`.
While this change addresses the crash and passes LITs, two follow-ups
are required:
1) The remainder of `getZeroWithPtrWidth()` and `getIntWithPtrWidth()`
should be cleaned up following this model to prevent future
confusion.
2) We're not sure why references are found along with the modified
code path, that should not be the case. A more principled
fix may be found after some further comprehension of why this
is the case.
Acks: Thanks to @steakhal and @martong for the discussions leading to this
fix.
Reviewed By: NoQ
Differential Revision: https://reviews.llvm.org/D105974
This patch handles the `<<` operator defined for `std::unique_ptr` in
the std namespace (ignores custom overloads of the operator).
Differential Revision: https://reviews.llvm.org/D105421
This patch handles all the comparision methods (defined via overloaded
operators) on std::unique_ptr. These operators compare the underlying
pointers, which is modelled by comparing the corresponding inner-pointer
SVal. There is also a special case for comparing the same pointer.
Differential Revision: https://reviews.llvm.org/D104616
This patch is in a series of patches to provide builtins for compatibility
with the XL compiler. This patch adds the builtins and instrisics for population
count, reversed load and store related operations.
Reviewed By: nemanjai, #powerpc
Differential revision: https://reviews.llvm.org/D106021
x86_64-linux-gnu and x86_64-linux-gnux32 use different ABIs and objects
built for one cannot be used for the other. In order to build and use
compiler-rt for x32, we need to treat x32 as a new arch there. This
updates the driver to search using the new arch name.
Reviewed By: glaubitz
Differential Revision: https://reviews.llvm.org/D100148
This patch implements the `__popcntb` XL compatibility builtin for 32bit in the frontend and backend. This patch also updates tests for `__popcntb` and other XL Compat sync related builtins.
Reviewed By: #powerpc, nemanjai, amyk
Differential Revision: https://reviews.llvm.org/D105360
This patch is in a series of patches to provide builtins for compatibility
with the XL compiler. This patch adds the builtins and emit target independent
code for rotate related operations.
Reviewed By: nemanjai, #powerpc
Differential revision: https://reviews.llvm.org/D104744
LLVM changed to not emit L... labels for things marked "do_not_dead_strip"
because the linker can sometimes drop the flag if there's no proper symbol.
This Clang test checked for the old behaviour, but doesn't actually care about
that bit.
If the instantiation of a member variable makes it possible to
compute a previously undeduced type, we should use that piece of
information.
Fix bug#50590
Differential Revision: https://reviews.llvm.org/D103849
This patch make coroutine passes run by default in LLVM pipeline. Now
the clang and opt could handle IR inputs containing coroutine intrinsics
without special options.
It should be fine. On the one hand, the coroutine passes seems to be stable
since there are already many projects using coroutine feature.
On the other hand, the coroutine passes should do nothing for IR who doesn't
contain coroutine intrinsic.
Test Plan: check-llvm
Reviewed by: lxfind, aeubanks
Differential Revision: https://reviews.llvm.org/D105877
Replace the experimental clang builtins and LLVM intrinsics for these
instructions with normal codegen patterns. Resolves PR50435.
Differential Revision: https://reviews.llvm.org/D106019
Summary This option can be used to reduce the size of the
binary. The trade-off in this case would be the run-time
performance.
Reviewed By: vitalybuka
Differential Revision: https://reviews.llvm.org/D105726
Replace the experimental clang builtin and LLVM intrinsics for these
instructions with normal codegen patterns. Resolves PR50433.
Differential Revision: https://reviews.llvm.org/D105950
The anonymous and non-anonymous bit-field diagnostics are easily
combined into one diagnostic. However, the diagnostic was missing a
"the" that is present in the almost-identically worded
warn_bitfield_width_exceeds_type_width diagnostic, hence the changes to
test cases.
`-u` is a linker option used to pretend a symbol is undefined,
this option are common used for forcing archive member extraction.
This option should pass to `ld`, and many other toolchain in Clang
like `tools::gnutools` has pass that too.
Reviewed By: MaskRay
Differential Revision: https://reviews.llvm.org/D105091
This patch is in a series of patches to provide builtins for compatibility
with the XL compiler. This patch adds the builtins and instrisics for compare
and multiply related operations.
Reviewed By: nemanjai, #powerpc
Differential revision: https://reviews.llvm.org/D102875
[NFC] This patch adds features for pwr7, pwr8, and pwr9 that can be
used for semachecking builtin functions that are only valid for certain
versions of ppc.
Reviewed By: nemanjai, #powerpc
Authored By: Quinn Pham <Quinn.Pham@ibm.com>
Differential revision: https://reviews.llvm.org/D105501
Otherwise, if someone specifies a valid AMD arch, we may end up triggering an
assertion on unexpected arch later on.
Differential Revision: https://reviews.llvm.org/D105295
This patch simplifies the way we deal with (dis)equalities.
Due to the symmetry between constraint handler and range inferrer,
we can have very similar implementations of logic handling
questions about (dis)equality and assumptions involving (dis)equality.
It also helps us to remove one more visitor, and removes uncertainty
that we got all the right places to put `trackNE` and `trackEQ`.
Differential Revision: https://reviews.llvm.org/D105693
After taking C++98 implicit moves out in D104500,
we put it back in, but now in a new form which preserves
compatibility with pure C++98 programs, while at the same time
giving almost all the goodies from P1825.
* We use the exact same rules as C++20 with regards to which
id-expressions are move eligible. The previous
incarnation would only benefit from the proper subset which is
copy ellidable. This means we can implicit move, in addition:
* Parameters.
* RValue references.
* Exception variables.
* Variables with higher-than-natural required alignment.
* Objects with different type from the function return type.
* We preserve the two-overload resolution, with one small tweak to the
first one: If we either pick a (possibly converting) constructor which
does not take an rvalue reference, or a user conversion operator which
is not ref-qualified, we abort into the second overload resolution.
This gives C++98 almost all the implicit move patterns which we had created test
cases for, while at the same time preserving the meaning of these
three patterns, which are found in pure C++98 programs:
* Classes with both const and non-const copy constructors, but no move
constructors, continue to have their non-const copy constructor
selected.
* We continue to reject as ambiguous the following pattern:
```
struct A { A(B &); };
struct B { operator A(); };
A foo(B x) { return x; }
```
* We continue to pick the copy constructor in the following pattern:
```
class AutoPtrRef { };
struct AutoPtr {
AutoPtr(AutoPtr &);
AutoPtr();
AutoPtr(AutoPtrRef);
operator AutoPtrRef();
};
AutoPtr test_auto_ptr() {
AutoPtr p;
return p;
}
```
Signed-off-by: Matheus Izvekov <mizvekov@gmail.com>
Reviewed By: Quuxplusone
Differential Revision: https://reviews.llvm.org/D105756
Similar to D46745, "S" represents an absolute symbolic operand, which
can be used to specify the access models, e.g.
extern int var;
void *addr_via_asm() {
void *ret;
asm("lui %0, %%hi(%1)\naddi %0,%0,%%lo(%1)" : "=r"(ret) : "S"(&var));
return ret;
}
'S' is documented in trunk GCC: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=101275
Reviewed By: luismarques
Differential Revision: https://reviews.llvm.org/D105254
LDARX and LWARX sometimes gets optimized out by the compiler
when it is critical to the correctness of the code. This inline asm generation
ensures that it preserved.
Differential Revision: https://reviews.llvm.org/D105754
Properties that were declared `@property(copy, nonatomic) id foo` make an
unnecessary call to objc_get_property(). This call can be replaced with a
direct access to the backing variable identical to how a `@property(nonatomic)
id foo` would do it.
This reduces codegen by 4 bytes (x86_64/arm64) and removes a cross linkage unit
function call per property declared as copy/nonatomic.
Differential Revision: https://reviews.llvm.org/D105311
This feature requires support of __opencl_c_images, so diagnostics for that is provided as well
Reviewed By: Anastasia
Differential Revision: https://reviews.llvm.org/D104915
Summary: This patch is a part of an attempt to obtain more
timer data from the analyzer. In this patch, we try to use
LLVM::TimeRecord to save time before starting the analysis
and to print the time that a specific function takes while
getting analyzed.
The timer data is printed along with the
-analyzer-display-progress outputs.
ANALYZE (Syntax): test.c functionName : 0.4 ms
ANALYZE (Path, Inline_Regular): test.c functionName : 2.6 ms
Authored By: RithikSharma
Reviewer: NoQ, xazax.hun, teemperor, vsavchenko
Reviewed By: NoQ
Differential Revision: https://reviews.llvm.org/D105565
While GNU as only allows the directory form of the .file directive for DWARF v5,
the integrated assembler prefers the directory form on all DWARF versions
(-fdwarf-directory-asm).
We currently set CC1 -fno-dwarf-directory-asm for -fno-integrated-as -gdwarf-5
which may cause the directory entry 0 and the filename entry 0 to be incorrect
(see D105662 and the example below). This patch makes -fno-integrated-as -gdwarf-5 use
-fdwarf-directory-asm as well.
```
cd /tmp/c
before
% clang -g -gdwarf-5 -fno-integrated-as e/a.c -S -o - | grep '\.file.*0'
.file 0 "/tmp/c/e/a.c" md5 0x97e31cee64b4e58a4af8787512d735b6
% clang -g -gdwarf-5 -fno-integrated-as e/a.c -c
% llvm-dwarfdump a.o | grep include_directories
include_directories[ 0] = "/tmp/c/e"
after
% clang -g -gdwarf-5 -fno-integrated-as e/a.c -S -o - | grep '\.file.*0'
.file 0 "/tmp/c" "e/a.c" md5 0x97e31cee64b4e58a4af8787512d735b6
% clang -g -gdwarf-5 -fno-integrated-as e/a.c -c
% llvm-dwarfdump a.o | grep include_directories
include_directories[ 0] = "/tmp/c"
```
Reviewed By: #debug-info, dblaikie, osandov
Differential Revision: https://reviews.llvm.org/D105835
On AIX when there is a pragma pack, or pragma align in effect then zero-width bitfields should pad out to the end of the bitfield container but not increase the alignment requirements of the struct greater then the max field align.
Reviewed By: ZarkoCA
Differential Revision: https://reviews.llvm.org/D105635
Replace the clang builtin function and LLVM intrinsic for
f32x4.demote_zero_f64x2 with combines from normal SDNodes. Also add missing
combines for i32x4.trunc_sat_zero_f64x2_{s,u}, which share the same pattern.
Differential Revision: https://reviews.llvm.org/D105755
This patch implements trap and FP to and from double conversions. The builtins
generate code that mirror what is generated from the XL compiler. Intrinsics
are named conventionally with builtin_ppc, but are aliased to provide the same
builtin names as the XL compiler.
Differential Revision: https://reviews.llvm.org/D103668
We are currently being inconsistent in using signed vs unsigned comparisons for
vec_all_* and vec_any_* interfaces that use vector bool types. For example we
use signed comparison for vec_all_ge(vector signed char, vector bool char) but
unsigned comparison for when the arguments are swapped. GCC and XL use signed
comparison instead. This patch makes clang consistent with itself and with XL
and GCC.
Reviewed By: nemanjai
Differential Revision: https://reviews.llvm.org/D105666
Support Narrowing conversions to bool in if constexpr condition
under C++23 language mode.
Only if constexpr is implemented as the behavior of static_assert
is already conforming. Still need to work on explicit(bool) to
complete support.
The function is supposed to be the equivalent of rint() (as in
round to nearest, ties to even) rather than round() (round to
nearest, ties away from zero). In fact, the instruction we emit
without VSX is vrfin which is correct. However, with VSX we emit
xvrspi which is the equivalent of round() and therefore incorrect.
Since there is no equivalent VSX instruction, simply use vrfin
regardless of availability of VSX.
OpenMP 5.1 added support for writing OpenMP directives using [[]]
syntax in addition to using #pragma and this introduces support for the
new syntax.
In OpenMP, the attributes take one of two forms:
[[omp::directive(...)]] or [[omp::sequence(...)]]. A directive
attribute contains an OpenMP directive clause that is identical to the
analogous #pragma syntax. A sequence attribute can contain either
sequence or directive arguments and is used to ensure that the
attributes are processed sequentially for situations where the order of
the attributes matter (remember:
https://eel.is/c++draft/dcl.attr.grammar#4.sentence-4).
The approach taken here is somewhat novel and deserves mention. We
could refactor much of the OpenMP parsing logic to work for either
pragma annotation tokens or for attribute clauses. It would be a fair
amount of effort to share the logic for both, but it's certainly
doable. However, the semantic attribute system is not designed to
handle the arbitrarily complex arguments that OpenMP directives
contain. Adding support to thread the novel parsed information until we
can produce a semantic attribute would be considerably more effort.
What's more, existing OpenMP constructs are not (often) represented as
semantic attributes. So doing this through Attr.td would be a massive
undertaking that would likely only benefit OpenMP and comes with
additional risks. Rather than walk down that path, I am taking
advantage of the fact that the syntax of the directives within the
directive clause is identical to that of the #pragma form. Once the
parser recognizes that we're processing an OpenMP attribute, it caches
all of the directive argument tokens and then replays them as though
the user wrote a pragma. This reuses the same OpenMP parsing and
semantic logic directly, but does come with a risk if the OpenMP
committee decides to purposefully diverge their pragma and attribute
syntaxes. So, despite this being a novel approach that does token
replay, I think it's actually a better approach than trying to do this
through the declarative syntax in Attr.td.
A number of functions in the header have guards for 64-bit only
that were presumably added as some of the functions in the blocks
use vector __int128 which is only available in 64-bit mode.
A more appropriate guard (__SIZEOF_INT128__) has been added for
those functions since, making the 64-bit guards redundant.
This patch removes those guards as they inadvertently guard code
that uses vector long long which does not actually require 64-bit
mode.
The `-analyzer-display-progress` displayed the function name of the
currently analyzed function. It differs in C and C++. In C++, it
prints the argument types as well in a comma-separated list.
While in C, only the function name is displayed, without the brackets.
E.g.:
C++: foo(), foo(int, float)
C: foo
In crash traces, the analyzer dumps the location contexts, but the
string is not enough for `-analyze-function` in C++ mode.
This patch addresses the issue by dumping the proper function names
even in stack traces.
Reviewed By: NoQ
Differential Revision: https://reviews.llvm.org/D105708
In the spirit of TRegions [0], this patch analyzes a kernel and tracks
if it can be executed in SPMD-mode. If so, we flip the arguments of
the __kmpc_target_init and deinit call to enable the mode. We also
update the `<kernel>_exec_mode` flag to indicate to the runtime we
changed the mode to SPMD.
The code analysis is done interprocedurally by extending the
AAKernelInfo abstract attribute to track SPMD compatibility as well.
[0] https://link.springer.com/chapter/10.1007/978-3-030-28596-8_11
Differential Revision: https://reviews.llvm.org/D102307
In the spirit of TRegions [0], this patch provides a simpler and uniform
interface for a kernel to set up the device runtime. The OMPIRBuilder is
used for reuse in Flang. A custom state machine will be generated in the
follow up patch.
The "surplus" threads of the "master warp" will not exit early anymore
so we need to use non-aligned barriers. The new runtime will not have an
extra warp but also require these non-aligned barriers.
[0] https://link.springer.com/chapter/10.1007/978-3-030-28596-8_11
This was in parts extracted from D59319.
Reviewed By: ABataev, JonChesterfield
Differential Revision: https://reviews.llvm.org/D101976
This reverts commit 3ec88ca60b which reverted e386871e1d due to a asan build
failure.
This patch removes the new lines in the test case which seem to introduce the
failure.
Differential revision: https://reviews.llvm.org/D104898
Replace the clang builtin function and LLVM intrinsic previously used to select
the f64x2.promote_low_f32x4 instruction with custom combines from standard
SelectionDAG nodes. Implement the new combines to share code with the similar
combines for f64x2.convert_low_i32x4_{s,u}. Resolves PR50232.
Differential Revision: https://reviews.llvm.org/D105675
Set the device malloc and free functions as weak,
and move the std headers after device malloc/free
to avoid issues with std malloc/free.
Fixes: SWDEV-293590
Reviewed By: yaxunl
Differential Revision: https://reviews.llvm.org/D105707
If the base is used in a map clause and later we have a memberexpr with
this base, and the member is a pointer, and this pointer is dereferenced
anyhow (subscript, array section, dereference, etc.), such components
should be considered as overlapped, otherwise it may lead to incorrect
size computations, since we try to map a pointee as a part of the whole
struct, which is not true for the pointer members.
Differential Revision: https://reviews.llvm.org/D105562
Reapply with fixes for clang tests.
-----
This is a simple enum attribute. Test changes are because enum
attributes are sorted before type attributes, so mustprogress is
now in a different position.
This change is intended as initial setup. The plan is to add
more semantic checks later. I plan to update the documentation
as more semantic checks are added (instead of documenting the
details up front). Most of the code closely mirrors that for
the Swift calling convention. Three places are marked as
[FIXME: swiftasynccc]; those will be addressed once the
corresponding convention is introduced in LLVM.
Reviewed By: rjmccall
Differential Revision: https://reviews.llvm.org/D95561
This reverts commit 52aeacfbf5.
There isn't full agreement on a path forward yet, but there is agreement that
this shouldn't land as-is. See discussion on https://reviews.llvm.org/D105338
Also reverts unreviewed "[clang] Improve `-Wnull-dereference` diag to be more in-line with reality"
This reverts commit f4877c78c0.
And all the related changes to tests:
This reverts commit 9a0152799f.
This reverts commit 3f7c9cc274.
This reverts commit 329f8197ef.
This reverts commit aa9f58cc2c.
This reverts commit 2df37d5ddd.
This reverts commit a72a441812.
No need to emit private copyfor firstprivate constants in target
regions, we can use the original copy instead.
Differential Revision: https://reviews.llvm.org/D105647
When building the member call to a user conversion function during an
implicit cast, the expression was not being checked for immediate
invocation, so we were never adding the ConstantExpr node to AST.
This would cause the call to the user conversion operator to be emitted
even if it was constantexpr evaluated, and this would even trip an
assert when said user conversion was declared consteval:
`Assertion failed: !cast<FunctionDecl>(GD.getDecl())->isConsteval() && "consteval function should never be emitted", file clang\lib\CodeGen\CodeGenModule.cpp, line 3530`
Fixes PR48855.
Signed-off-by: Matheus Izvekov <mizvekov@gmail.com>
Reviewed By: rsmith
Differential Revision: https://reviews.llvm.org/D105446
- ``externally_initialized`` variables would be initialized or modified
elsewhere. Particularly, CUDA or HIP may have host code to initialize
or modify ``externally_initialized`` device variables, which may not
be explicitly referenced on the device side but may still be used
through the host side interfaces. Not preserving them triggers the
elimination of them in the GlobalDCE and breaks the user code.
Reviewed By: yaxunl
Differential Revision: https://reviews.llvm.org/D105135
D105314 added the abibility choose to use AsmParser for parsing inline
asm. -no-intergrated-as will override this default if specified
explicitly.
If toolchain choose to use MCAsmParser for inline asm, don't pass
the option to disable integrated-as explictly unless set by user.
Reviewed By: #powerpc, shchenz
Differential Revision: https://reviews.llvm.org/D105512
The Microsoft STL currently has some issues with P2266.
We disable it for now in that mode, but we might come back later with a
more targetted approach.
Signed-off-by: Matheus Izvekov <mizvekov@gmail.com>
Reviewed By: aaron.ballman
Differential Revision: https://reviews.llvm.org/D105518
When there is unknown type in a struct in code compiled with
-Wcast-align, the compiler crashes due to
clang::ASTContext::getASTRecordLayout() failing an assert.
Added check that the RecordDecl is valid before calling
getASTRecordLayout().
Named return of a variable with aligned attribute would
trip an assert in case alignment was dependent.
Signed-off-by: Matheus Izvekov <mizvekov@gmail.com>
Reviewed By: rsmith
Differential Revision: https://reviews.llvm.org/D105380
This fixes a gap in the `overloadable` attribute support (K&R declared
functions would get mangled symbol names, but that name wouldn't be
represented in the debug info linkage name field for the function) and
in -funique-internal-linkage-names (this came up in review discussion on
D98799) where K&R static declarations would not get the uniqued linkage
names.
%%%
Transfer the predefined macro, __TOS_AIX__, from the AIX XL C/C++ compilers.
__TOS_AIX__ indicates that the target operating system is AIX.
%%%
Reviewed By: cebowleratibm
Differential Revision: https://reviews.llvm.org/D103587
As v1.0-rc specs say Zvamo is removed from standard extension,
Zvamo has to be specified explicitly.
Reviewed By: evandro
Differential Revision: https://reviews.llvm.org/D105396
Prior to this patch, we always gave priority to constraints that we
actually know about symbols in question. However, these can get
outdated and we can get better results if we look at all possible
sources of knowledge, including sub-expressions.
Differential Revision: https://reviews.llvm.org/D105436
This patch implaments the load and reserve and store conditional
builtins for the PowerPC target, in order to have feature parody with
xlC on AIX.
Differential revision: https://reviews.llvm.org/D105236
Named return of a variable with aligned attribute would
trip an assert in case alignment was dependent.
Signed-off-by: Matheus Izvekov <mizvekov@gmail.com>
Reviewed By: rsmith
Differential Revision: https://reviews.llvm.org/D105380
This option implies -fdump-record-layouts but dumps record layout information with canonical field types, which can be more useful in certain cases when comparing structure layouts.
Reviewed By: stevewan
Differential Revision: https://reviews.llvm.org/D105112
Ignore top-level qualifiers in casts, which fixes issues in reinterpret_cast.
This rule comes from [expr.type]/8.2.2 which explains that casting to a
pr-qualified type should actually cast to the unqualified type. In C++
this is only done for types that aren't classes or arrays.
Fixes: PR49221
Reviewed By: Anastasia
Differential Revision: https://reviews.llvm.org/D102689
Fix offset calculation routines in padding checker to avoid assertion
errors described in bugzilla issue 50426. The fields that are subojbects
of zero size, marked with [[no_unique_address]] or empty bitfields will
be excluded from padding calculation routines.
Reviewed By: NoQ
Differential Revision: https://reviews.llvm.org/D104097
This adds a very basic test in `cuda_with_openmp.cu` that just checks whether the CUDA & OpenMP integrated headers do compile, when a CUDA file is compiled with OpenMP (CPU) enabled.
Thus this basically adds the missing test for https://reviews.llvm.org/D90415.
Reviewed By: jdoerfert
Differential Revision: https://reviews.llvm.org/D105322
Reland of 31859f896.
This change implements new DAG notes GLOBAL_GET/GLOBAL_SET, and
lowering methods for load and stores of reference types from IR
globals. Once the lowering creates the new nodes, tablegen pattern
matches those and converts them to Wasm global.get/set.
Differential Revision: https://reviews.llvm.org/D104797
A user reported an issue to me via email that Clang was accepting some
code that GCC was rejecting. After investigation, it turned out to be a
general problem of us failing to properly reject attributes written in
the type position in C when they don't apply to types. The root cause
was a terminology issue -- we sometimes use "CXX11Attr" to mean [[]] in
C++11 mode and sometimes [[]] in general -- and this came back to bite
us because in this particular case, it really meant [[]] in C++ mode.
I fixed the issue by introducing a new function
AttributeCommonInfo::isStandardAttributeSyntax() to represent [[]] in
either C or C++ mode.
This fix pointed out that we've had the issue in some of our existing
tests, which have all been corrected. This resolves
https://bugs.llvm.org/show_bug.cgi?id=50954.
Before this patch, the dependence of CallExpr was only computed in the
constructor, the dependence bits might not reflect truth -- some arguments might
be not set (nullptr) during this time, e.g. CXXDefaultArgExpr will be set via
the setArg method in the later parsing stage, so we need to recompute the
dependence bits.
This extends the effects of [[ http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2019/p1825r0.html | P1825 ]] to all C++ standards from C++11 up to C++20.
According to Motion 23 from Cologne 2019, P1825R0 was accepted as a Defect Report, so we retroactively apply this all the way back to C++11.
Note that we also remove implicit moves from C++98 as an extension
altogether, since the expanded first overload resolution from P1825
can cause some meaning changes in C++98.
For example it can change which copy constructor is picked when both const
and non-const ones are available.
This also rips out warn_return_std_move since there are no cases where it would be worthwhile to suggest it.
This also fixes a bug with bailing into the second overload resolution
when encountering a non-rvref qualified conversion operator.
This was unnoticed until now, so two new test cases cover these.
Signed-off-by: Matheus Izvekov <mizvekov@gmail.com>
Reviewed By: rsmith
Differential Revision: https://reviews.llvm.org/D104500
It seems like ExprEngine::handleLVectorSplat() was used at only 2
places. It might be better to directly inline them for readability.
It seems like these cases were not covered by tests according to my
coverage measurement, so I'm adding tests as well, demonstrating that no
behavior changed.
Besides that, I'm handling CK_MatrixCast similarly to how the rest of
the unhandled casts are evaluated.
Differential Revision: https://reviews.llvm.org/D105125
Reviewed by: NoQ
Previously `LValueToRValueBitCast`s were modeled in the same way how
a regular `BitCast` was. However, this should not produce an l-value.
Modeling bitcasts accurately is tricky, so it's probably better to
model this expression by binding a fresh conjured value.
The following code should not result in a diagnostic:
```lang=C++
__attribute__((always_inline))
static inline constexpr unsigned int_castf32_u32(float __A) {
return __builtin_bit_cast(unsigned int, __A); // no-warning
}
```
Previously, it reported
`Address of stack memory associated with local variable '__A' returned
to caller [core.StackAddressEscape]`.
Differential Revision: https://reviews.llvm.org/D105017
Reviewed by: NoQ, vsavchenko
We should not error out on non-x86 targets if `-fbasic-block-sections=none` is in effect.
Also, filter it out for GPU-side compilations, as we do with other options not
supported on the GPU.
Differential Revision: https://reviews.llvm.org/D105226
Relevant discussion can be found at: https://lists.llvm.org/pipermail/llvm-dev/2021-January/148197.html
In the existing design, An SCC that contains a coroutine will go through the folloing passes:
Inliner -> CoroSplitPass (fake) -> FunctionSimplificationPipeline -> Inliner -> CoroSplitPass (real) -> FunctionSimplificationPipeline
The first CoroSplitPass doesn't do anything other than putting the SCC back to the queue so that the entire pipeline can repeat.
As you can see, we run Inliner twice on the SCC consecutively without doing any real split, which is unnecessary and likely unintended.
What we really wanted is this:
Inliner -> FunctionSimplificationPipeline -> CoroSplitPass -> FunctionSimplificationPipeline
(note that we don't really need to run Inliner again on the ramp function after split).
Hence the way we do it here is to move CoroSplitPass to the end of the CGSCC pipeline, make it once for real, insert the newly generated SCCs (the clones) back to the pipeline so that they can be optimized, and also add a function simplification pipeline after CoroSplit to optimize the post-split ramp function.
This approach also conforms to how the new pass manager works instead of relying on an adhoc post split cleanup, making it ready for full switch to new pass manager eventually.
By looking at some of the changes to the tests, we can already observe that this changes allows for more optimizations applied to coroutines.
Reviewed By: aeubanks, ChuanqiXu
Differential Revision: https://reviews.llvm.org/D95807
This patch adds a new clang builtin, __arithmetic_fence. The purpose of the
builtin is to provide the user fine control, at the expression level, over
floating point optimization when -ffast-math (-ffp-model=fast) is enabled.
The builtin prevents the optimizer from rearranging floating point expression
evaluation. The new option fprotect-parens has the same effect on
parenthesized expressions, forcing the optimizer to respect the parentheses.
Reviewed By: aaron.ballman, kpn
Differential Revision: https://reviews.llvm.org/D100118
This patch adds unbundling support of an archive file. It takes an
archive file along with a set of offload targets as input.
Output is a device specific archive for each given offload target.
Input archive contains bundled code objects bundled using
clang-offload-bundler. Each generated device specific archive contains
a set of device code object files which are named as
<Parent Bundle Name>-<CodeObject-GPUArch>.
Entries in input archive can be of any binary type which is
supported by clang-offload-bundler, like *.bc. Output archives will
contain files in same type.
Example Usuage:
clang-offload-bundler --unbundle --inputs=lib-generic.a -type=a
-targets=openmp-amdgcn-amdhsa--gfx906,openmp-amdgcn-amdhsa--gfx908
-outputs=devicelib-gfx906.a,deviceLib-gfx908.a
Reviewed By: jdoerfert, yaxunl
Differential Revision: https://reviews.llvm.org/D93525
No need to try to create the default constructor for private copy, it
will be called automatically in the initializer of the declare
reduction. Fixes balance between constructors/destructors calls.
Differential Revision: https://reviews.llvm.org/D105143
We allow branches to join where one holds a managed lock but the other
doesn't, but we can't do so for back edges: because there we can't drop
them from the lockset, as we have already analyzed the loop with the
larger lockset. So we can't allow dropping managed locks on back edges.
We move the managed() check from handleRemovalFromIntersection up to
intersectAndWarn, where we additionally check if we're on a back edge if
we're removing from the first lock set (the entry set of the next block)
but not if we're removing from the second lock set (the exit set of the
previous block). Now that the order of arguments matters, I had to swap
them in one invocation, which also causes some minor differences in the
tests.
Reviewed By: delesley
Differential Revision: https://reviews.llvm.org/D104261
functions implicitly generated by the compiler
These fake functions would cause clang to crash if the changes proposed
in https://reviews.llvm.org/D98799 were made.
Added the option `-altivec-src-compat=[mixed,gcc,xl]`. The default at this time is `mixed`.
The default behavior for clang is for all vector compares to return a scalar unless the vectors being
compared are vector bool or vector pixel. In that case the compare returns a
vector. With the gcc case all vector compares return vectors and in the xl case
all vector compares return scalars.
This patch does not change the default behavior of clang.
This option will be used in future patches to implement behaviour compatibility for the vector bool/pixel types.
Reviewed By: bmahjour
Differential Revision: https://reviews.llvm.org/D103615
the call already has the operand bundle
This bug was causing the call to `replaceAllUsesWith` to crash because
the old call instruction and the new call instruction were the same.
rdar://74957948
Differential Revision: https://reviews.llvm.org/D97824
If a default template type argument is manually specified to be of the default
type, then it is committed when printing the template.
Differential revision: https://reviews.llvm.org/D103040
Fix suggested by Yuanfang Chen:
Non-distinct debuginfo is attached to the function due to the undecorated declaration. Later, when seeing the function definition and `nodebug` attribute, the non-distinct debuginfo should be cleared.
Reviewed By: dblaikie
Differential Revision: https://reviews.llvm.org/D104777
This reverts commit c3fe847f9d.
Tests fail in non-asserts builds because they assume named IR, by the
looks of it (testing for the "entry" label, for instance). I don't know
enough about the update_cc_test_checks.py stuff to know how to manually
fix these tests, so reverting for now.
According to AVR ABI (https://gcc.gnu.org/wiki/avr-gcc), returned struct value
within size 1-8 bytes should be returned directly (via register r18-r25), while
larger ones should be returned via an implicit struct pointer argument.
Reviewed By: dylanmckay
Differential Revision: https://reviews.llvm.org/D99237
CoroElide pass works only when a post-split coroutine is inlined into another post-split coroutine.
In O0, there is no inlining after CoroSplit, and hence no CoroElide can happen.
It's useless to put CoroElide pass in the O0 pipeline and it will never be triggered (unless I miss anything).
Differential Revision: https://reviews.llvm.org/D105066
C++ constructors/destructors need to go through a different constructor to construct a GlobalDecl object in order to retrieve their linkage type. This causes an assert failure in the default constructor of GlobalDecl. I'm chaning it to using the exsiting GlobalDecl object.
Reviewed By: dblaikie
Differential Revision: https://reviews.llvm.org/D102356
This patch adds a new clang builtin, __arithmetic_fence. The purpose of the
builtin is to provide the user fine control, at the expression level, over
floating point optimization when -ffast-math (-ffp-model=fast) is enabled.
The builtin prevents the optimizer from rearranging floating point expression
evaluation. The new option fprotect-parens has the same effect on
parenthesized expressions, forcing the optimizer to respect the parentheses.
Reviewed By: aaron.ballman, kpn
Differential Revision: https://reviews.llvm.org/D100118
Added the option `-altivec-src-compat=[mixed,gcc,xl]`. The default at this time is `mixed`.
The default behavior for clang is for all vector compares to return a scalar unless the vectors being
compared are vector bool or vector pixel. In that case the compare returns a
vector. With the gcc case all vector compares return vectors and in the xl case
all vector compares return scalars.
This patch does not change the default behavior of clang.
This option will be used in future patches to implement behaviour compatibility for the vector bool/pixel types.
Reviewed By: bmahjour
Differential Revision: https://reviews.llvm.org/D103615
This reverts commit 6f3b775c3e.
Test fails flakily, see comments on https://reviews.llvm.org/D103967
Also revert follow-up "[Analyzer] Attempt to fix windows bots test
failure b/c of new-line"
This reverts commit fe0e861a4d.
This test is again failing across multiple bots and passing on others
there is no reliable way to enable it for some of the bots while
disabling for the unsupported ones. Tagging it as unsupported across all
types of Arm 32 bit cores.
We need to mask the immediate to the width of a single vector
rather than 2 vectors. If we use the width of 2 vectors then
any shift larger than the length of 1 vector is going to overflow
the shuffle indices.
Fixes PR50895.
In FreeBSD 14 the project will deprecate the _p special profiling
libraries.
Support for -pg (i.e., mcount) still exists but libraries compiled
with -pg will not be built by default, so stop linking against them.
Reviewed by: Dimitry Andric
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.llvm.org/D104753
The feature was implemented in D99005, but we forgot to add the test
macro.
Reviewed By: Quuxplusone
Differential Revision: https://reviews.llvm.org/D104984
Clang can be configured with a different default unwindlib, for example
gcc. In that case, -lunwind will not be present in the output.
Fix this by explicitly specifying libunwind as the unwindlib.
Differential Revision: https://reviews.llvm.org/D104899
Word on the grapevine was that the committee had some discussion that
ended with unanimous agreement on eliminating relational function pointer comparisons.
We wanted to be bold and just ban all of them cold turkey.
But then we chickened out at the last second and are going for
eliminating just the spaceship overload candidate instead, for now.
See D104680 for reference.
This should be fine and "safe", because the only possible semantic change this
would cause is that overload resolution could possibly be ambiguous if
there was another viable candidate equally as good.
But to save face a little we are going to:
* Issue an "error" for three-way comparisons on function pointers.
But all this is doing really is changing one vague error message,
from an "invalid operands to binary expression" into an
"ordered comparison of function pointers", which sounds more like we mean business.
* Otherwise "warn" that comparing function pointers like that is totally
not cool (unless we are told to keep quiet about this).
Signed-off-by: Matheus Izvekov <mizvekov@gmail.com>
Reviewed By: rsmith
Differential Revision: https://reviews.llvm.org/D104892
This patch adds a module level metadata flag indicating that the module
was compiled with the `-fopenmp` flag. This will make it easier for
passes like OpenMPOpt to determine if it should be run.
Reviewed By: jdoerfert
Differential Revision: https://reviews.llvm.org/D102361
This option is already supported by update_test_checks.py, but it can
also be useful in update_cc_test_checks.py. For example, I'd like to
use it in OpenMP offload codegen tests to check global variables like
`.offload_maptypes*`.
Reviewed By: jdoerfert, arichardson, ggeorgakoudis
Differential Revision: https://reviews.llvm.org/D104714
When clang driver is used with -save-temps to compile OpenCL program,
clang driver first launches clang -cc1 -E to generate preprocessor expansion output,
then launches clang -cc1 with the generated preprocessor expansion output as input
to generate LLVM IR.
Currently clang by default passes "-finclude-default-header" "-fdeclare-opencl-builtins"
in both steps, which causes default header included again in the second step, which
causes error.
This patch let clang not to include default header when input type is preprocessor expansion
output, which fixes the issue.
Reviewed by: Anastasia Stulova
Differential Revision: https://reviews.llvm.org/D104800
Note regarding C++ for OpenCL:
When compiling C++ for OpenCL, DW_LANG_C_plus_plus* is emitted.
There is no DWARF language code defined for C++ for OpenCL as of yet,
but DWARF issue 210514.1 has been raised to request one.
In the mean time, continuing to emit DW_LANG_C_plus_plus* for C++ for
OpenCL allows the potential to distinguish between C++ for OpenCL and
OpenCL C in !DICompileUnit nodes, whereas using DW_LANG_OpenCL for
C++ for OpenCL would prevent this.
This change therefore leaves C++ for OpenCL as-is.
Reviewed By: shchenz, Anastasia
Differential Revision: https://reviews.llvm.org/D104118
Consider the code
```
void f(int a0, int b0, int c)
{
int a1 = a0 - b0;
int b1 = (unsigned)a1 + c;
if (c == 0) {
int d = 7L / b1;
}
}
```
At the point of divisiion by `b1` that is considered to be non-zero,
which results in a new constraint for `$a0 - $b0 + $c`. The type
of this sym is unsigned, however, the simplified sym is `$a0 -
$b0` and its type is signed. This is probably the result of the
inherent improper handling of casts. Anyway, Range assignment
for constraints use this type information. Therefore, we must
make sure that first we simplify the symbol and only then we
assign the range.
Differential Revision: https://reviews.llvm.org/D104844
These allow getting a whole register from a larger lmul. Or
inserting a whole register into a larger lmul register. Fractional
lmuls are not supported as they would require a vslide.
Based on this update to the intrinsic doc
https://github.com/riscv/rvv-intrinsic-doc/pull/99
Reviewed By: HsiangKai
Differential Revision: https://reviews.llvm.org/D104822
sanitize-coverage-old-pm.c is passing intermittently on different
arm v7 machines. This patch moves it to unsupported on all arm 32
targets reporting armv8l core.
copy/dispose helper functions
We found out that these fake functions would cause clang to crash if the
changes proposed in https://reviews.llvm.org/D98799 were made.
The original patch was reverted in f681fd927e
because debug locations were missing in the body of the block byref
helper functions. This patch fixes the bug by calling CreateArtificial
after the calls to StartFunction.
Differential Revision: https://reviews.llvm.org/D104082
sanitize-coverage-old-pm.c started failing on arm 32 bit where
underlying architecture reported is armv8l fore 32bit arm.
This patch XFAILS sanitize-coverage-old-pm.c on armv8l similar
to armv7 and thumbv7.
Although clang is able to defer overloading resolution
diagnostics for common functions. It does not defer
overloading resolution caused diagnostics for overloaded
operators.
This patch extends the existing deferred
diagnostic mechanism and defers a diagnostic caused
by overloaded operator.
Reviewed by: Artem Belevich
Differential Revision: https://reviews.llvm.org/D104505
This ensures that the mangled type names match between C and C++,
which is significant when using -fsanitize=cfi-icall. Ideally we
wouldn't have created this namespace at all, but it's now part of
the ABI (e.g. in mangled names), so we can't change it.
Differential Revision: https://reviews.llvm.org/D104830
Only LLVM-based instrumentation profile is supported on AIX.
And it currently must be used with full LTO.
Reviewed By: hubert.reinterpretcast
Differential Revision: https://reviews.llvm.org/D104803
Non-throwing allocators currently will always get null-check code. However, if the non-throwing allocator is explicitly annotated with returns_nonnull the null check should be elided.
Testing:
ninja check-all
added test case correctly elides
Reviewed By: bruno
Differential Revision: https://reviews.llvm.org/D102820
This fixes issues with various return types(bool/int) and was already
in place for nvptx headers, adjusted to work for amdgcn. This does
not affect hip as the change is guarded with OPENMP_AMDGCN.
Similar to D85879.
Reviewed By: jdoerfert, JonChesterfield, yaxunl
Differential Revision: https://reviews.llvm.org/D104677
According to https://eel.is/c++draft/over.literal
> double operator""_Bq(long double); // OK: does not use the reserved identifier _Bq ([lex.name])
> double operator"" _Bq(long double); // ill-formed, no diagnostic required: uses the reserved identifier _Bq ([lex.name])
Obey that rule by keeping track of the operator literal name status wrt. leading whitespace.
Fix: https://bugs.llvm.org/show_bug.cgi?id=50644
Differential Revision: https://reviews.llvm.org/D104299
The default Altivec ABI was implemented but the clang error for specifying
its use still remains. Users could get around this but not specifying the
type of Altivec ABI but we need to remove the error.
Reviewed By: jsji
Differential Revision: https://reviews.llvm.org/D102094
NFCI, although the test change shows that ConstantExpr::getAsInstruction
is better than the old implementation of createReplacementInstr because
it propagates things like the sdiv "exact" flag.
Differential Revision: https://reviews.llvm.org/D104124
During template instantiation involving templated lambdas, clang
could hit an assertion in `TemplateDeclInstantiator::SubstFunctionType`
since the functions are not associated with any `TypeSourceInfo`:
`assert(OldTInfo && "substituting function without type source info");`
This path is triggered when using templated lambdas like the one added as
a test to this patch. To fix this:
- Create `TypeSourceInfo`s for special members and make sure the template
instantiator can get through all patterns.
- Introduce a `SpecialMemberTypeInfoRebuilder` tree transform to rewrite
such member function arguments. Without this, we get errors like:
`error: only special member functions and comparison operators may be defaulted`
since `getDefaultedFunctionKind` can't properly recognize these functions
as special members as part of `SetDeclDefaulted`.
Fixes PR45828 and PR44848
Differential Revision: https://reviews.llvm.org/D88327
This change adds an option which, in addition to dumping the record
layout as is done by -fdump-record-layouts, causes us to compute the
layout for all complete record types (rather than the as-needed basis
which is usually done by clang), so that we will dump them as well.
This is useful if we are looking for layout differences across large
code bases without needing to instantiate every type we are interested in.
Reviewed By: dexonsmith
Differential Revision: https://reviews.llvm.org/D104484
Fixes bug 50263
When "unused-private-field" flag is on if you have a struct with private
members and only defaulted comparison operators clang will warn about
unused private fields.
If you where to write the comparison operators by hand no warning is
produced.
This is a bug since defaulting a comparison operator uses all private
members .
The fix is simple, in CheckExplicitlyDefaultedFunction just clear the
list of unused private fields if the defaulted function is a comparison
function.
Differential revision: https://reviews.llvm.org/D102186
Summary:
Currently the attributor needs to give up if a function has external linkage.
This means that the optimization introduced in D97818 will only apply to static
functions. This change uses the Attributor to internalize OpenMP device
routines by making a copy of each function with private linkage and replacing
the uses in the module with it. This allows for the optimization to be applied
to any regular function.
Reviewed By: jdoerfert
Differential Revision: https://reviews.llvm.org/D102824
Summary:
Memory globalization is required to maintain OpenMP standard semantics for data sharing between
worker and master threads. The GPU cannot share data between its threads so must allocate global or
shared memory to store the data in. Currently this is implemented fully in the frontend using the
`__kmpc_data_sharing_push_stack` and __kmpc_data_sharing_pop_stack` functions to emulate standard
CPU stack sharing. The front-end scans the target region for variables that escape the region and
must be shared between the threads. Each variable then has a field created for it in a global record
type.
This patch replaces this functinality with a single allocation command, effectively mimicing an
alloca instruction for the variables that must be shared between the threads. This will be much
slower than the current solution, but makes it much easier to optimize as we can analyze each
variable independently and determine if it is not captured. In the future, we can replace these
calls with an `alloca` and small allocations can be pushed to shared memory.
Reviewed By: tianshilei1992
Differential Revision: https://reviews.llvm.org/D97680
The codegen for simd constructs was affected by the presence (or
absence) of the 'monotonic' schedule modifier for worksharing
loops. The modifier is only intended to apply to the scheduling of
chunks for a thread, not iterations of a loop inside a chunk.
In addition, the monotonic modifier was applied to worksharing loops
by default if no schedule clause was present; the referenced part of
the OpenMP 4.5 spec in the code (section 2.7.1) only applies if the
user specified a schedule clause with a static kind but no modifier.
Without a user-specified schedule clause we should default to
nonmonotonic scheduling.
Reviewed By: ABataev
Differential Revision: https://reviews.llvm.org/D103793
The checker contains check for passing a NULL stream argument.
This change should make more easy to identify where the passed pointer
becomes NULL.
Reviewed By: NoQ
Differential Revision: https://reviews.llvm.org/D104640
Without this patch, llvm/utils/update_cc_test_checks.py fails to
perform `--replace-value-regex` replacements when two RUN lines
produce the same output and use the same single FileCheck prefix. The
problem is that replacements in a RUN line's output are not performed
until after comparing against previous RUN lines' output, where
replacements have already been performed. This patch fixes that.
Reviewed By: MaskRay
Differential Revision: https://reviews.llvm.org/D104566
GCC has had this function attribute since GCC 7.1 for this purpose. I
added "no_profile" last week in D104475; rename this to
"no_profile_instrument_function" to improve compatibility with GCC.
Link: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=80223#c11
Reviewed By: MaskRay, aaron.ballman
Differential Revision: https://reviews.llvm.org/D104658
Add runtime functions to detect invalid calls to pure or deleted virtual
functions.
Patch by: Siu Chi Chan
Reviewed by: Yaxun Liu
Differential Revision: https://reviews.llvm.org/D104392
This patch does three things:
- Map the /external:I flag to -isystem
- Add support for the /external:env:<var> flag which reads system
include paths from the <var> environment variable
- Pick up system include dirs EXTERNAL_INCLUDE in addition to the old
INCLUDE environment variable.
Differential revision: https://reviews.llvm.org/D104387
This reverts commit a1449a10db.
Seems like my changes to LNT had no effect -- puzzled.
The 21 tests pass on my sandbox with the clang patch but are
failing in exec time in the bot
This patch changes the ffp-model=precise to enables -ffp-contract=on
(previously -ffp-model=precise enabled -ffp-contract=fast). This is a
follow-up to Andy Kaylor's comments in the llvm-dev discussion
"Floating Point semantic modes". From the same email thread, I put
Andy's distillation of floating point options and floating point modes
into UsersManual.rst
Differential Revision: https://reviews.llvm.org/D74436
This fixes a crash in MallocChecker for the situation when operator new (delete) is invoked via NTTP and makes the behavior of CallContext.getCalleeDecl(Expr) identical to CallEvent.getDecl().
Reviewed By: vsavchenko
Differential Revision: https://reviews.llvm.org/D103025
This implements a more comprehensive fix than was done at D95409.
Instead of excluding just function pointer subobjects, we also
exclude any user-defined function pointer conversion operators.
Signed-off-by: Matheus Izvekov <mizvekov@gmail.com>
Reviewed By: rsmith
Differential Revision: https://reviews.llvm.org/D103855
Support -Wno-frame-larger-than (with no =) and make it properly
interoperate with -Wframe-larger-than. Reject -Wframe-larger-than with
no argument.
We continue to support Clang's old spelling, -Wframe-larger-than=, for
compatibility with existing users of that facility.
In passing, stop the driver from accepting and ignoring
-fwarn-stack-size and make it a cc1-only flag as intended.
Summary:
AIX does not support --as-needed linker options. Remove that option from
aix linker when -lunwind is needed.
For unwinder library, nothing special is needed because by default aix
linker has the as-needed effect for library that's an archive (which is
the case for libunwind on AIX).
Reviewed By: daltenty
Differential Revision: https://reviews.llvm.org/D104314
Fixed crash when doing pointer math on a void pointer.
Also, reworked test to use -verify rather than FileCheck.
Reviewed By: erichkeane
Differential Revision: https://reviews.llvm.org/D104424
D66572 separated BugReport and BugReporter into basic and path sensitive
versions. As a result, checker silencing, which worked deep in the path
sensitive report generation facilities became specific to it. DeadStoresChecker,
for instance, despite being in the static analyzer, emits non-pathsensitive
reports, and was impossible to silence.
This patch moves the corresponding code before the call to the virtual function
generateDiagnosticForConsumerMap (which is overriden by the specific kinds of
bug reporters). Although we see bug reporting as relatively lightweight compared
to the analysis, this will get rid of several steps we used to throw away.
Quoting from D65379:
At a very high level, this consists of 3 steps:
For all BugReports in the same BugReportEquivClass, collect all their error
nodes in a set. With that set, create a new, trimmed ExplodedGraph whose leafs
are all error nodes.
Until a valid report is found, construct a bug path, which is yet another
ExplodedGraph, that is linear from a given error node to the root of the graph.
Run all visitors on the constructed bug path. If in this process the report got
invalidated, start over from step 2.
Checker silencing used to kick in after all of these. Now it does before any of
them :^)
Differential Revision: https://reviews.llvm.org/D102914
Change-Id: Ice42939304516f2bebd05a1ea19878b89c96a25d
This can be seen as a follow up to commit 0ee439b705,
that changed the second argument of __powidf2, __powisf2 and
__powitf2 in compiler-rt from si_int to int. That was to align with
how those runtimes are defined in libgcc.
One thing that seem to have been missing in that patch was to make
sure that the rest of LLVM also handle that the argument now depends
on the size of int (not using the si_int machine mode for 32-bit).
When using __builtin_powi for a target with 16-bit int clang crashed.
And when emitting libcalls to those rtlib functions, typically when
lowering @llvm.powi), the backend would always prepare the exponent
argument as an i32 which caused miscompiles when the rtlib was
compiled with 16-bit int.
The solution used here is to use an overloaded type for the second
argument in @llvm.powi. This way clang can use the "correct" type
when lowering __builtin_powi, and then later when emitting the libcall
it is assumed that the type used in @llvm.powi matches the rtlib
function.
One thing that needed some extra attention was that when vectorizing
calls several passes did not support that several arguments could
be overloaded in the intrinsics. This patch allows overload of a
scalar operand by adding hasVectorInstrinsicOverloadedScalarOpd, with
an entry for powi.
Differential Revision: https://reviews.llvm.org/D99439
Fixes PR50591.
When analyzing classes with members which have user-defined conversion
operators to builtin types, the defaulted comparison analyzer was
picking the member type instead of the type for the builtin operator
which was selected as the best match.
This could either result in wrong comparison category being selected,
or a crash when runtime checks are enabled.
Signed-off-by: Matheus Izvekov <mizvekov@gmail.com>
Reviewed By: rsmith
Differential Revision: https://reviews.llvm.org/D103760
This expands NRVO propagation for more cases:
Parse analysis improvement:
* Lambdas and Blocks with dependent return type can have their variables
marked as NRVO Candidates.
Variable instantiation improvements:
* Fixes crash when instantiating NRVO variables in Blocks.
* Functions, Lambdas, and Blocks which have auto return type have their
variables' NRVO status propagated. For Blocks with non-auto return type,
as a limitation, this propagation does not consider the actual return
type.
This also implements exclusion of VarDecls which are references to
dependent types.
Signed-off-by: Matheus Izvekov <mizvekov@gmail.com>
Reviewed By: Quuxplusone
Differential Revision: https://reviews.llvm.org/D99696
Addition of this pass has been botched.
There is no particular reason why it had to be sold as an inseparable part
of new-pm transition. It was added when old-pm was still the default,
and very *very* few users were actually tracking new-pm,
so it's effects weren't measured.
Which means, some of the turnoil of the new-pm transition
are actually likely regressions due to this pass.
Likewise, there has been a number of post-commit feedback
(post new-pm switch), namely
* https://reviews.llvm.org/D37467#2787157 (regresses HW-loops)
* https://reviews.llvm.org/D37467#2787259 (should not be in middle-end, should run after LSR, not before)
* https://reviews.llvm.org/D95789 (an attempt to fix bad loop backedge metadata)
and in the half year past, the pass authors (google) still haven't found time to respond to any of that.
Hereby it is proposed to backout the pass from the pipeline,
until someone who cares about it can address the issues reported,
and properly start the process of adding a new pass into the pipeline,
with proper performance evaluation.
Furthermore, neither google nor facebook reports any perf changes
from this change, so i'm dropping the pass completely.
It can always be re-reverted should/if anyone want to pick it up again.
Reviewed By: aeubanks
Differential Revision: https://reviews.llvm.org/D104099
This has already been implemented in be2277fbf2 which adds
pragma fp support. This patch just adds test coverage for
regular fast-math flags (PR46165).
Need to capture locals in aligned clauses for the combined directives to
be fix the crash in the codegen.
Differential Revision: https://reviews.llvm.org/D104258
A lambda in a function template may be recursively instantiated. The recursive
lambda will cause a lambda function instantiated multiple times, one inside another.
The inner LocalInstantiationScope should not be marked as MergeWithParentScope
since it already has references to locals properly substituted, otherwise it causes
assertion due to the check for duplicate locals in merged LocalInstantiationScope.
Reviewed by: Richard Smith
Differential Revision: https://reviews.llvm.org/D98068
Added support for CXXRewrittenBinaryOperator as a condition in ranged
for loops. This is a new kind of expression, need to extend support for
C++20 constructs.
It fixes PR49970: range-based for compilation fails for libstdc++ vector
with -std=c++20.
Differential Revision: https://reviews.llvm.org/D104240
The parser code assumes building with C++ compiler and asserts when using clang (not clang++) on C file. I made the code dependent on input language. This shows up for amdgpu target.
Reviewed By: ABataev
Differential Revision: https://reviews.llvm.org/D103899
This change caused build errors related to move-only __block variables,
see discussion on https://reviews.llvm.org/D99696
> This expands NRVO propagation for more cases:
>
> Parse analysis improvement:
> * Lambdas and Blocks with dependent return type can have their variables
> marked as NRVO Candidates.
>
> Variable instantiation improvements:
> * Fixes crash when instantiating NRVO variables in Blocks.
> * Functions, Lambdas, and Blocks which have auto return type have their
> variables' NRVO status propagated. For Blocks with non-auto return type,
> as a limitation, this propagation does not consider the actual return
> type.
>
> This also implements exclusion of VarDecls which are references to
> dependent types.
>
> Signed-off-by: Matheus Izvekov <mizvekov@gmail.com>
>
> Reviewed By: Quuxplusone
>
> Differential Revision: https://reviews.llvm.org/D99696
This also reverts the follow-on change which was hard to tease apart
form the one above:
> "[clang] Implement P2266 Simpler implicit move"
>
> This Implements [[http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2021/p2266r1.html|P2266 Simpler implicit move]].
>
> Signed-off-by: Matheus Izvekov <mizvekov@gmail.com>
>
> Reviewed By: Quuxplusone
>
> Differential Revision: https://reviews.llvm.org/D99005
This reverts commits 1e50c3d785 and
bf20631782.
The `sed` command ensures Windows-specific path separators (single and double backslashes) are replaced by forward slashes in the output file. FileCheck can continue using forward slashes in paths this way.
One of the goals of the dependency scanner is to generate command-lines that can be used to explicitly build modular dependencies of a translation unit. The only modifications to these command-lines should be for the purposes of explicit modular build.
However, the current version of dependency scanner leaks its implementation details into the command-lines.
The first problem is that the `clang-scan-deps` tool adjusts the original textual command-line (adding `-Eonly -M -MT <target> -sys-header-deps -Wno-error -o /dev/null `, removing `--serialize-diagnostics`) in order to set up the `DependencyScanning` library. This has been addressed in D103461, D104012, D104030, D104031, D104033. With these patches, the `DependencyScanning` library receives the unmodified `CompilerInvocation`, sets it up and uses it for the implicit modular build.
Finally, to prevent leaking the implementation details to the resulting command-lines, this patch generates them from the **original** unmodified `CompilerInvocation` rather than from the one that drives the implicit build.
Reviewed By: dexonsmith
Differential Revision: https://reviews.llvm.org/D104036
The `clang-scan-deps` tests for the full output format were written under the assumption that most TUs/modules have the same context hash. This is no longer true, since we're changing the original compilation options. This patch updates the tests, which no longer conflate multiple context hashes into a single FileCheck variable.
Commit d8bab69ead updated the ClangScanDeps/modules.cpp test. The new `{{.*}}` regex is supposed to only match `modules_cdb_input.o`, `a.o` or `b.o`. However, due to non-determinism, this can sometimes also match `modules_cdb_input2.o`, causing match failure on the next line. This commit changes the regex to only match one of the three valid cases.
Buildbot failure: https://lab.llvm.org/buildbot/#/builders/109/builds/16675
The `clang-scan-deps` tool has some logic that parses and modifies the original Clang command-line. The goal is to setup `DependencyOutputOptions` by injecting `-M -MT <target>` and prevent the creation of output files.
This patch moves the logic into the `DependencyScanning` library, and uses the parsed `CompilerInvocation` instead of the raw command-line. The code simpler and can be used from the C++ API as well.
The `-o /dev/null` arguments are not necessary, since the `DependencyScanning` library only runs a preprocessing action, so there's no way it'll produce an actual object file.
Related: The `-M` argument implies `-w`, which would appear on the command-line of modular dependencies even though it was not on the original TU command line (see D104036).
Some related tests were updated.
Reviewed By: arphaman
Differential Revision: https://reviews.llvm.org/D104030
To prevent the creation of diagnostics file, `clang-scan-deps` strips the corresponding command-line argument. This behavior is useful even when using the C++ `DependencyScanner` library.
This patch transforms stripping of command-line in `clang-scan-deps` into stripping of `CompilerInvocation` in `DependencyScanning`.
AFAIK, the `clang-cl` driver doesn't even accept `--serialize-diagnostics`, so I've removed the test. (It would fail with an unknown command-line argument otherwise.)
Note: Since we're generating command-lines for modular dependencies from `CompilerInvocation`, the `--serialize-diagnostics` will be dropped. This was already happening in `clang-scan-deps` before this patch, but it will now happen also when using `DependencyScanning` library directly. This is resolved in D104036.
Reviewed By: dexonsmith, arphaman
Differential Revision: https://reviews.llvm.org/D104012
Update `setConstraint` to simplify existing equivalence classes when a
new constraint is added. In this patch we iterate over all existing
equivalence classes and constraints and try to simplfy them with
simplifySVal. This solves problematic cases where we have two symbols in
the tree, e.g.:
```
int test_rhs_further_constrained(int x, int y) {
if (x + y != 0)
return 0;
if (y != 0)
return 0;
clang_analyzer_eval(x + y == 0); // expected-warning{{TRUE}}
clang_analyzer_eval(y == 0); // expected-warning{{TRUE}}
return 0;
}
```
Differential Revision: https://reviews.llvm.org/D103314
When a translation unit uses a PCH and imports the same modules as the PCH, we'd prefer to resolve to those modules instead of inventing new modules and reporting them as modular dependencies. Since the PCH modules have already been built nudge the compiler to reuse them when deciding whether to build a new module and don't report them as regular modular dependencies.
Depends on D103524 & D103802.
Reviewed By: dexonsmith
Differential Revision: https://reviews.llvm.org/D103526
The `PreprocessOnlyAction` doesn't support loading the AST file of a precompiled header. This is problematic for dependency scanning, since the `#include` manufactured for the PCH is treated as textual. This means the PCH contents get scanned with each TU, which is redundant. Moreover, dependencies of the PCH end up being considered dependency of the TU.
To handle AST file of PCH properly, this patch creates new `FrontendAction` that behaves the same way `PreprocessorOnlyAction` does, but treats the manufactured PCH `#include` as a normal compilation would (by not claiming it only uses a preprocessor and creating the default AST consumer).
The AST file is now reported as a file dependency of the TU.
Depends on D103519.
Reviewed By: Bigcheese
Differential Revision: https://reviews.llvm.org/D103524
It's useful to be able to load explicitly-built PCH files into an implicit build (e.g. during dependency scanning). That's currently impossible, since the explicitly-built PCH has an empty modules cache path, while the current compilation has (and needs to have) a valid path, triggering an error in the `PCHValidator`.
This patch adds a preprocessor option and command-line flag that can be used to omit this check.
Reviewed By: dexonsmith
Differential Revision: https://reviews.llvm.org/D103802
At least LibreOffice has, for mainly historic reasons that would be hard to
change now, a class Any with an overloaded operator >>= that semantically does
not assign to the LHS but rather extracts into the (by-reference) RHS. Which
thus caused false positive -Wunused-but-set-parameter and
-Wunused-but-set-variable after those have been introduced recently.
This change is more conservative about the assumed semantics of overloaded
operators, excluding compound assignment operators but keeping plain operator =
ones. At least for LibreOffice, that strikes a good balance of not producing
false positives but still finding lots of true ones.
(The change to the BinaryOperator case in MaybeDecrementCount is necessary
because e.g. the template f4 test code in warn-unused-but-set-variables-cpp.cpp
turns the += into a BinaryOperator.)
Differential Revision: https://reviews.llvm.org/D103949
This expands NRVO propagation for more cases:
Parse analysis improvement:
* Lambdas and Blocks with dependent return type can have their variables
marked as NRVO Candidates.
Variable instantiation improvements:
* Fixes crash when instantiating NRVO variables in Blocks.
* Functions, Lambdas, and Blocks which have auto return type have their
variables' NRVO status propagated. For Blocks with non-auto return type,
as a limitation, this propagation does not consider the actual return
type.
This also implements exclusion of VarDecls which are references to
dependent types.
Signed-off-by: Matheus Izvekov <mizvekov@gmail.com>
Reviewed By: Quuxplusone
Differential Revision: https://reviews.llvm.org/D99696
Also:
- add driver test (fsanitize-use-after-return.c)
- add basic IR test (asan-use-after-return.cpp)
- (NFC) cleaned up logic for generating table of __asan_stack_malloc
depending on flag.
for issue: https://github.com/google/sanitizers/issues/1394
Reviewed By: vitalybuka
Differential Revision: https://reviews.llvm.org/D104076
Check applied to unbounded (incomplete) arrays and pointers to spot
cases where the computed address is beyond the largest possible
addressable extent of the array, based on the address space in which the
array is delcared, or which the pointer refers to.
Check helps to avoid cases of nonsense pointer math and array indexing
which could lead to linker failures or runtime exceptions. Of
particular interest when building for embedded systems with small
address spaces.
This is version 2 of this patch -- version 1 had some testing issues
due to a sign error in existing code. That error is corrected and
lit test for this chagne is extended to verify the fix.
Originally reviewed/accepted by: aaron.ballman
Original revision: https://reviews.llvm.org/D86796
Reviewed By: aaron.ballman, ebevhan
Differential Revision: https://reviews.llvm.org/D88174
Allow the usage of minor version 0, for hip versions
such as 4.0. Change the default values when performing
version checks.
Reviewed By: yaxunl
Differential Revision: https://reviews.llvm.org/D104062
Check applied to unbounded (incomplete) arrays and pointers to spot
cases where the computed address is beyond the largest possible
addressable extent of the array, based on the address space in which the
array is delcared, or which the pointer refers to.
Check helps to avoid cases of nonsense pointer math and array indexing
which could lead to linker failures or runtime exceptions. Of
particular interest when building for embedded systems with small
address spaces.
This is version 2 of this patch -- version 1 had some testing issues
due to a sign error in existing code. That error is corrected and
lit test for this chagne is extended to verify the fix.
Originally reviewed/accepted by: aaron.ballman
Original revision: https://reviews.llvm.org/D86796
Reviewed By: aaron.ballman, ebevhan
Differential Revision: https://reviews.llvm.org/D88174
Adds the basic instrumentation needed for stack tagging.
Currently does not support stack short granules or TLS stack histories,
since a different code path is followed for the callback instrumentation
we use.
We may simply wait to support these two features until we switch to
a custom calling convention.
Patch By: xiangzhangllvm, morehouse
Reviewed By: vitalybuka
Differential Revision: https://reviews.llvm.org/D102901
This fixes the prioritization of address spaces when choosing a
constructor, stopping them from being considered equally good,
which made the construction of types that could be constructed
by more than one of the constructors.
It does this by preferring the most specific address space,
which is decided by seeing if one of the address spaces is
a superset of the other, and preferring the other.
Fixes: PR50329
Reviewed By: Anastasia
Differential Revision: https://reviews.llvm.org/D102850
-Wframe-larger-than= is an interesting warning; we can't know the frame
size until PrologueEpilogueInsertion (PEI); very late in the compilation
pipeline.
-Wframe-larger-than= was propagated through CC1 as an -mllvm flag, then
was a cl::opt in LLVM's PEI pass; this meant it was dropped during LTO
and needed to be re-specified via -plugin-opt.
Instead, make it part of the IR proper as a module level attribute,
similar to D103048. Introduce -fwarn-stack-size CC1 option.
Reviewed By: rsmith, qcolombet
Differential Revision: https://reviews.llvm.org/D103928
This expands NRVO propagation for more cases:
Parse analysis improvement:
* Lambdas and Blocks with dependent return type can have their variables
marked as NRVO Candidates.
Variable instantiation improvements:
* Fixes crash when instantiating NRVO variables in Blocks.
* Functions, Lambdas, and Blocks which have auto return type have their
variables' NRVO status propagated. For Blocks with non-auto return type,
as a limitation, this propagation does not consider the actual return
type.
This also implements exclusion of VarDecls which are references to
dependent types.
Signed-off-by: Matheus Izvekov <mizvekov@gmail.com>
Reviewed By: Quuxplusone
Differential Revision: https://reviews.llvm.org/D99696
Implementation of the unroll directive introduced in OpenMP 5.1. Follows the approach from D76342 for the tile directive (i.e. AST-based, not using the OpenMPIRBuilder). Tries to use `llvm.loop.unroll.*` metadata where possible, but has to fall back to an AST representation of the outer loop if the partially unrolled generated loop is associated with another directive (because it needs to compute the number of iterations).
Reviewed By: ABataev
Differential Revision: https://reviews.llvm.org/D99459
This patch adds the command line options /permissive and /permissive- to clang-cl. These flags are used in MSVC to enable various /Zc language conformance options at once. In particular, /permissive is used to enable the various non standard behaviour of MSVC, while /permissive- is the opposite.
When either of two command lines are specified they are simply expanded to the various underlying /Zc options. In particular when /permissive is passed it currently expands to:
/Zc:twoPhase- (disable two phase lookup)
-fno-operator-names (disable C++ operator keywords)
/permissive- expands to the opposites of these flags + /Zc:strictStrings (/Zc:strictStrings- does not currently exist). In the future, if any more MSVC workarounds are ever added they can easily be added to the expansion. One is also able to override settings done by permissive. Specifying /permissive- /Zc:twoPhase- will apply the settings from permissive minus, but disables two phase lookup.
Motivation for this patch was mainly parity with MSVC as well as compatibility with Windows SDK headers. The /permissive page from MSVC documents various workarounds that have to be done for the Windows SDK headers [1], when MSVC is used with /permissive-. In these, Microsoft often recommends simply compiling with /permissive for the specified source files. Since some of these also apply to clang-cl (which acts like /permissive- by default mostly), and some are currently implemented as "hacks" within clang that I'd like to remove, adding /permissive and /permissive- to be in full parity with MSVC and Microsofts documentation made sense to me.
[1] https://docs.microsoft.com/en-us/cpp/build/reference/permissive-standards-conformance?view=msvc-160#windows-header-issues
Differential Revision: https://reviews.llvm.org/D103773
When using the -fno-rtti option of the GCC style clang++, using typeid results in an error. The MSVC STL however kindly provides a define flag called _HAS_STATIC_RTTI, which either enables or disables uses of typeid throughout the STL. By default, if undefined, it is set to 1, enabling the use of typeid.
With this patch, _HAS_STATIC_RTTI is set to 0 when -fno-rtti is specified. This way various headers of the MSVC STL like functional can be consumed without compilation failures.
Differential Revision: https://reviews.llvm.org/D103771
This patch adds the command line option -foperator-names which acts as the opposite of -fno-operator-names. With this command line option it is possible to reenable C++ operator keywords on the command line if -fno-operator-names had previously been passed.
Differential Revision: https://reviews.llvm.org/D103749
This patch changes the ffp-model=precise to enables -ffp-contract=on
(previously -ffp-model=precise enabled -ffp-contract=fast). This is a
follow-up to Andy Kaylor's comments in the llvm-dev discussion
"Floating Point semantic modes". From the same email thread, I put
Andy's distillation of floating point options and floating point modes
into UsersManual.rst
Differential Revision: https://reviews.llvm.org/D74436
Clang will create a global value put in constant memory if an aggregate value
is declared firstprivate in the target device. The symbol name only uses the
name of the firstprivate variable, so symbol name conflicts will occur if the
variable is allowed to have different types through templates. An example of
this behvaiour is shown in https://godbolt.org/z/EsMjYh47n. This patch adds the
mangled type name to the symbol to avoid such naming conflicts. This fixes
https://bugs.llvm.org/show_bug.cgi?id=50642.
Reviewed By: ABataev
Differential Revision: https://reviews.llvm.org/D103995
Before this change, CXXDefaultArgExpr would always have
ExprDependence::None. This can lead to issues when, for example, the
inner expression is RecoveryExpr and yet containsErrors() on the default
expression is false.
Differential Revision: https://reviews.llvm.org/D103982
Post-commit feedback for d69c4372bf says the output
may vary between pass managers. This is hopefully a
quick fix, but we might want to investigate how to
better solve this type of problem.
Add a test to verify OpenCL builtin declarations using
OpenCLBuiltins.td.
This test consists of parsing a 60k line generated input file. The
entire test takes about 60s with a debug build on a decent machine.
Admittedly this is not the fastest test, but doesn't seem excessive
compared to other tests in clang/test/Headers (with one of the tests
taking 85s for example).
RFC: https://lists.llvm.org/pipermail/cfe-dev/2021-April/067973.html
Differential Revision: https://reviews.llvm.org/D97869
Added --gpu-bundle-output to control bundling/unbundling output of HIP device compilation.
By default preprocessor expansion, llvm bitcode and assembly are unbundled, code objects are
bundled.
Reviewed by: Artem Belevich, Jan Svoboda
Differential Revision: https://reviews.llvm.org/D101630
1. There is no tests for mabi=ilp32e, and my patch covers that.
2. The tests in riscv-abi.c will show default ABI changes for special archs
in the future, especially the arch with the F but without the D extension.
3. The tests in riscv-arch.c will show default arch changes for abi=ilp32,
which is rv32imacfd currently, but it is better to be rv32imac.
And it is also better for abi=ilp32f defaults to arch=imacf.
Reviewed By: MaskRay, luismarques
Differential Revision: https://reviews.llvm.org/D103878
This completes the series implementing p1099, by adding the feature
macro and updating the web page.
Differential Revision: https://reviews.llvm.org/D102242
Clang checks whether the type given to va_arg will automatically cause
undefined behavior, but this check was issuing false positives for
enumerations in C++. The issue turned out to be because
typesAreCompatible() in C++ checks whether the types are *the same*, so
this uses custom logic if the type compatibility check fails.
This issue was found by a user on code like:
typedef enum {
CURLINFO_NONE,
CURLINFO_EFFECTIVE_URL,
CURLINFO_LASTONE = 60
} CURLINFO;
...
__builtin_va_arg(list, CURLINFO); // false positive warning
Given that C++ defers to C for the rules around va_arg, the behavior
should be the same in both C and C++ and not diagnose because int and
CURLINFO are "compatible enough" types for va_arg.
This renames the expression value categories from rvalue to prvalue,
keeping nomenclature consistent with C++11 onwards.
C++ has the most complicated taxonomy here, and every other language
only uses a subset of it, so it's less confusing to use the C++ names
consistently, and mentally remap to the C names when working on that
context (prvalue -> rvalue, no xvalues, etc).
Renames:
* VK_RValue -> VK_PRValue
* Expr::isRValue -> Expr::isPRValue
* SK_QualificationConversionRValue -> SK_QualificationConversionPRValue
* JSON AST Dumper Expression nodes value category: "rvalue" -> "prvalue"
Signed-off-by: Matheus Izvekov <mizvekov@gmail.com>
Reviewed By: rsmith
Differential Revision: https://reviews.llvm.org/D103720