This patch makes the builtin operand order match the C operand order
for all intrinsics. With this we can use clang_builtin_alias for
all overloaded intrinsics.
This should further reduce the test time for vector intrinsics.
Differential Revision: https://reviews.llvm.org/D101700
The code example:
```
constexpr const char kEta[] = "Eta";
template <const char*, typename T> class Column {};
using quick = Column<kEta,double>;
void lookup() {
quick c1;
c1.ls();
}
```
emits error: no member named 'ls' in 'Column<&kEta, double>'. The patch fixes
the printed type name by not printing the ampersand for array types.
Differential Revision: https://reviews.llvm.org/D36368
AMDGPU backend need to know whether floating point opcodes that support exception
flag gathering quiet and propagate signaling NaN inputs per IEEE754-2008, which is
conveyed by a function attribute "amdgpu-ieee". "amdgpu-ieee"="false" turns this off.
Without this function attribute backend assumes it is on for compute functions.
-mamdgpu-ieee and -mno-amdgpu-ieee are added to Clang to control this function attribute.
By default it is on. -mno-amdgpu-ieee requires -fno-honor-nans or equivalent.
Reviewed by: Matt Arsenault
Differential Revision: https://reviews.llvm.org/D77013
when passing -platform_version to the linker
The use of a valid SDK version is preferred over an empty SDK version
(0.0.0) as the system's runtime might expect the linked binary to contain
a valid SDK version in order for the binary to work correctly
rdar://66795188
This adds the long overdue implementations of these functions
that have been part of the ABI document and are now part of
the "Power Vector Intrinsic Programming Reference" (PVIPR).
The approach is to add new builtins and to emit code with
the fast flag regardless of whether fastmath was specified
on the command line.
Differential revision: https://reviews.llvm.org/D101209
This patch fixes a bug from D89802. For example, without it, Clang
generates x as the debug map name for both x and y in the following
example:
```
#pragma omp target map(to: x, y)
x = y = 1;
```
Reviewed By: jhuber6
Differential Revision: https://reviews.llvm.org/D101564
There already was a check for undeduced and incomplete types, but it
failed to trigger when outer type (SubstTemplateTypeParm in test) looked
fine, but inner type was not.
Differential Revision: https://reviews.llvm.org/D100667
Removed extension begin/end pragma as it has no effect and
it is added unconditionally for all targets.
Differential Revision: https://reviews.llvm.org/D92244
Currently Clang does not add mustprogress to inifinite loops with a
known constant condition, matching C11 behavior. The forward progress
guarantee in C++11 and later should allow us to add mustprogress to any
loop (http://eel.is/c++draft/intro.progress#1).
This allows us to simplify the code dealing with adding mustprogress a
bit.
Reviewed By: aaron.ballman, lebedev.ri
Differential Revision: https://reviews.llvm.org/D96418
Use of bitcast resulted in lanes being swapped for vcreateq with big
endian. Fix this by using vreinterpret. No code change for little
endian. Adds IR lit test.
Differential Revision: https://reviews.llvm.org/D101606
This patch copies implementation from cpuid.h, which preserve base register %rbx around cpuid. It fixes PR50133.
Reviewed By: rnk
Differential Revision: https://reviews.llvm.org/D101338
the function the block is passed to isn't a block pointer type
This patch fixes a bug where a block passed to a function taking a
parameter that doesn't have a block pointer type (e.g., id or reference
to a block pointer) was marked as noescape.
This partially fixes PR50043.
rdar://77030453
Differential Revision: https://reviews.llvm.org/D101097
isn't an ExprWithCleanups
This patch fixes a bug where a temporary ObjC pointer is released before
the end of the full expression.
This fixes PR50043.
rdar://77030453
Differential Revision: https://reviews.llvm.org/D101502
This ensures that the Darwin driver uses a consistent target triple
representation when the triple is printed out to the user.
This reverts the revert commit ab0df6c034.
Differential Revision: https://reviews.llvm.org/D100807
Renaming the option is based on discussions in https://reviews.llvm.org/D101122.
It is normally not a good idea to rename driver flags but this flag is
new enough and obscure enough that it is very unlikely to have adopters.
While we're here also drop the `<kind>` metavar. It's not necessary and
is actually inconsistent with the documentation in
`clang/docs/ClangCommandLineReference.rst`.
Differential Revision: https://reviews.llvm.org/D101491
This patch is child of D89671, contains the clang
implementation to use the OpenMP IRBuilder's section
construct.
Co-author: @anchu-rajendran
Reviewed By: jdoerfert
Differential Revision: https://reviews.llvm.org/D91054
When using the per-target runtime build, it may be desirable to have
different __config_site headers for each target where all targets cannot
share a single configuration.
The layout used for libc++ headers after this change is:
```
include/
c++/
v1/
<libc++ headers except for __config_site>
<target1>/
c++/
v1/
__config_site
<target2>/
c++/
v1/
__config_site
<other targets>
```
This is the most optimal layout since it avoids duplication, the only
headers that's per-target is __config_site, all other headers are
shared across targets. This also means that we no need two
-isystem flags: one for the target-agnostic headers and one for
the target specific headers.
Differential Revision: https://reviews.llvm.org/D89013
The Neon vadd intrinsics were added to the ARMSIMD intrinsic map,
however due to being defined under an AArch64 guard in arm_neon.td,
were not previously useable on ARM. This change rectifies that.
It is important to note that poly128 is not valid on ARM, thus it was
extracted out of the original arm_neon.td definition and separated
for the sake of AArch64.
Reviewed By: DavidSpickett
Differential Revision: https://reviews.llvm.org/D100772
When searching for stores and creating corresponding notes, the
analyzer is more specific about the target region of the store
as opposed to the stored value. While this description was tweaked
for constant and undefined values, it lacked in the most general
case of symbolic values.
This patch tries to find a memory region, where this value is stored,
to use it as a better alias for the value.
rdar://76645710
Differential Revision: https://reviews.llvm.org/D101041
Since we can report memory leaks on one variable, while the originally
allocated object was stored into another one, we should explain
how did it get there.
rdar://76645710
Differential Revision: https://reviews.llvm.org/D100852
When reporting leaks, we try to attach the leaking object to some
variable, so it's easier to understand. Before the patch, we always
tried to use the first variable that stored the object in question.
This can get very confusing for the user, if that variable doesn't
contain that object at the moment of the actual leak. In many cases,
the warning is dismissed as false positive and it is effectively a
false positive when we fail to properly explain the warning to the
user.
This patch addresses the bigest issue in cases like this. Now we
check if the variable still contains the leaking symbolic object.
If not, we look for the last variable to actually hold it and use
that variable instead.
rdar://76645710
Differential Revision: https://reviews.llvm.org/D100839
This patch changes the AArch32 crypto instructions (sha2 and aes) to
require the specific sha2 or aes features. These features have
already been implemented and can be controlled through the command
line, but do not have the expected result (i.e. `+noaes` will not
disable aes instructions). The crypto feature retains its existing
meaning of both sha2 and aes.
Several small changes are included due to the knock-on effect this has:
- The AArch32 driver has been modified to ensure sha2/aes is correctly
set based on arch/cpu/fpu selection and feature ordering.
- Crypto extensions are permitted for AArch32 v8-R profile, but not
enabled by default.
- ACLE feature macros have been updated with the fine grained crypto
algorithms. These are also used by AArch64.
- Various tests updated due to the change in feature lists and macros.
Reviewed By: lenary
Differential Revision: https://reviews.llvm.org/D99079
Language options are not available when a target is being created,
thus, a new method is introduced. Also, some refactoring is done,
such as removing OpenCL feature macros setting from TargetInfo.
Reviewed By: Anastasia
Differential Revision: https://reviews.llvm.org/D101087
Need to respect mapping/privatization of declare target variables in the
target regions if explicitly specified by the user.
Differential Revision: https://reviews.llvm.org/D99530
This is a partial revert of b4537c3f51
based on the discussion in https://reviews.llvm.org/D101194. Rather
than using the getMultiarchTriple, we use the getTripleString.
This is useful in runtimes build for example which currently try to
guess the correct triple where to place libraries in the multiarch
layout. Using this flag, the build system can get the correct triple
directly by querying Clang.
Differential Revision: https://reviews.llvm.org/D101400
- Unsupported Windows to drop backslashes code
- Upgrade to current gcc 10 version
Reviewed By: MaskRay
Differential Revision: https://reviews.llvm.org/D101347
This is a follow-up of e92d2b80c6 ("[Driver] Detect libstdc++ include
paths for native gcc (-m32 and -m64) on Debian i386") for the Debian Hurd
case, which has the same multiarch name reduction from i686 to i386.
i386-linux-gnu is actually Linux-only, so this moves the code of that commit
to Linux.cpp, and adds the same to Hurd.cpp
Reviewed By: MaskRay
Differential Revision: https://reviews.llvm.org/D101331
f263418402 ("[Driver] Gnu.cpp: remove obsoleted i386 triple detection
from end-of-life distribution versions") dropped the i686-gnu gcc path, but
GNU/Hurd's gcc is actually using it, and not i386.
This fixes the gcc path and update the tests to reflect it.
Reviewed By: MaskRay
Differential Revision: https://reviews.llvm.org/D101317
When we report an argument constraint violation, we should track those
other arguments that participate in the evaluation of the violation. By
default, we depend only on the argument that is constrained, however,
there are some special cases like the buffer size constraint that might
be encoded in another argument(s).
Differential Revision: https://reviews.llvm.org/D101358
Refactored diagnostics for OpenCL types to allow their
reuse for templates.
Patch by olestrohm (Ole Strohm)!
Differential Revision: https://reviews.llvm.org/D100860
Commit e3d8ee35e4 ("reland "[DebugInfo] Support to emit debugInfo
for extern variables"") added support to emit debugInfo for
extern variables if requested by the target. Currently, only
BPF target enables this feature by default.
As BPF ecosystem grows, callback function started to get
support, e.g., recently bpf_for_each_map_elem() is introduced
(https://lwn.net/Articles/846504/) with a callback function as an
argument. In the future we may have something like below as
a demonstration of use case :
extern int do_work(int);
long bpf_helper(void *callback_fn, void *callback_ctx, ...);
long prog_main() {
struct { ... } ctx = { ... };
return bpf_helper(&do_work, &ctx, ...);
}
Basically bpf helper may have a callback function and the
callback function is defined in another file or in the kernel.
In this case, we would like to know the debuginfo types for
do_work(), so the verifier can proper verify the safety of
bpf_helper() call.
For the following example,
extern int do_work(int);
long bpf_helper(void *callback_fn);
long prog() {
return bpf_helper(&do_work);
}
Currently, there is no debuginfo generated for extern function do_work().
In the IR, we have,
...
define dso_local i64 @prog() local_unnamed_addr #0 !dbg !7 {
entry:
%call = tail call i64 @bpf_helper(i8* bitcast (i32 (i32)* @do_work to i8*)) #2, !dbg !11
ret i64 %call, !dbg !12
}
...
declare dso_local i32 @do_work(i32) #1
...
This patch added support for the above callback function use case, and
the generated IR looks like below:
...
declare !dbg !17 dso_local i32 @do_work(i32) #1
...
!17 = !DISubprogram(name: "do_work", scope: !1, file: !1, line: 1, type: !18, flags: DIFlagPrototyped, spFlags: DISPFlagOptimized, retainedNodes: !2)
!18 = !DISubroutineType(types: !19)
!19 = !{!20, !20}
!20 = !DIBasicType(name: "int", size: 32, encoding: DW_ATE_signed)
The TargetInfo.allowDebugInfoForExternalVar is renamed to
TargetInfo.allowDebugInfoForExternalRef as now it guards
both extern variable and extern function debuginfo generation.
Differential Revision: https://reviews.llvm.org/D100567
These are intended to mimic warnings available in gcc.
-Wunused-but-set-variable is triggered in the case of a variable which
appears on the LHS of an assignment but not otherwise used.
For instance:
void f() {
int x;
x = 0;
}
-Wunused-but-set-parameter works similarly, but for function parameters
instead of variables.
In C++, they are triggered only for scalar types; otherwise, they are
triggered for all types. This is gcc's behavior.
-Wunused-but-set-parameter is controlled by -Wextra, while
-Wunused-but-set-variable is controlled by -Wunused. This is slightly
different from gcc's behavior, but seems most consistent with clang's
behavior for -Wunused-parameter and -Wunused-variable.
Reviewed By: aeubanks
Differential Revision: https://reviews.llvm.org/D100581
This primarily parses a different set of options and invokes the same
resource compiler as llvm-rc normally. Additionally, it can convert
directly to an object file (which in MSVC style setups is done with the
separate cvtres tool, or by the linker).
(GNU windres also supports other conversions; from coff object file back
to .res, and from .res or object file back to .rc form; that's not yet
implemented.)
The other bigger complication lies in being able to imply or pass the
intended target triple, to let clang find the corresponding mingw sysroot
for finding include files, and for specifying the default output object
machine format.
It can be implied from the tool triple prefix, like
`<triple>-[llvm-]windres` or picked up from the windres option e.g.
`-F pe-x86-64`. In GNU windres, that option takes BFD style format names
such as pe-i386 or pe-x86-64. As libbfd in binutils doesn't support
Windows on ARM, there's no such canonical name for the ARM targets.
Therefore, as an LLVM specific extension, this option is extended to
allow passing full triples, too.
Differential Revision: https://reviews.llvm.org/D100756
This ensures that the Darwin driver uses a consistent target triple
representation when the triple is printed out to the user.
Differential Revision: https://reviews.llvm.org/D100807
The headers shipped with the XMOS XCore compiler expect __xcore__ to be defined.
The __XS1B__ macro, already defined, is for the default subtarget.
No other targets affected.
Default address space (applies when no explicit address space was
specified) maps to generic (4) address space.
Added SYCL named address spaces `sycl_global`, `sycl_local` and
`sycl_private` defined as sub-sets of the default address space.
Static variables without address space now reside in global address
space when compile for SPIR target, unless they have an explicit address
space qualifier in source code.
Differential Revision: https://reviews.llvm.org/D89909
The test added in D97533 (and modified by this patch) has some overly
strict printed metadata ordering requirements, specifically the
interleaving of DILocalVariable nodes and DILocation nodes. Slight changes
in metadata emission can easily break this unfortunately.
This patch stops after clang codegen rather than allowing the coro splitter
to run, and reduces the need for ordering: it picks out the
DILocalVariable nodes being sought, in any order (CHECK-DAG), and doesn't
examine any DILocations. The implicit CHECK-NOT is what's important: the
test seeks to ensure a duplicate set of DILocalVariables aren't emitted in
the same scope.
Differential Revision: https://reviews.llvm.org/D100298
Add option to `clang-scan-deps` to enable/disable generation of command-line arguments with absolute paths. This is essentially a revert of D100533, but with improved naming and added test.
Reviewed By: dexonsmith
Differential Revision: https://reviews.llvm.org/D101051
We only apply `clang_builtin_alias` to non-masked builtins.
Masked builtins could not use `clang_builtin_alias` because the
operand order is different between overloaded intrinsics and builtins.
A bunch of test cases need to be updated.
Differential Revision: https://reviews.llvm.org/D100658
In some cases, we want to provide the alias name for the clang builtins.
For example, the arguments must be constant integers for some RISC-V builtins.
If we use wrapper functions, we could not constrain the arguments be constant
integer. This attribute is used to achieve the purpose.
Besides this, use `clang_builtin_alias` is more efficient than using
wrapper functions. We use this attribute to deal with test time issue
reported in https://bugs.llvm.org/show_bug.cgi?id=49962.
In our downstream testing, it could decrease the testing time from 6.3
seconds to 3.7 seconds for vloxei.c test.
Differential Revision: https://reviews.llvm.org/D100611
The following program winds up with
D->getDefaultArgStorage().getInheritedFrom() == nullptr
during dumping the TemplateTemplateParmDecl corresponding to the
template parameter of i.
template <typename>
struct R;
template <template <typename> class = R>
void i();
This patch fixes the null pointer dereference.
[clang][amdgpu] Use implicit code object version
At present, clang always passes amdhsa-code-object-version on to -cc1. That is
great for certainty over what object version is being used when debugging.
Unfortunately, the command line argument is in AMDGPUBaseInfo.cpp in the amdgpu
target. If clang is used with an llvm compiled with DLLVM_TARGETS_TO_BUILD
that excludes amdgpu, this will be diagnosed (as discovered via D98658):
- Unknown command line argument '--amdhsa-code-object-version=4'
This means that clang, built only for X86, can be used to compile the nvptx
devicertl for openmp but not the amdgpu one. That would shortly spawn fragile
logic in the devicertl cmake to try to guess whether the clang used will work.
This change omits the amdhsa-code-object-version parameter when it matches the
default that AMDGPUBaseInfo.cpp specifies, with a comment to indicate why. As
this is the only part of clang's codegen for amdgpu that depends on the target
in the back end it suffices to build the openmp runtime on most (all?) systems.
It is a non-functional change, though observable in the updated tests and when
compiling with -###. It may cause minor disruption to the amd-stg-open branch.
Revision of D98746, builds on refactor in D101077
Reviewed By: yaxunl
Differential Revision: https://reviews.llvm.org/D101095
Adds new intrinsics for instructions that are in the final SIMD spec but did not
previously have intrinsics. Also updates the names of existing intrinsics to
reflect the final names of the underlying instructions in the spec. Keeps the
old names as deprecated functions to ease the transition to the new names.
Differential Revision: https://reviews.llvm.org/D101112
There are some interfaces in altivec.h that are not compatible
between Clang and XL (although Clang is compatible with GCC).
Currently, we have found 3 but there may be others.
Clang/GCC signatures:
vector double vec_ctf(vector signed long long)
vector double vec_ctf(vector unsigned long long)
vector signed long long vec_cts(vector double)
vector unsigned long long vec_ctu(vector double)
XL signatures:
vector float vec_ctf(vector signed long long)
vector float vec_ctf(vector unsigned long long)
vector signed int vec_cts(vector double)
vector unsigned int vec_ctu(vector double)
This patch provides the XL behaviour under the __XL_COMPAT_ALTIVEC__
macro for users that rely on XL behaviour.
Differential revision: https://reviews.llvm.org/D101130
When an object is allocated in a non-default address space we do not
need to check for a constructor if it is not initialized and has a
trivial constructor (which we won't call then).
Reviewed By: JonChesterfield
Differential Revision: https://reviews.llvm.org/D100929
These are added for compatibility with XLC. They are similar to
vec_cts and vec_ctu except that the result is a doubleword vector
regardless of the parameter type.
In this patch, I provide a detailed explanation for each argument
constraint. This explanation is added in an extra 'note' tag, which is
displayed alongside the warning.
Since these new notes describe clearly the constraint, there is no need
to provide the number of the argument (e.g. 'Arg3') within the warning.
However, I decided to keep the name of the constraint in the warning (but
this could be a subject of discussion) in order to be able to identify
the different kind of constraint violations easily in a bug database
(e.g. CodeChecker).
Differential Revision: https://reviews.llvm.org/D101060
There was a missing isInvalid() check leading to an attempt to
instantiate template with an empty instantiation stack.
Differential Revision: https://reviews.llvm.org/D100675
LLVM should be smarter about *known* malloc's alignment and this knowledge may enable other optimizations.
Originally started as LLVM patch - https://reviews.llvm.org/D100862 but this logic should be really in Clang.
Reviewed By: rjmccall
Differential Revision: https://reviews.llvm.org/D100879
The Linux kernel objtool diagnostic `call without frame pointer save/setup`
arise in multiple instrumentation passes (asan/tsan/gcov). With the mechanism
introduced in D100251, it's trivial to respect the command line
-m[no-]omit-leaf-frame-pointer/-f[no-]omit-frame-pointer, so let's do it.
Fix: https://github.com/ClangBuiltLinux/linux/issues/1236 (tsan)
Fix: https://github.com/ClangBuiltLinux/linux/issues/1238 (asan)
Also document the function attribute "frame-pointer" which is long overdue.
Differential Revision: https://reviews.llvm.org/D101016
Remove the dependence on standard C++ header
for overloaded math functions in HIP header
since standard C++ header is not available for hipRTC.
Reviewed by: Artem Belevich, Justin Lebar
Differential Revision: https://reviews.llvm.org/D100794
This avoids test failures where extra files exist in the tree, such
as the standard library built using the runtimes build.
Differential Revision: https://reviews.llvm.org/D101023
Add restrictions on type layout (PR48099):
- Types passed by pointer or reference must be standard layout types.
- Types passed by value must be POD types.
Patch by olestrohm (Ole Strohm)!
Differential Revision: https://reviews.llvm.org/D100471
https://reviews.llvm.org/D62335 added some C++ for OpenCL specific
builtins to opencl-c.h, but these were not mirrored to the TableGen
builtin functions yet.
The TableGen builtins machinery does not have dedicated version
handling for C++ for OpenCL at the moment: all builtin versioning is
tied to `LangOpts.OpenCLVersion` (i.e., the OpenCL C version). As a
workaround, to add builtins that are only available in C++ for OpenCL,
we define a function extension guarded by the __cplusplus macro.
Differential Revision: https://reviews.llvm.org/D100935
Fixes PR50041.
We found issues with a number of intrinsics when building them with
C++, so it makes sense to guard these tests with some extra RUN lines
to build the tests in C++ mode.
From https://bugs.llvm.org/show_bug.cgi?id=49739:
Currently, `#pragma clang fp` are ignored for matrix types.
For the code below, the `contract` fast-math flag should be added to the generated call to `llvm.matrix.multiply` and `fadd`
```
typedef float fx2x2_t __attribute__((matrix_type(2, 2)));
void foo(fx2x2_t &A, fx2x2_t &C, fx2x2_t &B) {
#pragma clang fp contract(fast)
C = A*B + C;
}
```
Reviewed By: fhahn, mibintc
Differential Revision: https://reviews.llvm.org/D100834
This patch adds new clang tool named amdgpu-arch which uses
HSA to detect installed AMDGPU and report back latter's march.
This tool is built only if system has HSA installed.
The value printed by amdgpu-arch is used to fill -march when
latter is not explicitly provided in -Xopenmp-target.
Reviewed By: JonChesterfield, gregrodgers
Differential Revision: https://reviews.llvm.org/D99949
This revision simplifies Clang codegen for parallel regions in OpenMP GPU target offloading and corresponding changes in libomptarget: SPMD/non-SPMD parallel calls are unified under a single `kmpc_parallel_51` runtime entry point for parallel regions (which will be commonized between target, host-side parallel regions), data sharing is internalized to the runtime. Tests have been auto-generated using `update_cc_test_checks.py`. Also, the revision contains changes to OpenMPOpt for remark creation on target offloading regions.
Reviewed By: jdoerfert, Meinersbur
Differential Revision: https://reviews.llvm.org/D95976
On ELF targets, if a function has uwtable or personality, or does not have
nounwind (`needsUnwindTableEntry`), it marks that `.eh_frame` is needed in the module.
Then, a function gets `.eh_frame` if `needsUnwindTableEntry` or `-g[123]` is specified.
(i.e. If -g[123], every function gets `.eh_frame`.
This behavior is strange but that is the status quo on GCC and Clang.)
Let's take asan as an example. Other sanitizers are similar.
`asan.module_[cd]tor` has no attribute. `needsUnwindTableEntry` returns true,
so every function gets `.eh_frame` if `-g[123]` is specified.
This is the root cause that
`-fno-exceptions -fno-asynchronous-unwind-tables -g` produces .debug_frame
while
`-fno-exceptions -fno-asynchronous-unwind-tables -g -fsanitize=address` produces .eh_frame.
This patch
* sets the nounwind attribute on sanitizer module ctor/dtor.
* let Clang emit a module flag metadata "uwtable" for -fasynchronous-unwind-tables. If "uwtable" is set, sanitizer module ctor/dtor additionally get the uwtable attribute.
The "uwtable" mechanism is generic: synthesized functions not cloned/specialized
from existing ones should consider `Function::createWithDefaultAttr` instead of
`Function::create` if they want to get some default attributes which
have more of module semantics.
Other candidates: "frame-pointer" (https://github.com/ClangBuiltLinux/linux/issues/955https://github.com/ClangBuiltLinux/linux/issues/1238), dso_local, etc.
Differential Revision: https://reviews.llvm.org/D100251
The new layout more closely matches the layout used by other compilers.
This is only used when LLVM_ENABLE_PER_TARGET_RUNTIME_DIR is enabled.
Differential Revision: https://reviews.llvm.org/D100869
This reverts commit 05eeed9691 and after
fixing the impacted lldb tests in 5d1c43f333.
[Driver] Support default libc++ library location on Darwin
Darwin driver currently uses libc++ headers that are part of Clang
toolchain when available (by default ../include/c++/v1 relative to
executable), but it completely ignores the libc++ library itself
because it doesn't pass the location of libc++ library that's part
of Clang (by default ../lib relative to the exceutable) to the linker
always using the system copy of libc++.
This may lead to subtle issues when the compilation fails because the
headers that are part of Clang toolchain are incompatible with the
system library. Either the driver should ignore both headers as well as
the library, or it should always try to use both when available.
This patch changes the driver behavior to do the latter which seems more
reasonable, it makes it easy to test and use custom libc++ build on
Darwin while still allowing the use of system version. This also matches
the Clang driver behavior on other systems.
Differential Revision: https://reviews.llvm.org/D45639
The implicitly generated mappings for allocation/deallocation in mappers
runtime should be mapped as implicit, also no need to clear member_of
flag to avoid ref counter increment. Also, the ref counter should not be
incremented for the very first element that comes from the mapper
function.
Differential Revision: https://reviews.llvm.org/D100673
Clang only defines __VFP_FP__ when the FPU is enabled. However, gcc
defines it unconditionally.
This patch aligns Clang with gcc.
Reviewed By: peter.smith, rengolin
Differential Revision: https://reviews.llvm.org/D100372
The `ppc32` cpu model was introduced a while ago in a9321059b9 as an independent copy of the `ppc` one but was never wired into clang.
Reviewed By: #powerpc, nemanjai
Differential Revision: https://reviews.llvm.org/D100933
FileCheck now gives an error when there's a check for an undefined
variable, which this test does in one of its NOT checks. Fix this by
being a bit looser in what the test checks.
This reverts commit 199c397482.
This time, clang-scan-deps's search for output argument in clang-cl command line will now ignore arguments preceded by "-Xclang".
That way, it won't detect a /o argument in "-Xclang -ivfsoverlay -Xclang /opt/subpath"
Initial patch description:
clang-scan-deps contains some command line parsing and modifications.
This patch adds support for clang-cl command options.
Differential Revision: https://reviews.llvm.org/D92191
Add functionality to assign extensions to types in OpenCLBuiltins.td
and use that information to filter candidates that should not be
exposed if a type is not available.
Differential Revision: https://reviews.llvm.org/D100209
If you gave clang the options `--target=arm-pc-windows-msvc` and
`-march=armv8-a+crypto` together, the crypto extension would not be
enabled in the compilation, and you'd see the following warning
message suggesting that the 'armv8-a' had been ignored:
clang: warning: ignoring extension 'crypto' because the 'armv7-a' architecture does not support it [-Winvalid-command-line-argument]
This happens because Triple::getARMCPUForArch(), for the Win32 OS,
unconditionally returns "cortex-a9" (an Armv7 CPU) regardless of
MArch, which overrides the architecture setting on the command line.
I don't think that the combination of Windows and AArch32 _should_
unconditionally outlaw the use of the crypto extension. MSVC itself
doesn't think so: you can perfectly well compile Thumb crypto code
using its AArch32-targeted compiler.
All the other default CPUs in the same switch statement are
conditional on a particular MArch setting; this is the only one that
returns a particular CPU _regardless_ of MArch. So I've fixed this one
by adding a condition, so that if you ask for an architecture *above*
v7, the default of Cortex-A9 no longer overrides it.
Reviewed By: mstorsjo
Differential Revision: https://reviews.llvm.org/D100937
When llvm-rc invokes clang for preprocessing, it uses a target
triple derived from the default target. The test verifies that
e.g. _WIN32 is defined when preprocessing.
If running clang with e.g. -target ppc64le-windows-msvc, that
particular arch/OS combination isn't hooked up, so _WIN32 doesn't
get defined in that configuration. Therefore, the preprocessing
test fails.
Instead make llvm-rc inspect the architecture of the default target.
If it's one of the known supported architectures, use it as such,
otherwise set a default one (x86_64). (Clang can run preprocessing
with an x86_64 target triple, even if the x86 backend isn't
enabled.)
Also remove superfluous llvm:: specifications on enums in llvm-rc.cpp.
Allow opting out from preprocessing with a command line argument.
Update tests to pass -no-preprocess to make it not try to use clang
(which isn't a build level dependency of llvm-rc), but add a test that
does preprocessing under clang/test/Preprocessor.
Update a few options to allow them both joined (as -DFOO) and separate
(-D BR), as rc.exe allows both forms of them.
With the verbose flag set, this prints the preprocessing command
used (which differs from what rc.exe does).
Tests under llvm/test/tools/llvm-rc only test constructing the
preprocessor commands, while tests under clang/test/Preprocessor test
actually running the preprocessor.
Differential Revision: https://reviews.llvm.org/D100755
This patch adds new clang tool named amdgpu-arch which uses
HSA to detect installed AMDGPU and report back latter's march.
This tool is built only if system has HSA installed.
The value printed by amdgpu-arch is used to fill -march when
latter is not explicitly provided in -Xopenmp-target.
Reviewed By: JonChesterfield, gregrodgers
Differential Revision: https://reviews.llvm.org/D99949
This is another attempt to address the issue introduced in
ae8b2cab67.
We cannot capture InstalledDir because FileCheck doesn't handle
the backslashes correctly, so instead we just consume the entire
path prefix which is what other tests are doing.
Darwin driver currently uses libc++ headers that are part of Clang
toolchain when available (by default ../include/c++/v1 relative to
executable), but it completely ignores the libc++ library itself
because it doesn't pass the location of libc++ library that's part
of Clang (by default ../lib relative to the exceutable) to the linker
always using the system copy of libc++.
This may lead to subtle issues when the compilation fails because the
headers that are part of Clang toolchain are incompatible with the
system library. Either the driver should ignore both headers as well as
the library, or it should always try to use both when available.
This patch changes the driver behavior to do the latter which seems more
reasonable, it makes it easy to test and use custom libc++ build on
Darwin while still allowing the use of system version. This also matches
the Clang driver behavior on other systems.
Differential Revision: https://reviews.llvm.org/D45639
This demotes the apple-a12 CPU selection for arm64e to just be the
last-resort default. Concretely, this means:
- an explicitly-specified -mcpu will override the arm64e default;
a user could potentially pick an invalid CPU that doesn't have
v8.3a support, but that's not a major problem anymore
- arm64e-apple-macos (and variants) will pick apple-m1 instead of
being forced to apple-a12.
apple-m1 has the same level of ISA support as apple-a14,
so this is a straightforward mechanical change. However, that
also means this inherits apple-a14's v8.5a+nobti quirkiness.
rdar://68287159
As reported in PR50025, sometimes we would end up not emitting functions
needed by inline multiversioned variants. This is because we typically
use the 'deferred decl' mechanism to emit these. However, the variants
are emitted after that typically happens. This fixes that by ensuring
we re-run deferred decls after this happens. Also, the multiversion
emission is done recursively to ensure that MV functions that require
other MV functions to be emitted get emitted.
The NSS FileCheck variables at the end of the
CodeGenCXX/split-stacks.cpp clang testcase are off by 1, resulting in
the use of an undefined variable (NSS3). One of the CHECK-NOT is also
redundant because _Z8tnosplitIiEiv uses the same attribute as _Z3foov
without split stack. This commit fixes that.
Reviewed By: ChuanqiXu
Differential Revision: https://reviews.llvm.org/D99839
Re-land the patch with a fix of clang test.
Cost of spill location is computed basing on relative branch frequency
where corresponding spill/reload/copy are located.
While the number itself is highly depends on incoming IR,
the total cost can be used when do some changes in RA.
Revert "Revert "[GreedyRA ORE] Add Cost of spill locations into remark""
This reverts commit 680f3d6de7.
This patch uses the new `CompilerInvocation::generateCC1CommandLine` to generate the full canonical command line for modular dependencies, instead of only appending additional arguments.
Reviewed By: dexonsmith
Differential Revision: https://reviews.llvm.org/D100534
The comment here was introduced in
a3e01cf822 and suggests that we should
handle declaration statements and non-declaration statements the same,
but don't because ProhibitAttributes() can't handle GNU attributes. That
has recently changed, so remove the comment and handle all statements
the same.
Differential Revision: https://reviews.llvm.org/D99936
This patch removes the `-full-command-line` option from `clang-scan-deps`. It's only used with `-format=experimental-full`, where omitting the command lines doesn't make much sense. There are no tests without `-full-command-line`.
Depends on D100531.
Reviewed By: Bigcheese
Differential Revision: https://reviews.llvm.org/D100533
This patch simplifies (and renames) the `appendCommonModuleArguments` function.
It no longer tries to construct the command line for explicitly building modules. Instead, it only performs the DFS traversal of modular dependencies and queries the callbacks to collect paths to `.pcm` and `.modulemap` files.
This makes it more flexible and usable in two contexts:
* Generating additional command line arguments for the main TU in modular build. The `std::vector<std::string>` output parameters can be used to manually generate appropriate command line flags.
* Generate full command line for a module. The output parameters can be the corresponding parts of `CompilerInvocation`. (In a follow-up patch.)
Reviewed By: dexonsmith
Differential Revision: https://reviews.llvm.org/D100531
In baremetal::Linker::ConstructJob, LinkerInput is handled prior to T_Group options,
but on the other side in RISCV::Linker::ConstructJob, it is opposite.
We want it to be consistent whether users are using RISCV::Linker or baremetal::Linker.
Reviewed By: MaskRay
Differential Revision: https://reviews.llvm.org/D100615
This reverts commit fa6b54c44a.
The commited patch broke mlir tests. It seems that mlir tests depend on coroutine function properties set in CoroEarly pass.
Presplit coroutines cannot be inlined. During AlwaysInliner we check if a function is a presplit coroutine, if so we skip inlining.
The presplit coroutine attributes are set in CoroEarly pass.
However in O0 pipeline, AlwaysInliner runs before CoroEarly, so the attribute isn't set yet and will still inline the coroutine.
This causes Clang to crash: https://bugs.llvm.org/show_bug.cgi?id=49920
To fix this, we set the attributes in the Clang front-end instead of in CoroEarly pass.
Reviewed By: rjmccall, ChuanqiXu
Differential Revision: https://reviews.llvm.org/D100282
Presplit coroutines cannot be inlined. During AlwaysInliner we check if a function is a presplit coroutine, if so we skip inlining.
The presplit coroutine attributes are set in CoroEarly pass.
However in O0 pipeline, AlwaysInliner runs before CoroEarly, so the attribute isn't set yet and will still inline the coroutine.
This causes Clang to crash: https://bugs.llvm.org/show_bug.cgi?id=49920
Differential Revision: https://reviews.llvm.org/D100282
clang-scan-deps contains some command line parsing and modifications.
This patch adds support for clang-cl command options.
Differential Revision: https://reviews.llvm.org/D92191
This fixes argument injection in clang command lines, by adding them before "--".
Previously, the arguments were injected at the end of the command line and could be added after "--", which would be wrongly interpreted as input file paths.
This fix is needed for a subsequent patch, see D92191.
Differential Revision: https://reviews.llvm.org/D95099
hipRTC compiles HIP device code at run time. Since the system may not
have development tools installed, when a HIP program is compiled through
hipRTC, there is no standard C or C++ header available. As such, the HIP
headers should not depend on standard C or C++ headers when used
with hipRTC. Basically when hipRTC is used, HIP headers only provides
definitions of HIP device API functions. This is in line with what nvRTC does.
This patch adds support of hipRTC to HIP headers in clang. Basically hipRTC
defines a macro __HIPCC_RTC__ when compile HIP code at run time. When
this macro is defined, HIP headers do not include standard C/C++ headers.
Reviewed by: Artem Belevich
Differential Revision: https://reviews.llvm.org/D100652
GCC 8 introduced these new pragmas to control loop unrolling. We should support them for compatibility reasons and the implementation itself requires few lines of code, since everything needed is already implemented for #pragma unroll/nounroll.
Add device variables to llvm.compiler.used if they are
ODR-used by either host or device functions.
This is necessary to prevent them from being
eliminated by whole-program optimization
where the compiler has no way to know a device
variable is used by some host code.
Reviewed by: Artem Belevich
Differential Revision: https://reviews.llvm.org/D98814
The internalization pass only internalizes global variables
with no users. If the global variable has some dead user,
the internalization pass will not internalize it.
To be able to internalize global variables with dead
users, a global dce pass is needed before the
internalization pass.
This patch adds that.
Reviewed by: Artem Belevich, Matt Arsenault
Differential Revision: https://reviews.llvm.org/D98783
If a module contains errors (ie. it was built with
-fallow-pcm-with-compiler-errors and had errors) and was from the module
cache, it is marked as out of date - see
a2c1054c30.
When a module is imported multiple times in the one compile, this caused
it to be recompiled each time - removing the existing buffer from the
module cache and replacing it. This results in various errors further
down the line.
Instead, only mark the module as out of date if it isn't already
finalized in the module cache.
Reviewed By: akyrtzi
Differential Revision: https://reviews.llvm.org/D100619
Have funcattrs expand all implied attributes into the IR. This expands the infrastructure from D100400, but for definitions not declarations this time.
Somewhat subtly, this mostly isn't semantic. Because the accessors did the inference, any client which used the accessor was already getting the stronger result. Clients that directly checked presence of attributes (there are some), will see a stronger result now.
The old behavior can end up quite confusing for two reasons:
* Without this change, we have situations where function-attrs appears to fail when inferring an attribute (as seen by a human reading IR), but that consuming code will see that it should have been implied. As a human trying to sanity check test results and study IR for optimization possibilities, this is exceeding error prone and confusing. (I'll note that I wasted several hours recently because of this.)
* We can have transforms which trigger without the IR appearing (on inspection) to meet the preconditions. This change doesn't prevent this from happening (as the accessors still involve multiple checks), but it should make it less frequent.
I'd argue in favor of deleting the extra checks out of the accessors after this lands, but I want that in it's own review as a) it's purely stylistic, and b) I already know there's some disagreement.
Once this lands, I'm also going to do a cleanup change which will delete some now redundant duplicate predicates in the inference code, but again, that deserves to be a change of it's own.
Differential Revision: https://reviews.llvm.org/D100226
Use the target-independent @llvm.fptosi and @llvm.fptoui intrinsics instead.
This includes removing the instrinsics for i32x4.trunc_sat_zero_f64x2_{s,u},
which are now represented in IR as a saturating truncation to a v2i32 followed by
a concatenation with a zero vector.
Differential Revision: https://reviews.llvm.org/D100596
We had verified the correctness of all intrinsics in downstream, so
dropping the assembly tests to decrease the check-clang time.
It would remove 1/3 of the RUN lines.
https://reviews.llvm.org/D99151#2654154 mentions why we need to have
the ASM tests before.
Reviewed By: craig.topper
Differential Revision: https://reviews.llvm.org/D100617
For combined worksharing directives need to emit the temp arrays outside
of the parallel region and update them in the master thread only.
Differential Revision: https://reviews.llvm.org/D100187
This patch adds new clang tool named amdgpu-arch which uses
HSA to detect installed AMDGPU and report back latter's march.
This tool is built only if system has HSA installed.
The value printed by amdgpu-arch is used to fill -march when
latter is not explicitly provided in -Xopenmp-target.
Reviewed By: JonChesterfield, gregrodgers
Differential Revision: https://reviews.llvm.org/D99949
This is a Clang-only change and depends on the existing "musttail"
support already implemented in LLVM.
The [[clang::musttail]] attribute goes on a return statement, not
a function definition. There are several constraints that the user
must follow when using [[clang::musttail]], and these constraints
are verified by Sema.
Tail calls are supported on regular function calls, calls through a
function pointer, member function calls, and even pointer to member.
Future work would be to throw a warning if a users tries to pass
a pointer or reference to a local variable through a musttail call.
Reviewed By: rsmith
Differential Revision: https://reviews.llvm.org/D99517
When we pass a AArch64 Homogeneous Floating-Point
Aggregate (HFA) argument with increased alignment
requirements, for example
struct S {
__attribute__ ((__aligned__(16))) double v[4];
};
Clang uses `[4 x double]` for the parameter, which is passed
on the stack at alignment 8, whereas it should be at
alignment 16, following Rule C.4 in
AAPCS (https://github.com/ARM-software/abi-aa/blob/master/aapcs64/aapcs64.rst#642parameter-passing-rules)
Currently we don't have a way to express in LLVM IR the
alignment requirements of the function arguments. The align
attribute is applicable to pointers only, and only for some
special ways of passing arguments (e..g byval). When
implementing AAPCS32/AAPCS64, clang resorts to dubious hacks
of coercing to types, which naturally have the needed
alignment. We don't have enough types to cover all the
cases, though.
This patch introduces a new use of the stackalign attribute
to control stack slot alignment, when and if an argument is
passed in memory.
The attribute align is left as an optimizer hint - it still
applies to pointer types only and pertains to the content of
the pointer, whereas the alignment of the pointer itself is
determined by the stackalign attribute.
For byval arguments, the stackalign attribute assumes the
role, previously perfomed by align, falling back to align if
stackalign` is absent.
On the clang side, when passing arguments using the "direct"
style (cf. `ABIArgInfo::Kind`), now we can optionally
specify an alignment, which is emitted as the new
`stackalign` attribute.
Patch by Momchil Velikov and Lucas Prates.
Differential Revision: https://reviews.llvm.org/D98794
The documentation says that for variadic functions, all composites
are treated similarly, no special handling of HFAs/HVAs, not even
for the fixed arguments of a variadic function.
Differential Revision: https://reviews.llvm.org/D100467
Being lazy with printing the banner seems hard to reason with, we should print it
unconditionally first (it could also lead to duplicate banners if we
have multiple functions in -filter-print-funcs).
The printIR() functions were doing too many things. I separated out the
call from PrintPassInstrumentation since we were essentially doing two
completely separate things in printIR() from different callers.
There were multiple ways to generate the name of some IR. That's all
been moved to getIRName(). The printing of the IR name was also
inconsistent, now it's always "IR Dump on $foo" where "$foo" is the
name. For a function, it's the function name. For a loop, it's what's
printed by Loop::print(), which is more detailed. For an SCC, it's the
list of functions in parentheses. For a module it's "[module]", to
differentiate between a possible SCC with a function called "module".
To preserve D74814, we have to check if we're going to print anything at
all first. This is unfortunate, but I would consider this a special
case that shouldn't be handled in the core logic.
Reviewed By: jamieschmeiser
Differential Revision: https://reviews.llvm.org/D100231
Test Plan: using kernel ASAN and MSAN implementations in FreeBSD
Reviewed By: emaste, dim, arichardson
Differential Revision: https://reviews.llvm.org/D98286
Double square bracket attribute arguments can be arbitrarily complex,
and the attribute argument parsing logic recovers by skipping tokens.
As a fallback recovery mechanism, parse recovery stops before reading a
semicolon. This could lead to an infinite loop in the attribute list
parsing logic.
Similar to variables with an initializer, this is never valid in
standard C, so we can safely constant-fold as an extension. I ran into
this construct in a couple proprietary codebases.
While I'm here, drive-by fix for 090dd647: we should only fold variables
with VLA types, not arbitrary variably modified types.
Differential Revision: https://reviews.llvm.org/D98363
This reverts commit ab98f2c712 and 98eea392cd.
It includes a fix for the clang test which triggered the revert. I failed to notice this one because there was another AMDGPU llvm test with a similiar name and the exact same text in the error message. Odd. Since only one build bot reported the clang test, I didn't notice that one.
Removes the builtins and intrinsics used to opt in to using these instructions
and replaces them with normal ISel patterns now that they are no longer
prototypes.
Differential Revision: https://reviews.llvm.org/D100402
After https://reviews.llvm.org/D90484 libclang is unable to read a serialized diagnostic file
which contains a diagnostic which came from a file with an empty filename. The reason being is
that the serialized diagnostic reader is creating a virtual file for the "" filename, which now
fails after the changes in https://reviews.llvm.org/D90484. This patch restores the previous
behavior in getVirtualFileRef by allowing it to construct a file entry ref with an empty name by
pretending its name is "." so that the directory entry can be created.
Differential Revision: https://reviews.llvm.org/D100428
Add a custom DAG combine and ISD opcode for detecting patterns like
(uint_to_fp (extract_subvector ...))
before the extract_subvector is expanded to ensure that they will ultimately
lower to f64x2.convert_low_i32x4_{s,u} instructions. Since these instructions
are no longer prototypes and can now be produced via standard IR, this commit
also removes the target intrinsics and builtins that had been used to prototype
the instructions.
Differential Revision: https://reviews.llvm.org/D100425
Now that these instructions are no longer prototypes, we do not need to be
careful about keeping them opt-in and can use the standard LLVM infrastructure
for them. This commit removes the bespoke intrinsics we were using to represent
these operations in favor of the corresponding target-independent intrinsics.
The clang builtins are preserved because there is no standard way to easily
represent these operations in C/C++.
For consistency with the scalar codegen in the Wasm backend, the intrinsic used
to represent {f32x4,f64x2}.nearest is @llvm.nearbyint even though
@llvm.roundeven better captures the semantics of the underlying Wasm
instruction. Replacing our use of @llvm.nearbyint with use of @llvm.roundeven is
left to a potential future patch.
Differential Revision: https://reviews.llvm.org/D100411
Consider the following set of files:
a.cc:
#include "a.h"
a.h:
#ifndef A_H
#define A_H
#include "b.h"
#include "c.h" // This gets "skipped".
#endif
b.h:
#ifndef B_H
#define B_H
#include "c.h"
#endif
c.h:
#ifndef C_H
#define C_H
void c();
#endif
And the output of the -H option:
$ clang -c -H a.cc
. ./a.h
.. ./b.h
... ./c.h
Note that the include of c.h in a.h is not shown in the output (GCC does the
same). This is because of the include guard optimization: clang knows c.h is
covered by an include guard which is already defined, so when it sees the
include in a.h, it skips it. The same would have happened if #pragma once were
used instead of include guards.
However, a.h *does* include c.h, and it may be useful to show that in the -H
output. This patch adds a flag for doing that.
Differential revision: https://reviews.llvm.org/D100480
ICC permits this, and after some extensive testing it looks like we can
support this with very little trouble. We intentionally don't choose to
do this with attribute-target (despite it likely working as well!)
because GCC does not support that, and introducing said
incompatibility doesn't seem worth it.
Aggregate types over 16 bytes are passed by reference.
Contrary to the x86_64 ABI, smaller structs with an odd (non power
of two) are padded and passed in registers.
Differential Revision: https://reviews.llvm.org/D100374
According to i386 System V ABI:
1. when __m256 are required to be passed on the stack, the stack pointer must be aligned on a 0 mod 32 byte boundary at the time of the call.
2. when __m512 are required to be passed on the stack, the stack pointer must be aligned on a 0 mod 64 byte boundary at the time of the call.
The current method of clang passing __m512 parameter are as follow:
1. when target supports avx512, passing it with 64 byte alignment;
2. when target supports avx, passing it with 32 byte alignment;
3. Otherwise, passing it with 16 byte alignment.
Passing __m256 parameter are as follow:
1. when target supports avx or avx512, passing it with 32 byte alignment;
2. Otherwise, passing it with 16 byte alignment.
This pach will passing __m128/__m256/__m512 following i386 System V ABI and
apply it to Linux only since other System V OS (e.g Darwin, PS4 and FreeBSD) don't
want to spend any effort dealing with the ramifications of ABI breaks at present.
Differential Revision: https://reviews.llvm.org/D78564
This change splits '-Wtautological-unsigned-zero-compare' by reporting
char-expressions-interpreted-as-unsigned under a separate diagnostic
'-Wtautological-unsigned-char-zero-compare'. This is beneficial for
projects that want to enable '-Wtautological-unsigned-zero-compare' but at
the same time want to keep code portable for platforms with char being
signed or unsigned, such as Chromium.
Differential Revision: https://reviews.llvm.org/D99808
This fixes a subtle issue where:
svprf(pg, ptr, SV_ALL /*is sv_pattern instead of sv_prfop*/)
would be quietly accepted. With this change, the function declaration
guards that the third parameter is a `enum sv_prfop`. Previously `svprf`
would map directly to `__builtin_sve_svprfb`, which accepts the enum
operand as a signed integer and only checks that the incoming range is
valid, meaning that SV_ALL would be discarded as being outside the valid
immediate range, but would have allowed SV_VL1 without issuing a warning
(C) or error (C++).
Reviewed By: c-rhodes
Differential Revision: https://reviews.llvm.org/D100297
I recently ran into issues with aggregates and inheritance, I'm using
it for creating a type-safe library where most of the types are build
over "tagged" std::array. After bit of cleaning and enabling -Wall
-Wextra -pedantic I noticed clang only in my pipeline gives me warning.
After a bit of focusing on it I found it's not helpful, and contemplate
disabling the warning all together. After a discussion with other
library authors I found it's bothering more people and decided to fix
it.
Removes this warning:
template<typename T, int N> struct StdArray {
T contents[N];
};
template<typename T, int N> struct AggregateAndEmpty : StdArray<T,N> { };
AggregateAndEmpty<int, 3> p = {1, 2, 3}; // <-- warning here about omitted braces
The previous implementation was insufficient for checking statement
attribute mutual exclusion because attributed statements do not collect
their attributes one-at-a-time in the same way that declarations do. So
the design that was attempting to check for mutual exclusion as each
attribute was processed would not ever catch a mutual exclusion in a
statement. This was missed due to insufficient test coverage, which has
now been added for the [[likely]] and [[unlikely]] attributes.
The new approach is to check all of attributes that are to be applied
to the attributed statement in a group. This required generating
another DiagnoseMutualExclusions() function into AttrParsedAttrImpl.inc.
Adds the __clang_literal_encoding__ and __clang_wide_literal_encoding__
predefined macros to expose the encoding used for string literals to
the preprocessor.
These proposals make the same changes to both C++ and C and remove a
restriction on standard attributes appearing multiple times in the same
attribute list.
We could warn on the duplicate attributes, but do not. This is for
consistency as we do not warn on attributes duplicated within the
attribute specifier sequence. If we want to warn on duplicated
standard attributes, we should do so both for both situations:
[[foo, foo]] and [[foo]][[foo]].
Clang currently has a bug where it allows you to write [[foo bar]] and
both attributes are silently accepted. This patch corrects the comma
parsing rules for such attributes and handles the test case fallout, as
a few tests were accidentally doing this.
The existing Windows Itanium patches for dllimport/export
behaviour w.r.t vtables/rtti can't be adopted for PS4 due to
backwards compatibility reasons (see comments on
https://reviews.llvm.org/D90299).
This commit adds our PS4 scheme for this to Clang.
Differential Revision: https://reviews.llvm.org/D93203
This patch changes the builtin prototype to use 'b' (boolean) instead
of the default integer element type. That fixes the dup/dupq intrinsics
when compiling with C++.
This patch also fixes one of the defines for __ARM_FEATURE_SVE2_BITPERM.
Reviewed By: kmclaughlin
Differential Revision: https://reviews.llvm.org/D100294
Retry of 330619a3a6 that includes a clang test update.
Original commit message:
If we run passes before lowering llvm.expect intrinsics to metadata,
then those passes have no way to act on the hints provided by llvm.expect.
SimplifyCFG is the known offender, and we made it smarter about profile
metadata in D98898 <https://reviews.llvm.org/D98898>.
In the motivating example from https://llvm.org/PR49336 , this means we
were ignoring the recommended method for a programmer to tell the compiler
that a compare+branch is expensive. This change appears to solve that case -
the metadata survives to the backend, the compare order is as expected in IR,
and the backend does not do anything to reverse it.
We make the same change to the old pass manager to keep things synchronized.
Differential Revision: https://reviews.llvm.org/D100213
Most text processing commands (eg. grep, awk) have a maximum line length limit on z/OS. The current method of using cc -E & grep fails on z/OS because of this limit. I'm changing the command to create the long line in the response file to use python. This avoids the possibility of any tools blocking the generation of the large response file. This also eliminates the need for the extra file.
Reviewed By: abhina.sreeskantharajan
Differential Revision: https://reviews.llvm.org/D100197
This patch fixes an issue with the SVE prefetch and qinc/qdec intrinsics
that take an `enum` argument, but where the builtin prototype encodes
these as `int`. Some code in SemaDecl found the mismatch and chose
to forget about the builtin altogether, which meant that any future
code using that builtin would fail. The code that forgets about the
builtin was actually obsolete after D77491 and should have been removed.
This patch now removes that code.
This patch also fixes another issue with the SVE prefetch intrinsic
when built with C++, where the builtin didn't accept the correct
pointer type, which should be `const void *`.
Reviewed By: tambre
Differential Revision: https://reviews.llvm.org/D100046
The .rgba vector component accessors are supported in OpenCL C 3.0.
Previously, the diagnostic would check `OpenCLVersion` for version 2.2
(value 220) and report those accessors are an OpenCL 2.2 feature.
However, there is no "OpenCL C version 2.2", so change the check and
diagnostic text to 3.0 only.
A spurious `OpenCLVersion` argument was passed into the diagnostic;
remove that.
Differential Revision: https://reviews.llvm.org/D99969
Summary: The tags DW_LANG_C_plus_plus_14 and DW_LANG_C_plus_plus_11, introduced in Dwarf-5, are unexpected in previous versions. Fixing the mismathing doesn't have any drawbacks for any other debuggers, but helps dbx.
Reviewed By: aprantl, shchenz
Differential Revision: https://reviews.llvm.org/D99250
The first one is the real parameters of the coroutine function, the
other one just for copying parameters to the coroutine frame.
Considering the following c++ code:
```
struct coro {
...
};
coro foo(struct test & t) {
...
co_await suspend_always();
...
co_await suspend_always();
...
co_await suspend_always();
}
int main(int argc, char *argv[]) {
auto c = foo(...);
c.handle.resume();
...
}
```
Function foo is the standard coroutine function, and it has only
one parameter named t (ignoring this at first),
when we use the llvm code to compile this function, we can get the
following ir:
```
!2921 = distinct !DISubprogram(name: "foo", linkageName:
"_ZN6Object3fooE4test", scope: !2211, file: !45, li\
ne: 48, type: !2329, scopeLine: 48, flags: DIFlagPrototyped |
DIFlagAllCallsDescribed, spFlags: DISPFlagDefi\
nition | DISPFlagOptimized, unit: !44, declaration: !2328,
retainedNodes: !2922)
!2924 = !DILocalVariable(name: "t", arg: 2, scope: !2921, file: !45,
line: 48, type: !838)
...
!2926 = !DILocalVariable(name: "t", scope: !2921, type: !838, flags:
DIFlagArtificial)
```
We can find there are two `the same` DIVariable named t in the same
dwarf scope for foo.resume.
And when we try to use llvm-dwarfdump to dump the dwarf info of this
elf, we get the following output:
```
0x00006684: DW_TAG_subprogram
DW_AT_low_pc (0x00000000004013a0)
DW_AT_high_pc (0x00000000004013a8)
DW_AT_frame_base (DW_OP_reg7 RSP)
DW_AT_object_pointer (0x0000669c)
DW_AT_GNU_all_call_sites (true)
DW_AT_specification (0x00005b5c "_ZN6Object3fooE4test")
0x000066a5: DW_TAG_formal_parameter
DW_AT_name ("t")
DW_AT_decl_file ("/disk1/yifeng.dongyifeng/my_code/llvm/build/bin/coro-debug-1.cpp")
DW_AT_decl_line (48)
DW_AT_type (0x00004146 "test")
0x000066ba: DW_TAG_variable
DW_AT_name ("t")
DW_AT_type (0x00004146 "test")
DW_AT_artificial (true)
```
The elf also has two 't' in the same scope.
But unluckily, it might let the debugger
confused. And failed to print parameters for O0 or above.
This patch will make coroutine parameters and move
parameters use the same DIVar and try to fix the problems
that I mentioned before.
Test Plan: check-clang
Reviewed By: aprantl, jmorse
Differential Revision: https://reviews.llvm.org/D97533
1. Redefine vpopc and vfirst IR intrinsic so it could adapt on
clang tablegen generator which always appends a type for vl
in IntrinsicType of clang codegen.
2. Remove `c` type transformer and add `u` and `l` for unsigned long
and long type.
Authored-by: Roger Ferrer Ibanez <rofirrim@gmail.com>
Co-Authored-by: Zakk Chen <zakk.chen@sifive.com>
Reviewed By: craig.topper
Differential Revision: https://reviews.llvm.org/D100120
Support the following instructions.
1. Mask load and store
2. Vector Strided Instructions
3. Vector Indexed Store Instructions
Authored-by: Roger Ferrer Ibanez <rofirrim@gmail.com>
Co-Authored-by: Zakk Chen <zakk.chen@sifive.com>
Reviewed By: craig.topper
Differential Revision: https://reviews.llvm.org/D99965
This implements C-style type conversions for matrix types, as specified
in clang/docs/MatrixTypes.rst.
Fixes PR47141.
Reviewed By: fhahn
Differential Revision: https://reviews.llvm.org/D99037
I have been trying to statically find and analyze all calls to heap
allocation functions to determine how many of them use sizes known at
compile time vs only at runtime. While doing so I saw that quite a few
projects use replaceable function pointers for heap allocation and noticed
that clang was not able to annotate functions pointers with alloc_size.
I have changed the Sema checks to allow alloc_size on all function pointers
and typedefs for function pointers now and added checks that these
attributes are propagated to the LLVM IR correctly.
With this patch we can also compute __builtin_object_size() for calls to
allocation function pointers with the alloc_size attribute.
Reviewed By: aaron.ballman, erik.pilkington
Differential Revision: https://reviews.llvm.org/D55212
This reworks a small set of tests, as preparatory work for implementing
P2266.
* Run for more standard versions, including c++2b.
* Normalize file names and run commands.
* Adds some extra tests.
New Coroutine tests taken from Aaron Puchert's D68845.
Signed-off-by: Matheus Izvekov <mizvekov@gmail.com>
Reviewed By: thakis
Differential Revision: https://reviews.llvm.org/D99225
This ensures these types have distinct names if they are distinct types
(eg: if one is an instantiation with a type in one inline namespace, and
another from a type with the same simple name, but in a different inline
namespace).
The backend can't handle this and will throw a fatal error from
type legalization. It's easy enough to fix that for this intrinsic
by just splitting the IR intrinsic since it works on individual bytes.
There will be other intrinsics in the future that would be harder
to support through splitting, for example grev, gorc, and shfl. Those
would require a compare and a select be inserted to check the MSB of
their control input.
This patch adds support for preventing this in the frontend with
a nice diagnostic.
Reviewed By: frasercrmck
Differential Revision: https://reviews.llvm.org/D99984
It is common to zero-initialize not only scalar variables,
but also structs. This is also defensive programming and
we shouldn't complain about that.
rdar://34122265
Differential Revision: https://reviews.llvm.org/D99262
6fe7de90b9 changed GNU toolchain,
and added new RUN line to test expected behavior.
The change is for GNU toolchain only, so this will fail other toolchain,
eg: AIX.
Update the test with `-target` to test GNU tool chain only.
Reviewed By: MaskRay
Differential Revision: https://reviews.llvm.org/D99901
The ROCm installation directory may be another
directory, llvm/ inside the build directory.
Reviewed By: yaxunl
Differential Revision: https://reviews.llvm.org/D100045
After D98856 these tests will by default break (fatal_error) if any of
the wrong interfaces are used, so there's no longer a need to have a
RUN line that checks for a warning message emitted by the compiler.
If we allocate memory, the extent of the MemRegion will be the symbolic
value of the size parameter. This way, if that symbol gets constrained,
the extent will be also constrained.
This test demonstrates that the extent is indeed the same symbol.
Reviewed By: NoQ
Differential Revision: https://reviews.llvm.org/D99959
When property is declared in a superclass (or in a protocol),
it still can be of CXXRecord type and Sema could've already
generated a body for us. This patch joins two branches and
two ways of acquiring IVar in order to reuse the existing code.
And prevent us from generating l-value to r-value casts for
C++ types.
rdar://67416721
Differential Revision: https://reviews.llvm.org/D99194
size_t and friends are built-in scalar data types and s6.4.4.2 of the
OpenCL C Specification says the as_type() operator must be available
for these data types.
Differential Revision: https://reviews.llvm.org/D98959
This reverts commit a547b4e26b,
relanding commit 31d219d299,
which was reverted because there was a conflicting inverse transform,
which was causing an endless combine loop, which has now been adjusted.
Original commit message:
https://alive2.llvm.org/ce/z/67w-wQ
We prefer `add`s over `sub`, and this particular xform
allows further folds to happen:
Fixes https://bugs.llvm.org/show_bug.cgi?id=49858
Clang test CodeGen/libcalls.c contains CHECK-NOT directives using a
variable defined in a CHECK directive with a different prefix never
enabled together, therefore causing the variable to be undefined in that
CHECK-NOT.
The intent of the test is to check that some declaration do not have the
same attribute as when compiling the test without -fmath-errno. This
commits instead changes all CHECK-NOT to CHECK directive, checking that
they all use the same attribute. It also adds an extra CHECK for that
prefix to check the expected attributes these functions should have when
compiling with -fmath-errno.
Reviewed By: rjmccall
Differential Revision: https://reviews.llvm.org/D99898
As it is being noted in D99249, lack of alignment information on `this`
has been preventing LICM from happening.
For some time now, lack of alignment attribute does *not* imply
natural alignment, but an alignment of `1`.
Also, we used to treat dereferenceable as implying alignment,
but we no longer do, so it's a bugfix.
Differential Revision: https://reviews.llvm.org/D99790
omp_is_initial_device() is marked as a built-in function in the current
compiler, and user code guarded by this call may be optimized away,
resulting in undesired behavior in some cases. This patch provides a
possible fix for such cases by defining the routine as a variant
function and removing it from builtin list.
Differential Revision: https://reviews.llvm.org/D99447
We already did so for scoped locks acquired in the constructor, this
change extends the treatment to deferred locks and scoped unlocking, so
locks acquired outside of the constructor. Obviously this makes things
more consistent.
Originally I thought this was a bad idea, because obviously it
introduces false negatives when it comes to double locking, but these
are typically easily found in tests, and the primary goal of the Thread
safety analysis is not to find double locks but race conditions.
Since the scoped lock will release the mutex anyway when the scope ends,
the inconsistent state is just temporary and probably fine.
Reviewed By: delesley
Differential Revision: https://reviews.llvm.org/D98747
Recently atomicrmw started to support fadd/fsub:
https://reviews.llvm.org/D53965
However clang atomic builtins fetch add/sub still does not support
emitting atomicrmw fadd/fsub.
This patch adds that.
Reviewed by: John McCall, Artem Belevich, Matt Arsenault, JF Bastien,
James Y Knight, Louis Dionne, Olivier Giroux
Differential Revision: https://reviews.llvm.org/D71726
This allows frontend and backend diagnostic files to all go into the
same place. Have it control the Windows (mini-)dump location.
Differential Revision: https://reviews.llvm.org/D99199
The major change here is to index macro occurrences in more places than
before, specifically
* In non-expansion references such as `#if`, `#ifdef`, etc.
* When the macro is a reference to a builtin macro such as __LINE__.
* When using the preprocessor state instead of callbacks, we now include
all definition locations and undefinitions instead of just the latest
one (which may also have had the wrong location previously).
* When indexing an existing module file (.pcm), we now include module
macros, and we no longer report unrelated preprocessor macros during
indexing the module, which could have caused duplication.
Additionally, we now correctly obey the system symbol filter for macros,
so by default in system headers only definition/undefinition occurrences
are reported, but it can be configured to report references as well if
desired.
Extends FileIndexRecord to support occurrences of macros. Since the
design of this type is to keep a single list of entities organized by
source location, we incorporate macros into the existing DeclOccurrence
struct.
Differential Revision: https://reviews.llvm.org/D99758
Programmers would like to be able to test direct methods by calling them from a
different linkage unit or mocking them, both of which are impossible. This
patch adds a flag that effectively disables the attribute, which will fix this
when enabled in testable builds. rdar://71190891
Differential revision: https://reviews.llvm.org/D95845
1. Rename RVVBinBuiltin to RVVOutputOp1Builtin because it is not related
to the number of operand.
2. Add RVV Integer instuctions which use RVVOutputOp1Builtin.
Reviewed By: craig.topper
Authored-by: Roger Ferrer Ibanez <rofirrim@gmail.com>
Co-Authored-by: Zakk Chen <zakk.chen@sifive.com>
Differential Revision: https://reviews.llvm.org/D99524
It is possible that an entry in 'DestroyRetVal' lives longer
than an entry in 'LockMap' if not removed at checkDeadSymbols.
The added test case demonstrates this.
Reviewed By: NoQ
Differential Revision: https://reviews.llvm.org/D98504
Clang test CodeGenOpenCL/fpmath.cl uses a variable defined in an earlier
CHECK-NOT directive. However, by definition the pattern in that
directive is not supposed to occur so no variable will be defined. This
commit solves the issue by using a regex match with the same regex as in
the definition. It also changes the definition into a regex match since
no variable is going to be defined.
Reviewed By: yaxunl
Differential Revision: https://reviews.llvm.org/D99857
This patch adds two debug functions to ExprInspectionChecker to dump out
the dynamic extent and element count of symbolic values:
dumpExtent(), dumpElementCount().
Clang used to emit a bad -Wbridge-cast diagnostic on the cast in the attached
test. This was because, after 09abecef7, struct __CFString was not added to
lookup, so the objc_bridge attribute wasn't getting duplicated onto the most
recent declaration, causing us to fail to find it in getObjCBridgeAttr. This
patch fixes this by instead walking through the redeclarations to find an
appropriate bridge attribute. rdar://72823399
Differential revision: https://reviews.llvm.org/D99661
Clang test CodeGen/debug-info-extern-call.c tries to check for the
absence of a sequence of instructions with several CHECK-NOT with one of
those directives using a variable defined in another. However CHECK-NOT
are checked independently so that is using a variable defined in a
pattern that should not occur in the input.
This commit removes the CHECK-NOT for the retained line attribute
definition since the CHECK-NOT on the compile unit will already check
that there is no retained lines.
Reviewed By: djtodoro
Differential Revision: https://reviews.llvm.org/D99830
Clang test CodeGenCUDA/kernel-stub-name.cu uses never defined DKERN
variable in a CHECK-NOT directive. This commit replace the variable by a
regex, thereby avoiding the issue.
Reviewed By: yaxunl
Differential Revision: https://reviews.llvm.org/D99832
Commit 8129521318 changed a line defining
PREFIX in clang test CodeGenCUDA/device-stub.cu into a CHECK-NOT
directive. All following lines using PREFIX are therefore using an
undefined variable since the pattern defining PREFIX is not supposed to
occur and CHECK-NOT are checked independently.
This commit replaces all uses of PREFIX by the regex used to define it,
thereby avoiding the problem.
Reviewed By: yaxunl
Differential Revision: https://reviews.llvm.org/D99831
Take gcc-8 on Debian i386 as an example. The target-specific libstdc++ search
path (`GPLUSPLUS_TOOL_INCLUDE_DIR`) uses the multiarch name `i386-linux-gnu`,
instead of the triple of the GCC installation `i686-linux-gnu` (the directory
under `usr/lib/gcc/`):
```
/usr/include/c++/8
/usr/include/i386-linux-gnu/c++/8
/usr/include/c++/8/backward
```
Clang currently detects `/usr/lib/gcc/i686-linux-gnu/8/../../../include/i686-linux-gnu/c++/8`.
This patch changes the second i686-linux-gnu to i386-linux-gnu so that
`/usr/include/i386-linux-gnu/c++/8` can be found.
Fix PR49827 - this was somehow regressed by my previous libstdc++ include path
cleanups and fixes for gcc-cross, but it seems that the paths were never properly tested before.
Differential Revision: https://reviews.llvm.org/D99852
Set the source ranges for parsed GNU-style attributes in
ParseGNUAttributes(), the same way that ParseCXX11Attributes() does it.
Differential Revision: https://reviews.llvm.org/D75844
Commit f495de43bd forgot two lines when
removing checks for strong and weak equality, resulting in the use of an
undefined FileCheck variable.
Reviewed By: Quuxplusone
Differential Revision: https://reviews.llvm.org/D99838
Currently, when one or more attributes are mutually exclusive, the
developer adding the attribute has to manually emit diagnostics. In
practice, this is highly error prone, especially for declaration
attributes, because such checking is not trivial. Redeclarations
require you to write a "merge" function to diagnose mutually exclusive
attributes and most attributes get this wrong.
This patch introduces a table-generated way to specify that a group of
two or more attributes are mutually exclusive:
def : MutualExclusions<[Attr1, Attr2, Attr3]>;
This works for both statement and declaration attributes (but not type
attributes) and the checking is done either from the common attribute
diagnostic checking code or from within mergeDeclAttribute() when
merging redeclarations.
Head files are included in a separate patch in case the name needs to be changed.
RV32 / 64:
clmul
clmulh
clmulr
Differential Revision: https://reviews.llvm.org/D99711
Forgot to amend the Author.
Original commit message:
Header files are included in a separate patch in case the name needs to be changed.
RV32 / 64:
orc.b
Differential Revision: https://reviews.llvm.org/D99320
Implementation for RISC-V Zbr extension intrinsic.
Header files are included in separate patch in case the name needs to be changed
RV32 / 64:
crc32b
crc32h
crc32w
crc32cb
crc32ch
crc32cw
RV64 Only:
crc32d
crc32cd
Reviewed By: craig.topper
Differential Revision: https://reviews.llvm.org/D99009
As of MSVC 19.28 (2019 Update 8), integral conversion is no longer preferred over floating-to-integral, and so MSVC is more standard conformant and will generate a compiler error on ambiguous call.
Cf. https://godbolt.org/z/E8xsdqKsb.
Initially found during the review of D99641.
Reviewed By: rnk
Differential Revision: https://reviews.llvm.org/D99663
For DBX, it does not handle column info well. Set -gno-column-info
by default for DBX.
Reviewed By: dblaikie
Differential Revision: https://reviews.llvm.org/D99703
Remove the CHECK-NOT directive referring to as-of-yet undefined VAR_PRIV
variable since the pattern of the following CHECK-NOT in the same
CHECK-NOT block covers a superset of the case caught by the first
CHECK-NOT.
Reviewed By: ABataev
Differential Revision: https://reviews.llvm.org/D99775
OpenMP test target_data_use_device_ptr_if_codegen contains a CHECK-NOT
directive using an undefined DECL FileCheck variable. It seems copied
from target_data_use_device_ptr_codegen where there's a CHECK for a load
that defined the variable. Since there is no corresponding load in this
testcase, the simplest is to simply forbid any store and get rid of the
variable altogether.
Reviewed By: ABataev
Differential Revision: https://reviews.llvm.org/D99771
Fix the many cases of use of undefined SIVAR/SVAR/SFVAR in OpenMP
*private_codegen tests, due to a missing BLOCK directive to capture the
IR variable when it is declared. It also fixes a few typo in its use.
Reviewed By: MaskRay
Differential Revision: https://reviews.llvm.org/D99770
Summary:
Currently the mapping names are not passed to the mapper components that set up
the array region. This means array mappings will not have their names availible
in the runtime. This patch fixes this by passing the argument name to the region
correctly. This means that the mapped variable's name will be the declared
mapper that placed it on the device.
Reviewed By: jdoerfert
Differential Revision: https://reviews.llvm.org/D99681
Set the source ranges for parsed GNU-style attributes in
ParseGNUAttributes(), the same way that ParseCXX11Attributes() does it.
Differential Revision: https://reviews.llvm.org/D75844
Currently, support for the x32 ABI is handled as a multilib to the
x86_64 target only. However, full self-hosting x32 systems treating it
as a separate architecture with its own architecture triplets as well as
search paths exist as well, in Debian's x32 port and elsewhere.
This adds the missing architecture triplets and search paths so that
clang can work as a native compiler on x32, and updates the tests so
that they pass when using an x32 libdir suffix.
Additionally, we would previously also assume that objects from any
x86_64-linux-gnu GCC installation could be used to target x32. This
changes the logic so that only GCC installations that include x32
support are used when targetting x32, meaning x86_64-linux-gnux32 GCC
installations, and x86_64-linux-gnu and i686-linux-gnu GCC installations
that include x32 multilib support.
Reviewed By: MaskRay
Differential Revision: https://reviews.llvm.org/D52050
Based on this debugger type, for now, we plan to:
1: use inline string by default for XCOFF DWARF
2: generate no column info for debug line table.
Reviewed By: aprantl
Differential Revision: https://reviews.llvm.org/D99400
Need to bitcast the function pointer passed as a parameter to the real
type to avoid possible problem with calling conventions.
Differential Revision: https://reviews.llvm.org/D99521
This helper method is useful even outside of Gnu toolchains, so move
it to ToolChain so it can be reused in other toolchains such as Fuchsia.
Differential Revision: https://reviews.llvm.org/D88452
Removes the prototype builtin and intrinsic for i64x2.eq and implements that
instruction as well as the other i64x2 comparison instructions in the final SIMD
spec. Unsigned comparisons were not included in the final spec, so they still
need to be scalarized via a custom lowering.
Differential Revision: https://reviews.llvm.org/D99623
... instantiations
They are currently not being diagnosed because ProhibitAttributes() does
not handle attribute lists with an invalid source range. But once it
does, we need to allow GNU attributes in this place.
Additionally, start optionally diagnosing empty attr lists in
ProhibitCXX11Attributes(), since ProhibitAttribute() does it.
Differential Revision: https://reviews.llvm.org/D97362
1. Undefined macro test for rv32i and rv64i.
a. Reorder it with canonical order.
b. Add missing undefined macro check.
c. Append defined value to `__riscv_a`, `__riscv_f` and `__riscv_c` to distinguish with
`__riscv_arch_test`, `__riscv_cmodel_medlow` and `__riscv_float_abi_soft`. They have the same prefix.
2. Move abi macro test below f and d.
3. Unify coding style for newline.
Reviewed By: HsiangKai
Differential Revision: https://reviews.llvm.org/D99631
This fixes https://bugs.llvm.org/show_bug.cgi?id=49534, where the call to the constructor
of the anonymous union is checked and triggers assertion failure when trying to retrieve
the alignment of the `this` argument (which is a union with virtual function).
The extra check for alignment was introduced in D97187.
Reviewed By: tmatheson
Differential Revision: https://reviews.llvm.org/D98548
another one for distributed mode.
Currently during module importing, ThinLTO opens all the source modules,
collect functions to be imported and append them to the destination module,
then leave all the modules open through out the lto backend pipeline. This
patch refactors it in the way that one source module will be closed before
another source module is opened. All the source modules will be closed after
importing phase is done. It will save some amount of memory when there are
many source modules to be imported.
Note that this patch only changes the distributed thinlto mode. For in
process thinlto mode, one source module is shared acorss different thinlto
backend threads so it is not changed in this patch.
Differential Revision: https://reviews.llvm.org/D99554
Need to cast the argument for the debug wrapper function call to the
corresponding parameter type to avoid crash.
Differential Revision: https://reviews.llvm.org/D99617
See PR45088.
Compound requirement type constraints were using decltype(E) instead of
decltype((E)), as per `[expr.prim.req]p1.3.3`.
Since neither instantiation nor type dependence should matter for
the constraints, this uses an approach where a `decltype` type is not built,
and just the canonical type of the expression after template instantiation
is used on the requirement.
Signed-off-by: Matheus Izvekov <mizvekov@gmail.com>
Reviewed By: rsmith
Differential Revision: https://reviews.llvm.org/D98160
Ensure that the cl_khr_3d_image_writes pragma is enabled by making
cl_khr_3d_image_writes an optional core feature in CL 3.0 in addition
to being an available extension in 1.0 onwards and a core feature in
CL 2.0.
https://reviews.llvm.org/D99425
Signed-off-by: Kevin Petit <kevin.petit@arm.com>
`allocClassWithName` allocates an object with the given type.
The type is actually provided as a string argument (type's name).
This creates a possibility for not particularly useful warnings
from the analyzer.
In order to combat with those, this patch checks for casts of the
`allocClassWithName` results to types mentioned directly as its
argument. All other uses of this method should be reasoned about
as before.
rdar://72165694
Differential Revision: https://reviews.llvm.org/D99500
It makes sense to track rvalue expressions in the case of special
concrete integer values. The most notable special value is zero (later
we may find other values). By tracking the origin of 0, we can provide a
better explanation for users e.g. in case of division by 0 warnings.
When the divisor is a product of a multiplication then now we can show
which operand (or both) was (were) zero and why.
Differential Revision: https://reviews.llvm.org/D99344
If no initializer-clause is specified, the private variables will be
initialized following the rules for initialization of objects with static
storage duration.
Need to adjust the implementation to the current version of the
standard.
Differential Revision: https://reviews.llvm.org/D99539
Since the introduction of class properties in Objective-C it is possible to declare a class and an instance
property with the same identifier in an interface/protocol.
Right now Clang just generates debug information for whatever property comes first in the source file.
The second property is ignored as it's filtered out by the set of already emitted properties (which is just
using the identifier of the property to check for equivalence). I don't think generating debug info in this case
was never supported as the identifier filter is in place since 7123bca7fb
(which precedes the introduction of class properties).
This patch expands the filter to take in account identifier + whether the property is class/instance. This
ensures that both properties are emitted in this special situation.
Reviewed By: aprantl
Differential Revision: https://reviews.llvm.org/D99512
The `noinline` for non-SPMD parallel functions is probably not necessary
but as long as we use it we should put it on the outermost parallel
function, which is the wrapper, not the actual outlined function.
Resolves PR49752
Reviewed By: JonChesterfield
Differential Revision: https://reviews.llvm.org/D99506
After c773d0f973 the remark is only emitted if the loop is profitable
to vectorize, but cannot be vectorized. Hence, it depends on
X86-specific cost-modeling.
The original issue is caused by the fact that the variable is allocated
with incorrect type i1 instead of i8. This causes the bitcasting of the
declaration to i8 type and the bitcast expression does not match the
original variable.
To fix the problem, the UndefValue initializer and the original
variable should be emitted with type i8, not i1.
Differential Revision: https://reviews.llvm.org/D99297
On z/OS there is a hard limitation on on the maximum requestable alignment in aligned attribute for static variables. We need to truncate values greater than that.
Reviewed By: abhina.sreeskantharajan
Differential Revision: https://reviews.llvm.org/D98864
Need to insert a basic block during generation of the target region to
avoid crash for the GPU to be able always calling a cleanup action.
This cleanup action is required for the correct emission of the target
region for the GPU.
Differential Revision: https://reviews.llvm.org/D99445
This follows GCC and simplifies code. /usr/local/include and TOOL_INCLUDE_DIR
should not conflict with the resource directory include so users should not
observe any difference.
This follows GCC. Having libstdc++/libc++ include paths is not useful
anyway because libstdc++/libc++ header files cannot find features.h.
While here, suppress -stdlib++-isystem with -nostdlibinc.
RVV intrinsics has new overloading rule, please see
82aac7dad4
Changed:
1. Rename `generic` to `overloaded` because the new rule is not using C11 generic.
2. Change HasGeneric to HasNoMaskedOverloaded because all masked operations
support overloading api.
3. Add more overloaded tests due to overloading rule changed.
Differential Revision: https://reviews.llvm.org/D99189
I think byval/sret and the others are close to being able to rip out
the code to support the missing type case. A lot of this code is
shared with inalloca, so catch this up to the others so that can
happen.
We should have a test verifying / \ for Windows but have such a long
test specifically for Linux cross compilation suffer from Windows \
is too troublesome.
IR values convert to check prefix FileCheck variables for IR checks. For example, nameless values, e.g., %0, convert to check prefix TMP FileCheck variables, e.g., [[TMP0:%.*]]. This check prefix may clash with named values that have the same name and that causes auto-generated tests to fail. Currently a warning is emitted to change the names of the IR values but this is not always possible, if for example they are generated by clang. Manual intervention to fix the FileCheck variable names is too tedious. This patch add a parameter to prefix conflicting FileCheck variable names with a user-provided string to automate the process.
Reviewed By: jdoerfert
Differential Revision: https://reviews.llvm.org/D99415
Zero length bitfield alignment is not respected if they are leading members on z/OS target.
Reviewed By: abhina.sreeskantharajan
Differential Revision: https://reviews.llvm.org/D98890
When emitting a function body there needs to be a instr profiling counter emitted. Otherwise instr profiling won't work for this function.
Reviewed By: MaskRay
Differential Revision: https://reviews.llvm.org/D98135
tl;dr Correct implementation of Corouintes requires having lifetime intrinsics available.
Coroutine functions are functions that can be suspended and resumed latter. To do so, data that need to stay alive after suspension must be put on the heap (i.e. the coroutine frame).
The optimizer is responsible for analyzing each AllocaInst and figure out whether it should be put on the stack or the frame.
In most cases, for data that we are unable to accurately analyze lifetime, we can just conservatively put them on the heap.
Unfortunately, there exists a few cases where certain data MUST be put on the stack, not on the heap. Without lifetime intrinsics, we are unable to correctly analyze those data's lifetime.
To dig into more details, there exists cases where at certain code points, the current coroutine frame may have already been destroyed. Hence no frame access would be allowed beyond that point.
The following is a common code pattern called "Symmetric Transfer" in coroutine:
```
auto tmp = await_suspend();
__builtin_coro_resume(tmp.address());
return;
```
In the above code example, `await_suspend()` returns a new coroutine handle, which we will obtain the address and then resume that coroutine. This essentially "transfered" from the current coroutine to a different coroutine.
During the call to `await_suspend()`, the current coroutine may be destroyed, which should be fine because we are not accessing any data afterwards.
However when LLVM is emitting IR for the above code, it needs to emit an AllocaInst for `tmp`. It will then call the `address` function on tmp. `address` function is a member function of coroutine, and there is no way for the LLVM optimizer to know that it does not capture the `tmp` pointer. So when the optimizer looks at it, it has to conservatively assume that `tmp` may escape and hence put it on the heap. Furthermore, in some cases `address` call would be inlined, which will generate a bunch of store/load instructions that move the `tmp` pointer around. Those stores will also make the compiler to think that `tmp` might escape.
To summarize, it's really difficult for the mid-end to figure out that the `tmp` data is short-lived.
I made some attempt in D98638, but it appears to be way too complex and is basically doing the same thing as inserting lifetime intrinsics in coroutines.
Also, for reference, we already force emitting lifetime intrinsics in O0 for AlwaysInliner: https://github.com/llvm/llvm-project/blob/main/llvm/lib/Passes/PassBuilder.cpp#L1893
Differential Revision: https://reviews.llvm.org/D99227
These contain clang driver changes for supporting HWASan on Fuchsia.
This includes hwasan multilibs and the dylib path change.
Differential Revision: https://reviews.llvm.org/D99361
Add builtin function __builtin_get_device_side_mangled_name
to get device side manged name for functions and global
variables, which can be used to get symbol address of kernels
or variables by mangled name in dynamically loaded
bundled code objects at run time.
Reviewed by: Artem Belevich
Differential Revision: https://reviews.llvm.org/D99301
Currently, we infer 0 if the divisible of the modulo op is 0:
int a = x < 0; // a can be 0
int b = a % y; // b is either 1 % sym or 0
However, we don't when the op is / :
int a = x < 0; // a can be 0
int b = a / y; // b is either 1 / sym or 0 / sym
This commit fixes the discrepancy.
Differential Revision: https://reviews.llvm.org/D99343
In order to test the preservation of the original Debug Info metadata
in your projects, a front end option could be very useful, since users
usually report that a concrete entity (e.g. variable x, or function fn2())
is missing debug info. The [0] is an example of running the utility
on GDB Project.
This depends on: D82546 and D82545.
Differential Revision: https://reviews.llvm.org/D82547
Summary: Add -fno-split-stack and rename CC1 option from `-split-stacks`
to `-fsplit-stack`.
Test Plan: check-all
Differential Revision: https://reviews.llvm.org/D99245
```
Warn when a function pointer is cast to an incompatible function
pointer. In a cast involving function types with a variable argument
list only the types of initial arguments that are provided are
considered. Any parameter of pointer-type matches any other
pointer-type. Any benign differences in integral types are ignored, like
int vs. long on ILP32 targets. Likewise type qualifiers are ignored. The
function type void (*) (void) is special and matches everything, which
can be used to suppress this warning. In a cast involving pointer to
member types this warning warns whenever the type cast is changing the
pointer to member type. This warning is enabled by -Wextra.
```
Reviewed By: rsmith
Differential Revision: https://reviews.llvm.org/D97831
Not only can this save unneeded filesystem stats, it can make `clang
--sysroot=/path/to/debian-sysroot -c a.cc` work (get `-internal-isystem
$sysroot/usr/include/x86_64-linux-gnu`) even without `lib/x86_64-linux-gnu/`.
This should make thakis happy.
Functions specified in `-emscripten-cxx-exceptions-allowed`, which is
set by Emscripten's `EXCEPTION_CATCHING_ALLOWED` setting, can be inlined
in LLVM middle ends before we reach WebAssemblyLowerEmscriptenEHSjLj
pass in the wasm backend and thus don't get transformed for exception
catching.
This fixes the issue by adding `--force-attribute=FUNC_NAME:noinline`
for each function name in `-emscripten-cxx-exceptions-allowed`, which
adds `noinline` attribute to the specified function and thus excludes
the function from inlining candidates in optimization passes.
Fixes the remaining half of
https://github.com/emscripten-core/emscripten/issues/10721.
Reviewed By: sbc100
Differential Revision: https://reviews.llvm.org/D99259
Create fix-it hints to fix the order of constructors.
To make this a lot simpler, I've grouped all the warnings for each out of order initializer into 1.
This is necessary as fixing one initializer would often interfere with other initializers.
Reviewed By: aaron.ballman
Differential Revision: https://reviews.llvm.org/D98745
If emit inlined region for master/critical directives, no need to clear
lambda/block context data, otherwise the variables cannot be found and
it causes a crash at compile time.
Differential Revision: https://reviews.llvm.org/D99280
The original implementation didn't fire on non-template classes when a
base class was an instantiation of a template with a dependent base.
In that case the base of the base is dependent as seen from the base,
but not from the class we're interested in, which isn't a template.
Also it simplifies the code a lot.
Reviewed By: aaron.ballman
Differential Revision: https://reviews.llvm.org/D98724
Add an option to tell the compiler that it can use privileged instructions.
This patch only adds the option. Backend implementation will be added in a
future patch.
Reviewed By: lei, amyk
Differential Revision: https://reviews.llvm.org/D99193
Files compiled with C++ for OpenCL mode can now have a distinct
file extension - clcpp, then clang driver picks the compilation
mode automatically (-x clcpp) without the use of -cl-std=clc++.
Differential Revision: https://reviews.llvm.org/D96771
In order to have the same option on power PC LLVM and power PC gcc
the option will be changed from -mrop-protection to -mrop-protect.
The feature will be off by default and turned on when the option is used.
Reviewed By: lei, amyk
Differential Revision: https://reviews.llvm.org/D99185
Required by D83660.
Test cases may want to use the host compiler to compile some mocks for the
test case.
This patch adds two substitutions `%host_cc` and `%host_cxx` to use the host
compilers set via variable `CMAKE_C_COMPILER` and `CMAKE_CXX_COMPILER`.
Patch by Ella Ma!
Reviewed By: steakhal
Differential Revision: https://reviews.llvm.org/D98918
There are a number of functions in altivec.h that use
vector __int128 which isn't supported on AIX. Those functions
need to be guarded for targets that don't support the type.
Furthermore, the functions that produce quadword instructions
without using the type need a builtin. This patch adds the
macro guards to altivec.h using the __SIZEOF_INT128__ which
is only defined on targets that support the __int128 type.
Support Complex type transformer to define more complexity legal type.
Overall our downstream implementation there are only four instructions need to
use complex type transformer, it's not a common case.
I still feel using a string for prototypes is simple and clear.
Reviewed By: craig.topper
Differential Revision: https://reviews.llvm.org/D98848
Review D88220 turns out to have some pretty severe bugs, but I *think*
this patch fixes them.
Paper P1825 is supposed to enable implicit move from "non-volatile objects
and rvalue references to non-volatile object types." Instead, what was committed
seems to have enabled implicit move from "non-volatile things of all kinds,
except that if they're rvalue references then they must also refer to non-volatile
things." In other words, D88220 accidentally enabled implicit move from
lvalue object references (super yikes!) and also from non-object references
(such as references to functions).
These two cases are now fixed and regression-tested.
Differential Revision: https://reviews.llvm.org/D98971
This patch is to fix lit test case failure relate to alignment, on z/OS, maximum alignment value for 64 bit mode is 16 and also fixed clang/test/Layout/itanium-union-bitfield.cpp, attribute ((aligned(4))) is needed for bit-field member in Union for z/OS because single bit-field has one byte alignment, this will make sure size and alignment will be correct value on z/OS.
Differential Revision: https://reviews.llvm.org/D98793
This line has a TODO comment, but the answer to it seems to be "no"
given that clang itself uses attributes on @try statements in its tests.
This ProhibitAttributes() statement is also dead code since
ProhibitAttributs() does not handle GNU attributes at the moment but
those are the only attributes valid in objc.
Differential Revision: https://reviews.llvm.org/D97371
1. Skip the temporary file
2. Test cc1 with -S to verify codegen work well. Add '-target-feature
+m' because the backend requires it to calculate the vscaled size/offset.
Reviewed By: craig.topper
Differential Revision: https://reviews.llvm.org/D99082
Add overloads that perform subtraction on v1i128 that take and
produce vector unsigned char to avoid needing to use __int128.
The overloads are suffixed with _u128 and are needed for targets
where __int128 isn't supported (AIX).
Add overloads that perform addition on v1i128 that take and produce
vector unsigned char to avoid needing to use __int128. The overloads
are suffixed with _u128 and are needed for targets where __int128
isn't supported (AIX).
1. Skip the temporary file
2. Test cc1 with -S to verify codegen work well. Add '-target-feature
+m' because the backend requires it to calculate the vscaled size/offset.
Reviewed By: craig.topper
Differential Revision: https://reviews.llvm.org/D99082
GlobalISel is currently not enabled when using -flto since the front-end
-mvllm flags don't get passed through. This change fixes this for Darwin
platforms. We have to do this in the driver because the code generator choice
isn't embedded into the bitcode file.
Differential Revision: https://reviews.llvm.org/D99126
ROCm has changed installation path to /opt/rocm-{release}. Add detection
for that. Also support ROCM_PATH environment variable.
Reviewed by: Artem Belevich
Differential Revision: https://reviews.llvm.org/D98867
This attribute represents the minimum and maximum values vscale can
take. For now this attribute is not hooked up to anything during
codegen, this will be added in the future when such codegen is
considered stable.
Additionally hook up the -msve-vector-bits=<x> clang option to emit this
attribute.
Differential Revision: https://reviews.llvm.org/D98030
Implement the TreeTransform for AsTypeExpr. Split `BuildAsTypeExpr`
out of `ActOnAsTypeExpr`, such that we can call the Build method from
the TreeTransform.
Fixes PR47979.
Differential Revision: https://reviews.llvm.org/D98855
Additionally, this patch puts an assertion checking for feasible
constraints in every place where constraints are assigned to states.
Differential Revision: https://reviews.llvm.org/D98948
Debian multiarch additionally adds /usr/include/<triplet> and somehow
Android borrowed the idea. (Note /usr/<triplet>/include is already an
include dir...). On Debian, we should just assume a GCC installation is
available and use its triple.
08196e0b2e exposed LowerExpectIntrinsic's
internal implementation detail in the form of
LikelyBranchWeight/UnlikelyBranchWeight options to the outside.
While this isn't incorrect from the results viewpoint,
this is suboptimal from the layering viewpoint,
and causes confusion - should transforms also use those weights,
or should they use something else, D98898?
So go back to status quo by making LikelyBranchWeight/UnlikelyBranchWeight
internal again, and fixing all the code that used it directly,
which currently is only clang codegen, thankfully,
to emit proper @llvm.expect intrinsics instead.
With this change, for `#include <ar.h>`, `clang --target=aarch64-linux-gnu`
will read `/usr/lib/gcc/aarch64-linux-gnu/10/../../../../aarch64-linux-gnu/include/ar.h`
(on Debian gcc->gcc-cross)
instead of `/usr/include/ar.h`. Some glibc headers (e.g. gnu/stubs.h) are different across architectures.
After path resolution, it duplicates a subsequent -L entry. The entry below
(lib/gcc/$triple/$version/../../../../$OSLibDir) usually does not exist (e.g.
Arch Linux; Debian cross gcc). When it exists, it typically just has ld.so (e.g.
Debian native gcc) which cannot cause collision. Removing the -L (similar to
reordering it) is therefore justified.
so that when --sysroot is specified, the detected GCC installation will not be
overridden by another from /usr which happens to have a larger version.
This behavior is particularly inconvenient when the system has a larger version
GCC while the user wants to try out an older sysroot.
Delete some tests from linux-ld.c which overlap with cross-linux.c
In GCC, if `-B $prefix` is specified, `$prefix` is used to find executable files and startup files.
`$prefix/include` is added as an include search directory.
Clang overloads -B with GCC installation detection semantics which make the
behavior less predictable (due to the "largest GCC version wins" rule) and
interact poorly with --gcc-toolchain (--gcc-toolchain can be overridden by -B).
* `clang++ foo.cpp` detects GCC installation under `/usr`.
* `clang++ --gcc-toolchain=Inputs foo.cpp` detects GCC installation under `Inputs`.
* `clang++ -BA --gcc-toolchain=B foo.cpp` detects GCC installation under A and B and the larger version wins. With this patch, only B is used for detection.
* `clang++ -BA foo.cpp` detects GCC installation under `A` and `/usr`, and the larger GCC version wins. With this patch `A` is not used for detection.
This patch changes -B to drop the GCC detection semantics. Its executable
searching semantics are preserved. --gcc-toolchain is the recommended option to
specify the GCC installation detection directory.
(
Note: Clang detects GCC installation in various target dependent directories.
`$sysroot/usr` (sysroot defaults to "") is a common directory used by most targets.
Such a directory is expected to contain something like `lib{,32,64}/gcc{,-cross}/$triple`.
Clang will then construct library/include paths from the directory.
)
Differential Revision: https://reviews.llvm.org/D97993
This patch adds a new command line option to clang which outputs the directory containing clangs runtime libraries to stdout.
The primary use case for this command line flag is for build systems using clang-cl. Build systems when using clang-cl invoke the linker, that is either link or lld-link in this case, directly instead of invoking the compiler for the linking process as is common with the other drivers. This leads to issues when runtime libraries of clang, such as sanitizers or profiling, have to be linked in as the compiler cannot communicate the link directory to the linker.
Using this flag, build systems would be capable of getting the directory containing all of clang's runtime libraries and add it to the linker path.
Differential Revision: https://reviews.llvm.org/D98868
At the moment "link.exe" is hard-coded as default linker in MSVC.cpp,
so there's no way to use LLD as default linker for MSVC driver.
This patch adds checking of CLANG_DEFAULT_LINKER to MSVC.cpp and
updates unit-tests that expect link.exe linker to explicitly select it
via -fuse-ld=link, so that buildbots and other builds that set
-DCLANG_DEFAULT_LINKER=foobar don't fail these tests.
This is a squash of
- https://reviews.llvm.org/D98493 (MSVC.cpp change) and
- https://reviews.llvm.org/D98862 (unit-tests change)
Reviewed By: maxim-kuvyrkov
Differential Revision: https://reviews.llvm.org/D98935
Clang currently automates a fair amount of diagnostic checking for
declaration attributes based on the declarations in Attr.td. It checks
for things like subject appertainment, number of arguments, language
options, etc. This patch uses the same machinery to perform diagnostic
checking on statement attributes.
C functions may be declared and defined in different prototypes like below. This patch unifies the checks for mangling names in symbol linkage name emission and debug linkage name emission so that the two names are consistent.
static int go(int);
static int go(a) int a;
{
return a;
}
Test Plan:
Differential Revision: https://reviews.llvm.org/D98799
Updates the names (e.g. widen => extend, saturate => sat) and opcodes of all
SIMD instructions to match the finalized SIMD spec. Deliberately does not change
the public interface in wasm_simd128.h yet; that will require more care.
Depends on D98466.
Differential Revision: https://reviews.llvm.org/D98676
Removes the instruction definitions, intrinsics, and builtins for qfma/qfms,
signselect, and prefetch instructions, which were not included in the final
WebAssembly SIMD spec.
Depends on D98457.
Differential Revision: https://reviews.llvm.org/D98466
Now that the WebAssembly SIMD specification is finalized and engines are
generally up-to-date, there is no need for a separate target feature for gating
SIMD instructions that engines have not implemented. With this change,
v128.const is now enabled by default with the simd128 target feature.
Differential Revision: https://reviews.llvm.org/D98457
This reverts commit 11b70b9e3a.
The bot failure was due to ArgumentPromotion deleting functions
without deleting their analyses. This was separately fixed in 4b1c807.
Added basic parsing/sema/serialization support to extend the
existing 'destroy' clause for use with the 'interop' directive.
Differential Revision: https://reviews.llvm.org/D98834
The `int` and `long` versions of these builtins already provide the
necessary overloads for `intptr_t` and `uintptr_t` arguments, as
`ASTContext` defines `atomic_(u)intptr_t` in terms of the `int` or
`long` types.
Prior to this patch, calls to those builtins with particular argument
types resulted in call-is-ambiguous errors.
Differential Revision: https://reviews.llvm.org/D98520
Clang test acle_sve_ld1.sh is missing the colon in one of the string
variable definition separating the variable name from the regex. This
leads the substitution block to be parsed as a numeric variable use.
Reviewed By: sdesmalen
Differential Revision: https://reviews.llvm.org/D98852
This patch is a second attempt at fixing a link error for MSVC
entry points when calling conventions are specified using a flag.
Calling conventions specified using flags should not be applied to MSVC
entry points. The default calling convention is set in this case. The
default calling convention for MSVC entry points main and wmain is cdecl.
For WinMain, wWinMain and DllMain, the default calling convention is
stdcall on 32 bit Windows.
Explicitly specified calling conventions are applied to MSVC entry points.
For MinGW, the default calling convention for all MSVC entry points is
cdecl.
First attempt: 4cff1b40da
Revert of first attempt: bebfc3b92d
Differential Revision: https://reviews.llvm.org/D97941
Cleanup attribute allows users to attach a destructor-like functions
to variable declarations to be called whenever they leave the scope.
The logic of such functions is not supported by the Clang's CFG and
is too hard to be reasoned about. In order to avoid false positives
in this situation, we assume that we didn't see ALL of the executtion
paths of the function and, thus, can warn only about multiple call
violation.
rdar://74441906
Differential Revision: https://reviews.llvm.org/D98694
This patch introduces a very simple inter-procedural analysis
between blocks and enclosing functions.
We always analyze blocks first (analysis is done as part of semantic
analysis that goes side-by-side with the parsing process), and at the
moment of reporting we don't know how that block will be actually
used.
This patch introduces new logic delaying reports of the "never called"
warnings on blocks. If we are not sure that the block will be called
exactly once, we shouldn't warn our users about that. Double calls,
however, don't require such delays. While analyzing the enclosing
function, we can actually decide what we should do with those
warnings.
Additionally, as a side effect, we can be more confident about blocks
in such context and can treat them not as escapes, but as direct
calls.
rdar://74090107
Differential Revision: https://reviews.llvm.org/D98688
This category is generic enough to hold a variety of checkers.
Currently it contains the Dead Stores checker and an alpha unreachable
code checker.
Differential Revision: https://reviews.llvm.org/D98741
Add new field PermuteOperands to mapping different operand order between
C/C++ API and clang builtin.
Reviewed By: craig.topper, rogfer01
Authored-by: Roger Ferrer Ibanez <rofirrim@gmail.com>
Co-Authored-by: Hsiangkai Wang <kai.wang@sifive.com>
Co-Authored-by: Zakk Chen <zakk.chen@sifive.com>
Differential Revision: https://reviews.llvm.org/D98388
This reverts commit 809a1e0ffd.
Mach-O doesn't support dso_local and this change broke XNU because of the use of dso_local.
Differential Revision: https://reviews.llvm.org/D98458
This would assert if we hit the evaluation step limit between starting
to delay the call and finishing. In any case, delaying the call was
largely pointless as it doesn't really matter when we mark the
evaluation as having had side effects.
The condition variable is in scope in the loop increment, so we need to
emit the jump destination from wthin the scope of the condition
variable.
For GCC compatibility (and compatibility with real-world 'FOR_EACH'
macros), 'continue' is permitted in a statement expression within the
condition of a for loop, though, so there are two cases here:
* If the for loop has no condition variable, we can emit the jump
destination before emitting the condition.
* If the for loop has a condition variable, we must defer emitting the
jump destination until after emitting the variable. We diagnose a
'continue' appearing in the initializer of the condition variable,
because it would jump past the initializer into the scope of that
variable.
Reviewed By: rjmccall
Differential Revision: https://reviews.llvm.org/D98816
Fix use of undefined variable in CHECK-NOT directive in clang test
CodeGen/attr-speculative-load-hardening.c.
Reviewed By: kristof.beyls
Differential Revision: https://reviews.llvm.org/D93347
Added basic parsing/sema/serialization support for interop directive.
Support for the 'init' clause.
Differential Revision: https://reviews.llvm.org/D98558
SYCL compilations initiated by the driver will spawn off one or more
frontend compilation jobs (one for device and one for host). This patch
reworks the driver options to make upstreaming this from the downstream
SYCL fork easier.
This patch introduces a language option to identify host executions
(SYCLIsHost) and a -cc1 frontend option to enable this mode. -fsycl and
-fno-sycl become driver-only options that are rejected when passed to
-cc1. This is because the frontend and beyond should be looking at
whether the user is doing a device or host compilation specifically.
Because the frontend should only ever be in one mode or the other,
-fsycl-is-device and -fsycl-is-host are mutually exclusive options.
As a follow-up to D95691, add new diagnostic groups named
pre-c++N-compat to replace the old diagnostic groups with the standards
listed out explicitly. The old group names are retained for backwards
compatibility.
Previously NEON used a target specific intrinsic for frintn, given that
the FROUNDEVEN ISD node now exists, move over to that instead and add
codegen support for that node for both NEON and fixed length SVE.
Differential Revision: https://reviews.llvm.org/D98487
Split out some of the instructions predicated on the dot2-insts target
feature into a new dot7-insts, in preparation for subtargets that have
some but not all of these instructions. NFCI.
Differential Revision: https://reviews.llvm.org/D98717
The idiom:
```
DeclContext::lookup_result R = DeclContext::lookup(Name);
for (auto *D : R) {...}
```
is not safe when in the loop body we trigger deserialization from an AST file.
The deserialization can insert new declarations in the StoredDeclsList whose
underlying type is a vector. When the vector decides to reallocate its storage
the pointer we hold becomes invalid.
This patch replaces a SmallVector with an singly-linked list. The current
approach stores a SmallVector<NamedDecl*, 4> which is around 8 pointers.
The linked list is 3, 5, or 7. We do better in terms of memory usage for small
cases (and worse in terms of locality -- the linked list entries won't be near
each other, but will be near their corresponding declarations, and we were going
to fetch those memory pages anyway). For larger cases: the vector uses a
doubling strategy for reallocation, so will generally be between half-full and
full. Let's say it's 75% full on average, so there's N * 4/3 + 4 pointers' worth
of space allocated currently and will be 2N pointers with the linked list. So we
break even when there are N=6 entries and slightly lose in terms of memory usage
after that. We suspect that's still a win on average.
Thanks to @rsmith!
Differential revision: https://reviews.llvm.org/D91524
This commit makes escapes symmetrical, meaning that having escape
before and after the branching, where parameter is not called on
one of the paths, will have the same effect.
Differential Revision: https://reviews.llvm.org/D98622
Somewhat surprisingly, signature help is emitted as a side-effect of
computing the expected type of a function argument.
The reason is that both actions require enumerating the possible
function signatures and running partial overload resolution, and doing
this twice would be wasteful and complicated.
Change #1: document this, it's subtle :-)
However, sometimes we need to compute the expected type without having
reached the code completion cursor yet - in particular to allow
completion of designators.
eb4ab3358c did this but introduced a
regression - it emits signature help in the wrong location as a side-effect.
Change #2: only emit signature help if the code completion cursor was reached.
Currently there is PP.isCodeCompletionReached(), but we can't use it
because it's set *after* running code completion.
It'd be nice to set this implicitly when the completion token is lexed,
but ConsumeCodeCompletionToken() makes this complicated.
Change #3: call cutOffParsing() *first* when seeing a completion token.
After this, the fact that the Sema::Produce*SignatureHelp() functions
are even more confusing, as they only sometimes do that.
I don't want to rename them in this patch as it's another large
mechanical change, but we should soon.
Change #4: prepare to rename ProduceSignatureHelp() to GuessArgumentType() etc.
Differential Revision: https://reviews.llvm.org/D98488
Remove emit-llvm-bc from addClangTargetOptions as it conflicts with -E for save-temps.
AMDGCN does not yet support linking object files so backend and assemble actions are
skipped, leaving LLVM IR as the output format.
Reviewed By: JonChesterfield, ronlieb
Differential Revision: https://reviews.llvm.org/D96769
The MSVC runtime library doesn't have a definition for wmemchr,
so provide an inline implementation.
Differential Revision: https://reviews.llvm.org/D98472
Summary:
because we enable -mignore-xcoff-visibility by default when there is no -fvisibility option in the clang in AIX OS
it will cause some test case fail at aix os. in order to let the -mignore-xcoff-visibility to be disable, we need to add the -fvisibility=default for those test case.
Reviewers: hubert.reinterpretcast daltenty
Differential Revision: https://reviews.llvm.org/D98660
Recognize BI__builtin_isinf and BI__builtin_isfinite (and a few other opcodes
for finite) in testFPKind() and handle with TDC.
Review: Ulrich Weigand.
Differential Revision: https://reviews.llvm.org/D97901
This patch adds the `__PCREL__` define when PC Relative addressing is enabled.
Reviewed By: nemanjai, #powerpc
Differential Revision: https://reviews.llvm.org/D98546
Visual Studios implementation of the C++ Standard Library does not use strerror to produce a message for std::error_code unlike other standard libraries such as libstdc++ or libc++ that might be used.
This patch adds a cmake script that through running a C++ program gets the error messages for the POSIX error codes and passes them onto lit through an optional config parameter.
If the config parameter is not set, or getting the messages failed, due to say a cross compiling configuration without an emulator, it will fall back to using pythons strerror functions.
Differential Revision: https://reviews.llvm.org/D98278
This patch implements the __rndr and __rndrrs intrinsics to provide access to the random
number instructions introduced in Armv8.5-A. They are only defined for the AArch64
execution state and are available when __ARM_FEATURE_RNG is defined.
These intrinsics store the random number in their pointer argument and return a status
code if the generation succeeded. The difference between __rndr __rndrrs, is that the latter
intrinsic reseeds the random number generator.
The instructions write the NZCV flags indicating the success of the operation that we can
then read with a CSET.
[1] https://developer.arm.com/docs/101028/latest/data-processing-intrinsics
[2] https://bugs.llvm.org/show_bug.cgi?id=47838
Differential Revision: https://reviews.llvm.org/D98264
Change-Id: I8f92e7bf5b450e5da3e59943b53482edf0df6efc
`__translate_sampler_initializer` has a calling convention of
`spir_func`, but clang generated calls to it using the default CC.
Instruction Combining was lowering these mismatching calling conventions
to `store i1* undef` which itself was subsequently lowered to a trap
instruction by simplifyCFG resulting in runtime `SIGILL`
There are arguably two bugs here: but whether there's any wisdom in
converting an obviously invalid call into a runtime crash over aborting
with a sensible error message will require further discussion. So for
now it's enough to set the right calling convention on the runtime
helper.
Reviewed By: svenh, bader
Differential Revision: https://reviews.llvm.org/D98411
Previously, the CurFPFeatures state was set to command line settings before
semantic analysis of the nested member functions and initialization
expressions, that's not correct, it should use the pragma state which
is in effect at the lexical position.
Reviewed By: Erich Keane, Aaron Ballman
Differential Revision: https://reviews.llvm.org/D98211
__builtin_isinf currently generates a floating-point compare operation
which triggers a trap when faced with a signaling NaN in StrictFP mode.
This commit uses integer operations instead to not generate any trap in
such a case.
Reviewed By: mibintc
Differential Revision: https://reviews.llvm.org/D97125
This test was apparently disabled in 6fcd4e080f, without any
sign of how it was going to be reenabled. This patch rewrites the test
to use update_cc_test_checks, with midend optimizations other that
mem2reg disabled.
The first attempt of this patch in 5ae949a927 failed on bots even
though it worked locally. I've attempted to adjust the RUN lines and
made the test AArch64/ARM specific.
Differential Revision: https://reviews.llvm.org/D98510
The patch adds an argument to update test scripts, such as update_cc_test_checks, for replacing a function name matching a regex. This functionality is needed to match generated function signatures that include file hashes. Example:
The function signature for the following function:
`__omp_offloading_50_b84c41e__Z9ftemplateIiET_i_l30_worker`
with `--replace-function-regex "__omp_offloading_[0-9]+_[a-z0-9]+_(.*)"` will become:
`CHECK-LABEL: @{{__omp_offloading_[0-9]+_[a-z0-9]+__Z9ftemplateIiET_i_l30_worker}}(`
Reviewed By: jdoerfert
Differential Revision: https://reviews.llvm.org/D97107
The patch adds an argument to update test scripts, such as update_cc_test_checks, for replacing a function name matching a regex. This functionality is needed to match generated function signatures that include file hashes. Example:
The function signature for the following function:
`__omp_offloading_50_b84c41e__Z9ftemplateIiET_i_l30_worker`
with `--replace-function-regex "__omp_offloading_[0-9]+_[a-z0-9]+_(.*)"` will become:
`CHECK-LABEL: @{{__omp_offloading_[0-9]+_[a-z0-9]+__Z9ftemplateIiET_i_l30_worker}}(`
Reviewed By: jdoerfert
Differential Revision: https://reviews.llvm.org/D97107
See PR48593.
Constraints with invalid type parameters were causing a null pointer
dereference.
Signed-off-by: Matheus Izvekov <mizvekov@gmail.com>
Reviewed By: rsmith
Differential Revision: https://reviews.llvm.org/D98095
This patch just makes the error message clearer by reinforcing the cause
was a lack of viable **three-way** comparison function for the
**complete object**.
Signed-off-by: Matheus Izvekov <mizvekov@gmail.com>
Reviewed By: rsmith
Differential Revision: https://reviews.llvm.org/D97990
This was motivated by the fact that constructor type homing (debug info
optimization that we want to turn on by default) drops some libc++ types,
so an attribute would allow us to override constructor homing and emit
them anyway. I'm currently looking into the particular libc++ issue, but
even if we do fix that, this issue might come up elsewhere and it might be
nice to have this.
As I've implemented it now, the attribute isn't specific to the
constructor homing optimization and overrides all of the debug info
optimizations.
Open to discussion about naming, specifics on what the attribute should do, etc.
Differential Revision: https://reviews.llvm.org/D97411
This test was apparently disabled in 6fcd4e080f, without any
sign of how it was going to be reenabled. This patch rewrites the test
to use update_cc_test_checks, with midend optimizations other that
mem2reg disabled.
This patch fixes the situation when our knowledge of disequalities
can help us figuring out that some assumption is infeasible, but
the solver still produces a state with inconsistent constraints.
Additionally, this patch adds a couple of assertions to catch this
type of problems easier.
Differential Revision: https://reviews.llvm.org/D98341
This broke the check-profile tests on Mac, see comment on the code
review.
> This is no longer needed, we can add __llvm_profile_runtime directly
> to llvm.compiler.used or llvm.used to achieve the same effect.
>
> Differential Revision: https://reviews.llvm.org/D98325
This reverts commit c7712087cb.
Also reverting the dependent follow-up commit:
Revert "[InstrProfiling] Generate runtime hook for ELF platforms"
> When using -fprofile-list to selectively apply instrumentation only
> to certain files or functions, we may end up with a binary that doesn't
> have any counters in the case where no files were selected. However,
> because on Linux and Fuchsia, we pass -u__llvm_profile_runtime, the
> runtime would still be pulled in and incur some non-trivial overhead,
> especially in the case when the continuous or runtime counter relocation
> mode is being used. A better way would be to pull in the profile runtime
> only when needed by declaring the __llvm_profile_runtime symbol in the
> translation unit only when needed.
>
> This approach was already used prior to 9a041a7522, but we changed it
> to always generate the __llvm_profile_runtime due to a TAPI limitation.
> Since TAPI is only used on Mach-O platforms, we could use the early
> emission of __llvm_profile_runtime there, and on other platforms we
> could change back to the earlier approach where the symbol is generated
> later only when needed. We can stop passing -u__llvm_profile_runtime to
> the linker on Linux and Fuchsia since the generated undefined symbol in
> each translation unit that needed it serves the same purpose.
>
> Differential Revision: https://reviews.llvm.org/D98061
This reverts commit 87fd09b25f.
WG14 adopted N2626 at the meetings this week. This commit adds support
for using ' as a digit separator in a numeric literal which is
compatible with the C++ feature.
There is no need to check for enabled pragma for core or optional core features,
thus this check is removed
Reviewed By: Anastasia
Differential Revision: https://reviews.llvm.org/D97058
Nested `omp [begin|end] declare variant` inherit the selectors from
surrounding `omp (begin|end) declare variant` constructs. To stop such
propagation the user can add the `disable_selector_propagation` to the
`extension` set in the `implementation` selector.
Reviewed By: tianshilei1992
Differential Revision: https://reviews.llvm.org/D95765
If we have nested declare variant context, it doesn't make sense to
inherit the match extension from the parent. Instead, just skip it.
Reviewed By: JonChesterfield
Differential Revision: https://reviews.llvm.org/D95764
This allows to check for various globals (metadata/attributes/...) and
also resolves problems with globals (metadata/attributes/...) being
reused across different prefixes.
Reviewed By: sstefan1
Differential Revision: https://reviews.llvm.org/D94741
D96109 added support for unique internal linkage names for both internal
linkage functions and global variables. There was a lot of discussion on how to
get the demangling right for functions but I completely missed the point that
demanglers do not support suffixes for global vars. For example:
$ c++filt _ZL3foo
foo
$ c++filt _ZL3foo.uniq.123
_ZL3foo.uniq.123
The demangling for functions works as expected.
I am not sure of the impact of this. I don't understand how debuggers and other
tools depend on the correctness of global variable demangling so I am
pre-emptively disabling it until we can get the demangling support added.
Importantly, uniquefying global variables is not needed right now as we do not
do profile attribution to global vars based on sampling. It was added for
completeness and so this feature is not exactly missed.
Differential Revision: https://reviews.llvm.org/D98392
This patch extends the matrix spec to allow matrix-by-scalar division.
Originally support for `/` was left out to avoid ambiguity for the
matrix-matrix version of `/`, which could either be elementwise or
specified as matrix multiplication M1 * (1/M2).
For the matrix-scalar version, no ambiguity exists; `*` is also
an elementwise operation in that case. Matrix-by-scalar division
is commonly supported by systems including Matlab, Mathematica
or NumPy.
Reviewed By: rjmccall
Differential Revision: https://reviews.llvm.org/D97857
When using -fprofile-list to selectively apply instrumentation only
to certain files or functions, we may end up with a binary that doesn't
have any counters in the case where no files were selected. However,
because on Linux and Fuchsia, we pass -u__llvm_profile_runtime, the
runtime would still be pulled in and incur some non-trivial overhead,
especially in the case when the continuous or runtime counter relocation
mode is being used. A better way would be to pull in the profile runtime
only when needed by declaring the __llvm_profile_runtime symbol in the
translation unit only when needed.
This approach was already used prior to 9a041a7522, but we changed it
to always generate the __llvm_profile_runtime due to a TAPI limitation.
Since TAPI is only used on Mach-O platforms, we could use the early
emission of __llvm_profile_runtime there, and on other platforms we
could change back to the earlier approach where the symbol is generated
later only when needed. We can stop passing -u__llvm_profile_runtime to
the linker on Linux and Fuchsia since the generated undefined symbol in
each translation unit that needed it serves the same purpose.
Differential Revision: https://reviews.llvm.org/D98061
Summary:
The changes introduced in D87946 changed the API for libomptarget
functions. `__kmpc_push_target_tripcount` was a function in Clang 11.x
but was not given a backward-compatible interface. This change will
require people using Clang 13.x or 12.x to recompile their offloading
programs.
Reviewed By: jdoerfert cchen
Differential Revision: https://reviews.llvm.org/D98358
Commit 1b04bdc2f3 added support for
capturing the 'this' pointer in a SEH context (__finally or __except),
But the case in which the 'this' pointer is part of a lambda capture
was not handled properly
Differential Revision: https://reviews.llvm.org/D97687
Demonstrate how to generate vadd/vfadd intrinsic functions
1. add -gen-riscv-vector-builtins for clang builtins.
2. add -gen-riscv-vector-builtin-codegen for clang codegen.
3. add -gen-riscv-vector-header for riscv_vector.h. It also generates
ifdef directives with extension checking, base on D94403.
4. add -gen-riscv-vector-generic-header for riscv_vector_generic.h.
Generate overloading version Header for generic api.
https://github.com/riscv/rvv-intrinsic-doc/blob/master/rvv-intrinsic-rfc.md#c11-generic-interface
5. update tblgen doc for riscv related options.
riscv_vector.td also defines some unused type transformers for vadd,
because I think it could demonstrate how tranfer type work and we need
them for the whole intrinsic functions implementation in the future.
Authored-by: Roger Ferrer Ibanez <rofirrim@gmail.com>
Co-Authored-by: Zakk Chen <zakk.chen@sifive.com>
Reviewed By: jrtc27, craig.topper, HsiangKai, Jim, Paul-C-Anagnostopoulos
Differential Revision: https://reviews.llvm.org/D95016
Initially, this flag was meant to only be used through cc1 and not directly
through the clang driver. However, we accidentally ended up using this flag
as a driver flag already for selecting multilibs within the fuchsia toolchain.
We're currently in an awkward state where it's only accepted as a driver flag
when targeting Fuchsia, and all other instances it can only be added via
-Xclang. Since we're ready to use this in Fuchsia, we can just expose this to
the driver for simplicity.
Differential Revision: https://reviews.llvm.org/D98375
Updates __is_unsigned to have the same behavior as the standard
specifies. This is in line with 511dbd8, which applied the same change
to __is_signed.
Refs D67897.
Differential Revision: https://reviews.llvm.org/D98104
The patch adds an argument to update_cc_test_checks for replacing a function name matching a regex. This functionality is needed to match generated function signatures that include file hashes. Example:
The function signature for the following function:
`__omp_offloading_50_b84c41e__Z9ftemplateIiET_i_l30_worker`
with `--replace-function-regex "__omp_offloading_[0-9]+_[a-z0-9]+_(.*)"` will become:
`CHECK-LABEL: @{{__omp_offloading_[0-9]+_[a-z0-9]+__Z9ftemplateIiET_i_l30_worker}}(`
Reviewed By: jdoerfert
Differential Revision: https://reviews.llvm.org/D97107
Some tests in clang require running non-filechecked commands to generate the actual filecheck input. For example, tests for openmp offloading require generating the host bc without any checking, before running the clang command to actually generate the filechecked IR of the target device. This patch enables `update_cc_test_checks.py` to run non-filechecked run lines in-place.
Reviewed By: jdoerfert
Differential Revision: https://reviews.llvm.org/D97068
If the non-iterator side of an iterator operation
`+`, `+=`, `-` or `-=` is `UndefinedVal` an assertions happens.
This small fix prevents this.
Patch by Adam Balogh.
Reviewed By: NoQ
Differential Revision: https://reviews.llvm.org/D85424
Before `bc713f6a004723d1325bc16e1efc32d0ac82f939` landed, the analyzer
crashed on this reduced example.
It seems important to have bot `ctu` and `-analyzer-opt-analyze-headers`
enabled in the example.
This test file ensures that no regression happens in the future in this regard.
Reviewed By: martong, NoQ
Differential Revision: https://reviews.llvm.org/D96586
According to a Bugzilla ticket (https://bugs.llvm.org/show_bug.cgi?id=45148),
ArrayBoundCheckerV2 produces a false-positive report.
This patch adds a test demonstrating the current //flawed// behavior.
Also adds several similar test cases just to be on the safe side.
Reviewed By: martong
Differential Revision: https://reviews.llvm.org/D86870
IR produced using TableGen builtin function declarations
(`fdeclare-opencl-builtins.cl`) did not have the target's calling
convention applied to builtin calls.
Fix this, and update the codegen test to check that IR produced using
opencl-c.h and `-fdeclare-opencl-builtins` is identical with respect
to the builtin calls.
Differential Revision: https://reviews.llvm.org/D98039
In -fno-exceptions -fno-asynchronous-unwind-tables -g0 mode,
GCC does not emit `.cfi_*` directives.
```
% diff <(gcc -fno-asynchronous-unwind-tables -dM -E a.c) <(gcc -dM -E a.c)
130a131
> #define __GCC_HAVE_DWARF2_CFI_ASM 1
```
This macro is useful because code can decide whether inline asm should include `.cfi_*` directives.
`.cfi_*` directives without `.cfi_startproc` can cause assembler errors
(integrated assembler: `this directive must appear between .cfi_startproc and .cfi_endproc directives`).
Differential Revision: https://reviews.llvm.org/D97743
By default, the driver uses the compiler-rt builtins and links with
-l:libunwind.a.
Restore the previous behavior by passing --rtlib=libgcc.
Reviewed By: danalbert
Differential Revision: https://reviews.llvm.org/D96404
The only time we would consider allowing this is inside a call to
std::allocator<T>::deallocate, whose contract does not permit deletion
of null pointers.
In -fno-exceptions -fno-asynchronous-unwind-tables -g0 mode,
GCC does not emit `.cfi_*` directives.
```
% diff <(gcc -fno-asynchronous-unwind-tables -dM -E a.c) <(gcc -dM -E a.c)
130a131
> #define __GCC_HAVE_DWARF2_CFI_ASM 1
```
This macro is useful because code can decide whether inline asm should include `.cfi_*` directives.
`.cfi_*` directives without `.cfi_startproc` can cause assembler errors
(integrated assembler: `this directive must appear between .cfi_startproc and .cfi_endproc directives`).
Differential Revision: https://reviews.llvm.org/D97743
We used to trigger assertion when transforming c-tor with unparsed
default argument. Now we ignore such constructors for this purpose.
Differential Revision: https://reviews.llvm.org/D97965
In case a char-literal of type int (C/ObjectiveC) corresponds to a
format specifier with the %hh length modifier, don't treat the literal
as of type char for issuing diagnostics, as otherwise this results in:
printf("%hhd", 'e');
warning: format specifies type 'char' but the argument has type 'char'.
Differential revision: https://reviews.llvm.org/D97951
SUMMARY:
n the patch https://reviews.llvm.org/D87451 "add new option -mignore-xcoff-visibility"
we did as "The option -mignore-xcoff-visibility has no effect on visibility attribute when compile with -emit-llvm option to generated LLVM IR."
in these patch we let -mignore-xcoff-visibility effect on generating IR too. the new feature only work on AIX OS
Reviewer: Jason Liu,
Differential Revision: https://reviews.llvm.org/D89986
This patch implements the conditional select operator for
ext_vector_types in C++. It does so by using the same semantics as for
C.
D71463 added support for the conditional select operator for VectorType
in C++. Unfortunately the semantics between ext_vector_type in C are
different to VectorType in C++. Select for ext_vector_type is based on
the MSB of the condition vector, whereas for VectorType it is `!= 0`.
This unfortunately means that the behavior is inconsistent between
ExtVectorType and VectorType, but I think using the C semantics for
ExtVectorType in C++ as well should be less surprising for users.
Reviewed By: erichkeane, aaron.ballman
Differential Revision: https://reviews.llvm.org/D98055
Builtins that require multiple extensions, such as certain
`write_imagef` forms, were not exposed because of the Sema check not
splitting the extension string.
Differential Revision: https://reviews.llvm.org/D97930
See https://bugs.llvm.org/show_bug.cgi?id=42154.
GCC's __attribute__((align)) can reduce the alignment of a type when applied to
a typedef. However, functions which take a pointer or reference to the
original type are compiled assuming the original alignment. Therefore when any
such function is passed an object of the new, less-aligned type, an alignment
fault can occur. In particular, this applies to the constructor, which is
defined for the original type and called for the less-aligned object.
This change adds a warning whenever an pointer or reference to an object is
passed to a function that was defined for a more-aligned type.
The calls to ASTContext::getTypeAlignInChars seem change the order in which
record layouts are evaluated, which caused changes to the output of
-fdump-record-layouts. As such some tests needed to be updated:
* Use CHECK-LABEL rather than counting the number of "Dumping AST Record
Layout" headers.
* Check for end of line in labels, so that struct B1 doesn't match struct B
etc.
* Add --strict-whitespace, since the whitespace shows meaningful structure.
* The order in which record layouts are printed has changed in some cases.
* clang-format for regions changed
Differential Revision: https://reviews.llvm.org/D97187
In D97003, CUDA 9.2 is the minimum requirement for OpenMP offloading on
NVPTX target. We don't need to have macros in source code to select right functions
based on CUDA version. we don't need to compile multiple bitcode libraries of
different CUDA versions for each SM. We don't need to worry about future
compatibility with newer CUDA version.
`-target-feature +ptx61` is used in this patch, which corresponds to the highest
PTX version that CUDA 9.2 can support.
Reviewed By: jdoerfert
Differential Revision: https://reviews.llvm.org/D97198
Some tests in clang require running non-filechecked commands to generate the actual filecheck input. For example, tests for openmp offloading require generating the host bc without any checking, before running the clang command to actually generate the filechecked IR of the target device. This patch enables `update_cc_test_checks.py` to run non-filechecked run lines in-place.
Reviewed By: jdoerfert
Differential Revision: https://reviews.llvm.org/D97068
This changes the target data layout to make stack align to 16 bytes
on Power10. Before this change, stack was being aligned to 32 bytes.
Reviewed By: #powerpc, nemanjai
Differential Revision: https://reviews.llvm.org/D96265
Spack is a package management tool extensively used by HPC community.
As ROCm packages are built by Spack by HPC community, we need to teach
clang driver to detect ROCm installation built by Spack.
Reviewed by: Artem Belevich
Differential Revision: https://reviews.llvm.org/D97340
For MinGW targets, we distinguish between an explicitly shared unwinder
library (requested via -shared-libgcc), an explicitly static one
(requested via -static-libgcc or -static) and the default case (which
just passes -lunwind to the linker, which will pick either shared or
static depending on what's available, with the normal linker logic).
This makes the implicit default case (as added in D79995) actually work as
it was intended, when using the g++ driver (which is the main usecase for
libunwind as far as I know).
Differential Revision: https://reviews.llvm.org/D98023
Prior to this fix, constrained decltype(auto) behaves exactly the same
as constrained regular auto.
This fixes it so it deduces like decltype(auto).
Signed-off-by: Matheus Izvekov <mizvekov@gmail.com>
Reviewed By: rsmith
Differential Revision: https://reviews.llvm.org/D98087
Fix duplicate diagnostic for an over-aligned allocation with no matching
function, and add custom diagnostic for the case where the
non-allocating placement new was intended but <new> was not included.
The option -funique-internal-linkage-names was added in D73307 and D78243 as a
LLVM early pass to insert a unique suffix to internal linkage functions and
vars. The unique suffix was the hash of the module path. However, we found
that this can be done more cleanly in clang early and the fixes that need to
be done later can be completely avoided. The fixes in particular are trying
to modify the DW_AT_linkage_name and finding the right place to insert the
pass.
This patch ressurects the original implementation proposed in D73307 which
was reviewed and then ditched in favor of the pass based approach.
Differential Revision: https://reviews.llvm.org/D96109
emission
Ensure that we are in a function declaration context before checking
the diagnostic emission status, to avoid dereferencing a NULL function
declaration.
Differential Revision: https://reviews.llvm.org/D97573
gfx1030 added a new way to implement readcyclecounter using the
SHADER_CYCLES hardware register, but the s_memtime instruction still
exists, so the MC layer should still accept it and the
llvm.amdgcn.s.memtime intrinsic should still work.
Differential Revision: https://reviews.llvm.org/D97928
A bug was introduced when adding -munsafe-fp-atomics.
By default it should be off.
Reviewed by: Artem Belevich
Differential Revision: https://reviews.llvm.org/D97967
Like nvptx and some other targets, -mconstructor-aliases does not work well with amdgpu,
therefore we disable it in the same approach.
Reviewed by: Artem Belevich
Differential Revision: https://reviews.llvm.org/D97959
`mix` is subtly different from `clamp`: in the overloads where the
last argument is a scalar, the second argument should be a gentype for
`mix`.
As scalars can be implicitly converted to vectors, this cannot be
caught in the Sema test. Hence adding a CodeGen test, where we can
verify the types using the mangled name.
Initial support for using the OpenMPIRBuilder by clang to generate loops using the OpenMPIRBuilder. This initial support is intentionally limited to:
* Only the worksharing-loop directive.
* Recognizes only the nowait clause.
* No loop nests with more than one loop.
* Untested with templates, exceptions.
* Semantic checking left to the existing infrastructure.
This patch introduces a new AST node, OMPCanonicalLoop, which becomes parent of any loop that has to adheres to the restrictions as specified by the OpenMP standard. These restrictions allow OMPCanonicalLoop to provide the following additional information that depends on base language semantics:
* The distance function: How many loop iterations there will be before entering the loop nest.
* The loop variable function: Conversion from a logical iteration number to the loop variable.
These allow the OpenMPIRBuilder to act solely using logical iteration numbers without needing to be concerned with iterator semantics between calling the distance function and determining what the value of the loop variable ought to be. Any OpenMP logical should be done by the OpenMPIRBuilder such that it can be reused MLIR OpenMP dialect and thus by flang.
The distance and loop variable function are implemented using lambdas (or more exactly: CapturedStmt because lambda implementation is more interviewed with the parser). It is up to the OpenMPIRBuilder how they are called which depends on what is done with the loop. By default, these are emitted as outlined functions but we might think about emitting them inline as the OpenMPRuntime does.
For compatibility with the current OpenMP implementation, even though not necessary for the OpenMPIRBuilder, OMPCanonicalLoop can still be nested within OMPLoopDirectives' CapturedStmt. Although OMPCanonicalLoop's are not currently generated when the OpenMPIRBuilder is not enabled, these can just be skipped when not using the OpenMPIRBuilder in case we don't want to make the AST dependent on the EnableOMPBuilder setting.
Loop nests with more than one loop require support by the OpenMPIRBuilder (D93268). A simple implementation of non-rectangular loop nests would add another lambda function that returns whether a loop iteration of the rectangular overapproximation is also within its non-rectangular subset.
Reviewed By: jdenny
Differential Revision: https://reviews.llvm.org/D94973
Background:
Wasm EH, while using Windows EH (catchpad/cleanuppad based) IR, uses
Itanium-based libraries and ABIs with some modifications.
`__clang_call_terminate` is a wrapper generated in Clang's Itanium C++
ABI implementation. It contains this code, in C-style pseudocode:
```
void __clang_call_terminate(void *exn) {
__cxa_begin_catch(exn);
std::terminate();
}
```
So this function is a wrapper to call `__cxa_begin_catch` on the
exception pointer before termination.
In Itanium ABI, this function is called when another exception is thrown
while processing an exception. The pointer for this second, violating
exception is passed as the argument of this `__clang_call_terminate`,
which calls `__cxa_begin_catch` with that pointer and calls
`std::terminate` to terminate the program.
The spec (https://libcxxabi.llvm.org/spec.html) for `__cxa_begin_catch`
says,
```
When the personality routine encounters a termination condition, it
will call __cxa_begin_catch() to mark the exception as handled and then
call terminate(), which shall not return to its caller.
```
In wasm EH's Clang implementation, this function is called from
cleanuppads that terminates the program, which we also call terminate
pads. Cleanuppads normally don't access the thrown exception and the
wasm backend converts them to `catch_all` blocks. But because we need
the exception pointer in this cleanuppad, we generate
`wasm.get.exception` intrinsic (which will eventually be lowered to
`catch` instruction) as we do in the catchpads. But because terminate
pads are cleanup pads and should run even when a foreign exception is
thrown, so what we have been doing is:
1. In `WebAssemblyLateEHPrepare::ensureSingleBBTermPads()`, we make sure
terminate pads are in this simple shape:
```
%exn = catch
call @__clang_call_terminate(%exn)
unreachable
```
2. In `WebAssemblyHandleEHTerminatePads` pass at the end of the
pipeline, we attach a `catch_all` to terminate pads, so they will be in
this form:
```
%exn = catch
call @__clang_call_terminate(%exn)
unreachable
catch_all
call @std::terminate()
unreachable
```
In `catch_all` part, we don't have the exception pointer, so we call
`std::terminate()` directly. The reason we ran HandleEHTerminatePads at
the end of the pipeline, separate from LateEHPrepare, was it was
convenient to assume there was only a single `catch` part per `try`
during CFGSort and CFGStackify.
---
Problem:
While it thinks terminate pads could have been possibly split or calls
to `__clang_call_terminate` could have been duplicated,
`WebAssemblyLateEHPrepare::ensureSingleBBTermPads()` assumes terminate
pads contain no more than calls to `__clang_call_terminate` and
`unreachable` instruction. I assumed that because in LLVM very limited
forms of transformations are done to catchpads and cleanuppads to
maintain the scoping structure. But it turned out to be incorrect;
passes can merge cleanuppads into one, including terminate pads, as long
as the new code has a correct scoping structure. One pass that does this
I observed was `SimplifyCFG`, but there can be more. After this
transformation, a single cleanuppad can contain any number of other
instructions with the call to `__clang_call_terminate` and can span many
BBs. It wouldn't be practical to duplicate all these BBs within the
cleanuppad to generate the equivalent `catch_all` blocks, only with
calls to `__clang_call_terminate` replaced by calls to `std::terminate`.
Unless we do more complicated transformation to split those calls to
`__clang_call_terminate` into a separate cleanuppad, it is tricky to
solve.
---
Solution (?):
This CL just disables the generation and use of `__clang_call_terminate`
and calls `std::terminate()` directly in its place.
The possible downside of this approach can be, because the Itanium ABI
intended to "mark" the violating exception handled, we don't do that
anymore. What `__cxa_begin_catch` actually does is increment the
exception's handler count and decrement the uncaught exception count,
which in my opinion do not matter much given that we are about to
terminate the program anyway. Also it does not affect info like stack
traces that can be possibly shown to developers.
And while we use a variant of Itanium EH ABI, we can make some
deviations if we choose to; we are already different in that in the
current version of the EH spec we don't support two-phase unwinding. We
can possibly consider a more complicated transformation later to
reenable this, but I don't think that has high priority.
Changes in this CL contains:
- In Clang, we don't generate a call to `wasm.get.exception()` intrinsic
and `__clang_call_terminate` function in terminate pads anymore; we
simply generate calls to `std::terminate()`, which is the default
implementation of `CGCXXABI::emitTerminateForUnexpectedException`.
- Remove `WebAssembly::ensureSingleBBTermPads() function and
`WebAssemblyHandleEHTerminatePads` pass, because terminate pads are
already `catch_all` now (because they don't need the exception
pointer) and we don't need these transformations anymore.
- Change tests to use `std::terminate` directly. Also removes tests that
tested `LateEHPrepare::ensureSingleBBTermPads` and
`HandleEHTerminatePads` pass.
- Drive-by fix: Add some function attributes to EH intrinsic
declarations
Fixes https://github.com/emscripten-core/emscripten/issues/13582.
Reviewed By: dschuff, tlively
Differential Revision: https://reviews.llvm.org/D97834
Use a WeakTrackingVH to cope with the stmt emission logic that cleans up
unreachable blocks. This invalidates the reference to the deferred
replacement placeholder. Cope with it.
Fixes PR25102 (from 2015!)
This change adds a new IR noundef attribute, which denotes when a function call argument or return val may never contain uninitialized bits.
In MemorySanitizer, this attribute enables optimizations which decrease instrumented code size by up to 17% (measured with an instrumented build of clang) . I'll introduce the change allowing msan to take advantage of this information in a separate patch.
Differential Revision: https://reviews.llvm.org/D81678
explicitly emitting retainRV or claimRV calls in the IR
This reapplies ed4718eccb, which was reverted
because it was causing a miscompile. The bug that was causing the miscompile
has been fixed in 75805dce5f.
Original commit message:
Background:
This fixes a longstanding problem where llvm breaks ARC's autorelease
optimization (see the link below) by separating calls from the marker
instructions or retainRV/claimRV calls. The backend changes are in
https://reviews.llvm.org/D92569.
https://clang.llvm.org/docs/AutomaticReferenceCounting.html#arc-runtime-objc-autoreleasereturnvalue
What this patch does to fix the problem:
- The front-end adds operand bundle "clang.arc.attachedcall" to calls,
which indicates the call is implicitly followed by a marker
instruction and an implicit retainRV/claimRV call that consumes the
call result. In addition, it emits a call to
@llvm.objc.clang.arc.noop.use, which consumes the call result, to
prevent the middle-end passes from changing the return type of the
called function. This is currently done only when the target is arm64
and the optimization level is higher than -O0.
- ARC optimizer temporarily emits retainRV/claimRV calls after the calls
with the operand bundle in the IR and removes the inserted calls after
processing the function.
- ARC contract pass emits retainRV/claimRV calls after the call with the
operand bundle. It doesn't remove the operand bundle on the call since
the backend needs it to emit the marker instruction. The retainRV and
claimRV calls are emitted late in the pipeline to prevent optimization
passes from transforming the IR in a way that makes it harder for the
ARC middle-end passes to figure out the def-use relationship between
the call and the retainRV/claimRV calls (which is the cause of
PR31925).
- The function inliner removes an autoreleaseRV call in the callee if
nothing in the callee prevents it from being paired up with the
retainRV/claimRV call in the caller. It then inserts a release call if
claimRV is attached to the call since autoreleaseRV+claimRV is
equivalent to a release. If it cannot find an autoreleaseRV call, it
tries to transfer the operand bundle to a function call in the callee.
This is important since the ARC optimizer can remove the autoreleaseRV
returning the callee result, which makes it impossible to pair it up
with the retainRV/claimRV call in the caller. If that fails, it simply
emits a retain call in the IR if retainRV is attached to the call and
does nothing if claimRV is attached to it.
- SCCP refrains from replacing the return value of a call with a
constant value if the call has the operand bundle. This ensures the
call always has at least one user (the call to
@llvm.objc.clang.arc.noop.use).
- This patch also fixes a bug in replaceUsesOfNonProtoConstant where
multiple operand bundles of the same kind were being added to a call.
Future work:
- Use the operand bundle on x86-64.
- Fix the auto upgrader to convert call+retainRV/claimRV pairs into
calls with the operand bundles.
rdar://71443534
Differential Revision: https://reviews.llvm.org/D92808
... unless it's a literal
D94640 was a bit too aggressive in its analysis, considering integers
representing valid addresses as invalid. This change rolls back some of
the check, so that only the most obvious case is still flagged.
Before:
```cpp
free((void*)1000); // literal converted to `void*`: warning good
free((void*)an_int); // `int` object converted to `void*`: warning might
// be a false positive
```
After
```cpp
free((void*)1000); // literal converted to `void*`: warning good
free((void*)an_int); // doesn't warn
```
Differential Revision: https://reviews.llvm.org/D97512
If the file in line directive does not exist on the system we need, to
use the original file to get its file id.
Differential Revision: https://reviews.llvm.org/D97945
The SwitchStmt::FirstCase member is not initialized when the AST is
built by the ASTStmtReader. See the below code of
ASTStmtReader::VisitSwitchStmt in the case where the for loop does not
have any iterations:
```
// ... more code ...
SwitchCase *PrevSC = nullptr;
for (auto E = Record.size(); Record.getIdx() != E; ) {
SwitchCase *SC = Record.getSwitchCaseWithID(Record.readInt());
if (PrevSC)
PrevSC->setNextSwitchCase(SC);
else
S->setSwitchCaseList(SC); // Sets FirstCase !!!
PrevSC = SC;
}
} // return
```
Later, in ASTNodeImporter::VisitSwitchStmt,
we have a condition that depends on this uninited value:
```
for (SwitchCase *SC = S->getSwitchCaseList(); SC != nullptr;
SC = SC->getNextSwitchCase()) {
// ... more code ...
}
```
This is clearly an UB. This causes non-deterministic crashes when
ClangSA analyzes some code with CTU. See the below report by valgrind
(the whole valgrind output is attached):
```
==31019== Conditional jump or move depends on uninitialised value(s)
==31019== at 0x12ED1983: clang::ASTNodeImporter::VisitSwitchStmt(clang::SwitchStmt*) (ASTImporter.cpp:6195)
==31019== by 0x12F1D509: clang::StmtVisitorBase<std::add_pointer, clang::ASTNodeImporter, llvm::Expected<clang::Stmt*>>::Visit(clang::Stmt*) (StmtNodes.inc:591)
==31019== by 0x12EE4FDF: clang::ASTImporter::Import(clang::Stmt*) (ASTImporter.cpp:8484)
==31019== by 0x12F09498: llvm::Expected<clang::Stmt*> clang::ASTNodeImporter::import<clang::Stmt>(clang::Stmt*) (ASTImporter.cpp:164)
==31019== by 0x12F3A1F5: llvm::Error clang::ASTNodeImporter::ImportArrayChecked<clang::Stmt**, clang::Stmt**>(clang::Stmt**, clang::Stmt**, clang::Stmt**) (ASTImporter.cpp:653)
==31019== by 0x12F13152: llvm::Error clang::ASTNodeImporter::ImportContainerChecked<llvm::iterator_range<clang::Stmt**>, llvm::SmallVector<clang::Stmt*, 8u> >(llvm::iterator_range<clang::Stmt**> const&, llvm::SmallVector<clang::Stmt*, 8u>&) (ASTImporter.cpp:669)
==31019== by 0x12ED099F: clang::ASTNodeImporter::VisitCompoundStmt(clang::CompoundStmt*) (ASTImporter.cpp:6077)
==31019== by 0x12F1CC2D: clang::StmtVisitorBase<std::add_pointer, clang::ASTNodeImporter, llvm::Expected<clang::Stmt*>>::Visit(clang::Stmt*) (StmtNodes.inc:73)
==31019== by 0x12EE4FDF: clang::ASTImporter::Import(clang::Stmt*) (ASTImporter.cpp:8484)
==31019== by 0x12F09498: llvm::Expected<clang::Stmt*> clang::ASTNodeImporter::import<clang::Stmt>(clang::Stmt*) (ASTImporter.cpp:164)
==31019== by 0x12F13275: clang::Stmt* clang::ASTNodeImporter::importChecked<clang::Stmt*>(llvm::Error&, clang::Stmt* const&) (ASTImporter.cpp:197)
==31019== by 0x12ED0CE6: clang::ASTNodeImporter::VisitCaseStmt(clang::CaseStmt*) (ASTImporter.cpp:6098)
```
Differential Revision: https://reviews.llvm.org/D97849
IR symbol table does not parse inline asm. A symbol only referenced by inline
asm is not in the IR symbol table, so LTO does not know that the definition (in
another translation unit) is referenced and may internalize it, even if that
definition has `__attribute__((used))` (which lowers to `llvm.compiler.used` on
ELF targets since D97446).
```
// cabac.c
__attribute__((used)) const uint8_t ff_h264_cabac_tables[...] = {...};
// h264_cabac.c
asm("lea ff_h264_cabac_tables(%rip), %0" : ...);
```
`__attribute__((used))` is the recommended way to tell the compiler there may
be inline asm references, so the usage is perfectly fine. This patch
conservatively sets the `FB_used` bit on `llvm.compiler.used` symbols to work
around the IR symbol table limitation. Note: before D97446, Clang never emitted
symbols in the `llvm.compiler.used` list, so this change does not punish any
Clang emitted global object.
Without the patch, `ff_h264_cabac_tables` may be assigned to a non-external
partition and get internalized. Then we will get a linker error because the
`cabac.c` definition is not exposed.
Differential Revision: https://reviews.llvm.org/D97755
Adding it to the general filepaths results in it being added to the
linker arguments. The AIX linker always looks in this path anyway
and adds it as a default library path component. Adding this duplicate
explicitly results in duplicate entries in path in the loader section
of executables and messes up tools like CMake that parse the default
library flags.
Reviewed By: ZarkoCA
Differential Revision: https://reviews.llvm.org/D97574
Currently the test on line 3 is identical to the test on line 1.
Looking at the rest of the file (particularily the use of FOVERRIDE
as the check-prefix), I think it's pretty clear that this line
was supposed to use `-ftrigraphs` instead of `-trigraphs`.
Reviewed By: thakis
Differential Revision: https://reviews.llvm.org/D97796
https://wg21.link/P2173 is making its way through WG21 currently and
has not been formally adopted yet. This feature provides very useful
functionality in that you can specify attributes on the various
function *declarations* generated by a lambda expression, where the
current C++ grammar only allows attributes which apply to the various
function *types* so generated.
This patch implements P2173 on the assumption that it will be adopted
by WG21 with this syntax for C++23.
This commit refactors extension support to allow
specifying whether pragma is needed or not explicitly.
For backward compatibility pragmas are set to required
for all extensions that were added prior to this but
not for OpenCL 3.0 features.
Differential Revision: https://reviews.llvm.org/D97052
This caused miscompiles of Chromium tests for iOS due clobbering of live
registers. See discussion on the code review for details.
> Background:
>
> This fixes a longstanding problem where llvm breaks ARC's autorelease
> optimization (see the link below) by separating calls from the marker
> instructions or retainRV/claimRV calls. The backend changes are in
> https://reviews.llvm.org/D92569.
>
> https://clang.llvm.org/docs/AutomaticReferenceCounting.html#arc-runtime-objc-autoreleasereturnvalue
>
> What this patch does to fix the problem:
>
> - The front-end adds operand bundle "clang.arc.attachedcall" to calls,
> which indicates the call is implicitly followed by a marker
> instruction and an implicit retainRV/claimRV call that consumes the
> call result. In addition, it emits a call to
> @llvm.objc.clang.arc.noop.use, which consumes the call result, to
> prevent the middle-end passes from changing the return type of the
> called function. This is currently done only when the target is arm64
> and the optimization level is higher than -O0.
>
> - ARC optimizer temporarily emits retainRV/claimRV calls after the calls
> with the operand bundle in the IR and removes the inserted calls after
> processing the function.
>
> - ARC contract pass emits retainRV/claimRV calls after the call with the
> operand bundle. It doesn't remove the operand bundle on the call since
> the backend needs it to emit the marker instruction. The retainRV and
> claimRV calls are emitted late in the pipeline to prevent optimization
> passes from transforming the IR in a way that makes it harder for the
> ARC middle-end passes to figure out the def-use relationship between
> the call and the retainRV/claimRV calls (which is the cause of
> PR31925).
>
> - The function inliner removes an autoreleaseRV call in the callee if
> nothing in the callee prevents it from being paired up with the
> retainRV/claimRV call in the caller. It then inserts a release call if
> claimRV is attached to the call since autoreleaseRV+claimRV is
> equivalent to a release. If it cannot find an autoreleaseRV call, it
> tries to transfer the operand bundle to a function call in the callee.
> This is important since the ARC optimizer can remove the autoreleaseRV
> returning the callee result, which makes it impossible to pair it up
> with the retainRV/claimRV call in the caller. If that fails, it simply
> emits a retain call in the IR if retainRV is attached to the call and
> does nothing if claimRV is attached to it.
>
> - SCCP refrains from replacing the return value of a call with a
> constant value if the call has the operand bundle. This ensures the
> call always has at least one user (the call to
> @llvm.objc.clang.arc.noop.use).
>
> - This patch also fixes a bug in replaceUsesOfNonProtoConstant where
> multiple operand bundles of the same kind were being added to a call.
>
> Future work:
>
> - Use the operand bundle on x86-64.
>
> - Fix the auto upgrader to convert call+retainRV/claimRV pairs into
> calls with the operand bundles.
>
> rdar://71443534
>
> Differential Revision: https://reviews.llvm.org/D92808
This reverts commit ed4718eccb.
Our diagnostics relating to static assertions were a bit confused. For
instance, when in MS compatibility mode in C (where we accept
static_assert even without including <assert.h>), we would fail
to warn the user that they were using the wrong spelling (even in
pedantic mode), we were missing a compatibility warning about using
_Static_assert in earlier standards modes, diagnostics for the optional
message were not reflected in C as they were in C++, etc.
Fix regression where we aren't passing `-platform_version` to new ld64.lld after {D95204}.
Most of the changes were originally in D95204, but I backed them out due to
test failures on builds which have `CLANG_DEFAULT_LINKER=lld`. The tests are
properly updated in this diff.
Reviewed By: #lld-macho, thakis
Differential Revision: https://reviews.llvm.org/D97741
__builtin_isinf currently generates a floating-point compare operation
which triggers a trap when faced with a signaling NaN in StrictFP mode.
This commit uses integer operations instead to not generate any trap in
such a case.
Reviewed By: mibintc
Differential Revision: https://reviews.llvm.org/D97125
If the mapped structure has data members, which have 'default' mappers,
need to map these members individually using their 'default' mappers.
Differential Revision: https://reviews.llvm.org/D92195
ccb4124a41 fixed translating -gz=zlib to --compress-debug-sections for
linker invocation for several ToolChains, but omitted FreeBSD.
Differential Revision: https://reviews.llvm.org/D97752
`GetAddrOfGlobalTemporary` previously tried to emit the initializer of
a global temporary before updating the global temporary map. Emitting the
initializer could recurse back into `GetAddrOfGlobalTemporary` for the same
temporary, resulting in an infinite recursion.
Reviewed By: rjmccall
Differential Revision: https://reviews.llvm.org/D97733
The VSX-only overloads (for 8-byte element vectors) are missing.
Add the missing overloads and convert element numbering to
modulo arithmetic to match GCC and XLC.
Currently clang uses stub function to launch kernel. This is inconvenient
to interop with C++ programs since the stub function has different name
as kernel, which is required by ROCm debugger.
This patch emits a variable symbol which has the same name as the kernel
and uses it to register and launch the kernel. This allows C++ program to
launch a kernel by using the original kernel name.
Reviewed by: Artem Belevich
Differential Revision: https://reviews.llvm.org/D86376
This makes CC1 and driver defaults consistent.
In addition, for more common cases (-g is specified without -gsplit-dwarf), users will not see -fno-split-dwarf-inlining in CC1 options.
Verified that the below is still true:
* `clang -g` => `splitDebugInlining: false` in DICompileUnit
* `clang -g -gsplit-dwarf` => `splitDebugInlining: false` in DICompileUnit
* `clang -g -gsplit-dwarf -fsplit-dwarf-inlining` => no `splitDebugInlining: false`
Reviewed By: dblaikie
Differential Revision: https://reviews.llvm.org/D97706
Lorenz Bauer reported that the following code will have
compilation error for bpf target:
enum e { TWO };
bpf_core_enum_value_exists(enum e, TWO);
The clang emitted the following error message:
__builtin_preserve_enum_value argument 1 invalid
In SemaChecking, an expression like "*(enum NAME)1" will have
cast kind CK_IntegralToPointer, but "*(enum NAME)0" will have
cast kind CK_NullToPointer. Current implementation only permits
CK_IntegralToPointer, missing enum value 0 case.
This patch permits CK_NullToPointer cast kind and
the above test case can pass now.
Differential Revision: https://reviews.llvm.org/D97659
This seems to be more of a Clang thing rather than a generic LLVM thing,
so this moves it out of LLVM pipelines and as Clang extension hooks into
LLVM pipelines.
Move the post-inline EEInstrumentation out of the backend pipeline and
into a late pass, similar to other sanitizer passes. It doesn't fit
into the codegen pipeline.
Also fix up EntryExitInstrumentation not running at -O0 under the new
PM. PR49143
Reviewed By: hans
Differential Revision: https://reviews.llvm.org/D97608
The new Darwin backend for LLD is now able to link reasonably large
real-world programs on x86_64. For instance, we have achieved
self-hosting for the X86_64 target, where all LLD tests pass when
building lld with itself on macOS. As such, we would like to make it the
default back-end.
The new port is now named `ld64.lld`, and the old port remains
accessible as `ld64.lld.darwinold`
This [annoucement email][1] has some context. (But note that, unlike
what the email says, we are no longer doing this as part of the LLVM 12
branch cut -- instead we will go into LLVM 13.)
Numerous mechanical test changes were required to make this change; in
the interest of creating something that's reviewable on Phabricator,
I've split out the boring changes into a separate diff (D95905). I plan to
merge its contents with those in this diff before landing.
(@gkm made the original draft of this diff, and he has agreed to let me
take over.)
[1]: https://lists.llvm.org/pipermail/llvm-dev/2021-January/147665.html
Reviewed By: #lld-macho, thakis
Differential Revision: https://reviews.llvm.org/D95204
On clang emits the compiler version string into debug information
by default for both dwarf and codeview. That makes compiler output
needlessly compiler-version-dependent which makes e.g. comparing
object file outputs during a bisect hard. So it's nice if there's
an easy way to turn this off.
(On ELF, this flag also controls the .comment section, but that
part is ELF-only. The debug-info bit isn't.)
Differential Revision: https://reviews.llvm.org/D97695
Simply make sure that the CodeGenFunction::CXXThisValue and CXXABIThisValue
are correctly initialized to the recovered value.
For lambda capture, we also need to make sure to fill the LambdaCaptureFields
Differential Revision: https://reviews.llvm.org/D97534
For ELF targets, GCC 11 will set SHF_GNU_RETAIN on the section of a
`__attribute__((retain))` function/variable to prevent linker garbage
collection. (See AttrDocs.td for the linker support).
This patch adds `retain` functions/variables to the `llvm.used` list, which has
the desired linker GC semantics. Note: `retain` does not imply `used`,
so an unused function/variable can be dropped by Sema.
Before 'retain' was introduced, previous ELF solutions require inline asm or
linker tricks, e.g. `asm volatile(".reloc 0, R_X86_64_NONE, target");`
(architecture dependent) or define a non-local symbol in the section and use
`ld -u`. There was no elegant source-level solution.
With D97448, `__attribute__((retain))` will set `SHF_GNU_RETAIN` on ELF targets.
Differential Revision: https://reviews.llvm.org/D97447
Added supporting CC_PRINT_PROC_STAT and CC_PRINT_PROC_STAT_FILE
environment variables to trigger clang driver reporting the process
statistics into specified file (alternate for -fproc-stat-report
option).
Differential Revision: https://reviews.llvm.org/D97094
See bug #48856
Definitions of classes with member function pointers and default
spaceship operator were getting accepted with no diagnostic on
release build, and triggering assert on builds with runtime checks
enabled. Diagnostics were only produced when actually comparing
instances of such classes.
This patch makes it so Spaceship and Less operators are not considered
as builtin operator candidates for function pointers, producing
equivalent diagnostics for the cases where pointers to member function
and pointers to data members are used instead.
Reviewed By: rsmith
Differential Revision: https://reviews.llvm.org/D95409
An global value in the `llvm.used` list does not have GC root semantics on ELF targets.
This will be changed in a subsequent backend patch.
Change some `llvm.used` in the ELF code path to use `llvm.compiler.used` to
prevent undesired GC root semantics.
Change one extern "C" alias (due to `__attribute__((used))` in extern "C") to use `llvm.compiler.used` on all targets.
GNU ld has a rule "`__start_/__stop_` references from a live input section retain the associated C identifier name sections",
which LLD may drop entirely (currently refined to exclude SHF_LINK_ORDER/SHF_GROUP) in a future release (the rule makes it clumsy to GC metadata sections; D96914 added a way to try the potential future behavior).
For `llvm.used` global values defined in a C identifier name section, keep using `llvm.used` so that
the future LLD change will not affect them.
rnk kindly categorized the changes:
```
ObjC/blocks: this wants GC root semantics, since ObjC mainly runs on Mac.
MS C++ ABI stuff: wants GC root semantics, no change
OpenMP: unsure, but GC root semantics probably don't hurt
CodeGenModule: affected in this patch to *not* use GC root semantics so that __attribute__((used)) behavior remains the same on ELF, plus two other minor use cases that don't want GC semantics
Coverage: Probably want GC root semantics
CGExpr.cpp: refers to LTO, wants GC root
CGDeclCXX.cpp: one is MS ABI specific, so yes GC root, one is some other C++ init functionality, which should form GC roots (C++ initializers can have side effects and must run)
CGDecl.cpp: Changed in this patch for __attribute__((used))
```
Differential Revision: https://reviews.llvm.org/D97446
These flags affect coverage mapping (-fcoverage-mapping), not
-fprofile-[instr-]generate so it makes more sense to use the
-fcoverage-* prefix.
Differential Revision: https://reviews.llvm.org/D97434
We introduce -ffile-compilation-dir shorthand to avoid having to set
-fdebug-compilation-dir and -fprofile-compilation-dir separately. This
is similar to -ffile-prefix-map.
Differential Revision: https://reviews.llvm.org/D97433
Previously, -fshow-overloads=best always showed 4 candidates. The
problem is, when this isn't enough, you're kind of up a creek; the only
option available is to recompile with different flags. This can be
quite expensive!
With this change, we try to strike a compromise. The *first* error with
more than 4 candidates will show up to 32 candidates. All further
errors continue to show only 4 candidates.
The hope is that this way, users will have *some chance* of making
forward progress, without facing unbounded amounts of error spam.
Differential Revision: https://reviews.llvm.org/D95754
It would be beneficial to allow not_tail_called attribute to be applied to
virtual functions. I don't see any drawback of allowing this.
Differential Revision: https://reviews.llvm.org/D96832
This patch adds each pass' pass argument in the header for IR dumps.
For example:
Before:
```
*** IR Dump Before InstructionSelect ***
```
After:
```
*** IR Dump Before InstructionSelect (instruction-select) ***
```
The goal is to make it easier to know what argument to pass to
command line options like `debug-only` or `run-pass` to further
investigate a given pass.
This change simplifies `clang/lib/Frontend/CompilerInvocation.cpp`
because we no longer need to manually parse the flag and set codegen
options in the frontend. However, we still need to manually parse the
flag in the driver because:
* The marshalling infrastructure doesn't operate there.
* We need to do some platform specific checks in the driver
that will likely never be supported by any kind of marshalling
infrastructure.
rdar://71609176
Differential Revision: https://reviews.llvm.org/D97327
The new `-fsanitize-address-destructor-kind=` option allows control over how module
destructors are emitted by ASan.
The new option is consumed by both the driver and the frontend and is propagated into
codegen options by the frontend.
Both the legacy and new pass manager code have been updated to consume the new option
from the codegen options.
It would be nice if the new utility functions (`AsanDtorKindToString` and
`AsanDtorKindFromString`) could live in LLVM instead of Clang so they could be
consumed by other language frontends. Unfortunately that doesn't work because
the clang driver doesn't link against the LLVM instrumentation library.
rdar://71609176
Differential Revision: https://reviews.llvm.org/D96572
This commit adds checks for the following:
* labels
* block expressions
* random integers cast to `void*`
* function pointers cast to `void*`
Differential Revision: https://reviews.llvm.org/D94640
This test case was causing a PowerPC buildbot to fail as it happened to
be named lld-multistage,
which matches with the original regex and therefore fails the check-not.
This should better represent the desired check.
Differential Revision: https://reviews.llvm.org/D97423
This happens in codebases a lot, which use xor where both sides are
macros. Using xor in that case is not the common error-prone 2^6 code
that the warning was introduced for.
Don't diagnose such a use of xor.
Differential Revision: https://reviews.llvm.org/D97445
Allow users to use a non-system version of perl, python and awk, which is useful
in certain package managers.
Reviewed By: JDevlieghere, MaskRay
Differential Revision: https://reviews.llvm.org/D95119
Finally, this patch moves from round-tripping one `CompilerInvocation` at a time to round-tripping the invocation as a whole.
This patch includes only the code required to make round-tripping the whole invocation work. More cleanups will be done in a follow-up patch.
Depends on D96847, D97041 & D97042.
Reviewed By: dexonsmith
Differential Revision: https://reviews.llvm.org/D96280
There are two preconditions to reproduce the issue,
1. Use -save-temps option
2. Provide the -o option with name equal to the input file name
without the file extension. For e.g. clang a.c -o a
With the -o specified, the AssembleJobAction after OffloadWrapperJobAction
will produce the object file with same name as host code object file.
Due to this clash, the OffloadWrapperAction overwrites the initial host
object file, which results in lld error. This also fixes the `multiple definition of __dummy.omp_offloading.entry'` issue in D96769 .
Reviewed By: jdoerfert
Differential Revision: https://reviews.llvm.org/D97273
Adding support for intrinsics of AMX-BF16.
This patch alse fix a bug that AMX-INT8 instructions will be selected with wrong
predicate.
Differential Revision: https://reviews.llvm.org/D97358
For -fgpu-rdc mode, static device vars in different TU's may have the same name.
To support accessing file-scope static device variables in host code, we need to give them
a distinct name and external linkage. This can be done by postfixing each static device variable with
a distinct CUID (Compilation Unit ID) hash.
Since the static device variables have different name across compilation units, now we let
them have external linkage so that they can be looked up by the runtime.
Reviewed by: Artem Belevich, and Jon Chesterfield
Differential Revision: https://reviews.llvm.org/D85223
When '__cl_clang_function_pointers' extension is enabled
the parser should allow obtaining the function address.
This fixes PR49264!
Differential Revision: https://reviews.llvm.org/D97203
Add the remaining missing builtin function declarations that have enum
or typedef argument or return types.
Differential Revision: https://reviews.llvm.org/D96860
-O1 and above do dont call real optimizer pipeline in ThinLTO PreLink.
Also clang can't add PostLink OptimizerLastEPCallbacks for in-process ThinLTO.
This results in missing sanitizer passes with ThinLTO.
Simple working solution is just call OptimizerLastEPCallbacks
at the end of buildThinLTOPreLinkDefaultPipeline.
Differential Revision: https://reviews.llvm.org/D96320
Patch takes advantage of the implicit default behavior to reduce the number of attributes, which in turns reduces compilation time.
Reviewed By: serge-sans-paille
Differential Revision: https://reviews.llvm.org/D97116
Currently managed variables are emitted as undefined symbols, which
causes difficulty for diagnosing undefined symbols for non-managed
variables.
This patch transforms managed variables in device compilation so that
they can be emitted as normal variables.
Reviewed by: Artem Belevich
Differential Revision: https://reviews.llvm.org/D96195
Demonstrate how to add RISC-V V builtins and lower them to IR intrinsics for V extension.
Authored-by: Roger Ferrer Ibanez <rofirrim@gmail.com>
Co-Authored-by: Hsiangkai Wang <kai.wang@sifive.com>
Differential Revision: https://reviews.llvm.org/D93446
https://bugs.llvm.org/show_bug.cgi?id=40858
CheckShadow is now called for each binding in the structured binding to make sure it does not shadow any other variable in scope. This does use a custom implementation of getShadowedDeclaration though because a BindingDecl is not a VarDecl
Added a few unit tests for this. In theory though all the other shadow unit tests should be duplicated for the structured binding variables too but whether it is probably not worth it as they use common code. The MyTuple and std interface code has been copied from live-bindings-test.cpp
Reviewed By: rsmith
Differential Revision: https://reviews.llvm.org/D96147
When targeting a MSVC triple, --dependant-libs with the name of the clang runtime library for profiling is added to the command line args. In it's current implementations clang_rt.profile-<ARCH> is chosen as the name. When building a distribution using LLVM_ENABLE_PER_TARGET_RUNTIME_DIR this fails, due to the runtime file names not having an architecture suffix in the filename.
This patch refactors getCompilerRT and getCompilerRTBasename to always consider per-target runtime directories. getCompilerRTBasename now simply returns the filename component of the path found by getCompilerRT
Differential Revision: https://reviews.llvm.org/D96638
This commit prevents warnings from -Wconversion when a clang vector type
is implicitly converted to a sizeless builtin type -- for example, when
implicitly converting a fixed-predicate to a scalable predicate.
The code below:
1 #include <arm_sve.h>
2
3 #define N __ARM_FEATURE_SVE_BITS
4 #define FIXED_ATTR __attribute__((arm_sve_vector_bits (N)))
5 typedef svbool_t fixed_svbool_t FIXED_ATTR;
6
7 inline fixed_svbool_t foo(fixed_svbool_t p) {
8 return svnot_z(svptrue_b64(), p);
9 }
would previously raise this warning:
warning: implicit conversion turns vector to scalar: \
'fixed_svbool_t' (vector of 8 'unsigned char' values) to 'svbool_t' \
(aka '__SVBool_t') [-Wconversion]
Note that many cases of these implicit conversions were already
permitted because many functions inside arm_sve.h are spawned via
preprocessor macros, and the call to isInSystemMacro would cover us in
this case. This commit fixes the remaining cases.
Differential Revision: https://reviews.llvm.org/D97053
Follow-up to fe2dcd89ac.
Update test per review comments, restoring the "D" type to its
original state, and adding new "L" type. (Sorry, this was intended to
be included in the prior commit)
Differential Revision: https://reviews.llvm.org/D96044
Previously, the definition was so-marked, but the declaration was
not. This resulted in LLVM's dwarf emission treating the function as
being external, and incorrectly emitting DW_AT_external.
Differential Revision: https://reviews.llvm.org/D96044
Currently TypePrinter lumps anonymous classes and unnamed classes in one group "anonymous" this is not correct and can be confusing in some contexts.
Differential Revision: https://reviews.llvm.org/D96807
If a static assert has a message as the right side of an and condition, suggest a fix it of replacing the '&&' to ','.
`static_assert(cond && "Failed Cond")` -> `static_assert(cond, "Failed cond")`
This use case comes up when lazily replacing asserts with static asserts.
Reviewed By: aaron.ballman
Differential Revision: https://reviews.llvm.org/D89065
In current implementation of `deviceRTLs`, we're using some functions
that are CUDA version dependent (if CUDA_VERSION < 9, it is one; otheriwse, it
is another one). As a result, we have to compile one bitcode library for each
CUDA version supported. A worse problem is forward compatibility. If a new CUDA
version is released, we have to update CMake file as well.
CUDA 9.2 has been released for three years. Instead of using various weird tricks
to make `deviceRTLs` work with different CUDA versions and still have forward
compatibility, we can simply drop support for CUDA 9.1 or lower version. It has at
least two benifits:
- We don't need to generate bitcode libraries for each CUDA version;
- Clang driver doesn't need to search for the bitcode lib based on CUDA version.
We can claim that starting from LLVM 12, OpenMP offloading on NVPTX target requires
CUDA 9.2+.
Reviewed By: jdoerfert, JonChesterfield
Differential Revision: https://reviews.llvm.org/D97003
This change enables the builtin function declarations
in clang driver by default using the Tablegen solution
along with the implicit include of 'opencl-c-base.h'
header.
A new flag '-cl-no-stdinc' disabling all default
declarations and header includes is added. If any other
mechanisms were used to include the declarations (e.g.
with -Xclang -finclude-default-header) and the new default
approach is not sufficient the, `-cl-no-stdinc` flag has
to be used with clang to activate the old behavior.
Tags: #clang
Differential Revision: https://reviews.llvm.org/D96515
This patch adds the following SHA3 Intrinsics:
vsha512hq_u64,
vsha512h2q_u64,
vsha512su0q_u64,
vsha512su1q_u64
veor3q_u8
veor3q_u16
veor3q_u32
veor3q_u64
veor3q_s8
veor3q_s16
veor3q_s32
veor3q_s64
vrax1q_u64
vxarq_u64
vbcaxq_u8
vbcaxq_u16
vbcaxq_u32
vbcaxq_u64
vbcaxq_s8
vbcaxq_s16
vbcaxq_s32
vbcaxq_s64
Note need to include +sha3 and +crypto when building from the front-end
Reviewed By: DavidSpickett
Differential Revision: https://reviews.llvm.org/D96381
Removes `CrossTranslationUnitContext::getImportedFromSourceLocation`
Removes the corresponding unit-test segment.
Introduces the `CrossTranslationUnitContext::getMacroExpansionContextForSourceLocation`
which will return the macro expansion context for an imported TU. Also adds a
few implementation FIXME notes where applicable, since this feature is
not implemented yet. This fact is also noted as Doxygen comments.
Uplifts a few CTU LIT test to match the current **incomplete** behavior.
It is a regression to some extent since now we don't expand any
macros in imported TUs. At least we don't crash anymore.
Note that the introduced function is already covered by LIT tests.
Eg.: Analysis/plist-macros-with-expansion-ctu.c
Reviewed By: balazske, Szelethus
Differential Revision: https://reviews.llvm.org/D94673
Removes the obsolete ad-hoc macro expansions during bugreport constructions.
It will skip the macro expansion if the expansion happened in an imported TU.
Also removes the expected plist file, while expanding matching context for
the tests.
Adds a previously crashing `plist-macros-with-expansion.c` testfile.
Temporarily marks `plist-macros-with-expansion-ctu.c ` to `XFAIL`.
Reviewed By: xazax.hun, Szelethus
Differential Revision: https://reviews.llvm.org/D93224
This patch moves the creation of the '-Wspir-compat' argument from cc1 to the driver.
Without this change, generating command line arguments from `CompilerInvocation` cannot be done reliably: there's no way to distinguish whether '-Wspir-compat' was passed to cc1 on the command line (should be generated), or if it was created within `CompilerInvocation::CreateFromArgs` (should not be generated).
This is also in line with how other '-W' flags are handled.
(This was introduced in D21567.)
Reviewed By: Anastasia
Differential Revision: https://reviews.llvm.org/D97041
`sm_35` is the minimum requirement for OpenMP offloading on NVPTX device.
Current driver test case is using `sm_20`. D97003 is going to switch the minimum
CUDA version to 9.2, which only supports `sm_30+`. This patch makes step for the
change.
Reviewed By: JonChesterfield
Differential Revision: https://reviews.llvm.org/D97120
When compiling UEFI applications, the main function is named
efi_main() instead of main(). Let's exclude efi_main() from
-Wmissing-prototypes as well to avoid warnings when working
on UEFI applications.
Differential Revision: https://reviews.llvm.org/D95746
When WPD is enabled, via WholeProgramVTables, emit type metadata for
available_externally vtables. Additionally, add the vtables to the
llvm.compiler.used global so that they are not prematurely eliminated
(before *LTO analysis).
This is needed to avoid devirtualizing calls to a function overriding a
class defined in a header file but with a strong definition in a shared
library. Without type metadata on the available_externally vtables from
the header, the WPD analysis never sees what a derived class is
overriding. Even if the available_externally base class functions are
pure virtual, because shared library definitions are already treated
conservatively (committed patches D91583, D96721, and D96722) we will
not devirtualize, which would be unsafe since the library might contain
overrides that aren't visible to the LTO unit.
An example is std::error_category, which is overridden in LLVM
and causing failures after a self build with WPD enabled, because
libstdc++ contains hidden overrides of the virtual base class methods.
Differential Revision: https://reviews.llvm.org/D96919
This enables use of MemorySSA instead of MemDep in MemCpyOpt. To
allow this without significant compile-time impact, the MemCpyOpt
pass is moved directly before DSE (in the cases where this was not
already the case), which allows us to reuse the existing MemorySSA
analysis.
Unlike the MemDep-based implementation, the MemorySSA-based MemCpyOpt
can also perform simple optimizations across basic blocks.
Differential Revision: https://reviews.llvm.org/D94376
This enables AES fusion and the post RA scheduler for the Neoverse cores.
And while we are it also for the A55 that we had missed earlier.
Differential Revision: https://reviews.llvm.org/D96866
Add option -fgpu-sanitize to enable sanitizer for AMDGPU target.
Since it is experimental, it is off by default.
Reviewed by: Artem Belevich
Differential Revision: https://reviews.llvm.org/D96835
mode.
We use that mode when evaluating ICEs in C, and those shortcuts could
result in ICE evaluation producing the wrong answer, specifically if we
evaluate a statement-expression as part of evaluating the ICE.
Currently TypePrinter lumps anonymous classes and unnamed classes in one group "anonymous" this is not correct and can be confusing in some contexts.
Differential Revision: https://reviews.llvm.org/D96807
We currently always store absolute filenames in coverage mapping. This
is problematic for several reasons. It poses a problem for distributed
compilation as source location might vary across machines. We are also
duplicating the path prefix potentially wasting space.
This change modifies how we store filenames in coverage mapping. Rather
than absolute paths, it stores the compilation directory and file paths
as given to the compiler, either relative or absolute. Later when
reading the coverage mapping information, we recombine relative paths
with the working directory. This approach is similar to handling
ofDW_AT_comp_dir in DWARF.
Finally, we also provide a new option, -fprofile-compilation-dir akin
to -fdebug-compilation-dir which can be used to manually override the
compilation directory which is useful in distributed compilation cases.
Differential Revision: https://reviews.llvm.org/D95753
We currently always store absolute filenames in coverage mapping. This
is problematic for several reasons. It poses a problem for distributed
compilation as source location might vary across machines. We are also
duplicating the path prefix potentially wasting space.
This change modifies how we store filenames in coverage mapping. Rather
than absolute paths, it stores the compilation directory and file paths
as given to the compiler, either relative or absolute. Later when
reading the coverage mapping information, we recombine relative paths
with the working directory. This approach is similar to handling
ofDW_AT_comp_dir in DWARF.
Finally, we also provide a new option, -fprofile-compilation-dir akin
to -fdebug-compilation-dir which can be used to manually override the
compilation directory which is useful in distributed compilation cases.
Differential Revision: https://reviews.llvm.org/D95753
The recent commit 00a6254 "Stop traping on sNaN in builtin_isnan" changed the
lowering in constrained FP mode of builtin_isnan from an FP comparison to
integer operations to avoid trapping.
SystemZ has a special instruction "Test Data Class" which is the preferred
way to do this check. This patch adds a new target hook "testFPKind()" that
lets SystemZ emit the s390_tdc intrinsic instead.
testFPKind() takes the BuiltinID as an argument and is expected to soon
handle more opcodes than just 'builtin_isnan'.
Review: Thomas Preud'homme, Ulrich Weigand
Differential Revision: https://reviews.llvm.org/D96568
would otherwise include template specialization types
This helps reduce the size of the encoded C++ type strings in the binary.
This is enabled by default only on Darwin, but can be enabled/disabled
via command line options.
rdar://63288571
Differential Revision: https://reviews.llvm.org/D96816
The following commits added commandline arguments to control following the Arm
Procedure Call Standard for certain volatile bitfield operations:
- https://reviews.llvm.org/D67399
- https://reviews.llvm.org/D72932
This commit fixes the oversight that these args weren't passed from the driver
to cc1 if appropriate.
Where *appropriate* means:
- `-faapcs-bitfield-width`: is the default, so won't be passed
- `-fno-aapcs-bitfield-width`: should be passed
- `-faapcs-bitfield-load`: should be passed
Differential Revision: https://reviews.llvm.org/D96784
Added -mrop-protection for Power PC to turn on codegen that provides some
protection from ROP attacks.
The option is off by default and can be turned on for Power 8, Power 9 and
Power 10.
This patch is for the option only. The feature will be implemented by a later
patch.
Reviewed By: amyk
Differential Revision: https://reviews.llvm.org/D96512
This fixes an issue when "-gdwarf-N" switch was ignored if it was given
before another debug option.
Differential Revision: https://reviews.llvm.org/D96865
Add the types for the RISC-V V extension builtins.
These types will be used by the RISC-V V intrinsics which require
types of the form <vscale x 1 x i64>(LMUL=1 element size=64) or
<vscale x 4 x i32>(LMUL=2 element size=32), etc. The vector_size
attribute does not work for us as it doesn't create a scalable
vector type. We want these types to be opaque and have no operators
defined for them. We want them to be sizeless. This makes them
similar to the ARM SVE builtin types. But we will have quite a bit
more types. This patch adds around 60. Later patches will add
another 230 or so types representing tuples of these types similar
to the x2/x3/x4 types in ARM SVE. But with extra complexity that
these types are combined with the LMUL concept that is unique to
RISCV.
For more background see this RFC
http://lists.llvm.org/pipermail/llvm-dev/2020-October/145850.html
Authored-by: Roger Ferrer Ibanez <roger.ferrer@bsc.es>
Co-Authored-by: Hsiangkai Wang <kai.wang@sifive.com>
Differential Revision: https://reviews.llvm.org/D92715
This matches the platform default for GCC. It primarily matters when the
integrated assembler is not used as there is no default CPU defined for
ARMv7-A and GNU as is upset with -mcpu=generic.
The new spec does not have `exnref` so EH does not have dependency of
the reference types proposal anymore.
Reviewed By: dschuff
Differential Revision: https://reviews.llvm.org/D96903
Drop the `Separate` form of `-fmodule-name X`, `-fprofile-remapping-file X`, and `-frewrite-map-file X`.
To the best of my knowledge they are not used. Their conventional Joined forms (`-fFOO=`) should be used instead.
`-fdebug-compilation-dir X` is used in several places, e.g. chromium/infra/goma.
It is also advertised in http://blog.llvm.org/2019/11/deterministic-builds-with-clang-and-lld.html
So we keep it but make the EQ form canonical and the Separate form an alias.
Differential Revision: https://reviews.llvm.org/D96886
Basic block sections enables function sections implicitly, this is not needed
and is inefficient with "=list" option.
We had basic block sections enable function sections implicitly in clang. This
is particularly inefficient with "=list" option as it places functions that do
not have any basic block sections in separate sections. This causes unnecessary
object file overhead for large applications.
This patch disables this implicit behavior. It only creates function sections
for those functions that require basic block sections.
This patch is the second of two patches and this patch removes the implicit
enabling of function sections with basic block sections in clang.
Differential Revision: https://reviews.llvm.org/D93876
Add enum and typedef argument support to `-fdeclare-opencl-builtins`,
which was the last major missing feature.
Adding the remaining missing builtins is left as future work.
Differential Revision: https://reviews.llvm.org/D96051
The option was added in D90507 for C/C++ source files. This patch adds
support for assembly files.
Differential Revision: https://reviews.llvm.org/D96783
This allows the option to affect the LTO output. Module::Max helps to
generate debug info for all modules in the same format.
Differential Revision: https://reviews.llvm.org/D96597
This change affects 'SemaOpenCLCXX/newdelete.cl' test,
thus the patch contains adjustments in types validation of
operators new and delete
Reviewed By: Anastasia
Differential Revision: https://reviews.llvm.org/D96178
Displaying the problem range could crash if the begin and end of a
range is in different files or macros. After the change such range
is displayed only as the beginning location.
There is a bug for this problem:
https://bugs.llvm.org/show_bug.cgi?id=46540
Reviewed By: steakhal
Differential Revision: https://reviews.llvm.org/D95860
OpenMP 5.0 removed a lot of restriction for overlapped mapped items
comparing to OpenMP 4.5. Patch restricts the checks for overlapped data
mappings only for OpenMP 4.5 and less and reorders mapping of the
arguments so, that present and alloc mappings are processed first and
then all others.
Differential Revision: https://reviews.llvm.org/D86119
Implement all of P1825R0:
- implicitly movable entity can be an rvalue reference to non-volatile
automatic object.
- operand of throw-expression can be a function or catch-clause parameter
(support for function parameter has already been implemented).
- in the first overload resolution, the selected function no need to be
a constructor.
- in the first overload resolution, the first parameter of the selected
function no need to be an rvalue reference to the object's type.
This patch also removes the diagnostic `-Wreturn-std-move-in-c++11`.
Differential Revision: https://reviews.llvm.org/D88220
The tile directive is in OpenMP's Technical Report 8 and foreseeably will be part of the upcoming OpenMP 5.1 standard.
This implementation is based on an AST transformation providing a de-sugared loop nest. This makes it simple to forward the de-sugared transformation to loop associated directives taking the tiled loops. In contrast to other loop associated directives, the OMPTileDirective does not use CapturedStmts. Letting loop associated directives consume loops from different capture context would be difficult.
A significant amount of code generation logic is taking place in the Sema class. Eventually, I would prefer if these would move into the CodeGen component such that we could make use of the OpenMPIRBuilder, together with flang. Only expressions converting between the language's iteration variable and the logical iteration space need to take place in the semantic analyzer: Getting the of iterations (e.g. the overload resolution of `std::distance`) and converting the logical iteration number to the iteration variable (e.g. overload resolution of `iteration + .omp.iv`). In clang, only CXXForRangeStmt is also represented by its de-sugared components. However, OpenMP loop are not defined as syntatic sugar. Starting with an AST-based approach allows us to gradually move generated AST statements into CodeGen, instead all at once.
I would also like to refactor `checkOpenMPLoop` into its functionalities in a follow-up. In this patch it is used twice. Once for checking proper nesting and emitting diagnostics, and additionally for deriving the logical iteration space per-loop (instead of for the loop nest).
Differential Revision: https://reviews.llvm.org/D76342
This takes advantage of the implicit default behavior to reduce the number of
attributes, which in turns reduces compilation time. I've observed -3% in
instruction count when compiling sqlite3 amalgamation with -O0
Differential Revision: https://reviews.llvm.org/D96400
This patch generates the `-f[no-]finite-loops` arguments from `CompilerInvocation` (added in D96419), fixing test failures of Clang built with `-DCLANG_ROUND_TRIP_CC1_ARGS=ON`.
Reviewed By: fhahn
Differential Revision: https://reviews.llvm.org/D96761
Even code in target and declare target regions might not be emitted.
With this patch we delay more diagnostics and use laziness and linkage
to determine if a function is emitted (for the device). Note that we
still eagerly emit diagnostics for target regions, unfortunately, see
the TODO for the reason.
This hopefully fixes PR48933.
Reviewed By: JonChesterfield
Differential Revision: https://reviews.llvm.org/D95928
Type errors in function declarations were not (always) diagnosed prior
to this patch. Furthermore, certain remarks did not get associated
properly which caused them to be emitted multiple times.
Reviewed By: JonChesterfield
Differential Revision: https://reviews.llvm.org/D95912
This commit fixes bug #48739. The bug was caused by the way static_casts
on pointer-to-member caused the CXXBaseSpecifier list of a
MemberToPointer to grow instead of shrink.
The list is now grown by implicit casts and corresponding entries are
removed by static_casts. No-op static_casts cause no effect.
Reviewed By: vsavchenko
Differential Revision: https://reviews.llvm.org/D95877
This is a follow up of D92940.
We have successfully converted fadd/fmul _mm_reduce_* intrinsics to
llvm.reduction + reassoc flag. We can do the same approach for fmin/fmax
too, i.e. llvm.reduction + nnan flag.
Reviewed By: spatel
Differential Revision: https://reviews.llvm.org/D93179
This patch ensures that vector predication and vectorization width
pragmas work together correctly/as expected. Specifically, this patch
fixes the issue that when vectorization_width > 1, the vector
predication behaviour (this would matter if it has NOT been disabled
explicitly by a pragma) was getting ignored, which was incorrect.
The fix here removes the dependence of vector predication on the
vectorization width. The loop metadata corresponding to clang loop
pragma vectorize_predicate is always emitted, if the pragma is
specified, even if vectorization is disabled by vectorize_width(1)
or vectorize(disable) since the option is also used for interleaving
by the LoopVectorize pass.
Reviewed By: dmgreen, Meinersbur
Differential Revision: https://reviews.llvm.org/D94779
vec_xl() and vec_xst() should not emit alignment hints since they take a
scalar pointer and also add a byte offset if passed.
This patch uses memcpy to achieve the desired result.
Review: Ulrich Weigand
Differential Revision: https://reviews.llvm.org/D96471
This patch adds 2 new options to control when Clang adds `mustprogress`:
1. -ffinite-loops: assume all loops are finite; mustprogress is added
to all loops, regardless of the selected language standard.
2. -fno-finite-loops: assume no loop is finite; mustprogress is not
added to any loop or function. We could add mustprogress to
functions without loops, but we would have to detect that in Clang,
which is probably not worth it.
Reviewed By: jdoerfert
Differential Revision: https://reviews.llvm.org/D96419
class types.
The goal is to provide a way to bypass constructor homing when emitting
class definitions and force class definitions in the debug info.
Not sure about the wording of the attribute, or whether it should be
specific to classes with constructors
explicitly emitting retainRV or claimRV calls in the IR
Background:
This fixes a longstanding problem where llvm breaks ARC's autorelease
optimization (see the link below) by separating calls from the marker
instructions or retainRV/claimRV calls. The backend changes are in
https://reviews.llvm.org/D92569.
https://clang.llvm.org/docs/AutomaticReferenceCounting.html#arc-runtime-objc-autoreleasereturnvalue
What this patch does to fix the problem:
- The front-end adds operand bundle "clang.arc.attachedcall" to calls,
which indicates the call is implicitly followed by a marker
instruction and an implicit retainRV/claimRV call that consumes the
call result. In addition, it emits a call to
@llvm.objc.clang.arc.noop.use, which consumes the call result, to
prevent the middle-end passes from changing the return type of the
called function. This is currently done only when the target is arm64
and the optimization level is higher than -O0.
- ARC optimizer temporarily emits retainRV/claimRV calls after the calls
with the operand bundle in the IR and removes the inserted calls after
processing the function.
- ARC contract pass emits retainRV/claimRV calls after the call with the
operand bundle. It doesn't remove the operand bundle on the call since
the backend needs it to emit the marker instruction. The retainRV and
claimRV calls are emitted late in the pipeline to prevent optimization
passes from transforming the IR in a way that makes it harder for the
ARC middle-end passes to figure out the def-use relationship between
the call and the retainRV/claimRV calls (which is the cause of
PR31925).
- The function inliner removes an autoreleaseRV call in the callee if
nothing in the callee prevents it from being paired up with the
retainRV/claimRV call in the caller. It then inserts a release call if
claimRV is attached to the call since autoreleaseRV+claimRV is
equivalent to a release. If it cannot find an autoreleaseRV call, it
tries to transfer the operand bundle to a function call in the callee.
This is important since the ARC optimizer can remove the autoreleaseRV
returning the callee result, which makes it impossible to pair it up
with the retainRV/claimRV call in the caller. If that fails, it simply
emits a retain call in the IR if retainRV is attached to the call and
does nothing if claimRV is attached to it.
- SCCP refrains from replacing the return value of a call with a
constant value if the call has the operand bundle. This ensures the
call always has at least one user (the call to
@llvm.objc.clang.arc.noop.use).
- This patch also fixes a bug in replaceUsesOfNonProtoConstant where
multiple operand bundles of the same kind were being added to a call.
Future work:
- Use the operand bundle on x86-64.
- Fix the auto upgrader to convert call+retainRV/claimRV pairs into
calls with the operand bundles.
rdar://71443534
Differential Revision: https://reviews.llvm.org/D92808
This patch uses the existing logic of CUDA for searching libomptarget
and extracts it to a common method.
Reviewed By: JonChesterfield, tianshilei1992
Differential Revision: https://reviews.llvm.org/D96248
EarlyCSEPass called after msan redices code size by about 10%.
Similar optimization exists for legacy pass manager in
addGeneralOptsForMemorySanitizer.
Reviewed By: eugenis
Differential Revision: https://reviews.llvm.org/D96406
The ability to specify alignment was recently added, and it's an
important property which we should ensure is set as expected by
Clang. (Especially before making further changes to Clang's code in
this area.) But, because it's on the end of the lines, the existing
tests all ignore it.
Therefore, update all the tests to also verify the expected alignment
for atomicrmw and cmpxchg. While I was in there, I also updated uses
of 'load atomic' and 'store atomic', and added the memory ordering,
where that was missing.
In D91442, @MaskRay commented about a failure. This commit does the following to
address his comments:
1. Replace %T with %t as former is deprecated.
2. Add an explicit --sysroot argument in a test.
Some tests were failing when gcc-10-riscv64-linux-gnu is installed on test machine.
This was happening because the test was checking a case when --gcc-toolchain is not
provided. But if --sysroot was also not provided then code could pick a toolchain
installed in /usr. So to make the test more robust, I have provided an explicit --sysroot
argument. Its value has been chosen to match the existing patterns.
Reviewed By: MaskRay
Differential Revision: https://reviews.llvm.org/D93023
This change removes the XFAIL from the original test and duplicates the test into sanitize-coverage-old-pm.c
which uses the old pass manager and has the corresponding XFAIL.
This should fix the XPASS from this and similar runs:
http://lab.llvm.org:8011/#/builders/60/builds/1875
Multi-configuration generators (such as Visual Studio and Xcode) allow the specification of a build flavor at build time instead of config time, so the lit configuration files need to support that - and they do for the most part. There are several places that had one of two issues (or both!):
1) Paths had %(build_mode)s set up, but then not configured, resulting in values that would not work correctly e.g. D:/llvm-build/%(build_mode)s/bin/dsymutil.exe
2) Paths did not have %(build_mode)s set up, but instead contained $(Configuration) (which is the value for Visual Studio at configuration time, for Xcode they would have had the equivalent) e.g. "D:/llvm-build/$(Configuration)/lib".
This seems to indicate that we still have a lot of fragility in the configurations, but also that a number of these paths are never used (at least on Windows) since the errors appear to have been there a while.
This patch fixes the configurations and it has been tested with Ninja and Visual Studio to generate the correct paths. We should consider removing some of these settings altogether.
Reviewed By: JDevlieghere, mehdi_amini
Differential Revision: https://reviews.llvm.org/D96427
With https://reviews.llvm.org/D63376, we began storing the APValue
directly into the ConstantExpr object so that we could reuse the
calculated value later. However, it missed a case when not in C++11
mode but the expression is known to be constant.
Before this commit, expression statements could not be annotated
with statement attributes. Whenever parser found attribute, it
unconditionally assumed that it was followed by a declaration.
This not only doesn't allow expression attributes to have attributes,
but also produces spurious error diagnostics.
In order to maintain all previously compiled code, we still assume
that GNU attributes are followed by declarations unless ALL of those
are statement attributes. And even in this case we are not forcing
the parser to think that it should parse a statement, but rather
let it proceed as if no attributes were found.
Differential Revision: https://reviews.llvm.org/D93630
The swift_bridge attribute warns when the attribute is applied multiple
times to the same declaration. However, it warns about the arguments
being different to the attribute without ever checking if the arguments
actually are different. If the arguments are different, diagnose,
otherwise silently accept the code. Either way, drop the duplicated
attribute.
Today, inside a template, you can get completion for:
Foo<T> t;
t.^
t has dependent type Foo<T>, and we use the primary template to find its members.
However we also want this to work:
t.foo.bar().^
The type of t.foo.bar() is DependentTy, so we attempt to resolve using similar
heuristics (e.g. primary template).
Differential Revision: https://reviews.llvm.org/D96376
Add the builtin functions brought by the
cl_khr_subgroup_extended_types extension to
`-fdeclare-opencl-builtins`.
Differential Revision: https://reviews.llvm.org/D96279
After D93264, using both -fdebug-info-for-profiling and
-fpseudo-probe-for-profiling will cause the compiler to crash.
Diagnose these conflicting options in the driver.
Also, the existing CodeGen test was using the driver when it should be
running cc1.
Differential Revision: https://reviews.llvm.org/D96354
Not all platforms accept -stdlib or -rtlib. Instead of complaining about
the wrong argument to these options, clang complains about the option
itself being present.
Pass an appropriate -target to the clang invocations.
Add the builtin functions brought by the
cl_khr_subgroup_non_uniform_arithmetic extension to
`-fdeclare-opencl-builtins`.
Differential Revision: https://reviews.llvm.org/D95951
This reverts commit 3500cc8d89.
This old commit was made over a completely false premise. OSSymbols
aren't different from other OSObjects and we shouldn't treat them
differently for the purposes of static analysis.
Since ToolChain::GetCXXStdlibType() is a simple getter that might emit
the "invalid library name in argument" warning, it can conceivably be
called several times while initializing the build pipeline.
Before this patch, a simple 'clang++ -stdlib=foo ./test.cpp' would print
the warning twice, -rt=lib=foo would print 6 times.
Change this and always only print the warning once. Keep the rest of the
semantics of the functions.
Differential Revision: https://reviews.llvm.org/D95915
Signed prefix is removed and the single word spelling is
printed for the scalar types.
Tags: #clang
Differential Revision: https://reviews.llvm.org/D96161
Intrinsics *reduce_add/mul_ps/pd have assumption that the elements in
the vector are reassociable. So we need to always assign the reassoc
flag when we call _mm_reduce_* intrinsics.
Reviewed By: spatel
Differential Revision: https://reviews.llvm.org/D96231
It is very common to check callbacks and completion handlers for null.
This patch supports such checks using built-in functions:
* __builtin_expect
* __builtin_expect_with_probablity
* __builtin_unpredictable
rdar://73455388
Differential Revision: https://reviews.llvm.org/D96268
This patch added a distinct CUID for each input file, which is represented by InputAction.
clang initially creates an InputAction for each input file for the host compilation. In CUDA/HIP action
builder, each InputAction is given a CUID and cloned for each GPU arch, and the CUID is also cloned. In this way,
we guarantee the corresponding device and host compilation for the same file shared the
same CUID. On the other hand, different compilation units have different CUID.
-fuse-cuid=random|hash|none is added to control the method to generate CUID. The default
is hash. -cuid=X is also added to specify CUID explicitly, which overrides -fuse-cuid.
Reviewed by: Artem Belevich
Differential Revision: https://reviews.llvm.org/D95007
during the same evaluation.
It looks like the only case for which this matters is determining
whether mutable subobjects of a heap allocation can be modified during
constant evaluation.
variable's destruction if it didn't do so during construction.
The standard doesn't give any guidance as to what to do here, but this
approach seems reasonable and conservative, and has been proposed to the
standard committee.
A module with errors would be marked as out-of-date, then the `compilerModule` action would produce it, but due to the error it would be treated as failure and the resulting PCM would not get used.
rdar://74087062
Differential Revision: https://reviews.llvm.org/D96246
Currently -fgpu-rdc is not passed to host clang -cc1.
This causes issue because -fgpu-rdc affects shadow
variable linkage in host compilation.
Reviewed by: Artem Belevich
Differential Revision: https://reviews.llvm.org/D96105
Link: https://lists.llvm.org/pipermail/llvm-dev/2020-October/146162.html "[RFC] FileCheck: (dis)allowing unused prefixes"
If a downstream project using lit needs time for transition,
add the following to `lit.local.cfg`:
```
from lit.llvm.subst import ToolSubst
fc = ToolSubst('FileCheck', unresolved='fatal')
config.substitutions.insert(0, (fc.regex, 'FileCheck --allow-unused-prefixes'))
```
Differential Revision: https://reviews.llvm.org/D95849
As Itanium ABI[http://itanium-cxx-abi.github.io/cxx-abi/abi.html#once-ctor]
points out:
"The size of the guard variable is 64 bits. The first byte (i.e. the byte at
the address of the full variable) shall contain the value 0 prior to
initialization of the associated variable, and 1 after initialization is complete."
Differential Revision: https://reviews.llvm.org/D95822
Pipe element type spelling for arg info metadata
should follow the same behavior as normal type spelling.
We should only use the canonical type spelling in the
base type field.
This patch also removed duplication in type handling.
Tags: #clang
Differential Revision: https://reviews.llvm.org/D96151
This reverts commit 6039f821 and reapplies bff6d9bb.
Clang's Index/implicit-attrs.m test invokes c-index-test with -fobjc-arc. This flag is not compatible with -fobjc-runtime=gcc, which gets implied on Linux.
The original commit uncovered this by correctly reporting issues when parsing -cc1 command line.
This commit fixes the test to explicitly provide ObjectiveC runtime compatible with ARC.
`CompilerInvocation::CreateFromArgs` doesn't always report command line parsing failures through the return value. Sometimes, errors are only reported via diagnostics.
Some clients like `c-index-test` only check the return value and don't check the state of `DiagnosticsEngine`.
If we were to start returning the correct return value from `CreateFromArgs`, this index test starts to fail, because it specifies `-std=c++11` for a C input, which is invalid.
This patch fixes that issue by adding forgotten `-x c++` argument.
Reviewed By: dexonsmith
Differential Revision: https://reviews.llvm.org/D95879
Currently the emscripten frontend driver injects this when building
with thread support. Moving this into the clang driver itself makes
the emscripten python driver less magical.
Differential Revision: https://reviews.llvm.org/D96171
When a function or a file is excluded using -fprofile-list= option,
don't emit coverage mapping as doing so confuses users since those
functions would always have zero count. This also reduces the binary
size considerably in cases where only a few functions or files are
being instrumented.
Differential Revision: https://reviews.llvm.org/D96000
For -fgpu-rdc, shadow variables should not be internalized, otherwise
they cannot be accessed by other TUs. This is necessary because
the shadow variable of external device variables are always
emitted as undefined symbols, which need to resolve to a global
symbols.
Managed variables need to be emitted as undefined symbols
in device compilations.
Reviewed by: Artem Belevich
Differential Revision: https://reviews.llvm.org/D95901
__builtin_isnan currently generates a floating-point compare operation
which triggers a trap when faced with a signaling NaN in StrictFP mode.
This commit uses integer operations instead to not generate any trap in
such a case.
Reviewed By: kpn
Differential Revision: https://reviews.llvm.org/D95948
- The failures are all cc1-based tests due to the missing `-aux-triple` options,
which is always prepared by the driver in CUDA/HIP compilation.
- Add extra check on the missing aux-targetinfo to prevent crashing.
[hip][cuda] Enable extended lambda support on Windows.
- On Windows, extended lambda has extra issues due to the numbering
schemes are different between the host compilation (Microsoft C++ ABI)
and the device compilation (Itanium C++ ABI. Additional device side
lambda number is required per lambda for the host compilation to
correctly mangle the device-side lambda name.
- A hybrid numbering context `MSHIPNumberingContext` is introduced to
number a lambda for both host- and device-compilations.
Reviewed By: rnk
Differential Revision: https://reviews.llvm.org/D69322
This reverts commit 4874ff0241.
This patch adds possibility to define OpenCL C 3.0 feature macros
via command line option or target setting.
Reviewed By: Anastasia
Differential Revision: https://reviews.llvm.org/D95776
emitting retainRV or claimRV calls in the IR
This reapplies 3fe3946d9a without the
changes made to lib/IR/AutoUpgrade.cpp, which was violating layering.
Original commit message:
Background:
This patch makes changes to the front-end and middle-end that are
needed to fix a longstanding problem where llvm breaks ARC's autorelease
optimization (see the link below) by separating calls from the marker
instructions or retainRV/claimRV calls. The backend changes are in
https://reviews.llvm.org/D92569.
https://clang.llvm.org/docs/AutomaticReferenceCounting.html#arc-runtime-objc-autoreleasereturnvalue
What this patch does to fix the problem:
- The front-end adds operand bundle "clang.arc.rv" to calls, which
indicates the call is implicitly followed by a marker instruction and
an implicit retainRV/claimRV call that consumes the call result. In
addition, it emits a call to @llvm.objc.clang.arc.noop.use, which
consumes the call result, to prevent the middle-end passes from changing
the return type of the called function. This is currently done only when
the target is arm64 and the optimization level is higher than -O0.
- ARC optimizer temporarily emits retainRV/claimRV calls after the calls
with the operand bundle in the IR and removes the inserted calls after
processing the function.
- ARC contract pass emits retainRV/claimRV calls after the call with the
operand bundle. It doesn't remove the operand bundle on the call since
the backend needs it to emit the marker instruction. The retainRV and
claimRV calls are emitted late in the pipeline to prevent optimization
passes from transforming the IR in a way that makes it harder for the
ARC middle-end passes to figure out the def-use relationship between
the call and the retainRV/claimRV calls (which is the cause of
PR31925).
- The function inliner removes an autoreleaseRV call in the callee if
nothing in the callee prevents it from being paired up with the
retainRV/claimRV call in the caller. It then inserts a release call if
the call is annotated with claimRV since autoreleaseRV+claimRV is
equivalent to a release. If it cannot find an autoreleaseRV call, it
tries to transfer the operand bundle to a function call in the callee.
This is important since ARC optimizer can remove the autoreleaseRV
returning the callee result, which makes it impossible to pair it up
with the retainRV/claimRV call in the caller. If that fails, it simply
emits a retain call in the IR if the implicit call is a call to
retainRV and does nothing if it's a call to claimRV.
Future work:
- Use the operand bundle on x86-64.
- Fix the auto upgrader to convert call+retainRV/claimRV pairs into
calls annotated with the operand bundles.
rdar://71443534
Differential Revision: https://reviews.llvm.org/D92808
emitting retainRV or claimRV calls in the IR
Background:
This patch makes changes to the front-end and middle-end that are
needed to fix a longstanding problem where llvm breaks ARC's autorelease
optimization (see the link below) by separating calls from the marker
instructions or retainRV/claimRV calls. The backend changes are in
https://reviews.llvm.org/D92569.
https://clang.llvm.org/docs/AutomaticReferenceCounting.html#arc-runtime-objc-autoreleasereturnvalue
What this patch does to fix the problem:
- The front-end adds operand bundle "clang.arc.rv" to calls, which
indicates the call is implicitly followed by a marker instruction and
an implicit retainRV/claimRV call that consumes the call result. In
addition, it emits a call to @llvm.objc.clang.arc.noop.use, which
consumes the call result, to prevent the middle-end passes from changing
the return type of the called function. This is currently done only when
the target is arm64 and the optimization level is higher than -O0.
- ARC optimizer temporarily emits retainRV/claimRV calls after the calls
with the operand bundle in the IR and removes the inserted calls after
processing the function.
- ARC contract pass emits retainRV/claimRV calls after the call with the
operand bundle. It doesn't remove the operand bundle on the call since
the backend needs it to emit the marker instruction. The retainRV and
claimRV calls are emitted late in the pipeline to prevent optimization
passes from transforming the IR in a way that makes it harder for the
ARC middle-end passes to figure out the def-use relationship between
the call and the retainRV/claimRV calls (which is the cause of
PR31925).
- The function inliner removes an autoreleaseRV call in the callee if
nothing in the callee prevents it from being paired up with the
retainRV/claimRV call in the caller. It then inserts a release call if
the call is annotated with claimRV since autoreleaseRV+claimRV is
equivalent to a release. If it cannot find an autoreleaseRV call, it
tries to transfer the operand bundle to a function call in the callee.
This is important since ARC optimizer can remove the autoreleaseRV
returning the callee result, which makes it impossible to pair it up
with the retainRV/claimRV call in the caller. If that fails, it simply
emits a retain call in the IR if the implicit call is a call to
retainRV and does nothing if it's a call to claimRV.
Future work:
- Use the operand bundle on x86-64.
- Fix the auto upgrader to convert call+retainRV/claimRV pairs into
calls annotated with the operand bundles.
rdar://71443534
Differential Revision: https://reviews.llvm.org/D92808
Commit 6bf29dbb enables float128 feature by default for Power9 targets.
But float128 may cause build failure in libcxx testing. Revert this
commit first to unblock LLVM 12 release.
The attribute definition claimed the attribute was inheritable (which
only applies to declaration attributes) and not a statement attribute.
Further, it treats subject appertainment errors as being parse errors
rather than semantic errors, which leads to us accepting invalid code.
For instance, we currently fail to reject:
void foo() {
int i = 1000;
__attribute__((nomerge, opencl_unroll_hint(8)))
if (i) { foo(); }
}
This addresses the issues by clarifying that opencl_unroll_hint is a
statement attribute and handles its appertainment checks in the
semantic layer instead of the parsing layer. This changes the output of
the diagnostic text to be more consistent with other appertainment
errors.
When the -matomics feature is not enabled, disable POSIXThreads
mode and set the thread model to Single, so that we don't predefine
macros like `__STDCPP_THREADS__`.
Differential Revision: https://reviews.llvm.org/D96091
Defer constant checking of dependent initializer to template instantiation
since it cannot be done for dependent values.
Reviewed by: Artem Belevich
Differential Revision: https://reviews.llvm.org/D95840
explicitly qualified as members of the current instantiation.
Despite the nested name specifier being fully-dependent in this case,
the elaborated type might only be instantiation-dependent, because the
type is a member of the current instantiation.
This fixes Bugzilla #48894 for Arm, where it
was reported that -Wa,-march was not being handled
by the integrated assembler.
This was previously fixed for -Wa,-mthumb by
parsing the argument in ToolChain::ComputeLLVMTriple
instead of CollectArgsForIntegratedAssembler.
It has to be done in the former because the Triple
is read only by the time we get to the latter.
Previously only mcpu would work via -Wa but only because
"-target-cpu" is it's own option to cc1, which we were
able to modify. Target architecture is part of "-target-triple".
This change applies the same workaround to -march and cleans up
handling of -Wa,-mcpu at the same time. There were some
places where we were not using the last instance of an argument.
The existing -Wa,-mthumb code was doing this correctly,
so I've just added tests to confirm that.
Now the same rules will apply to -Wa,-march/-mcpu as would
if you just passed them to the compiler:
* -Wa/-Xassembler options only apply to assembly files.
* Architecture derived from mcpu beats any march options.
* When there are multiple mcpu or multiple march, the last
one wins.
* If there is a compiler option and an assembler option of
the same type, we prefer the one that fits the input type.
* If there is an applicable mcpu option but it is overruled
by an march, the cpu value is still used for the "-target-cpu"
cc1 option.
Reviewed By: nickdesaulniers
Differential Revision: https://reviews.llvm.org/D95872
When deducing a reference type for forwarding references prevent
adding default address space of a template argument if it is given.
This got reported in PR48896 because in OpenCL all parameters are
in private address space and therefore when we initialize a
forwarding reference with a parameter we should just inherit the
address space from it i.e. keep __private instead of __generic.
Tags: #clang
Differential Revision: https://reviews.llvm.org/D95624
This patch implements generation of remaining header search arguments.
It's done manually in C++ as opposed to TableGen, because we need the flexibility and don't anticipate reuse.
This patch also tests the generation of header search options via a round-trip. This way, the code gets exercised whenever Clang is built and tested in asserts mode. All `check-clang` tests pass.
Reviewed By: dexonsmith
Differential Revision: https://reviews.llvm.org/D94472
doubly-nested implicit CXXConstructExprs.
Ensure that we transform the parameter initializer using
TransformInitializer rather than TransformExpr so that we properly strip
down and rebuild the initialization, including any necessary
CXXBindTemporaryExprs. Otherwise we can end up forgetting to destroy
temporary objects used to construct a constructor parameter.
- On Windows, extended lambda has extra issues due to the numbering
schemes are different between the host compilation (Microsoft C++ ABI)
and the device compilation (Itanium C++ ABI. Additional device side
lambda number is required per lambda for the host compilation to
correctly mangle the device-side lambda name.
- A hybrid numbering context `MSHIPNumberingContext` is introduced to
number a lambda for both host- and device-compilations.
Reviewed By: rnk
Differential Revision: https://reviews.llvm.org/D69322
A module in the cache with an error should just be a cache miss. If
allowing errors (with -fallow-pcm-with-compiler-errors), a rebuild is
needed so that the appropriate diagnostics are output and in case search
paths have changed. If not allowing errors, the module was built
*allowing* errors and thus should be rebuilt regardless.
Reviewed By: akyrtzi
Differential Revision: https://reviews.llvm.org/D95989
when rewriting 'a < b' as '(a <=> b) < 0'.
It's pretty common for comparison category types to use a pointer or
pointer-to-member type as their '0' parameter.
This is a corner of the differences between C99 designators and C++20
designators that we'd previously overlooked. As with other such cases,
this continues to be permitted as an extension and allowed by default,
behind the -Wc99-designators warning flag, except in cases where it
leads to a conformance difference (such as in overload resolution and in
a SFINAE context).
Clang usually propagates counter mapping region for conditions of `if`, `while`,
`for`, etc from parent counter. We should do the same for condition of conditional operator.
Differential Revision: https://reviews.llvm.org/D95918
Currently clang is not correctly retrieving from the AST the metadata for
constrained FP builtins. This patch fixes that for the X86 specific builtins.
Differential Revision: https://reviews.llvm.org/D94614
On z/OS, other error messages are not matched correctly in lit tests.
```
EDC5121I Invalid argument.
EDC5111I Permission denied.
```
This patch adds a lit substitution to fix it.
Reviewed By: jhenderson
Differential Revision: https://reviews.llvm.org/D95808
Prevent materializing temporaries in the address space of the references
they are bind to. The temporaries should always be in the same address
space - private for OpenCL.
Tags: #clang
Differential Revision: https://reviews.llvm.org/D95608
Add the builtin functions brought by the cl_khr_subgroup_ballot
extension to `-fdeclare-opencl-builtins`.
Also add placeholder comments for the other Extended Subgroup
Functions from the OpenCL Extension Specification.
Add a comment clarifying the scope of the test.
Differential Revision: https://reviews.llvm.org/D95523
This patch adds AMDGPUOpenMPToolChain for supporting OpenMP
offloading to AMD GPU's.
Originally authored by Greg Rodgers
Reviewed By: jdoerfert
Differential Revision: https://reviews.llvm.org/D94961
Sample re-annotation is required in LTO time to achieve a reasonable post-inline profile quality. However, we have seen that such LTO-time re-annotation degrades profile quality. This is mainly caused by preLTO code duplication that is done by passes such as loop unrolling, jump threading, indirect call promotion etc, where samples corresponding to a source location are aggregated multiple times due to the duplicates. In this change we are introducing a concept of distribution factor for pseudo probes so that samples can be distributed for duplicated probes scaled by a factor. We hope that optimizations duplicating code well-maintain the branch frequency information (BFI) based on which probe distribution factors are calculated. Distribution factors are updated at the end of preLTO pipeline to reflect an estimated portion of the real execution count.
This change also introduces a pseudo probe verifier that can be run after each IR passes to detect duplicated pseudo probes.
A saturated distribution factor stands for 1.0. A pesudo probe will carry a factor with the value ranged from 0.0 to 1.0. A 64-bit integral distribution factor field that represents [0.0, 1.0] is associated to each block probe. Unfortunately this cannot be done for callsite probes due to the size limitation of a 32-bit Dwarf discriminator. A 7-bit distribution factor is used instead.
Changes are also needed to the sample profile inliner to deal with prorated callsite counts. Call sites duplicated by PreLTO passes, when later on inlined in LTO time, should have the callees’s probe prorated based on the Prelink-computed distribution factors. The distribution factors should also be taken into account when computing hotness for inline candidates. Also, Indirect call promotion results in multiple callisites. The original samples should be distributed across them. This is fixed by adjusting the callisites' distribution factors.
Reviewed By: wmi
Differential Revision: https://reviews.llvm.org/D93264
The number of iterations calculation was failing in some cases with more
than two collpased loops. Now the LoopIterationSpace selected matches
InitDependOnLC and CondDependOnLC.
Differential Revision: https://reviews.llvm.org/D95834
Restrict use of references to functions as they can
result in non-conforming behavior.
Tags: #clang
Differential Revision: https://reviews.llvm.org/D95442
Until now, the `-fdeclare-opencl-builtins` option behaved differently
compared to inclusion of `opencl-c.h`: builtins that are part of an
extension were only available if the extension was enabled using the
corresponding pragma.
Builtins that belong to an extension are guarded using a preprocessor
macro (that is named after the extension) in `opencl-c.h`. Align the
behaviour of `-fdeclare-opencl-builtins` with this.
Co-authored-by: Anastasia Stulova
Differential Revision: https://reviews.llvm.org/D95616
Normally, Clang will not make dllimport functions available for inlining
if they reference non-imported symbols, as this can lead to confusing
link errors. But if the function is marked always_inline, the user
presumably knows what they're doing and the attribute should be honored.
Differential revision: https://reviews.llvm.org/D95673
Previously file entries in the -ivfsoverlay yaml could map to a file in the
external file system, but directories had to list their contents in the form of
other file entries or directories. Allowing directory entries to map to a
directory in the external file system makes it possible to present an external
directory's contents in a different location and (in combination with the
'fallthrough' option) overlay one directory's contents on top of another.
rdar://problem/72485443
Differential Revision: https://reviews.llvm.org/D94844
Generate outline atomics if compiling for armv8-a non-LSE AArch64 Linux
(including Android) targets to use LSE instructions, if they are available,
at runtime. Library support is checked by clang driver which doesn't enable
outline atomics if no proper libraries (libgcc >= 9.3.1 or compiler-rt) found.
Differential Revision: https://reviews.llvm.org/D93585
clang-cl already defaults to C17 for .c files, but no harm
in accepting these flags. Fixes PR48185.
Differential Revision: https://reviews.llvm.org/D95575
On non-Windows platforms, --sysroot can be used to make the compiler use
a single, hermetic directory for all header and library files.
This is useful, but difficult to do on Windows. After D95472 it's
possible to achieve this with two flags:
out/gn/bin/clang-cl win.c -fuse-ld=lld \
/vctoolsdir path/to/VC/Tools/MSVC/14.26.28801 \
/winsdkdir path/to/win_sdk
But that's still cumbersome: It requires two flags instead of one, and
it requires writing down the (changing) VC/Tools/MSVC version.
This adds a new `/winsysroot <dir>` flag that's effectively an alias to
these two flags. With this, building against a hermetic Windows
toolchain only needs:
out/gn/bin/clang-cl win.c -fuse-ld=lld /winsysroot path
`/winsysroot <dir>` is the same as adding
/vctoolsdir <dir>/VC/Tools/MSVC/<vctoolsver>
/winsdkdir <dir>/Windows Kits/<winsdkmajorversion>
`<vctoolsver>` is taken from `/vctoolsversion` if passed, or else it's
the name of the directory in `<dir>/VC/Tools/MSVC` that's the highest
numeric tuple.
`<winsdkmajorversion>` is the major version in /winsdkversion if passed,
else it's the name of the directory in `<dir>/Windows Kits` that's the
highest number.
So `/winsysroot <path>` requires this subfolder structure:
path/
VC/
Tools/
MSVC/
14.26.28801 (or another number)
include/
...
Windows Kits/
10/
Include/
10.0.19041.0/ (or another number)
um/
...
Lib/
10.0.19041.0/ (or another number)
um/
x64/
...
...
Differential Revision: https://reviews.llvm.org/D95534
On z/OS, the following error message is not matched correctly in lit tests.
```
EDC5129I No such file or directory.
```
This patch uses a lit config substitution to check for platform specific error messages.
Reviewed By: muiez, jhenderson
Differential Revision: https://reviews.llvm.org/D95246
Clang test Driver/macos-apple-silicon-slice-link-libs-darwin-only.cpp
assumes the target is darwin when the host is darwin which is not
necessarily the case, causing the test to fail when it is not. This
commit adds a -triple argument to the clang invocation to ensure the
target is darwin.
Reviewed By: hans
Differential Revision: https://reviews.llvm.org/D94396
with fix to test case and stringrefs.
Currently (for codeview) lambdas have a string like `<lambda_0>` in
their mangled name, and don't have any display name. This change uses the
`<lambda_0>` as the display name, which helps distinguish between lambdas
in -gline-tables-only, since there are no linkage names there.
It also changes how we display lambda names; previously we used
`<unnamed-tag>`; now it will show `<lambda_0>`.
I added a function to the mangling context code to create this string;
for Itanium it just returns an empty string.
Bug: https://bugs.llvm.org/show_bug.cgi?id=48432
Reviewed By: rnk
Differential Revision: https://reviews.llvm.org/D95187
This reverts 9b21d4b943
Currently (for codeview) lambdas have a string like `<lambda_0>` in
their mangled name, and don't have any display name. This change uses the
`<lambda_0>` as the display name, which helps distinguish between lambdas
in -gline-tables-only, since there are no linkage names there.
It also changes how we display lambda names; previously we used
`<unnamed-tag>`; now it will show `<lambda_0>`.
I added a function to the mangling context code to create this string;
for Itanium it just returns an empty string.
Bug: https://bugs.llvm.org/show_bug.cgi?id=48432
Reviewed By: rnk
Differential Revision: https://reviews.llvm.org/D95187
The only test that needed change had 'QUAL' as an unused prefix. The
rest of the changes are to simplify the prefix lists.
Differential Revision: https://reviews.llvm.org/D95499
If an initial value is given for a bitfield that does not fit in the
bitfield, the value should be truncated. Constant folding for
expressions did not account for this truncation in the case of union
member functions, despite a warning being emitted. In some contexts,
evaluation of expressions was not enabled unless C++11, ROPI or RWPI
was enabled.
Differential Revision: https://reviews.llvm.org/D93101
The Clang enable_if extension is mangled as an <extended-qualifier>,
which is supposed to contain <template-args>. However, we were
unconditionally emitting X/E around its arguments, neglecting the fact
that <expr-primary> should be emitted directly without the surrounding
X/E.
Differential Revision: https://reviews.llvm.org/D95488
Previously, we were emitting an extraneous X .. E in <template-arg>
around an <expr-primary> if the template argument was constructed from
an expression (rather than an already-evaluated literal value). In
such a case, we would then e.g. emit 'XLi0EE' instead of 'Li0E'.
We had one special-case for DeclRefExpr expressions, in particular, to
omit them the mangled-name without the surrounding X/E. However,
unfortunately, that special case also triggered for ParmVarDecl (a
subtype of VarDecl), and _incorrectly_ emitted 'L_Z .. E' instead of
the proper 'Xfp_E'.
This change causes mangleExpression itself to be responsible for
emitting X/E around non-primary expressions, which removes the
special-case, and corrects both these problems.
Differential Revision: https://reviews.llvm.org/D95487
The two operations have acted differently since Clang 8, but were
unfortunately mangled the same. The new mangling uses new "vendor
extended expression" syntax proposed in
https://github.com/itanium-cxx-abi/cxx-abi/issues/112
GCC had the same mangling problem, https://gcc.gnu.org/PR88115, and
will hopefully be switching to the same mangling as implemented here.
Additionally, fix the mangling of `__uuidof` to use the new extension
syntax, instead of its previous nonstandard special-case.
Adjusts the demangler accordingly.
Differential Revision: https://reviews.llvm.org/D93922
More study has discovered this to not actually be useful: because
current C++20 implementations reject `#ifdef __VA_OPT__`, this can't
really be used as a feature-test mechanism. And it's not too hard to
detect __VA_OPT__ without this, for example:
#define THIRD_ARG(a, b, c, ...) c
#define HAS_VA_OPT(...) THIRD_ARG(__VA_OPT__(,), 1, 0, )
#if HAS_VA_OPT(?)
Partially reverts 0436ec2128.
These changes are intended to give code a path to move away from the GNU
,##__VA_ARGS__ extension, which is non-conforming in some situations and
which we'd like to disable in our conforming mode in those cases.
In Clang today, we parse the different attribute syntaxes
(__attribute__, __declspec, and [[]]) in a fairly rigid order. This
leads to confusion for users when they guess the order incorrectly,
and leads to bug reports like PR24559 or necessitates changes like
D94788.
This patch adds a helper function to allow us to more easily parse
attributes in arbitrary order, and then updates all of the places
where we would parse two or more different syntaxes in a rigid order to
use the helper method. The patch does not attempt to handle Microsoft
attributes ([]) because those are ambiguous with other code constructs
and we don't have any attributes that use the syntax.
This reverts commit f4537935dc.
This reverts commit b43c26d036.
This GNU and MSVC extension turns out to be very popular. Most projects
are not using C++20, so cannot use the new __VA_OPT__ feature to be
standards conformant. The other workaround, using -std=gnu*, enables too
many language extensions and isn't viable.
Until there is a way for users to get the behavior provided by the
`, ## __VA_ARGS__` extension in the -std=c++17 and earlier language
modes, we need to revert this.
llvmArchToWindowsSDKArch() returns "" for non-intel non-arm archs.
We're checking for "/fake/lib/" which is followed by the result
of that function -- but if that returns an empty string, then that
trailing slash isn't there. As fix, just explicitly pass a triple
that's intel or arm (I randomly chose aarch64). Since the test runs
with -###, that arch doesn't have to be in LLVM_TARGETS_TO_BUILD.
These do for the Windows SDK path what D85998 did for
%VCToolsInstallDir% with /vctoolsdir: Offer a way to set them with an
explicit commandline switch.
With this (and /vctoolsdir), it's possible to compile and link
against hermetic vctools and winsdk directories with:
out/gn/bin/clang-cl win.c -fuse-ld=lld \
/vctoolsdir path/to/VC/Tools/MSVC/14.26.28801 \
/winsdkdir path/to/win_sdk
compared to a long list of -imsvc and /link /libpath: flags.
While here:
- Change the case of the "Include" folder inside the windows sdk
from "include" to "Include" to match on-disk case. Since the
Windows file system is case-insensitive this isn't a behavior
change, it's just a bit cleaner.
- Add libpath tests to the /vctoolsdir
- Add a FIXME about reading env vars for win sdk and ucrt sdk
if these flags aren't present, to match the VCToolsInstallDir
logic
We should also cache all these computed paths in the driver instead
of computing them every time they're queried, but that's for a future
patch.
It'd also be nice to invent a /winsysroot: flag that sets both
/vctoolsdir: and /winsdkdir: to some well-known subdirectory.
That's for a future patch as well.
Differential Revision: https://reviews.llvm.org/D95472
The included test case triggered a sign assertion on the result in
`Success()`. This was caused by the APSInt created for a bitcast
having its signedness bit inverted. The second APSInt constructor
argument is `isUnsigned`, so invert the result of
`isSignedIntegerType`.
Relanding this patch after reverting. The test case had to be updated
to be insensitive to 32/64-bit extractelement indices.
Differential Revision: https://reviews.llvm.org/D95135
This change implements support for applying profile instrumentation
only to selected files or functions. The implementation uses the
sanitizer special case list format to select which files and functions
to instrument, and relies on the new noprofile IR attribute to exclude
functions from instrumentation.
Differential Revision: https://reviews.llvm.org/D94820
The unwinder used by the crash handler on versions of Android prior to
API 29 did not correctly handle binaries built with rosegment, which is
enabled by default for LLD. Android only supports LLD, so it's not an
issue that this flag is not accepted by other linkers.
Reviewed By: srhines
Differential Revision: https://reviews.llvm.org/D95166
There are two use cases.
Assembler
We have accrued some code gated on MCAsmInfo::useIntegratedAssembler(). Some
features are supported by latest GNU as, but we have to use
MCAsmInfo::useIntegratedAs() because the newer versions have not been widely
adopted (e.g. SHF_LINK_ORDER 'o' and 'unique' linkage in 2.35, --compress-debug-sections= in 2.26).
Linker
We want to use features supported only by LLD or very new GNU ld, or don't want
to work around older GNU ld. We currently can't represent that "we don't care
about old GNU ld". You can find such workarounds in a few other places, e.g.
Mips/MipsAsmprinter.cpp PowerPC/PPCTOCRegDeps.cpp X86/X86MCInstrLower.cpp
AArch64 TLS workaround for R_AARCH64_TLSLD_MOVW_DTPREL_* (PR ld/18276),
R_AARCH64_TLSLE_LDST8_TPREL_LO12 (https://bugs.llvm.org/show_bug.cgi?id=36727https://sourceware.org/bugzilla/show_bug.cgi?id=22969)
Mixed SHF_LINK_ORDER and non-SHF_LINK_ORDER components (supported by LLD in D84001;
GNU ld feature request https://sourceware.org/bugzilla/show_bug.cgi?id=16833 may take a while before available).
This feature allows to garbage collect some unused sections (e.g. fragmented .gcc_except_table).
This patch adds `-fbinutils-version=` to clang and `-binutils-version` to llc.
It changes one codegen place in SHF_MERGE to demonstrate its usage.
`-fbinutils-version=2.35` means the produced object file does not care about GNU
ld<2.35 compatibility. When `-fno-integrated-as` is specified, the produced
assembly can be consumed by GNU as>=2.35, but older versions may not work.
`-fbinutils-version=none` means that we can use all ELF features, regardless of
GNU as/ld support.
Both clang and llc need `parseBinutilsVersion`. Such command line parsing is
usually implemented in `llvm/lib/CodeGen/CommandFlags.cpp` (LLVMCodeGen),
however, ClangCodeGen does not depend on LLVMCodeGen. So I add
`parseBinutilsVersion` to `llvm/lib/Target/TargetMachine.cpp` (LLVMTarget).
Differential Revision: https://reviews.llvm.org/D85474
For Clang synthesized `__va_list_tag` (`CreateX86_64ABIBuiltinVaListDecl`),
its DW_AT_decl_file/DW_AT_decl_line are arbitrarily set from `CurLoc`.
In a stage 2 `-DCMAKE_BUILD_TYPE=Debug` clang build, I observe that
in driver.cpp, DW_AT_decl_file/DW_AT_decl_line may be set to an `#include` line
(the transitively included file uses va_arg (`__builtin_va_arg`)).
This seems arbitrary. Drop that.
Reviewed By: #debug-info, dblaikie
Differential Revision: https://reviews.llvm.org/D94735
This change implements support for applying profile instrumentation
only to selected files or functions. The implementation uses the
sanitizer special case list format to select which files and functions
to instrument, and relies on the new noprofile IR attribute to exclude
functions from instrumentation.
Differential Revision: https://reviews.llvm.org/D94820
From this patch (plus some landed patches), `deviceRTLs` is taken as a regular OpenMP program with just `declare target` regions. In this way, ideally, `deviceRTLs` can be written in OpenMP directly. No CUDA, no HIP anymore. (Well, AMD is still working on getting it work. For now AMDGCN still uses original way to compile) However, some target specific functions are still required, but they're no longer written in target specific language. For example, CUDA parts have all refined by replacing CUDA intrinsic and builtins with LLVM/Clang/NVVM intrinsics.
Here're a list of changes in this patch.
1. For NVPTX, `DEVICE` is defined empty in order to make the common parts still work with AMDGCN. Later once AMDGCN is also available, we will completely remove `DEVICE` or probably some other macros.
2. Shared variable is implemented with OpenMP allocator, which is defined in `allocator.h`. Again, this feature is not available on AMDGCN, so two macros are redefined properly.
3. CUDA header `cuda.h` is dropped in the source code. In order to deal with code difference in various CUDA versions, we build one bitcode library for each supported CUDA version. For each CUDA version, the highest PTX version it supports will be used, just as what we currently use for CUDA compilation.
4. Correspondingly, compiler driver is also updated to support CUDA version encoded in the name of bitcode library. Now the bitcode library for NVPTX is named as `libomptarget-nvptx-cuda_[cuda_version]-sm_[sm_number].bc`, such as `libomptarget-nvptx-cuda_80-sm_20.bc`.
With this change, there are also multiple features to be expected in the near future:
1. CUDA will be completely dropped when compiling OpenMP. By the time, we also build bitcode libraries for all supported SM, multiplied by all supported CUDA version.
2. Atomic operations used in `deviceRTLs` can be replaced by `omp atomic` if OpenMP 5.1 feature is fully supported. For now, the IR generated is totally wrong.
3. Target specific parts will be wrapped into `declare variant` with `isa` selector if it can work properly. No target specific macro is needed anymore.
4. (Maybe more...)
Reviewed By: JonChesterfield
Differential Revision: https://reviews.llvm.org/D94745
The previous implementation required that `-maltivec` be specified when using either `-mabi=vec-extabi` or `-mabi=vec-default`, this patch removes that requirement.
Reviewed By: cebowleratibm
Differential Revision: https://reviews.llvm.org/D94986
Whenever we enter a new OpenMP data environment we want to enter a
function to simplify reasoning. Later we probably want to remove the
entire specialization wrt. the if clause and pass the result to the
runtime, for now this should fix PR48686.
Reviewed By: ABataev
Differential Revision: https://reviews.llvm.org/D94315
We're choosing to take an opt-in approach for landing Relative VTables, so we'll
need asan-equivalent multilibs with relative vtables enabled. Afterwards, we can
just flip the switch in our build.
Differential Revision: https://reviews.llvm.org/D95253
As noted in D91913, MSVC implements the GNU behavior for
, ## __VA_ARGS__ as well. Do the same when `-fms-compatibility` is used.
Reviewed By: rsmith
Differential Revision: https://reviews.llvm.org/D95392
Add new option called InsertEmptyLineBeforeAccessModifier. Empty line
before access modifier is inerted if this option is set to true (which
is the default value, because clang-format always inserts empty lines
before access modifiers), otherwise empty lines are removed.
Fixes issue #16518.
Differential Revision: https://reviews.llvm.org/D93846
or claimRV calls in the IR
Background:
This patch makes changes to the front-end and middle-end that are
needed to fix a longstanding problem where llvm breaks ARC's autorelease
optimization (see the link below) by separating calls from the marker
instructions or retainRV/claimRV calls. The backend changes are in
https://reviews.llvm.org/D92569.
https://clang.llvm.org/docs/AutomaticReferenceCounting.html#arc-runtime-objc-autoreleasereturnvalue
What this patch does to fix the problem:
- The front-end annotates calls with attribute "clang.arc.rv"="retain"
or "clang.arc.rv"="claim", which indicates the call is implicitly
followed by a marker instruction and a retainRV/claimRV call that
consumes the call result. This is currently done only when the target
is arm64 and the optimization level is higher than -O0.
- ARC optimizer temporarily emits retainRV/claimRV calls after the
annotated calls in the IR and removes the inserted calls after
processing the function.
- ARC contract pass emits retainRV/claimRV calls after the annotated
calls. It doesn't remove the attribute on the call since the backend
needs it to emit the marker instruction. The retainRV/claimRV calls
are emitted late in the pipeline to prevent optimization passes from
transforming the IR in a way that makes it harder for the ARC
middle-end passes to figure out the def-use relationship between the
call and the retainRV/claimRV calls (which is the cause of PR31925).
- The function inliner removes the autoreleaseRV call in the callee that
returns the result if nothing in the callee prevents it from being
paired up with the calls annotated with "clang.arc.rv"="retain/claim"
in the caller. If the call is annotated with "claim", a release call
is inserted since autoreleaseRV+claimRV is equivalent to a release. If
it cannot find an autoreleaseRV call, it tries to transfer the
attributes to a function call in the callee. This is important since
ARC optimizer can remove the autoreleaseRV call returning the callee
result, which makes it impossible to pair it up with the retainRV or
claimRV call in the caller. If that fails, it simply emits a retain
call in the IR if the call is annotated with "retain" and does nothing
if it's annotated with "claim".
- This patch teaches dead argument elimination pass not to change the
return type of a function if any of the calls to the function are
annotated with attribute "clang.arc.rv". This is necessary since the
pass can incorrectly determine nothing in the IR uses the function
return, which can happen since the front-end no longer explicitly
emits retainRV/claimRV calls in the IR, and change its return type to
'void'.
Future work:
- Use the attribute on x86-64.
- Fix the auto upgrader to convert call+retainRV/claimRV pairs into
calls annotated with the attributes.
rdar://71443534
Differential Revision: https://reviews.llvm.org/D92808
Currently, there is some refactoring needed in existing interface of OpenCL option
settings to support OpenCL C 3.0. The problem is that OpenCL extensions and features
are not only determined by the target platform but also by the OpenCL version.
Also, there are core extensions/features which are supported unconditionally in
specific OpenCL C version. In fact, these rules are not being followed for all targets.
For example, there are some targets (as nvptx and r600) which don't support
OpenCL C 2.0 core features (nvptx.languageOptsOpenCL.cl, r600.languageOptsOpenCL.cl).
After the change there will be explicit differentiation between optional core and core
OpenCL features which allows giving diagnostics if target doesn't support any of
necessary core features for specific OpenCL version.
This patch also eliminates `OpenCLOptions` instance duplication from `TargetOptions`.
`OpenCLOptions` instance should take place in `Sema` as it's going to be modified
during parsing. Removing this duplication will also allow to generally simplify
`OpenCLOptions` class for parsing purposes.
Reviewed By: Anastasia
Differential Revision: https://reviews.llvm.org/D92277
When working with invalid code, we would try to dereference a nullptr
while deducing template arguments in some dependend code operating on a
lambda with invalid return type.
Differential Revision: https://reviews.llvm.org/D95145
The included test case triggered a sign assertion on the result in
`Success()`. This was caused by the APSInt created for a bitcast
having its signedness bit inverted. The second APSInt constructor
argument is `isUnsigned`, so invert the result of
`isSignedIntegerType`.
Differential Revision: https://reviews.llvm.org/D95135
The test ms-lookup-template-base-classes.cpp added in d972d4c749
is failing on some builtbot that don't include x86.
This patch should fix that (following the patterns in the test directory).
The GNU token paste extension that removes the comma in , ## __VA_ARGS__
conflicts with C99/C++11's requirements when a variadic macro has no
named parameters: according to the standard, an invocation as FOO()
gives it a single empty argument, and concatenation of anything with an
empty argument is well-defined. For this reason, the GNU extension was
already disabled in C99 standard-conforming mode. It was not yet
disabled in C++11 standard-conforming mode.
The associated comment suggested that GCC keeps this extension enabled
in C90/C++03 standard-conforming mode, but it actually does not, so
rather than adding a check for C++ language version, this change simply
removes the check for C language version.
Reviewed By: rsmith
Differential Revision: https://reviews.llvm.org/D91913
D94700 removed the static library so we no longer need to pass
`-llibomptarget-nvptx` to `nvlink`. Since the bitcode library is the only device
runtime for now, instead of emitting a warning when it is not found, an error
should be raised. We also set a new option `libomptarget-nvptx-bc-path` to let
user choose which bitcode library is being used.
Reviewed By: JonChesterfield
Differential Revision: https://reviews.llvm.org/D95161
Insert a llvm.experimental.noalias.scope.decl intrinsic that identifies where a noalias argument was inlined.
This patch includes some refactorings from D90104.
Reviewed By: nikic
Differential Revision: https://reviews.llvm.org/D93040
In the PPC32 SVR4 ABI, a va_list has copies of registers from the function call.
va_arg looked in the wrong registers for (the pointer representation of) an
object in Objective-C, and for some types in C++. Fix va_arg to look in the
general-purpose registers, not the floating-point registers. Also fix va_arg
for some C++ types, like a member function pointer, that are aggregates for
the ABI.
Anthony Richardby found the problem in Objective-C. Eli Friedman suggested
part of this fix.
Fixes https://bugs.llvm.org/show_bug.cgi?id=47921
Reviewed By: efriedma, nemanjai
Differential Revision: https://reviews.llvm.org/D90329
default arguments.
When a function is declared with a qualified name, its eventual semantic
DeclContext may differ from the scope specified by the qualifier if it
redeclares a function in an inline namespace. In this case, we need to
update the DeclContext to be that of the previous declaration, and we
need to do so before we decide whether to inherit default arguments from
that previous declaration, because we only inherit default arguments
from declarations in the same scope.
This patch implements codegen for __managed__ variable attribute for HIP.
Diagnostics will be added later.
Differential Revision: https://reviews.llvm.org/D94814
This addresses an issue with how the PCH preable works, specifically:
1. When using a PCH/preamble the module hash changes and a different cache directory is used
2. When the preamble is used, PCH & PCM validation is disabled.
Due to combination of #1 and #2, reparsing with preamble enabled can end up loading a stale module file before a header change and using it without updating it because validation is disabled and it doesn’t check that the header has changed and the module file is out-of-date.
rdar://72611253
Differential Revision: https://reviews.llvm.org/D95159
check
This patch fixes a bug in emitARCOperationAfterCall where it inserts the
fall-back call after a bitcast instruction and then replaces the
bitcast's operand with the result of the fall-back call. The generated
IR without this patch looks like this:
msgSend.call: ; preds = %entry
%call = call i8* bitcast (i8* (i8*, i8*, ...)* @objc_msgSend
br label %msgSend.cont
msgSend.null-receiver: ; preds = %entry
call void @llvm.objc.release(i8* %4)
br label %msgSend.cont
msgSend.cont:
%8 = phi i8* [ %call, %msgSend.call ], [ null, %msgSend.null-receiver ]
%9 = bitcast i8* %10 to %0*
%10 = call i8* @llvm.objc.retain(i8* %8)
Notice that `%9 = bitcast i8* %10` to %0* is taking operand %10 which is
defined after it.
To fix the bug, this patch modifies the insert point to point to the
bitcast instruction so that the fall-back call is inserted before the
bitcast. In addition, it teaches the function to look at phi
instructions that are generated when there is a check for a null
receiver and insert the retainRV/claimRV instruction right after the
call instead of inserting a fall-back call right after the phi
instruction.
rdar://73360225
Differential Revision: https://reviews.llvm.org/D95181
CodeGenModule::EmitNullConstant() creates constants with their "in memory"
type, not their "in vregs" type. The one place where this difference matters is
when the type is _Bool, as that is an i1 when in vregs and an i8 in memory.
Fixes: rdar://73361264
If a function doesn't contain loops and does not call non-willreturn
functions, then it is willreturn. Loops are detected by checking
for backedges in the function. We don't attempt to handle finite
loops at this point.
Differential Revision: https://reviews.llvm.org/D94633
Defaulted destructor was treated inconsistently, compared to other
compiler-generated functions.
When Sema::IdentifyCUDATarget() got called on just-created dtor which didn't
have implicit __host__ __device__ attributes applied yet, it would treat it as a
host function. That happened to (sometimes) hide the error when dtor referred
to a host-only functions.
Even when we had identified defaulted dtor as a HD function, we still treated it
inconsistently during selection of usual deallocators, where we did not allow
referring to wrong-side functions, while it is allowed for other HD functions.
This change brings handling of defaulted dtors in line with other HD functions.
Differential Revision: https://reviews.llvm.org/D94732
Summary:
The custom mapper API did not previously support the mapping names added previously. This means they were not present if a user requested debugging information while using the mapper functions. This adds basic support for passing the mapped names to the runtime library.
Reviewers: jdoerfert
Differential Revision: https://reviews.llvm.org/D94806