Fix suggested by Yuanfang Chen:
Non-distinct debuginfo is attached to the function due to the undecorated declaration. Later, when seeing the function definition and `nodebug` attribute, the non-distinct debuginfo should be cleared.
Reviewed By: dblaikie
Differential Revision: https://reviews.llvm.org/D104777
This reverts commit c3fe847f9d.
Tests fail in non-asserts builds because they assume named IR, by the
looks of it (testing for the "entry" label, for instance). I don't know
enough about the update_cc_test_checks.py stuff to know how to manually
fix these tests, so reverting for now.
According to AVR ABI (https://gcc.gnu.org/wiki/avr-gcc), returned struct value
within size 1-8 bytes should be returned directly (via register r18-r25), while
larger ones should be returned via an implicit struct pointer argument.
Reviewed By: dylanmckay
Differential Revision: https://reviews.llvm.org/D99237
CoroElide pass works only when a post-split coroutine is inlined into another post-split coroutine.
In O0, there is no inlining after CoroSplit, and hence no CoroElide can happen.
It's useless to put CoroElide pass in the O0 pipeline and it will never be triggered (unless I miss anything).
Differential Revision: https://reviews.llvm.org/D105066
C++ constructors/destructors need to go through a different constructor to construct a GlobalDecl object in order to retrieve their linkage type. This causes an assert failure in the default constructor of GlobalDecl. I'm chaning it to using the exsiting GlobalDecl object.
Reviewed By: dblaikie
Differential Revision: https://reviews.llvm.org/D102356
This patch adds a new clang builtin, __arithmetic_fence. The purpose of the
builtin is to provide the user fine control, at the expression level, over
floating point optimization when -ffast-math (-ffp-model=fast) is enabled.
The builtin prevents the optimizer from rearranging floating point expression
evaluation. The new option fprotect-parens has the same effect on
parenthesized expressions, forcing the optimizer to respect the parentheses.
Reviewed By: aaron.ballman, kpn
Differential Revision: https://reviews.llvm.org/D100118
Added the option `-altivec-src-compat=[mixed,gcc,xl]`. The default at this time is `mixed`.
The default behavior for clang is for all vector compares to return a scalar unless the vectors being
compared are vector bool or vector pixel. In that case the compare returns a
vector. With the gcc case all vector compares return vectors and in the xl case
all vector compares return scalars.
This patch does not change the default behavior of clang.
This option will be used in future patches to implement behaviour compatibility for the vector bool/pixel types.
Reviewed By: bmahjour
Differential Revision: https://reviews.llvm.org/D103615
This reverts commit 6f3b775c3e.
Test fails flakily, see comments on https://reviews.llvm.org/D103967
Also revert follow-up "[Analyzer] Attempt to fix windows bots test
failure b/c of new-line"
This reverts commit fe0e861a4d.
This test is again failing across multiple bots and passing on others
there is no reliable way to enable it for some of the bots while
disabling for the unsupported ones. Tagging it as unsupported across all
types of Arm 32 bit cores.
We need to mask the immediate to the width of a single vector
rather than 2 vectors. If we use the width of 2 vectors then
any shift larger than the length of 1 vector is going to overflow
the shuffle indices.
Fixes PR50895.
In FreeBSD 14 the project will deprecate the _p special profiling
libraries.
Support for -pg (i.e., mcount) still exists but libraries compiled
with -pg will not be built by default, so stop linking against them.
Reviewed by: Dimitry Andric
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.llvm.org/D104753
The feature was implemented in D99005, but we forgot to add the test
macro.
Reviewed By: Quuxplusone
Differential Revision: https://reviews.llvm.org/D104984
Clang can be configured with a different default unwindlib, for example
gcc. In that case, -lunwind will not be present in the output.
Fix this by explicitly specifying libunwind as the unwindlib.
Differential Revision: https://reviews.llvm.org/D104899
Word on the grapevine was that the committee had some discussion that
ended with unanimous agreement on eliminating relational function pointer comparisons.
We wanted to be bold and just ban all of them cold turkey.
But then we chickened out at the last second and are going for
eliminating just the spaceship overload candidate instead, for now.
See D104680 for reference.
This should be fine and "safe", because the only possible semantic change this
would cause is that overload resolution could possibly be ambiguous if
there was another viable candidate equally as good.
But to save face a little we are going to:
* Issue an "error" for three-way comparisons on function pointers.
But all this is doing really is changing one vague error message,
from an "invalid operands to binary expression" into an
"ordered comparison of function pointers", which sounds more like we mean business.
* Otherwise "warn" that comparing function pointers like that is totally
not cool (unless we are told to keep quiet about this).
Signed-off-by: Matheus Izvekov <mizvekov@gmail.com>
Reviewed By: rsmith
Differential Revision: https://reviews.llvm.org/D104892
This patch adds a module level metadata flag indicating that the module
was compiled with the `-fopenmp` flag. This will make it easier for
passes like OpenMPOpt to determine if it should be run.
Reviewed By: jdoerfert
Differential Revision: https://reviews.llvm.org/D102361
This option is already supported by update_test_checks.py, but it can
also be useful in update_cc_test_checks.py. For example, I'd like to
use it in OpenMP offload codegen tests to check global variables like
`.offload_maptypes*`.
Reviewed By: jdoerfert, arichardson, ggeorgakoudis
Differential Revision: https://reviews.llvm.org/D104714
When clang driver is used with -save-temps to compile OpenCL program,
clang driver first launches clang -cc1 -E to generate preprocessor expansion output,
then launches clang -cc1 with the generated preprocessor expansion output as input
to generate LLVM IR.
Currently clang by default passes "-finclude-default-header" "-fdeclare-opencl-builtins"
in both steps, which causes default header included again in the second step, which
causes error.
This patch let clang not to include default header when input type is preprocessor expansion
output, which fixes the issue.
Reviewed by: Anastasia Stulova
Differential Revision: https://reviews.llvm.org/D104800
Note regarding C++ for OpenCL:
When compiling C++ for OpenCL, DW_LANG_C_plus_plus* is emitted.
There is no DWARF language code defined for C++ for OpenCL as of yet,
but DWARF issue 210514.1 has been raised to request one.
In the mean time, continuing to emit DW_LANG_C_plus_plus* for C++ for
OpenCL allows the potential to distinguish between C++ for OpenCL and
OpenCL C in !DICompileUnit nodes, whereas using DW_LANG_OpenCL for
C++ for OpenCL would prevent this.
This change therefore leaves C++ for OpenCL as-is.
Reviewed By: shchenz, Anastasia
Differential Revision: https://reviews.llvm.org/D104118
Consider the code
```
void f(int a0, int b0, int c)
{
int a1 = a0 - b0;
int b1 = (unsigned)a1 + c;
if (c == 0) {
int d = 7L / b1;
}
}
```
At the point of divisiion by `b1` that is considered to be non-zero,
which results in a new constraint for `$a0 - $b0 + $c`. The type
of this sym is unsigned, however, the simplified sym is `$a0 -
$b0` and its type is signed. This is probably the result of the
inherent improper handling of casts. Anyway, Range assignment
for constraints use this type information. Therefore, we must
make sure that first we simplify the symbol and only then we
assign the range.
Differential Revision: https://reviews.llvm.org/D104844
These allow getting a whole register from a larger lmul. Or
inserting a whole register into a larger lmul register. Fractional
lmuls are not supported as they would require a vslide.
Based on this update to the intrinsic doc
https://github.com/riscv/rvv-intrinsic-doc/pull/99
Reviewed By: HsiangKai
Differential Revision: https://reviews.llvm.org/D104822
sanitize-coverage-old-pm.c is passing intermittently on different
arm v7 machines. This patch moves it to unsupported on all arm 32
targets reporting armv8l core.
copy/dispose helper functions
We found out that these fake functions would cause clang to crash if the
changes proposed in https://reviews.llvm.org/D98799 were made.
The original patch was reverted in f681fd927e
because debug locations were missing in the body of the block byref
helper functions. This patch fixes the bug by calling CreateArtificial
after the calls to StartFunction.
Differential Revision: https://reviews.llvm.org/D104082
sanitize-coverage-old-pm.c started failing on arm 32 bit where
underlying architecture reported is armv8l fore 32bit arm.
This patch XFAILS sanitize-coverage-old-pm.c on armv8l similar
to armv7 and thumbv7.
Although clang is able to defer overloading resolution
diagnostics for common functions. It does not defer
overloading resolution caused diagnostics for overloaded
operators.
This patch extends the existing deferred
diagnostic mechanism and defers a diagnostic caused
by overloaded operator.
Reviewed by: Artem Belevich
Differential Revision: https://reviews.llvm.org/D104505
This ensures that the mangled type names match between C and C++,
which is significant when using -fsanitize=cfi-icall. Ideally we
wouldn't have created this namespace at all, but it's now part of
the ABI (e.g. in mangled names), so we can't change it.
Differential Revision: https://reviews.llvm.org/D104830
Only LLVM-based instrumentation profile is supported on AIX.
And it currently must be used with full LTO.
Reviewed By: hubert.reinterpretcast
Differential Revision: https://reviews.llvm.org/D104803
Non-throwing allocators currently will always get null-check code. However, if the non-throwing allocator is explicitly annotated with returns_nonnull the null check should be elided.
Testing:
ninja check-all
added test case correctly elides
Reviewed By: bruno
Differential Revision: https://reviews.llvm.org/D102820
This fixes issues with various return types(bool/int) and was already
in place for nvptx headers, adjusted to work for amdgcn. This does
not affect hip as the change is guarded with OPENMP_AMDGCN.
Similar to D85879.
Reviewed By: jdoerfert, JonChesterfield, yaxunl
Differential Revision: https://reviews.llvm.org/D104677
According to https://eel.is/c++draft/over.literal
> double operator""_Bq(long double); // OK: does not use the reserved identifier _Bq ([lex.name])
> double operator"" _Bq(long double); // ill-formed, no diagnostic required: uses the reserved identifier _Bq ([lex.name])
Obey that rule by keeping track of the operator literal name status wrt. leading whitespace.
Fix: https://bugs.llvm.org/show_bug.cgi?id=50644
Differential Revision: https://reviews.llvm.org/D104299
The default Altivec ABI was implemented but the clang error for specifying
its use still remains. Users could get around this but not specifying the
type of Altivec ABI but we need to remove the error.
Reviewed By: jsji
Differential Revision: https://reviews.llvm.org/D102094
NFCI, although the test change shows that ConstantExpr::getAsInstruction
is better than the old implementation of createReplacementInstr because
it propagates things like the sdiv "exact" flag.
Differential Revision: https://reviews.llvm.org/D104124
During template instantiation involving templated lambdas, clang
could hit an assertion in `TemplateDeclInstantiator::SubstFunctionType`
since the functions are not associated with any `TypeSourceInfo`:
`assert(OldTInfo && "substituting function without type source info");`
This path is triggered when using templated lambdas like the one added as
a test to this patch. To fix this:
- Create `TypeSourceInfo`s for special members and make sure the template
instantiator can get through all patterns.
- Introduce a `SpecialMemberTypeInfoRebuilder` tree transform to rewrite
such member function arguments. Without this, we get errors like:
`error: only special member functions and comparison operators may be defaulted`
since `getDefaultedFunctionKind` can't properly recognize these functions
as special members as part of `SetDeclDefaulted`.
Fixes PR45828 and PR44848
Differential Revision: https://reviews.llvm.org/D88327
This change adds an option which, in addition to dumping the record
layout as is done by -fdump-record-layouts, causes us to compute the
layout for all complete record types (rather than the as-needed basis
which is usually done by clang), so that we will dump them as well.
This is useful if we are looking for layout differences across large
code bases without needing to instantiate every type we are interested in.
Reviewed By: dexonsmith
Differential Revision: https://reviews.llvm.org/D104484
Fixes bug 50263
When "unused-private-field" flag is on if you have a struct with private
members and only defaulted comparison operators clang will warn about
unused private fields.
If you where to write the comparison operators by hand no warning is
produced.
This is a bug since defaulting a comparison operator uses all private
members .
The fix is simple, in CheckExplicitlyDefaultedFunction just clear the
list of unused private fields if the defaulted function is a comparison
function.
Differential revision: https://reviews.llvm.org/D102186
Summary:
Currently the attributor needs to give up if a function has external linkage.
This means that the optimization introduced in D97818 will only apply to static
functions. This change uses the Attributor to internalize OpenMP device
routines by making a copy of each function with private linkage and replacing
the uses in the module with it. This allows for the optimization to be applied
to any regular function.
Reviewed By: jdoerfert
Differential Revision: https://reviews.llvm.org/D102824
Summary:
Memory globalization is required to maintain OpenMP standard semantics for data sharing between
worker and master threads. The GPU cannot share data between its threads so must allocate global or
shared memory to store the data in. Currently this is implemented fully in the frontend using the
`__kmpc_data_sharing_push_stack` and __kmpc_data_sharing_pop_stack` functions to emulate standard
CPU stack sharing. The front-end scans the target region for variables that escape the region and
must be shared between the threads. Each variable then has a field created for it in a global record
type.
This patch replaces this functinality with a single allocation command, effectively mimicing an
alloca instruction for the variables that must be shared between the threads. This will be much
slower than the current solution, but makes it much easier to optimize as we can analyze each
variable independently and determine if it is not captured. In the future, we can replace these
calls with an `alloca` and small allocations can be pushed to shared memory.
Reviewed By: tianshilei1992
Differential Revision: https://reviews.llvm.org/D97680
The codegen for simd constructs was affected by the presence (or
absence) of the 'monotonic' schedule modifier for worksharing
loops. The modifier is only intended to apply to the scheduling of
chunks for a thread, not iterations of a loop inside a chunk.
In addition, the monotonic modifier was applied to worksharing loops
by default if no schedule clause was present; the referenced part of
the OpenMP 4.5 spec in the code (section 2.7.1) only applies if the
user specified a schedule clause with a static kind but no modifier.
Without a user-specified schedule clause we should default to
nonmonotonic scheduling.
Reviewed By: ABataev
Differential Revision: https://reviews.llvm.org/D103793
The checker contains check for passing a NULL stream argument.
This change should make more easy to identify where the passed pointer
becomes NULL.
Reviewed By: NoQ
Differential Revision: https://reviews.llvm.org/D104640
Without this patch, llvm/utils/update_cc_test_checks.py fails to
perform `--replace-value-regex` replacements when two RUN lines
produce the same output and use the same single FileCheck prefix. The
problem is that replacements in a RUN line's output are not performed
until after comparing against previous RUN lines' output, where
replacements have already been performed. This patch fixes that.
Reviewed By: MaskRay
Differential Revision: https://reviews.llvm.org/D104566
GCC has had this function attribute since GCC 7.1 for this purpose. I
added "no_profile" last week in D104475; rename this to
"no_profile_instrument_function" to improve compatibility with GCC.
Link: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=80223#c11
Reviewed By: MaskRay, aaron.ballman
Differential Revision: https://reviews.llvm.org/D104658
Add runtime functions to detect invalid calls to pure or deleted virtual
functions.
Patch by: Siu Chi Chan
Reviewed by: Yaxun Liu
Differential Revision: https://reviews.llvm.org/D104392
This patch does three things:
- Map the /external:I flag to -isystem
- Add support for the /external:env:<var> flag which reads system
include paths from the <var> environment variable
- Pick up system include dirs EXTERNAL_INCLUDE in addition to the old
INCLUDE environment variable.
Differential revision: https://reviews.llvm.org/D104387
This reverts commit a1449a10db.
Seems like my changes to LNT had no effect -- puzzled.
The 21 tests pass on my sandbox with the clang patch but are
failing in exec time in the bot
This patch changes the ffp-model=precise to enables -ffp-contract=on
(previously -ffp-model=precise enabled -ffp-contract=fast). This is a
follow-up to Andy Kaylor's comments in the llvm-dev discussion
"Floating Point semantic modes". From the same email thread, I put
Andy's distillation of floating point options and floating point modes
into UsersManual.rst
Differential Revision: https://reviews.llvm.org/D74436
This fixes a crash in MallocChecker for the situation when operator new (delete) is invoked via NTTP and makes the behavior of CallContext.getCalleeDecl(Expr) identical to CallEvent.getDecl().
Reviewed By: vsavchenko
Differential Revision: https://reviews.llvm.org/D103025
This implements a more comprehensive fix than was done at D95409.
Instead of excluding just function pointer subobjects, we also
exclude any user-defined function pointer conversion operators.
Signed-off-by: Matheus Izvekov <mizvekov@gmail.com>
Reviewed By: rsmith
Differential Revision: https://reviews.llvm.org/D103855
Support -Wno-frame-larger-than (with no =) and make it properly
interoperate with -Wframe-larger-than. Reject -Wframe-larger-than with
no argument.
We continue to support Clang's old spelling, -Wframe-larger-than=, for
compatibility with existing users of that facility.
In passing, stop the driver from accepting and ignoring
-fwarn-stack-size and make it a cc1-only flag as intended.
Summary:
AIX does not support --as-needed linker options. Remove that option from
aix linker when -lunwind is needed.
For unwinder library, nothing special is needed because by default aix
linker has the as-needed effect for library that's an archive (which is
the case for libunwind on AIX).
Reviewed By: daltenty
Differential Revision: https://reviews.llvm.org/D104314
Fixed crash when doing pointer math on a void pointer.
Also, reworked test to use -verify rather than FileCheck.
Reviewed By: erichkeane
Differential Revision: https://reviews.llvm.org/D104424
D66572 separated BugReport and BugReporter into basic and path sensitive
versions. As a result, checker silencing, which worked deep in the path
sensitive report generation facilities became specific to it. DeadStoresChecker,
for instance, despite being in the static analyzer, emits non-pathsensitive
reports, and was impossible to silence.
This patch moves the corresponding code before the call to the virtual function
generateDiagnosticForConsumerMap (which is overriden by the specific kinds of
bug reporters). Although we see bug reporting as relatively lightweight compared
to the analysis, this will get rid of several steps we used to throw away.
Quoting from D65379:
At a very high level, this consists of 3 steps:
For all BugReports in the same BugReportEquivClass, collect all their error
nodes in a set. With that set, create a new, trimmed ExplodedGraph whose leafs
are all error nodes.
Until a valid report is found, construct a bug path, which is yet another
ExplodedGraph, that is linear from a given error node to the root of the graph.
Run all visitors on the constructed bug path. If in this process the report got
invalidated, start over from step 2.
Checker silencing used to kick in after all of these. Now it does before any of
them :^)
Differential Revision: https://reviews.llvm.org/D102914
Change-Id: Ice42939304516f2bebd05a1ea19878b89c96a25d
This can be seen as a follow up to commit 0ee439b705,
that changed the second argument of __powidf2, __powisf2 and
__powitf2 in compiler-rt from si_int to int. That was to align with
how those runtimes are defined in libgcc.
One thing that seem to have been missing in that patch was to make
sure that the rest of LLVM also handle that the argument now depends
on the size of int (not using the si_int machine mode for 32-bit).
When using __builtin_powi for a target with 16-bit int clang crashed.
And when emitting libcalls to those rtlib functions, typically when
lowering @llvm.powi), the backend would always prepare the exponent
argument as an i32 which caused miscompiles when the rtlib was
compiled with 16-bit int.
The solution used here is to use an overloaded type for the second
argument in @llvm.powi. This way clang can use the "correct" type
when lowering __builtin_powi, and then later when emitting the libcall
it is assumed that the type used in @llvm.powi matches the rtlib
function.
One thing that needed some extra attention was that when vectorizing
calls several passes did not support that several arguments could
be overloaded in the intrinsics. This patch allows overload of a
scalar operand by adding hasVectorInstrinsicOverloadedScalarOpd, with
an entry for powi.
Differential Revision: https://reviews.llvm.org/D99439
Fixes PR50591.
When analyzing classes with members which have user-defined conversion
operators to builtin types, the defaulted comparison analyzer was
picking the member type instead of the type for the builtin operator
which was selected as the best match.
This could either result in wrong comparison category being selected,
or a crash when runtime checks are enabled.
Signed-off-by: Matheus Izvekov <mizvekov@gmail.com>
Reviewed By: rsmith
Differential Revision: https://reviews.llvm.org/D103760
This expands NRVO propagation for more cases:
Parse analysis improvement:
* Lambdas and Blocks with dependent return type can have their variables
marked as NRVO Candidates.
Variable instantiation improvements:
* Fixes crash when instantiating NRVO variables in Blocks.
* Functions, Lambdas, and Blocks which have auto return type have their
variables' NRVO status propagated. For Blocks with non-auto return type,
as a limitation, this propagation does not consider the actual return
type.
This also implements exclusion of VarDecls which are references to
dependent types.
Signed-off-by: Matheus Izvekov <mizvekov@gmail.com>
Reviewed By: Quuxplusone
Differential Revision: https://reviews.llvm.org/D99696
Addition of this pass has been botched.
There is no particular reason why it had to be sold as an inseparable part
of new-pm transition. It was added when old-pm was still the default,
and very *very* few users were actually tracking new-pm,
so it's effects weren't measured.
Which means, some of the turnoil of the new-pm transition
are actually likely regressions due to this pass.
Likewise, there has been a number of post-commit feedback
(post new-pm switch), namely
* https://reviews.llvm.org/D37467#2787157 (regresses HW-loops)
* https://reviews.llvm.org/D37467#2787259 (should not be in middle-end, should run after LSR, not before)
* https://reviews.llvm.org/D95789 (an attempt to fix bad loop backedge metadata)
and in the half year past, the pass authors (google) still haven't found time to respond to any of that.
Hereby it is proposed to backout the pass from the pipeline,
until someone who cares about it can address the issues reported,
and properly start the process of adding a new pass into the pipeline,
with proper performance evaluation.
Furthermore, neither google nor facebook reports any perf changes
from this change, so i'm dropping the pass completely.
It can always be re-reverted should/if anyone want to pick it up again.
Reviewed By: aeubanks
Differential Revision: https://reviews.llvm.org/D104099
This has already been implemented in be2277fbf2 which adds
pragma fp support. This patch just adds test coverage for
regular fast-math flags (PR46165).
Need to capture locals in aligned clauses for the combined directives to
be fix the crash in the codegen.
Differential Revision: https://reviews.llvm.org/D104258
A lambda in a function template may be recursively instantiated. The recursive
lambda will cause a lambda function instantiated multiple times, one inside another.
The inner LocalInstantiationScope should not be marked as MergeWithParentScope
since it already has references to locals properly substituted, otherwise it causes
assertion due to the check for duplicate locals in merged LocalInstantiationScope.
Reviewed by: Richard Smith
Differential Revision: https://reviews.llvm.org/D98068
Added support for CXXRewrittenBinaryOperator as a condition in ranged
for loops. This is a new kind of expression, need to extend support for
C++20 constructs.
It fixes PR49970: range-based for compilation fails for libstdc++ vector
with -std=c++20.
Differential Revision: https://reviews.llvm.org/D104240
The parser code assumes building with C++ compiler and asserts when using clang (not clang++) on C file. I made the code dependent on input language. This shows up for amdgpu target.
Reviewed By: ABataev
Differential Revision: https://reviews.llvm.org/D103899
This change caused build errors related to move-only __block variables,
see discussion on https://reviews.llvm.org/D99696
> This expands NRVO propagation for more cases:
>
> Parse analysis improvement:
> * Lambdas and Blocks with dependent return type can have their variables
> marked as NRVO Candidates.
>
> Variable instantiation improvements:
> * Fixes crash when instantiating NRVO variables in Blocks.
> * Functions, Lambdas, and Blocks which have auto return type have their
> variables' NRVO status propagated. For Blocks with non-auto return type,
> as a limitation, this propagation does not consider the actual return
> type.
>
> This also implements exclusion of VarDecls which are references to
> dependent types.
>
> Signed-off-by: Matheus Izvekov <mizvekov@gmail.com>
>
> Reviewed By: Quuxplusone
>
> Differential Revision: https://reviews.llvm.org/D99696
This also reverts the follow-on change which was hard to tease apart
form the one above:
> "[clang] Implement P2266 Simpler implicit move"
>
> This Implements [[http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2021/p2266r1.html|P2266 Simpler implicit move]].
>
> Signed-off-by: Matheus Izvekov <mizvekov@gmail.com>
>
> Reviewed By: Quuxplusone
>
> Differential Revision: https://reviews.llvm.org/D99005
This reverts commits 1e50c3d785 and
bf20631782.
The `sed` command ensures Windows-specific path separators (single and double backslashes) are replaced by forward slashes in the output file. FileCheck can continue using forward slashes in paths this way.
One of the goals of the dependency scanner is to generate command-lines that can be used to explicitly build modular dependencies of a translation unit. The only modifications to these command-lines should be for the purposes of explicit modular build.
However, the current version of dependency scanner leaks its implementation details into the command-lines.
The first problem is that the `clang-scan-deps` tool adjusts the original textual command-line (adding `-Eonly -M -MT <target> -sys-header-deps -Wno-error -o /dev/null `, removing `--serialize-diagnostics`) in order to set up the `DependencyScanning` library. This has been addressed in D103461, D104012, D104030, D104031, D104033. With these patches, the `DependencyScanning` library receives the unmodified `CompilerInvocation`, sets it up and uses it for the implicit modular build.
Finally, to prevent leaking the implementation details to the resulting command-lines, this patch generates them from the **original** unmodified `CompilerInvocation` rather than from the one that drives the implicit build.
Reviewed By: dexonsmith
Differential Revision: https://reviews.llvm.org/D104036
The `clang-scan-deps` tests for the full output format were written under the assumption that most TUs/modules have the same context hash. This is no longer true, since we're changing the original compilation options. This patch updates the tests, which no longer conflate multiple context hashes into a single FileCheck variable.
Commit d8bab69ead updated the ClangScanDeps/modules.cpp test. The new `{{.*}}` regex is supposed to only match `modules_cdb_input.o`, `a.o` or `b.o`. However, due to non-determinism, this can sometimes also match `modules_cdb_input2.o`, causing match failure on the next line. This commit changes the regex to only match one of the three valid cases.
Buildbot failure: https://lab.llvm.org/buildbot/#/builders/109/builds/16675
The `clang-scan-deps` tool has some logic that parses and modifies the original Clang command-line. The goal is to setup `DependencyOutputOptions` by injecting `-M -MT <target>` and prevent the creation of output files.
This patch moves the logic into the `DependencyScanning` library, and uses the parsed `CompilerInvocation` instead of the raw command-line. The code simpler and can be used from the C++ API as well.
The `-o /dev/null` arguments are not necessary, since the `DependencyScanning` library only runs a preprocessing action, so there's no way it'll produce an actual object file.
Related: The `-M` argument implies `-w`, which would appear on the command-line of modular dependencies even though it was not on the original TU command line (see D104036).
Some related tests were updated.
Reviewed By: arphaman
Differential Revision: https://reviews.llvm.org/D104030
To prevent the creation of diagnostics file, `clang-scan-deps` strips the corresponding command-line argument. This behavior is useful even when using the C++ `DependencyScanner` library.
This patch transforms stripping of command-line in `clang-scan-deps` into stripping of `CompilerInvocation` in `DependencyScanning`.
AFAIK, the `clang-cl` driver doesn't even accept `--serialize-diagnostics`, so I've removed the test. (It would fail with an unknown command-line argument otherwise.)
Note: Since we're generating command-lines for modular dependencies from `CompilerInvocation`, the `--serialize-diagnostics` will be dropped. This was already happening in `clang-scan-deps` before this patch, but it will now happen also when using `DependencyScanning` library directly. This is resolved in D104036.
Reviewed By: dexonsmith, arphaman
Differential Revision: https://reviews.llvm.org/D104012
Update `setConstraint` to simplify existing equivalence classes when a
new constraint is added. In this patch we iterate over all existing
equivalence classes and constraints and try to simplfy them with
simplifySVal. This solves problematic cases where we have two symbols in
the tree, e.g.:
```
int test_rhs_further_constrained(int x, int y) {
if (x + y != 0)
return 0;
if (y != 0)
return 0;
clang_analyzer_eval(x + y == 0); // expected-warning{{TRUE}}
clang_analyzer_eval(y == 0); // expected-warning{{TRUE}}
return 0;
}
```
Differential Revision: https://reviews.llvm.org/D103314
When a translation unit uses a PCH and imports the same modules as the PCH, we'd prefer to resolve to those modules instead of inventing new modules and reporting them as modular dependencies. Since the PCH modules have already been built nudge the compiler to reuse them when deciding whether to build a new module and don't report them as regular modular dependencies.
Depends on D103524 & D103802.
Reviewed By: dexonsmith
Differential Revision: https://reviews.llvm.org/D103526
The `PreprocessOnlyAction` doesn't support loading the AST file of a precompiled header. This is problematic for dependency scanning, since the `#include` manufactured for the PCH is treated as textual. This means the PCH contents get scanned with each TU, which is redundant. Moreover, dependencies of the PCH end up being considered dependency of the TU.
To handle AST file of PCH properly, this patch creates new `FrontendAction` that behaves the same way `PreprocessorOnlyAction` does, but treats the manufactured PCH `#include` as a normal compilation would (by not claiming it only uses a preprocessor and creating the default AST consumer).
The AST file is now reported as a file dependency of the TU.
Depends on D103519.
Reviewed By: Bigcheese
Differential Revision: https://reviews.llvm.org/D103524
It's useful to be able to load explicitly-built PCH files into an implicit build (e.g. during dependency scanning). That's currently impossible, since the explicitly-built PCH has an empty modules cache path, while the current compilation has (and needs to have) a valid path, triggering an error in the `PCHValidator`.
This patch adds a preprocessor option and command-line flag that can be used to omit this check.
Reviewed By: dexonsmith
Differential Revision: https://reviews.llvm.org/D103802
At least LibreOffice has, for mainly historic reasons that would be hard to
change now, a class Any with an overloaded operator >>= that semantically does
not assign to the LHS but rather extracts into the (by-reference) RHS. Which
thus caused false positive -Wunused-but-set-parameter and
-Wunused-but-set-variable after those have been introduced recently.
This change is more conservative about the assumed semantics of overloaded
operators, excluding compound assignment operators but keeping plain operator =
ones. At least for LibreOffice, that strikes a good balance of not producing
false positives but still finding lots of true ones.
(The change to the BinaryOperator case in MaybeDecrementCount is necessary
because e.g. the template f4 test code in warn-unused-but-set-variables-cpp.cpp
turns the += into a BinaryOperator.)
Differential Revision: https://reviews.llvm.org/D103949
This expands NRVO propagation for more cases:
Parse analysis improvement:
* Lambdas and Blocks with dependent return type can have their variables
marked as NRVO Candidates.
Variable instantiation improvements:
* Fixes crash when instantiating NRVO variables in Blocks.
* Functions, Lambdas, and Blocks which have auto return type have their
variables' NRVO status propagated. For Blocks with non-auto return type,
as a limitation, this propagation does not consider the actual return
type.
This also implements exclusion of VarDecls which are references to
dependent types.
Signed-off-by: Matheus Izvekov <mizvekov@gmail.com>
Reviewed By: Quuxplusone
Differential Revision: https://reviews.llvm.org/D99696
Also:
- add driver test (fsanitize-use-after-return.c)
- add basic IR test (asan-use-after-return.cpp)
- (NFC) cleaned up logic for generating table of __asan_stack_malloc
depending on flag.
for issue: https://github.com/google/sanitizers/issues/1394
Reviewed By: vitalybuka
Differential Revision: https://reviews.llvm.org/D104076
Check applied to unbounded (incomplete) arrays and pointers to spot
cases where the computed address is beyond the largest possible
addressable extent of the array, based on the address space in which the
array is delcared, or which the pointer refers to.
Check helps to avoid cases of nonsense pointer math and array indexing
which could lead to linker failures or runtime exceptions. Of
particular interest when building for embedded systems with small
address spaces.
This is version 2 of this patch -- version 1 had some testing issues
due to a sign error in existing code. That error is corrected and
lit test for this chagne is extended to verify the fix.
Originally reviewed/accepted by: aaron.ballman
Original revision: https://reviews.llvm.org/D86796
Reviewed By: aaron.ballman, ebevhan
Differential Revision: https://reviews.llvm.org/D88174
Allow the usage of minor version 0, for hip versions
such as 4.0. Change the default values when performing
version checks.
Reviewed By: yaxunl
Differential Revision: https://reviews.llvm.org/D104062
Check applied to unbounded (incomplete) arrays and pointers to spot
cases where the computed address is beyond the largest possible
addressable extent of the array, based on the address space in which the
array is delcared, or which the pointer refers to.
Check helps to avoid cases of nonsense pointer math and array indexing
which could lead to linker failures or runtime exceptions. Of
particular interest when building for embedded systems with small
address spaces.
This is version 2 of this patch -- version 1 had some testing issues
due to a sign error in existing code. That error is corrected and
lit test for this chagne is extended to verify the fix.
Originally reviewed/accepted by: aaron.ballman
Original revision: https://reviews.llvm.org/D86796
Reviewed By: aaron.ballman, ebevhan
Differential Revision: https://reviews.llvm.org/D88174
Adds the basic instrumentation needed for stack tagging.
Currently does not support stack short granules or TLS stack histories,
since a different code path is followed for the callback instrumentation
we use.
We may simply wait to support these two features until we switch to
a custom calling convention.
Patch By: xiangzhangllvm, morehouse
Reviewed By: vitalybuka
Differential Revision: https://reviews.llvm.org/D102901
This fixes the prioritization of address spaces when choosing a
constructor, stopping them from being considered equally good,
which made the construction of types that could be constructed
by more than one of the constructors.
It does this by preferring the most specific address space,
which is decided by seeing if one of the address spaces is
a superset of the other, and preferring the other.
Fixes: PR50329
Reviewed By: Anastasia
Differential Revision: https://reviews.llvm.org/D102850
-Wframe-larger-than= is an interesting warning; we can't know the frame
size until PrologueEpilogueInsertion (PEI); very late in the compilation
pipeline.
-Wframe-larger-than= was propagated through CC1 as an -mllvm flag, then
was a cl::opt in LLVM's PEI pass; this meant it was dropped during LTO
and needed to be re-specified via -plugin-opt.
Instead, make it part of the IR proper as a module level attribute,
similar to D103048. Introduce -fwarn-stack-size CC1 option.
Reviewed By: rsmith, qcolombet
Differential Revision: https://reviews.llvm.org/D103928
This expands NRVO propagation for more cases:
Parse analysis improvement:
* Lambdas and Blocks with dependent return type can have their variables
marked as NRVO Candidates.
Variable instantiation improvements:
* Fixes crash when instantiating NRVO variables in Blocks.
* Functions, Lambdas, and Blocks which have auto return type have their
variables' NRVO status propagated. For Blocks with non-auto return type,
as a limitation, this propagation does not consider the actual return
type.
This also implements exclusion of VarDecls which are references to
dependent types.
Signed-off-by: Matheus Izvekov <mizvekov@gmail.com>
Reviewed By: Quuxplusone
Differential Revision: https://reviews.llvm.org/D99696
Implementation of the unroll directive introduced in OpenMP 5.1. Follows the approach from D76342 for the tile directive (i.e. AST-based, not using the OpenMPIRBuilder). Tries to use `llvm.loop.unroll.*` metadata where possible, but has to fall back to an AST representation of the outer loop if the partially unrolled generated loop is associated with another directive (because it needs to compute the number of iterations).
Reviewed By: ABataev
Differential Revision: https://reviews.llvm.org/D99459
This patch adds the command line options /permissive and /permissive- to clang-cl. These flags are used in MSVC to enable various /Zc language conformance options at once. In particular, /permissive is used to enable the various non standard behaviour of MSVC, while /permissive- is the opposite.
When either of two command lines are specified they are simply expanded to the various underlying /Zc options. In particular when /permissive is passed it currently expands to:
/Zc:twoPhase- (disable two phase lookup)
-fno-operator-names (disable C++ operator keywords)
/permissive- expands to the opposites of these flags + /Zc:strictStrings (/Zc:strictStrings- does not currently exist). In the future, if any more MSVC workarounds are ever added they can easily be added to the expansion. One is also able to override settings done by permissive. Specifying /permissive- /Zc:twoPhase- will apply the settings from permissive minus, but disables two phase lookup.
Motivation for this patch was mainly parity with MSVC as well as compatibility with Windows SDK headers. The /permissive page from MSVC documents various workarounds that have to be done for the Windows SDK headers [1], when MSVC is used with /permissive-. In these, Microsoft often recommends simply compiling with /permissive for the specified source files. Since some of these also apply to clang-cl (which acts like /permissive- by default mostly), and some are currently implemented as "hacks" within clang that I'd like to remove, adding /permissive and /permissive- to be in full parity with MSVC and Microsofts documentation made sense to me.
[1] https://docs.microsoft.com/en-us/cpp/build/reference/permissive-standards-conformance?view=msvc-160#windows-header-issues
Differential Revision: https://reviews.llvm.org/D103773
When using the -fno-rtti option of the GCC style clang++, using typeid results in an error. The MSVC STL however kindly provides a define flag called _HAS_STATIC_RTTI, which either enables or disables uses of typeid throughout the STL. By default, if undefined, it is set to 1, enabling the use of typeid.
With this patch, _HAS_STATIC_RTTI is set to 0 when -fno-rtti is specified. This way various headers of the MSVC STL like functional can be consumed without compilation failures.
Differential Revision: https://reviews.llvm.org/D103771
This patch adds the command line option -foperator-names which acts as the opposite of -fno-operator-names. With this command line option it is possible to reenable C++ operator keywords on the command line if -fno-operator-names had previously been passed.
Differential Revision: https://reviews.llvm.org/D103749
This patch changes the ffp-model=precise to enables -ffp-contract=on
(previously -ffp-model=precise enabled -ffp-contract=fast). This is a
follow-up to Andy Kaylor's comments in the llvm-dev discussion
"Floating Point semantic modes". From the same email thread, I put
Andy's distillation of floating point options and floating point modes
into UsersManual.rst
Differential Revision: https://reviews.llvm.org/D74436
Clang will create a global value put in constant memory if an aggregate value
is declared firstprivate in the target device. The symbol name only uses the
name of the firstprivate variable, so symbol name conflicts will occur if the
variable is allowed to have different types through templates. An example of
this behvaiour is shown in https://godbolt.org/z/EsMjYh47n. This patch adds the
mangled type name to the symbol to avoid such naming conflicts. This fixes
https://bugs.llvm.org/show_bug.cgi?id=50642.
Reviewed By: ABataev
Differential Revision: https://reviews.llvm.org/D103995
Before this change, CXXDefaultArgExpr would always have
ExprDependence::None. This can lead to issues when, for example, the
inner expression is RecoveryExpr and yet containsErrors() on the default
expression is false.
Differential Revision: https://reviews.llvm.org/D103982
Post-commit feedback for d69c4372bf says the output
may vary between pass managers. This is hopefully a
quick fix, but we might want to investigate how to
better solve this type of problem.
Add a test to verify OpenCL builtin declarations using
OpenCLBuiltins.td.
This test consists of parsing a 60k line generated input file. The
entire test takes about 60s with a debug build on a decent machine.
Admittedly this is not the fastest test, but doesn't seem excessive
compared to other tests in clang/test/Headers (with one of the tests
taking 85s for example).
RFC: https://lists.llvm.org/pipermail/cfe-dev/2021-April/067973.html
Differential Revision: https://reviews.llvm.org/D97869
Added --gpu-bundle-output to control bundling/unbundling output of HIP device compilation.
By default preprocessor expansion, llvm bitcode and assembly are unbundled, code objects are
bundled.
Reviewed by: Artem Belevich, Jan Svoboda
Differential Revision: https://reviews.llvm.org/D101630
1. There is no tests for mabi=ilp32e, and my patch covers that.
2. The tests in riscv-abi.c will show default ABI changes for special archs
in the future, especially the arch with the F but without the D extension.
3. The tests in riscv-arch.c will show default arch changes for abi=ilp32,
which is rv32imacfd currently, but it is better to be rv32imac.
And it is also better for abi=ilp32f defaults to arch=imacf.
Reviewed By: MaskRay, luismarques
Differential Revision: https://reviews.llvm.org/D103878
This completes the series implementing p1099, by adding the feature
macro and updating the web page.
Differential Revision: https://reviews.llvm.org/D102242
Clang checks whether the type given to va_arg will automatically cause
undefined behavior, but this check was issuing false positives for
enumerations in C++. The issue turned out to be because
typesAreCompatible() in C++ checks whether the types are *the same*, so
this uses custom logic if the type compatibility check fails.
This issue was found by a user on code like:
typedef enum {
CURLINFO_NONE,
CURLINFO_EFFECTIVE_URL,
CURLINFO_LASTONE = 60
} CURLINFO;
...
__builtin_va_arg(list, CURLINFO); // false positive warning
Given that C++ defers to C for the rules around va_arg, the behavior
should be the same in both C and C++ and not diagnose because int and
CURLINFO are "compatible enough" types for va_arg.
This renames the expression value categories from rvalue to prvalue,
keeping nomenclature consistent with C++11 onwards.
C++ has the most complicated taxonomy here, and every other language
only uses a subset of it, so it's less confusing to use the C++ names
consistently, and mentally remap to the C names when working on that
context (prvalue -> rvalue, no xvalues, etc).
Renames:
* VK_RValue -> VK_PRValue
* Expr::isRValue -> Expr::isPRValue
* SK_QualificationConversionRValue -> SK_QualificationConversionPRValue
* JSON AST Dumper Expression nodes value category: "rvalue" -> "prvalue"
Signed-off-by: Matheus Izvekov <mizvekov@gmail.com>
Reviewed By: rsmith
Differential Revision: https://reviews.llvm.org/D103720
The FileCheck lines in these files are auto-generated and complete,
so there's very little upside (less CHECK lines) from running
-instcombine on them and violating the expected test layering
(optimizer developers shouldn't have to be aware of clang tests).
Running opt passes like this makes it harder to make changes such as:
D93817
This implements the 'using enum maybe-qualified-enum-tag ;' part of
1099. It introduces a new 'UsingEnumDecl', subclassed from
'BaseUsingDecl'. Much of the diff is the boilerplate needed to get the
new class set up.
There is one case where we accept ill-formed, but I believe this is
merely an extended case of an existing bug, so consider it
orthogonal. AFAICT in class-scope the c++20 rule is that no 2 using
decls can bring in the same target decl ([namespace.udecl]/8). But we
already accept:
struct A { enum { a }; };
struct B : A { using A::a; };
struct C : B { using A::a;
using B::a; }; // same enumerator
this patch permits mixtures of 'using enum Bob;' and 'using Bob::member;' in the same way.
Differential Revision: https://reviews.llvm.org/D102241
They are still unsupported, but at least this makes clang-cl not mistake
them for being filenames.
As pointed out in the bug, VS 16.10 now uses these flags in new projects
by default.
vtbl itself is in default global address space. When clang emits
ctor, it gets a pointer to the vtbl field based on the this pointer,
then stores vtbl to the pointer.
Since this pointer can point to any address space (e.g. an object
created in stack), this pointer points to default address space, therefore
the pointer to vtbl field in this object should also be in default
address space.
Currently, clang incorrectly casts the pointer to vtbl field in this object
to global address space. This caused assertions in backend.
This patch fixes that by removing the incorrect addr space cast.
Reviewed by: Artem Belevich
Differential Revision: https://reviews.llvm.org/D103835
This adds support for p1099's 'using SCOPED_ENUM::MEMBER;'
functionality, bringing a member of an enumerator into the current
scope. The novel feature here, is that there need not be a class
hierarchical relationship between the current scope and the scope of
the SCOPED_ENUM. That's a new thing, the closest equivalent is a
typedef or alias declaration. But this means that
Sema::CheckUsingDeclQualifier needs adjustment. (a) one can't call it
until one knows the set of decls that are being referenced -- if
exactly one is an enumerator, we're in the new territory. Thus it
needs calling later in some cases. Also (b) there are two ways we hold
the set of such decls. During parsing (or instantiating a dependent
scope) we have a lookup result, and during instantiation we have a set
of shadow decls. Thus two optional arguments, at most one of which
should be non-null.
Differential Revision: https://reviews.llvm.org/D100276
Add the `memory_scope_all_devices` enum value, which is restricted to
OpenCL 3.0 or newer and the `__opencl_c_atomic_scope_all_devices`
feature. Also guard `memory_scope_all_svm_devices` accordingly, which
is already available in OpenCL 2.0.
The `__opencl_c_atomic_scope_all_devices` feature is header-only, so
set its define to 1 in `opencl-c-base.h`. This is done
unconditionally at the moment, as the mechanism for disabling
header-only options hasn't been decided yet.
This patch only adds a negative test for now. Ideally adding a CL3.0
run line to atomic-ops.cl should suffice as a positive test, but we
cannot do that yet until (at least) generic address spaces and program
scope variables are supported in OpenCL 3.0 mode.
Differential Revision: https://reviews.llvm.org/D103241
This fixes inconsistencies in the ms_abi.c testcase.
Also add a couple cases of missing double pointers in the windows part
of the testcase; the outcome of building that testcase on windows hasn't
changed, but the previous form of the test was imprecise (checking
for "%[[STRUCT_FOO]]*" when clang actually generates "%[[STRUCT_FOO]]**"),
which still used to match.
Ideally this would share code with the native Windows case, but
X86_64ABIInfo and WinX86_64ABIInfo aren't superclasses/subclasses of
each other so it's impractical, and the code to share currently only
consists of a couple lines.
Differential Revision: https://reviews.llvm.org/D103837
This implements support for using libc++ headers and library in the MSVC
toolchain. We only support libc++ that is a part of the toolchain, and
not headers installed elsewhere on the system.
Differential Revision: https://reviews.llvm.org/D101479
So far, support for x86_64-linux-gnux32 has been handled by explicit
comparisons of Triple.getEnvironment() to GNUX32. This worked as long as
x86_64-linux-gnux32 was the only X32 environment to worry about, but we
now have x86_64-linux-muslx32 as well. To support this, this change adds
an isX32() function and uses it. It replaces all checks for GNUX32 or
MuslX32 by isX32(), except for the following:
- Triple::isGNUEnvironment() and Triple::isMusl() are supposed to treat
GNUX32 and MuslX32 differently.
- computeTargetTriple() needs to be able to transform triples to add or
remove X32 from the environment and needs to map GNU to GNUX32, and
Musl to MuslX32.
- getMultiarchTriple() completely lacks any Musl support and retains the
explicit check for GNUX32 as it can only return x86_64-linux-gnux32.
Reviewed By: MaskRay
Differential Revision: https://reviews.llvm.org/D103777
On x86_64 mingw, long doubles are always passed indirectly as
arguments (see an existing case in WinX86_64ABIInfo::classify);
generalize the existing code for reading varargs - any non-aggregate
type that is larger than 64 bits (which would be both long double
in mingw, and __int128) are passed indirectly too.
This makes reading varargs consistent with how they're passed,
fixing interop with both gcc and clang callers, for long double
and __int128.
Differential Revision: https://reviews.llvm.org/D103452
Refactored code of dependence processing and added new inoutset dependence type.
Compiler can set dependence flag to 0x8 when call __kmpc_omp_task_with_deps.
Size of type of the dependence flag changed from 1 to 4 bytes in clang.
All dependence flags library gets so far and corresponding dependence types:
1 - IN, 2 - OUT, 3 - INOUT, 4 - MUTEXINOUTSET, 8 - INOUTSET.
Differential Revision: https://reviews.llvm.org/D97085
This fixed PR#48894 for AArch64. The issue has been fixed for Arm in
https://reviews.llvm.org/D95872
The following rules apply to -Wa,-march with this change:
- Only compiler options apply to non assembly files
- Compiler and assembler options apply to assembly files
- For assembly files, we prefer the assembler option(s) if we have both kinds of option
- Of the options that apply (or are preferred), the last value wins (it's not additive)
Reviewed By: DavidSpickett, nickdesaulniers
Differential Revision: https://reviews.llvm.org/D103184
If the memory object is scalable type, we do not know the exact size of
it at compile time. Set the size of lifetime marker to unknown if the
object is scalable one.
Differential Revision: https://reviews.llvm.org/D102822
Use llvm.experimental.vector.insert instead of storing into an alloca
when generating code for these intrinsics. This defers the codegen of
the generated vector to instruction selection, allowing existing
shufflevector style optimizations to apply.
Additionally, introduce a new target transform that can recognise fixed
predicate patterns in the svbool variants of these intrinsics.
Differential Revision: https://reviews.llvm.org/D103082
This fixes PR49198: Wrong usage of __dso_handle in user code leads to
a compiler crash.
When Init is an address of the global itself, we need to track it
across RAUW. Otherwise the initializer can be destroyed if the global
is replaced.
Differential Revision: https://reviews.llvm.org/D101156
This fixes the missing address space on `this` in the implicit move
assignment operator.
The function called here is an abstraction around the lines that have
been removed which also sets the address space correctly.
This is copied from CopyConstructor, CopyAssignment and MoveConstructor,
all of which use this function, and now MoveAssignment does too.
Fixes: PR50259
Reviewed By: svenvh
Differential Revision: https://reviews.llvm.org/D103252
The following was found by a customer and is accepted by the other primary
C++ compilers, but fails to compile in Clang:
namespace sss {
double foo(int, double);
template <class T>
T foo(T); // note: target of using declaration
} // namespace sss
namespace oad {
void foo();
}
namespace oad {
using ::sss::foo;
}
namespace sss {
using oad::foo; // note: using declaration
}
namespace sss {
double foo(int, double) { return 0; }
template <class T>
T foo(T t) { // error: declaration conflicts with target of using
return t;
}
} // namespace sss
I believe the issue is that MergeFunctionDecl() was calling
checkUsingShadowRedecl() but only considering a FunctionDecl as a
possible shadow and not FunctionTemplateDecl. The changes in this patch
largely mirror how variable declarations were being handled by also
catching FunctionTemplateDecl.
Extend debug info handling by adding DWARF address space mapping for
SPIR, with corresponding test case.
Reviewed By: Anastasia
Differential Revision: https://reviews.llvm.org/D103097
Otherwise it is causing one of our build jobs to fail,
it is using "external" as directory, and NOT is
failing because "external" is found in ModuleID.
Differential Revision: https://reviews.llvm.org/D103658
Need to emit a call for __kmpc_cancel_barrier in the exit block for
__kmpc_cancel function call if cancellation of the parallel block is
requested.
Differential Revision: https://reviews.llvm.org/D103646
Patch allows using of constexpr vars evaluatable to constant calue to be
used in declare mapper construct.
Differential Revision: https://reviews.llvm.org/D103642
spack HIP device library is installed at amdgcn directory under llvm/clang
directory.
This patch fixes detection of HIP device library for spack.
Reviewed by: Artem Belevich, Harmen Stoppels
Differential Revision: https://reviews.llvm.org/D103281
When a project uses PCH with explicit modules, the build will look like this:
1. scan PCH dependencies
2. explicitly build PCH
3. scan TU dependencies
4. explicitly build TU
Step 2 produces an object file for the PCH, which the dependency scanner needs to read in step 3. This patch adds support for this.
The `clang-scan-deps` invocation in the attached test would fail without this change.
Depends on D103516.
Reviewed By: Bigcheese
Differential Revision: https://reviews.llvm.org/D103519
Dependency scanning currently performs an implicit build. When testing that Clang can build modules with the command-lines generated by `clang-scan-deps`, the actual compilation would overwrite artifacts created during the scan, which makes debugging harder than it should be and can lead to errors in multi-step builds.
To prevent this, this patch adds new flag to `clang-scan-deps` that allows developers to customize the directory to use when generating module map paths, instead of always using the module cache. Moreover, the explicit context hash in now part of the PCM path, which will be useful in D102488, where the context hash can change due to command-line pruning.
Reviewed By: Bigcheese
Differential Revision: https://reviews.llvm.org/D103516
This patch solves an error such as:
incompatible operand types ('vbool4_t' (aka '__rvv_bool4_t') and '__rvv_bool4_t')
when one of the value is a TypedefType of the other value in ?:.
Reviewed By: rjmccall
Differential Revision: https://reviews.llvm.org/D103603
Currently some amdgcn builtins are defined with long int type,
which causes invalid IR on Windows since long int is 32 bit
on Windows whereas these builtins have 64 bit arguments.
long long int type cannot be used since it is 128 bit in OpenCL.
This patch uses 64 bit int type instead of long int to define 64 bit int
arguments or return for amdgcn builtins.
Reviewed by: Artem Belevich
Differential Revision: https://reviews.llvm.org/D103563
A recent change (D99683) to support ThinLTO for HIP caused a regression
when compiling cuda code with -flto=thin -fwhole-program-vtables.
Specifically, we now get an error:
error: invalid argument '-fwhole-program-vtables' only allowed with '-flto'
This error is coming from the device offload cc1 action being set up for
the cuda compile, for which -flto=thin doesn't apply and gets dropped.
This is a regression, but points to a potential issue that was silently
occurring before the patch, details below.
Before D99683, the check for fwhole-program-vtables in the driver looked
like:
if (WholeProgramVTables) {
if (!D.isUsingLTO())
D.Diag(diag::err_drv_argument_only_allowed_with)
<< "-fwhole-program-vtables"
<< "-flto";
CmdArgs.push_back("-fwhole-program-vtables");
}
And D.isUsingLTO() returned true since we have -flto=thin. However,
because the cuda cc1 compile is doing device offloading, which didn't
support any LTO, there was other code that suppressed -flto* options
from being passed to the cc1 invocation. So the cc1 invocation silently
had -fwhole-program-vtables without any -flto*. This seems potentially
problematic, since if we had any virtual calls we would get type test
assume sequences without the corresponding LTO pass that handles them.
However, with the patch, which adds support for device offloading LTO
option -foffload-lto=thin, the code has changed so that we set a bool
IsUsingLTO based on either -flto* or -foffload-lto*, depending on
whether this is the device offloading action. For the device offload
action in our compile, since we don't have -foffload-lto, IsUsingLTO is
false, and the check for LTO with -fwhole-program-vtables now fails.
What we should do is only pass through -fwhole-program-vtables to the
cc1 invocation that has LTO enabled (either the device offload action
with -foffload-lto, or the non-device offload action with -flto), and
otherwise drop the -fwhole-program-vtables for the non-LTO action.
Then we should error only if we have -fwhole-program-vtables without any
-f*lto* options.
Differential Revision: https://reviews.llvm.org/D103579
Prior to this patch when you used `clang -module-file-info` clang would
delete the module on completion because the module was treated as an
output file.
This fixes the issue so you don't need to invoke cc1 directly to get
module file information.
Reviewed By: steven_wu, phosek
Differential Revision: https://reviews.llvm.org/D103547
The PreInits of a loop transformation (atm moment only tile) include the computation of the trip count. The trip count is needed by any loop-associated directives that consumes the transformation-generated loop. Hence, we must ensure that the PreInits of consumed loop transformations are emitted with the consuming directive.
This is done by addinging the inner loop transformation's PreInits to the outer loop-directive's PreInits. The outer loop-directive will consume the de-sugared AST such that the inner PreInits are not emitted twice. The PreInits of a loop transformation are still emitted directly if its generated loop(s) are not associated with another loop-associated directive.
Reviewed By: ABataev
Differential Revision: https://reviews.llvm.org/D102180
In the case where the device is an itanium target, and the host is a
windows target, we were getting the names wrong, since in the itanium
case we filter by lambda-signature.
The fix is to always filter by the signature rather than just on
non-windows builds. I considered doing the reverse (that is, checking
the aux-triple), but doing so would result in duplicate lambda mangling
numbers (from linux reusing the same number for different signatures).
This attribute applies to a using declaration, and permits importing a
declaration without knowing if that declaration exists. This is useful
for libc++ C wrapper headers that re-export declarations in std::, in
cases where the base C library doesn't provide all declarations.
This attribute was proposed in http://lists.llvm.org/pipermail/cfe-dev/2020-June/066038.html.
rdar://69313357
Differential Revision: https://reviews.llvm.org/D90188
Recently we added diagnosing ODR-use of host variables
in device functions, which includes ODR-use of const
host variables since they are not really emitted on
device side. This caused regressions since we used
to allow ODR-use of const host variables in device
functions.
This patch allows ODR-use of const variables in device
functions if the const variables can be statically initialized
and have an empty dtor. Such variables are marked with
implicit constant attrs and emitted on device side. This is
in line with what clang does for constexpr variables.
Reviewed by: Artem Belevich
Differential Revision: https://reviews.llvm.org/D103108
Currently clang and nvcc use c++14 as default std for C++.
gcc 11 even uses c++17 as default std for C++. However,
clang uses c++98 as default std for CUDA/HIP.
As c++14 has been well adopted and became default for
clang, it seems reasonable to use c++14 as default std
for CUDA/HIP.
Reviewed by: Artem Belevich
Differential Revision: https://reviews.llvm.org/D103221
All fuchsia targets will now use the relative-vtables ABI by default.
Also remove -fexperimental-relative-c++-abi-vtables from test RUNs targeting fuchsia.
Differential Revision: https://reviews.llvm.org/D102374
This addresses pr50497. The argument of a typeid expression is
unevaluated, *except* when it's a polymorphic type. We handle this by
parsing as unevaluated and then transforming to evaluated if we
discover it should have been an evaluated context.
We do the same in TreeTransform<Derived>::TransformCXXTypeidExpr,
entering unevaluated context before transforming and rebuilding the
typeid. But that's incorrect and can lead us to converting to
evaluated context twice -- and hitting an assert.
During normal template instantiation we're always cloning the
expression, but during generic lambda processing we do not necessarily
AlwaysRebuild, and end up with TransformDeclRefExpr unconditionally
calling MarkDeclRefReferenced around line 10226. That triggers the
assert.
// Mark it referenced in the new context regardless.
// FIXME: this is a bit instantiation-specific.
SemaRef.MarkDeclRefReferenced(E);
This patch makes 2 changes.
a) TreeTransform<Derived>::TransformCXXTypeidExpr only enters
unevaluated context if the typeid's operand is not a polymorphic
glvalue. If it is, it keeps the same evaluation context.
b) Sema::BuildCXXTypeId is altered to only transform to evaluated, if
the current context is unevaluated.
Differential Revision: https://reviews.llvm.org/D103258
We now make up a TypeLoc for the class receiver to simplify visiting,
notably for indexing, availability, and clangd.
Differential Revision: https://reviews.llvm.org/D101645
When applying the changes in 8edd3464af,
it seems that this bit got merged incorrectly and no test coverage
caught the issue. This fixes the diagnostic and adds a test.
This is a re-application of dc67299 which was reverted in f63adf5b because
it broke the build. The issue should now be fixed.
Attribution note: The original author of this patch is Erik Pilkington.
I'm only trying to land it after rebasing.
Differential Revision: https://reviews.llvm.org/D91630
Because half is limited to the `cl_khr_fp16` extension being enabled,
`DefaultLvalueConversion` can fail when it's not enabled.
The original assumption that it will never fail is therefore wrong now.
Fixes: PR47976
Reviewed By: Anastasia
Differential Revision: https://reviews.llvm.org/D103175
The tightened checks from commit 722c39fef5 did not work
fully for buildbots using symlinks in repo paths. This patch
is not fully reverting 722c39fef5, as we still match that
there is a "/lib" somewhere in the path before "/clang/".
So this is once again a bit fragile in case someone would put
their repo in a base directory, for example, named
"/scratch/lib/foo/clang/llvm-project/". But it is atleast a
bit better than the original checks (avoiding the problem that
commit 722c39fef5 was solving).
Summary: Make StoreManager::castRegion function usage safier. Replace `const MemRegion *` with `Optional<const MemRegion *>`. Simplified one of related test cases due to suggestions in D101635.
Differential Revision: https://reviews.llvm.org/D103319
At the moment, the matrix support in CheckCXXCStyleCast (added in
D101696) breaks function-style constructor calls that take a
single matrix value, because it is treated as matrix cast.
Instead, unify the C++ matrix cast handling by moving the logic to
TryStaticCast and only handle the case where both types are matrix
types. Otherwise, fall back to the generic mis-match detection.
Suggested by @rjmccall
Reviewed By: SaurabhJha
Differential Revision: https://reviews.llvm.org/D103163
Actually compare each version to the version of the last chosen one.
There's no guarantee that the added test case does showcase the
previous issue (it depends on the order that directory entries
are returned when iterating), but with the issue fixed it should behave
deterministically in any case.
Also improve the match patterns in the mingw-sysroot.cpp test a bit.
Differential Revision: https://reviews.llvm.org/D102873
This is the first in a series of patches to provide builtins for
compatibility with the XL compiler. Most of the builtins already had
intrinsics and only needed to be implemented in the front end.
Intrinsics were created for the three iospace builtins, eieio, and icbt.
Pseudo instructions were created for eieio and iospace_eieio to
ensure that nops were inserted before the eieio instruction.
Reviewed By: nemanjai, #powerpc
Differential Revision: https://reviews.llvm.org/D102443
These actually can be automatically imported from another DLL. (This
works properly as long as the actual implementation of emutls is
linked dynamically from e.g. libgcc; if the implementation comes from
compiler-rt or a statically linked libgcc, it doesn't work as intended.)
This fixes PR50146 and https://github.com/msys2/MINGW-packages/issues/8706
(fixing calling std::call_once in a dynamically linked libstdc++);
since f731839584 the dso_local attribute
on the TLS variable affected the actual generated code for accessing
the emutls variable.
The dso_local attribute on the emutls variable made those accesses to
use 32 bit relative addressing in code, which requires runtime pseudo
relocations in the text section, and breaks entirely if the actual
other variable ends up loaded too far away in the virtual address
space.
Differential Revision: https://reviews.llvm.org/D102970
I discovered when merging the __builtin_sycl_unique_stable_name into my
downstream that it is actually possible for the cc1 invocation to have
more than 1 Sema instance, if you pass it multiple input files, each
gets its own Sema instance and thus ASTContext instance. The result was
that the call to Filter the SYCL kernels was using an
ItaniumMangleContext stored via a 'magic static', so it had an invalid
reference to ASTContext when processing the 2nd failure.
The failure is unfortunately flakey/transient, but the test that fails
was added anyway.
The magic-static was switched to a unique_ptr member variable in
ASTContext that is initialized when needed.
It is a reference-counted class but it uses different methods for that
and the checker doesn't understand them yet.
Differential Revision: https://reviews.llvm.org/D103081
Like other sanitizers, enable __has_feature(coverage_sanitizer) if clang
has enabled at least one SanitizerCoverage instrumentation type.
Because coverage instrumentation selection is not handled via normal
-fsanitize= (and thus not in SanitizeSet), passing this information
through to LangOptions required propagating the already parsed
-fsanitize-coverage= options from CodeGenOptions through to LangOptions
in FixupInvocation().
Reviewed By: aaron.ballman
Differential Revision: https://reviews.llvm.org/D103159
As discussed in PR50385, strict-fp on PowerPC SPE has not been handled
well. This patch disables it by default for SPE.
Reviewed By: nemanjai, vit9696, jhibbits
Differential Revision: https://reviews.llvm.org/D103235
Summary:
We are going to have libc++abi.a and libunwind.a on AIX.
Add the necessary linking command to pick the libraries up.
Reviewed By: daltenty
Differential Revision: https://reviews.llvm.org/D102813
Similar to how we allow managed and asserted locks to be held and not
held in joining branches, we also allow them to be held shared and
exclusive. The scoped lock should restore the original state at the end
of the scope in any event, and asserted locks need not be released.
We should probably only allow asserted locks to be subsumed by managed,
not by (directly) acquired locks, but that's for another change.
Reviewed By: delesley
Differential Revision: https://reviews.llvm.org/D102026
The original version of this was reverted, and @rjmcall provided some
advice to architect a new solution. This is that solution.
This implements a builtin to provide a unique name that is stable across
compilations of this TU for the purposes of implementing the library
component of the unnamed kernel feature of SYCL. It does this by
running the Itanium mangler with a few modifications.
Because it is somewhat common to wrap non-kernel-related lambdas in
macros that aren't present on the device (such as for logging), this
uniquely generates an ID for all lambdas involved in the naming of a
kernel. It uses the lambda-mangling number to do this, except replaces
this with its own number (starting at 10000 for readabililty reasons)
for lambdas used to name a kernel.
Additionally, this implements itself as constexpr with a slight catch:
if a name would be invalidated by the use of this lambda in a later
kernel invocation, it is diagnosed as an error (see the Sema tests).
Differential Revision: https://reviews.llvm.org/D103112
WG14 adopted N2645 and WG21 EWG has accepted P2334 in principle (still
subject to full EWG vote + CWG review + plenary vote), which add
support for #elifdef as shorthand for #elif defined and #elifndef as
shorthand for #elif !defined. This patch adds support for the new
preprocessor directives.
This reverts commit 6911114d8c.
Broke the QEMU sanitizer bots due to a missing header dependency. This
actually needs to be fixed on the bot-side, but for now reverting this
patch until I can fix up the bot.
This patch moves -fsanitize=scudo to link the standalone scudo library,
rather than the original compiler-rt based library. This is one of the
major remaining roadblocks to deleting the compiler-rt based scudo,
which should not be used any more. The standalone Scudo is better in
pretty much every way and is much more suitable for production usage.
As well as patching the litmus tests for checking that the
scudo_standalone lib is linked instead of the scudo lib, this patch also
ports all the scudo lit tests to run under scudo standalone.
This patch also adds a feature to scudo standalone that was under test
in the original scudo - that arguments passed to an aligned operator new
were checked that the alignment was a power of two.
Some lit tests could not be migrated, due to the following issues:
1. Features that aren't supported in scudo standalone, like the rss
limit.
2. Different quarantine implementation where the test needs some more
thought.
3. Small bugs in scudo standalone that should probably be fixed, like
the Secondary allocator having a full page on the LHS of an allocation
that only contains the chunk header, so underflows by <= a page aren't
caught.
4. Slight differences in behaviour that's technically correct, like
'realloc(malloc(1), 0)' returns nullptr in standalone, but a real
pointer in old scudo.
5. Some tests that might be migratable, but not easily.
Tests that are obviously not applicable to scudo standalone (like
testing that no sanitizer symbols made it into the DSO) have been
deleted.
After this patch, the remaining work is:
1. Update the Scudo documentation. The flags have changed, etc.
2. Delete the old version of scudo.
3. Patch up the tests in lit-unmigrated, or fix Scudo standalone.
Reviewed By: cryptoad, vitalybuka
Differential Revision: https://reviews.llvm.org/D102543
VS 2019 16.11 (just released in Preview) is adding support for the
/std:c++20 option and bumping /std:c++latest to "post-c++20". This
updates clang-cl to match.
Differential revision: https://reviews.llvm.org/D103155
The changes in commit 722c39fef5 caused the test case to fail
when building with -DLLVM_LIBDIR_SUFFIX=64. This patch makes the
checks a bit more relaxed to support libdir suffixes again.
Also adjusting the regular expressions to avoid mathes including
double quotes.
This relands commit 13dd65b3a1.
The original commit contained a test, which failed when compiled
for a MACH-O target.
This patch changes the test to run for x86_64-linux instead of
`%itanium_abi_triple`, to avoid having invalid syntax for MACH-O
sections. The patch itself does not care about section attribute
syntax and a x86 backend does not even need to be included in the
build.
Differential Revision: https://reviews.llvm.org/D102693
Since 4468e5b899 clang will prefer
the last one it finds of "-mimplicit-it" or "-Wa,-mimplicit-it".
Due to a mistake in that patch the compiler argument "-mimplicit-it"
was never marked as used, even if it was the last one and was passed
to llvm.
Move the Claim call back to the start of the loop and update
the testing to check we don't get any unused argument warnings.
Reviewed By: mstorsjo
Differential Revision: https://reviews.llvm.org/D103086
We really ought to support no_sanitize("coverage") in line with other
sanitizers. This came up again in discussions on the Linux-kernel
mailing lists, because we currently do workarounds using objtool to
remove coverage instrumentation. Since that support is only on x86, to
continue support coverage instrumentation on other architectures, we
must support selectively disabling coverage instrumentation via function
attributes.
Unfortunately, for SanitizeCoverage, it has not been implemented as a
sanitizer via fsanitize= and associated options in Sanitizers.def, but
rolls its own option fsanitize-coverage. This meant that we never got
"automatic" no_sanitize attribute support.
Implement no_sanitize attribute support by special-casing the string
"coverage" in the NoSanitizeAttr implementation. To keep the feature as
unintrusive to existing IR generation as possible, define a new negative
function attribute NoSanitizeCoverage to propagate the information
through to the instrumentation pass.
Fixes: https://bugs.llvm.org/show_bug.cgi?id=49035
Reviewed By: vitalybuka, morehouse
Differential Revision: https://reviews.llvm.org/D102772
During CTU, the *on-demand parsing* will read and parse the invocation
list to know how to compile the file being imported. However, it seems
that the invocation list will be parsed again if a previous parsing
has failed.
Then, parse again and fail again. This patch tries to overcome the
problem by storing the error code during the first parsing, and
re-create the stored error during the later parsings.
Reviewed By: steakhal
Patch By: OikawaKirie!
Differential Revision: https://reviews.llvm.org/D101763
GCC allows each target to define a set of non-letter and non-digit
escaped characters for inline assembly that will be replaced by another
string (They call this "punctuation" characters. The existing "%%" and
"%{" -- replaced by '%' and '{' at the end -- can be seen as special
cases shared by all targets).
This patch implements this feature by adding a new hook in `TargetInfo`.
Differential Revision: https://reviews.llvm.org/D103036
This fixes both https://bugs.llvm.org/show_bug.cgi?id=50309 and https://bugs.llvm.org/show_bug.cgi?id=50310.
Previously, lambdas inside functions would mark their own bodies for later analysis when encountering a potentially unavailable decl, without taking into consideration that the entire lambda itself might be correctly guarded inside an @available check. The same applied to inner class member functions. Blocks happened to work as expected already, since Sema::getEnclosingFunction() skips through block scopes.
This patch instead simply and conservatively marks the entire outermost function scope for search, and removes some special-case logic that prevented DiagnoseUnguardedAvailabilityViolations from traversing down into lambdas and nested functions. This correctly accounts for arbitrarily nested lambdas, inner classes, and blocks that may be inside appropriate @available checks at any ancestor level. It also treats all potential availability violations inside functions consistently, without being overly sensitive to the current DeclContext, which previously caused issues where e.g. nested struct members were warned about twice.
DiagnoseUnguardedAvailabilityViolations now has more work to do in some cases, particularly in functions with many (possibly deeply) nested lambdas and classes, but the big-O is the same, and the simplicity of the approach and the fact that it fixes at least two bugs feels like a strong win.
Differential Revision: https://reviews.llvm.org/D102338
When a const-qualified object has a section attribute, that
section is set to read-only and clang outputs a LLVM IR constant
for that object. This is incorrect for dynamically initialised
objects.
For example:
int init() { return 15; }
__attribute__((section("SA")))
const int a = init();
a is allocated to a read-only section and is left
unintialised (zero-initialised).
This patch adds checks if an initialiser is a constant expression
and allocates objects to sections as follows:
* const-qualified objects
- no initialiser or constant initialiser: .rodata
- dynamic initializer: .bss
* non const-qualified objects
- no initialiser or dynamic initialiser: .bss
- constant initialiser: .data
(".rodata", ".data", and ".bss" names used just for explanatory
purpose)
Differential Revision: https://reviews.llvm.org/D102693
Allow use of bit-fields as a clang extension
in OpenCL. The extension can be enabled using
pragma directives.
This fixes PR45339!
Differential Revision: https://reviews.llvm.org/D101843
Previously, information about `ConstructionContextLayer` was not
propagated thru causing the expression like:
Var c = (createVar());
To produce unrelated temporary for the `createVar()` result and conjure
a new symbol for the value of `c` in C++17 mode.
Reviewed By: steakhal
Patch By: tomasz-kaminski-sonarsource!
Differential Revision: https://reviews.llvm.org/D102835
When -gstrict-dwarf is specified, generate DW_TAG_rvalue_reference_type
at DWARF 4 or above
Reviewed By: dblaikie, aprantl
Differential Revision: https://reviews.llvm.org/D100630
This implements support for using libc++ headers and library in the MSVC
toolchain. We only support libc++ that is a part of the toolchain, and
not headers installed elsewhere on the system.
Differential Revision: https://reviews.llvm.org/D101479
Add options -[no-]offload-lto and -foffload-lto=[thin,full] for controlling
LTO for offload compilation. Allow LTO for AMDGPU target.
AMDGPU target does not support codegen of object files containing
call of external functions, therefore the LLVM module passed to
AMDGPU backend needs to contain definitions of all the callees.
An LLVM option is added to allow function importer to import
functions with noinline attribute.
HIP toolchain passes proper LLVM options to lld to make sure
function importer imports definitions of all the callees.
Reviewed by: Teresa Johnson, Artem Belevich
Differential Revision: https://reviews.llvm.org/D99683
D88631 added initial support for:
- -mstack-protector-guard=
- -mstack-protector-guard-reg=
- -mstack-protector-guard-offset=
flags, and D100919 extended these to AArch64. Unfortunately, these flags
aren't retained for LTO. Make them module attributes rather than
TargetOptions.
Link: https://github.com/ClangBuiltLinux/linux/issues/1378
Reviewed By: tejohnson
Differential Revision: https://reviews.llvm.org/D102742
If multiple instances of the -arm-implicit-it option is passed to
the backend, it errors out.
Also fix cases where there are multiple -Wa,-mimplicit-it; the existing
tests indicate that the last one specified takes effect, while in
practice it passed double options, which didn't work as intended.
Differential Revision: https://reviews.llvm.org/D102812
There already exists cl_khr_fp64 extension. So OpenCL C 3.0
and higher should use the feature, earlier versions still
use the extension. OpenCL C 3.0 API spec states that extension
will be not described in the option string if corresponding
optional functionality is not supported (see 4.2. Querying Devices).
Due to that fact the usage of features for OpenCL C 3.0 must
be as follows:
```
$ clang -Xclang -cl-ext=+cl_khr_fp64,+__opencl_c_fp64 ...
$ clang -Xclang -cl-ext=-cl_khr_fp64,-__opencl_c_fp64 ...
```
e.g. the feature and the equivalent extension (if exists)
must be set to the same values
Reviewed By: Anastasia
Differential Revision: https://reviews.llvm.org/D96524
this patch fixes Bug 27113 by adding support for string literals to the
implementation of the MS extension __identifier.
Differential revision: https://reviews.llvm.org/D100252
Instead of ignoring flto=auto and -flto=jobserver, treat them as -flto
and pass -flto=full along.
Differential Revision: https://reviews.llvm.org/D102479
.byte supports string, so if the whole byte list are printable,
we can actually print the string for readability and LIT tests maintainence.
.byte 'H,'e,'l,'l,'o,',,' ,'w,'o,'r,'l,'d
->
.byte "Hello, world"
Reviewed By: hubert.reinterpretcast
Differential Revision: https://reviews.llvm.org/D102814
variables emitted on both host and device side with different addresses
when ODR-used by host function should not cause device side counter-part
to be force emitted.
This fixes the regression caused by https://reviews.llvm.org/D102237
Reviewed by: Artem Belevich
Differential Revision: https://reviews.llvm.org/D102801
This patch adds supports for inline assembly operands and some simple
operand constraints, including register and constant operands.
Differential Revision: https://reviews.llvm.org/D102585
Add `-ffixed-a[0-6]` and `-ffixed-d[0-7]` and the corresponding
subtarget features to prevent certain register from being allocated.
Differential Revision: https://reviews.llvm.org/D102805
It turns out we have not correctly supported exception spec all along in
Emscripten EH. Emscripten EH supports `throw()` but not `throw` with
types. See https://bugs.llvm.org/show_bug.cgi?id=50396.
Wasm EH also only supports `throw()` but not `throw` with types, and we
have been printing a warning message for the latter. This prints the
same warning message for `throw` with types when Emscripten EH is used,
or more precisely, when Wasm EH is not used. (So this will print the
warning messsage even when `-fno-exceptions` is used but I think that
should be fine. It's cumbersome to do a complilcated option checking in
CGException.cpp and options checkings are mostly done in elsewhere.)
Reviewed By: dschuff, kripken
Differential Revision: https://reviews.llvm.org/D102791
Summary:
Call TryAltiVecVectorToken when an identifier is seen in the parser before
annotating the token. This checks the next token where necessary to ensure
that vector is properly handled as the altivec token.
Author: Jamie Schmeiser <schmeise@ca.ibm.com>
Reviewed By: ZarkoCA (Zarko Todorovski)
Differential Revision: https://reviews.llvm.org/D100991
Follow up to D101357 / 3fa6510f6.
Supersedes D102330.
Goal: Use flags setting rdffrs instead of rdffr + ptest.
Problem: RDFFR_P doesn't have have a flags setting equivalent.
Solution: in instcombine, canonicalize to RDFFR_PP at the IR level, and
rely on RDFFR_PP+PTEST => RDFFRS_PP optimization in
AArch64InstrInfo::optimizePTestInstr.
While here:
* Test that rdffr.z+ptest generates a rdffrs.
* Use update_{test,llc}_checks.py on the tests.
* Use sve attribute on functions.
Differential Revision: https://reviews.llvm.org/D102623
Linker scripts might not handle COMDAT sections. SLSHardeing adds
new section for each __llvm_slsblr_thunk_xN. This new option allows
the generation of the thunks into the normal text section to handle these
exceptional cases.
,comdat or ,noncomdat can be added to harden-sls to control the codegen.
-mharden-sls=[all|retbr|blr],nocomdat.
Reviewed By: kristof.beyls
Differential Revision: https://reviews.llvm.org/D100546
This happens during the error-recovery, and it would esacpe all
dependent-type check guards in getTypeInfo/constexpr-evaluator code
paths, which lead to crashes.
Differential Revision: https://reviews.llvm.org/D102773
If a test does not contain an " simd" but -fopenmp-simd RUN lines we can
just check that we do not create __kmpc|__tgt calls.
Reviewed By: ABataev
Differential Revision: https://reviews.llvm.org/D101973
Currently 1 byte global object has a ridiculous 63 bytes redzone.
This patch reduces the redzone size to be less than 32 if the size of global object is less than or equal to half of 32 (the minimal size of redzone).
A 12 bytes object has a 20 bytes redzone, a 20 bytes object has a 44 bytes redzone.
Reviewed By: MaskRay, #sanitizers, vitalybuka
Differential Revision: https://reviews.llvm.org/D102469
`clang -fpic -fno-semantic-interposition` may set dso_local on variables for -fpic.
GCC folks consider there are 'address interposition' and 'semantic interposition',
and 'disabling semantic interposition' can optimize function calls but
cannot change variable references to use local aliases
(https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100483).
This patch removes dso_local for variables in
`clang -fpic -fno-semantic-interposition` mode so that the built shared objects can
work with copy relocations. Building llvm-project tiself with
-fno-semantic-interposition (D102453) should now be safe with trunk Clang.
Example:
```
// a.c
int var;
int *addr() { return var; }
// old: cannot be interposed
movslq .Lvar$local(%rip), %rax
// new: can be interposed
movq var@GOTPCREL(%rip), %rax
movslq (%rax), %rax
```
The local alias lowering for `GlobalVariable`s is kept in case there is a
future option allowing local aliases.
Reviewed By: rnk
Differential Revision: https://reviews.llvm.org/D102583
satisfaction.
Previously we used the rules for constant folding in a non-constant
context, meaning that we'd incorrectly accept foldable non-constant
expressions and that std::is_constant_evaluated() would evaluate to
false.
This reverts commit 2919222d80.
That commit broke backwards compatibility. Additionally, the
replacement, -Wa,-mimplicit-it, isn't yet supported by any stable
release of Clang.
See D102812 for a fix for the error cases when callers specify both
-mimplicit-it and -Wa,-mimplicit-it.
when implementing an optional protocol requirement
When an Objective-C method implements an optional protocol requirement,
allow the method to use a newer introduced or older obsoleted
availability version than what's specified on the method in the protocol
itself. This allows SDK adopters to adopt an optional method from a
protocol later than when the method is introduced in the protocol. The users
that call an optional method on an object that conforms to this protocol
are supposed to check whether the object implements the method or not,
so a lack of appropriate `if (@available)` check for a new OS version
is not a cause of concern as there's already another runtime check that's required.
Differential Revision: https://reviews.llvm.org/D102459
Summary:
Currently, only `OptimizationRemarks` can be emitted using a Function.
Add constructors to allow this for `OptimizationRemarksAnalysis` and
`OptimizationRemarkMissed` as well.
Reviewed By: jdoerfert thegameg
Differential Revision: https://reviews.llvm.org/D102784
This reapplies commit 95033eb3 that reverted commit 1d9e8e13.
The tests were failing on Windows due to spaces and backslashes in paths not being handled carefully.
We might encounter an undeduced type before calling getTypeAlignInChars.
NOTE: this retrieves the fix from
8f80c66bd2, which was removed in Adam's
followup fix fbfcfdbf68. We originally
thought the crash was caused by recovery-ast, but it turns out it can
occur for other cases, e.g. typo-correction.
Differential Revision: https://reviews.llvm.org/D102750
The checks (both positive and negative checks) in the test case
hip-include-path.hip could mistakenly end up matching the string
"clang" from the InstalledDir in case the build dir for example
was named "/home/username/build-clang/". Intention with this
patch is to tighten up the checks a bit to filter our the
part of the paths that match with InstalledDir when doing the
checks, as well as matching "/lib/clang/" rather than
just "clang/".
Problem was found when building with
-DCLANG_DEFAULT_RTLIB=compiler-rt
-DCLANG_DEFAULT_CXX_STDLIB=libc++
and having "clang/" in the path to the build dir.
Reviewed By: yaxunl
Differential Revision: https://reviews.llvm.org/D102723
The program point created by the checker, even if it is an error node,
might not be the same as the name under which the report is emitted.
Make sure we're checking the name of the checker, because thats what
we're silencing after all.
Differential Revision: https://reviews.llvm.org/D102683
To bring D99599's implementation in line with the existing
PrintPassInstrumentation, and to fix a FIXME, add more customizability
to PrintPassInstrumentation.
Introduce three new options. The first takes over the existing
"-debug-pass-manager-verbose" cl::opt.
The second and third option are specific to -fdebug-pass-structure. They
allow indentation, and also don't print analysis queries.
To avoid more golden file tests than necessary, prune down the
-fdebug-pass-structure tests.
Reviewed By: asbirlea
Differential Revision: https://reviews.llvm.org/D102196
Revert recent commit to require x86-registered-target (e4b790c5e3).
Remove -O1 from the run lines so they are less dependent on backend passes.
Update the CHECK6 and CHECK10 lines with script.
Differential Revision: https://reviews.llvm.org/D102720
This is a GNU as and Clang cc1as option, not a GCC option.
Users should specify `-Wa,-mimplicit-it=` instead.
Note: mixing the -m option and the -Wa, option doesn't work
`-Wa,-mimplicit-it=never -mimplicit-it=always` =>
`clang (LLVM option parsing): for the --arm-implicit-it option: may only occur zero or one times!`
Reviewed By: nickdesaulniers, raj.khem
Differential Revision: https://reviews.llvm.org/D102568
This was causing some Mac-specific build failures:
http://45.33.8.238/macm1/9739/step_7.txthttp://45.33.8.238/mac/31615/step_7.txt
As best I can tell with psychic debugging, the /Users/blah path to the
source file is being treated as a macro undef with the clang-cl driver.
This splits the filename off explicitly so hopefully the rest of the
command line arguments will be read properly.
llvm-objcopy has been changed to support adding a section and updating section flags
in one run (D90438), so we can now change clang-offload-bundler to run llvm-objcopy
tool only once when creating fat object.
Reviewed By: ABataev
Differential Revision: https://reviews.llvm.org/D102670
Currently, we have support for SYCL 1.2.1 (also known as SYCL 2017).
This patch introduces the start of support for SYCL 2020 mode, which is
the latest SYCL standard available at (https://www.khronos.org/registry/SYCL/specs/sycl-2020/html/sycl-2020.html).
This sets the default SYCL to be 2020 in the driver, and introduces the
notion of a "default" version (set to 2020) when cc1 is in SYCL mode
but there was no explicit -sycl-std= specified on the command line.
We use `CHECK-LABEL: define` to divide input stream into functions,
this works well on most platforms.
But there are cases that some platforms (eg: AIX) may have different
codegen , especially for global constructor and descructors.
On AIX, the codegen will have two more functions: __dtor_b,
__finalize_b, which will fail the test.
The fix is to use specific function name so that we can safely ignore
those unrelated codegen differences.
Reviewed By: dblaikie
Differential Revision: https://reviews.llvm.org/D102654
This fixes the initialization of objects in the __constant
address space that occurs when declaring the object.
Fixes part of PR42566
Reviewed By: Anastasia
Differential Revision: https://reviews.llvm.org/D102248
This patch is the Part-1 (FE Clang) implementation of HW Exception handling.
This new feature adds the support of Hardware Exception for Microsoft Windows
SEH (Structured Exception Handling).
This is the first step of this project; only X86_64 target is enabled in this patch.
Compiler options:
For clang-cl.exe, the option is -EHa, the same as MSVC.
For clang.exe, the extra option is -fasync-exceptions,
plus -triple x86_64-windows -fexceptions and -fcxx-exceptions as usual.
NOTE:: Without the -EHa or -fasync-exceptions, this patch is a NO-DIFF change.
The rules for C code:
For C-code, one way (MSVC approach) to achieve SEH -EHa semantic is to follow
three rules:
* First, no exception can move in or out of _try region., i.e., no "potential
faulty instruction can be moved across _try boundary.
* Second, the order of exceptions for instructions 'directly' under a _try
must be preserved (not applied to those in callees).
* Finally, global states (local/global/heap variables) that can be read
outside of _try region must be updated in memory (not just in register)
before the subsequent exception occurs.
The impact to C++ code:
Although SEH is a feature for C code, -EHa does have a profound effect on C++
side. When a C++ function (in the same compilation unit with option -EHa ) is
called by a SEH C function, a hardware exception occurs in C++ code can also
be handled properly by an upstream SEH _try-handler or a C++ catch(...).
As such, when that happens in the middle of an object's life scope, the dtor
must be invoked the same way as C++ Synchronous Exception during unwinding
process.
Design:
A natural way to achieve the rules above in LLVM today is to allow an EH edge
added on memory/computation instruction (previous iload/istore idea) so that
exception path is modeled in Flow graph preciously. However, tracking every
single memory instruction and potential faulty instruction can create many
Invokes, complicate flow graph and possibly result in negative performance
impact for downstream optimization and code generation. Making all
optimizations be aware of the new semantic is also substantial.
This design does not intend to model exception path at instruction level.
Instead, the proposed design tracks and reports EH state at BLOCK-level to
reduce the complexity of flow graph and minimize the performance-impact on CPP
code under -EHa option.
One key element of this design is the ability to compute State number at
block-level. Our algorithm is based on the following rationales:
A _try scope is always a SEME (Single Entry Multiple Exits) region as jumping
into a _try is not allowed. The single entry must start with a seh_try_begin()
invoke with a correct State number that is the initial state of the SEME.
Through control-flow, state number is propagated into all blocks. Side exits
marked by seh_try_end() will unwind to parent state based on existing
SEHUnwindMap[].
Note side exits can ONLY jump into parent scopes (lower state number).
Thus, when a block succeeds various states from its predecessors, the lowest
State triumphs others. If some exits flow to unreachable, propagation on those
paths terminate, not affecting remaining blocks.
For CPP code, object lifetime region is usually a SEME as SEH _try.
However there is one rare exception: jumping into a lifetime that has Dtor but
has no Ctor is warned, but allowed:
Warning: jump bypasses variable with a non-trivial destructor
In that case, the region is actually a MEME (multiple entry multiple exits).
Our solution is to inject a eha_scope_begin() invoke in the side entry block to
ensure a correct State.
Implementation:
Part-1: Clang implementation described below.
Two intrinsic are created to track CPP object scopes; eha_scope_begin() and eha_scope_end().
_scope_begin() is immediately added after ctor() is called and EHStack is pushed.
So it must be an invoke, not a call. With that it's also guaranteed an
EH-cleanup-pad is created regardless whether there exists a call in this scope.
_scope_end is added before dtor(). These two intrinsics make the computation of
Block-State possible in downstream code gen pass, even in the presence of
ctor/dtor inlining.
Two intrinsic, seh_try_begin() and seh_try_end(), are added for C-code to mark
_try boundary and to prevent from exceptions being moved across _try boundary.
All memory instructions inside a _try are considered as 'volatile' to assure
2nd and 3rd rules for C-code above. This is a little sub-optimized. But it's
acceptable as the amount of code directly under _try is very small.
Part-2 (will be in Part-2 patch): LLVM implementation described below.
For both C++ & C-code, the state of each block is computed at the same place in
BE (WinEHPreparing pass) where all other EH tables/maps are calculated.
In addition to _scope_begin & _scope_end, the computation of block state also
rely on the existing State tracking code (UnwindMap and InvokeStateMap).
For both C++ & C-code, the state of each block with potential trap instruction
is marked and reported in DAG Instruction Selection pass, the same place where
the state for -EHsc (synchronous exceptions) is done.
If the first instruction in a reported block scope can trap, a Nop is injected
before this instruction. This nop is needed to accommodate LLVM Windows EH
implementation, in which the address in IPToState table is offset by +1.
(note the purpose of that is to ensure the return address of a call is in the
same scope as the call address.
The handler for catch(...) for -EHa must handle HW exception. So it is
'adjective' flag is reset (it cannot be IsStdDotDot (0x40) that only catches
C++ exceptions).
Suppress push/popTerminate() scope (from noexcept/noTHrow) so that HW
exceptions can be passed through.
Original llvm-dev [RFC] discussions can be found in these two threads below:
https://lists.llvm.org/pipermail/llvm-dev/2020-March/140541.htmlhttps://lists.llvm.org/pipermail/llvm-dev/2020-April/141338.html
Differential Revision: https://reviews.llvm.org/D80344/new/
The tests of fdebug-compilation-dir and -ffile-compilation-dir for `-x
assembler` are assuming integrated-as.
If the platform set the no-itegrated-as by default (eg: AIX for now), then this test will
fail.
Add the -integrated-as to aviod relying on the platform defaults.
Reviewed By: thakis
Differential Revision: https://reviews.llvm.org/D102647
I wouldn't recommend writing code like the testcase; a function
parameter isn't atomic, so using an atomic type doesn't really make
sense. But it's valid, so clang shouldn't crash on it.
The code was assuming hasAggregateEvaluationKind(Ty) implies Ty is a
RecordType, which isn't true. Just use isRecordType() instead.
Differential Revision: https://reviews.llvm.org/D102015
Follow up to D88631 but for aarch64; the Linux kernel uses the command
line flags:
1. -mstack-protector-guard=sysreg
2. -mstack-protector-guard-reg=sp_el0
3. -mstack-protector-guard-offset=0
to use the system register sp_el0 for the stack canary, enabling the
kernel to have a unique stack canary per task (like a thread, but not
limited to userspace as the kernel can preempt itself).
Address pr/47341 for aarch64.
Fixes: https://github.com/ClangBuiltLinux/linux/issues/289
Signed-off-by: Nick Desaulniers <ndesaulniers@google.com>
Reviewed By: xiangzhangllvm, DavidSpickett, dmgreen
Differential Revision: https://reviews.llvm.org/D100919
Missing or duplicate spack package should not cause error, since
users may only installed llvm/clang package, or users may installed
duplicate HIP package but will use environment variable or compiler
option to choose HIP path.
The message about missing or duplicate spack package is informational,
therefore should be emitted only when -v is specified.
Reviewed by: Artem Belevich
Differential Revision: https://reviews.llvm.org/D102556
1.[bool, char, short] bitfields have the same alignment as unsigned int
2.Adjust alignment on typedef field decls/honor align attribute
3.Fix alignment for scoped enum class
4.Long long bitfield has 4bytes alignment and StorageUnitSize under 32 bit
compile mode
Differential Revision: https://reviews.llvm.org/D87029
This patch removes duplicates also encountered in the output of clang-scan-deps when one same header file is encountered with different casing and/or different separators ('/' vs '\').
The case of separators can appear when the same file is included externally by
`#include <folder/file.h>`
whereas a file from the same folder does
`#include "file.h"`
Under Windows, clang computes the paths using '/' from the include directive, the `\` from the -I options, and the concatenations use the native `\`, leading to internal paths containing a mix of both separators.
Differential Revision: https://reviews.llvm.org/D102339
Adding lowering support for bitreverse.
Previously, lowering bitreverse would expand it into a series of other instructions. This patch makes it so this produces a single rbit instruction instead.
Reviewed By: dmgreen
Differential Revision: https://reviews.llvm.org/D102397
`__block` variables used to be always stored on the head instead of stack.
D51564 allowed `__block` variables to the stored on the stack like normal
variablesif they not captured by any escaping block, but the debug-info
generation code wasn't made aware of it so we still unconditionally emit DWARF
expressions pointing to the heap.
This patch makes CGDebugInfo use the `EscapingByref` introduced in D51564 that
tracks whether the `__block` variable is actually on the heap. If it's stored on
the stack instead we just use the debug info we would generate for normal
variables instead.
Reviewed By: ahatanak, aprantl
Differential Revision: https://reviews.llvm.org/D99946
Fixes issues with vectors in reinterpret_cast in C++ for OpenCL
and adds tests to make sure they both pass without errors and
generate the correct code.
Fixes: PR47977
Reviewed By: Anastasia
Differential Revision: https://reviews.llvm.org/D101519
Drop non-conformant extension pragma implementation as
it does not properly disable anything and therefore
enabling non-disabled logic has no meaning.
This simplifies clang code and user interface to the extension
functionality. With this patch extension pragma 'begin'/'end'
and 'enable'/'disable' are only accepted for backward
compatibility and no longer have any default behavior.
Differential Revision: https://reviews.llvm.org/D101043
This patch adds support for inferred modules to the dependency scanner.
Effectively a cherry-pick of https://github.com/apple/llvm-project/pull/699 authored by @Bigcheese with libclang and other changes omitted.
Contains following changes:
1. [Clang][ScanDeps] Ignore __inferred_module.map dependency.
* This shows up with inferred modules, but it doesn't exist on disk, so don't report it as a dependency.
2. [Clang][ScanDeps] Use the module map a module was inferred from for inferred modules.
Also includes a smoke test that uses clang-scan-deps output to perform an explicit build. There's no intention to duplicate whatever `test/Modules` contains, just to verify the produced command-line does "work" (with very loose definition of work).
Split from D100934.
Reviewed By: dexonsmith
Differential Revision: https://reviews.llvm.org/D102495
This patch enables explicitly building inferred modules.
Effectively a cherry-pick of https://github.com/apple/llvm-project/pull/699 authored by @Bigcheese with libclang and dependency scanner changes omitted.
Contains the following changes:
1. [Clang] Fix the header paths in clang::Module for inferred modules.
* The UmbrellaAsWritten and NameAsWritten fields in clang::Module are a lie for framework modules. For those they actually are the path to the header or umbrella relative to the clang::Module::Directory.
* The exception to this case is for inferred modules. Here it actually is the name as written, because we print out the module and read it back in when implicitly building modules. This causes a problem when explicitly building an inferred module, as we skip the printing out step.
* In order to fix this issue this patch adds a new field for the path we want to use in getInputBufferForModule. It also makes NameAsWritten actually be the name written in the module map file (or that would be, in the case of an inferred module).
2. [Clang] Allow explicitly building an inferred module.
* Building the actual module still fails, but make sure it fails for the right reason.
Split from D100934.
Reviewed By: dexonsmith
Differential Revision: https://reviews.llvm.org/D102491
ScheduleDAGFast.cpp is compiled to object file, but the ScheduleDAGFast
object file isn't linked into clang executable file as no symbol is
referred by outside. Add calling to createXxx of ScheduleDAGFast.cpp,
then the ScheduleDAGFast object file will be linked into clang
executable file. The static RegisterScheduler will register scheduler
fast and linearize at clang boot time.
Reviewed By: pengfei
Differential Revision: https://reviews.llvm.org/D101601
Reapply after adjusting the synchronized.m test case, where the
TODO is now resolved. The pointer is only captured on the exception
handling path.
-----
For the CapturesBefore tracker, it is sufficient to check that
I can not reach BeforeHere. This does not necessarily require
that BeforeHere dominates I, it can also occur if the capture
happens on an entirely disjoint path.
This change was previously accepted in D90688, but had to be
reverted due to large compile-time impact in some cases: It
increases the number of reachability queries that are performed.
After recent changes, the compile-time impact is largely mitigated,
so I'm reapplying this patch. The remaining compile-time impact
is largely proportional to changes in code-size.
This patch replaces the `powerpc64` token with the `system-aix` one in
the UNSUPPORTED line of a test. The `powerpc64` token was originally
added temporarily in 71a0609a2b.
If AIX uses integrated-as by default and it works both for 32-bit and
64-bit objects, then the issues encountered so far (see comments in
D96033) would be mostly solved.
As it is, marking the test as expected-to-fail (as opposed to
unsupported) on AIX might cause more trouble in the form of 32-bit
versus 64-bit differences. I am not aware of other situations where LIT
tests are dependent on whether the LLVM build is 64-bit or 32-bit.
Reviewed By: jsji
Differential Revision: https://reviews.llvm.org/D102560
This patch adds support for GCC's -fstack-usage flag. With this flag, a stack
usage file (i.e., .su file) is generated for each input source file. The format
of the stack usage file is also similar to what is used by GCC. For each
function defined in the source file, a line with the following information is
produced in the .su file.
<source_file>:<line_number>:<function_name> <size_in_byte> <static/dynamic>
"Static" means that the function's frame size is static and the size info is an
accurate reflection of the frame size. While "dynamic" means the function's
frame size can only be determined at run-time because the function manipulates
the stack dynamically (e.g., due to variable size objects). The size info only
reflects the size of the fixed size frame objects in this case and therefore is
not a reliable measure of the total frame size.
Reviewed By: MaskRay
Differential Revision: https://reviews.llvm.org/D100509
Support for Darwin's libsystem_m's vector functions has been added to
LLVM in 93a9a8a8d9.
This patch adds support for -fveclib=Darwin_libsystem_m to Clang.
Reviewed By: arphaman
Differential Revision: https://reviews.llvm.org/D102489
Remove requirements on extension pragma in atomic types
because it has not respected the spec wrt disabling types
and hasn't been useful either. With this change, the
developers can use atomic types from the extensions if they
are supported without enabling the pragma just like the builtin
functions
This patch does not break backward compatibility since the
extension pragma is still supported and it makes the behavior of
the compiler less strict by accepting code without needless and
inconsistent pragma statements.
Differential Revision: https://reviews.llvm.org/D100976
Since 5de2d189e6 this particular warning
hasn't had the location of the source file containing the inline
assembly.
Fix this by reporting via LLVMContext. Which means that we no longer
have the "instantiated into assembly here" lines but they were going to
point to the start of the inline asm string anyway.
This message is already tested via IR in llvm. However we won't have
the required location info there so I've added a C file test in clang
to cover it.
(though strictly, this is testing llvm code)
Reviewed By: ychen
Differential Revision: https://reviews.llvm.org/D102244
This test is failing on some builders (see [1]) with the following error:
error: Added modules have incompatible data layouts:
e-m:e-i64:64-n32:64-S128-v256:256:256-v512:512:512 (module) vs
E-m:a-i64:64-n32:64-S128-v256:256:256-v512:512:512 (jit)
The JIT layout is correct, but some IR module added to the JIT is using a
little-endian layout instead.
This commit disables the test on ppc64 until we can investigate further and
fix the bug.
[1] https://lab.llvm.org/staging/#/builders/126/builds/371
I've taken the following steps to add unwinding support from inline assembly:
1) Add a new `unwind` "attribute" (like `sideeffect`) to the asm syntax:
```
invoke void asm sideeffect unwind "call thrower", "~{dirflag},~{fpsr},~{flags}"()
to label %exit unwind label %uexit
```
2.) Add Bitcode writing/reading support + LLVM-IR parsing.
3.) Emit EHLabels around inline assembly lowering (SelectionDAGBuilder + GlobalISel) when `InlineAsm::canThrow` is enabled.
4.) Tweak InstCombineCalls/InlineFunction pass to not mark inline assembly "calls" as nounwind.
5.) Add clang support by introducing a new clobber: "unwind", which lower to the `canThrow` being enabled.
6.) Don't allow unwinding callbr.
Reviewed By: Amanieu
Differential Revision: https://reviews.llvm.org/D95745
The original change was reverted because it was discovered
that clang mishandles thunks, and they receive wrong
attributes for their this/return types - the ones for the function
they will call, not the ones they have.
While i have tried to fix this in https://reviews.llvm.org/D100388
that patch has been up and stuck for a month now,
with little signs of progress.
So while it will be good to solve this for real,
for now we can simply avoid introducing the bug,
by not annotating this/return for thunks.
This reverts commit 6270b3a1ea,
relanding 0aa0458f14.
As it was discovered in post-commit feedback
for 0aa0458f14,
we handle thunks incorrectly, and end up annotating
their this/return with attributes that are valid
for their callees, not for thunks themselves.
While it would be good to fix this properly,
and keep annotating them on thunks,
i've tried doing that in https://reviews.llvm.org/D100388
with little success, and the patch is stuck for a month now.
So for now, as a stopgap measure, subj.
Add user-facing front end option to turn off power10 prefixed instructions.
Reviewed By: nemanjai
Differential Revision: https://reviews.llvm.org/D102191
This commit removes some redundant {insert,extract}_vector intrinsic
chains by implementing the following patterns as instsimplifies:
(insert_vector _, (extract_vector X, 0), 0) -> X
(extract_vector (insert_vector _, X, 0), 0) -> X
Reviewed By: peterwaller-arm
Differential Revision: https://reviews.llvm.org/D101986
Currently when including stdbool.h and altivec.h declaration of `vector bool` leads to
errors due to `bool` being expanded to '_Bool`. This patch allows the parser
to recognize `_Bool`.
Reviewed By: hubert.reinterpretcast, Everybody0523
Differential Revision: https://reviews.llvm.org/D102064
This was reverted to mitigate mitigate miscompiles caused by
the logical and/or to bitwise and/or fold. Reapply it now that
the underlying issue has been fixed by D101191.
-----
This patch folds more operations to poison.
Alive2 proof: https://alive2.llvm.org/ce/z/mxcb9G (it does not contain tests about div/rem because they fold to poison when raising UB)
Reviewed By: nikic
Differential Revision: https://reviews.llvm.org/D92270
There are two reasons this shouldn't be restricted to Power8 and up:
1. For XL compatibility
2. Because clang will expand comparison operators to these intrinsics*
*Without this patch, the following causes a selection error:
int test(vector signed long a, vector signed long b) {
return a < b;
}
This patch provides the handling for the intrinsics in the back
end and removes the Power8 guards from the predicate functions
(vec_{all|any}_{eq|ne|gt|ge|lt|le}).
Original commit message:
In http://lists.llvm.org/pipermail/llvm-dev/2020-July/143257.html we have
mentioned our plans to make some of the incremental compilation facilities
available in llvm mainline.
This patch proposes a minimal version of a repl, clang-repl, which enables
interpreter-like interaction for C++. For instance:
./bin/clang-repl
clang-repl> int i = 42;
clang-repl> extern "C" int printf(const char*,...);
clang-repl> auto r1 = printf("i=%d\n", i);
i=42
clang-repl> quit
The patch allows very limited functionality, for example, it crashes on invalid
C++. The design of the proposed patch follows closely the design of cling. The
idea is to gather feedback and gradually evolve both clang-repl and cling to
what the community agrees upon.
The IncrementalParser class is responsible for driving the clang parser and
codegen and allows the compiler infrastructure to process more than one input.
Every input adds to the “ever-growing” translation unit. That model is enabled
by an IncrementalAction which prevents teardown when HandleTranslationUnit.
The IncrementalExecutor class hides some of the underlying implementation
details of the concrete JIT infrastructure. It exposes the minimal set of
functionality required by our incremental compiler/interpreter.
The Transaction class keeps track of the AST and the LLVM IR for each
incremental input. That tracking information will be later used to implement
error recovery.
The Interpreter class orchestrates the IncrementalParser and the
IncrementalExecutor to model interpreter-like behavior. It provides the public
API which can be used (in future) when using the interpreter library.
Differential revision: https://reviews.llvm.org/D96033
This reverts commit 44a4000181.
We are seeing build failures due to missing dependency to libSupport and
CMake Error at tools/clang/tools/clang-repl/cmake_install.cmake
file INSTALL cannot find
In http://lists.llvm.org/pipermail/llvm-dev/2020-July/143257.html we have
mentioned our plans to make some of the incremental compilation facilities
available in llvm mainline.
This patch proposes a minimal version of a repl, clang-repl, which enables
interpreter-like interaction for C++. For instance:
./bin/clang-repl
clang-repl> int i = 42;
clang-repl> extern "C" int printf(const char*,...);
clang-repl> auto r1 = printf("i=%d\n", i);
i=42
clang-repl> quit
The patch allows very limited functionality, for example, it crashes on invalid
C++. The design of the proposed patch follows closely the design of cling. The
idea is to gather feedback and gradually evolve both clang-repl and cling to
what the community agrees upon.
The IncrementalParser class is responsible for driving the clang parser and
codegen and allows the compiler infrastructure to process more than one input.
Every input adds to the “ever-growing” translation unit. That model is enabled
by an IncrementalAction which prevents teardown when HandleTranslationUnit.
The IncrementalExecutor class hides some of the underlying implementation
details of the concrete JIT infrastructure. It exposes the minimal set of
functionality required by our incremental compiler/interpreter.
The Transaction class keeps track of the AST and the LLVM IR for each
incremental input. That tracking information will be later used to implement
error recovery.
The Interpreter class orchestrates the IncrementalParser and the
IncrementalExecutor to model interpreter-like behavior. It provides the public
API which can be used (in future) when using the interpreter library.
Differential revision: https://reviews.llvm.org/D96033
For a type-constraint in a lambda signature, this makes the lambda
contain an unexpanded pack; for requirements in a requires-expressions
it makes the requires-expression contain an unexpanded pack; otherwise
it's invalid.
properly track that it has constraints.
Previously an instantiation of a constrained generic lambda would behave
as if unconstrained because we incorrectly cached a "has no constraints"
value that we computed before the constraints from 'auto' parameters
were attached.
Previously clang would print a binary blob into the bundled file
for amdgcn. With this patch, it will instead print textual IR as
expected.
Reviewed By: JonChesterfield, ronlieb
Differential Revision: https://reviews.llvm.org/D102065
Change-Id: I10c0127ab7357787769fdf9a2edd4b3071e790a1
It doesn't really make sense to emit language specific diagnostics
in a discarded statement, and suppressing these diagnostics results in a
programming pattern that many users will feel is quite useful.
Basically, this makes sure we only emit errors from the 'true' side of a
'constexpr if'.
It does this by making the ExprEvaluatorBase type have an opt-in option
as to whether it should visit discarded cases.
Differential Revision: https://reviews.llvm.org/D102251
Non-comprehensive list of cases:
* Dumping template arguments;
* Corresponding parameter contains a deduced type;
* Template arguments are for a DeclRefExpr that hadMultipleCandidates()
Type information is added in the form of prefixes (u8, u, U, L),
suffixes (U, L, UL, LL, ULL) or explicit casts to printed integral template
argument, if MSVC codeview mode is disabled.
Differential revision: https://reviews.llvm.org/D77598
This removed the pointless need for extension pragma since
it doesn't disable anything properly and it doesn't need to
enable anything that is not possible to disable.
The change doesn't break existing kernels since it allows to
compile more cases i.e. without pragma statements but the
pragma continues to be accepted.
Differential Revision: https://reviews.llvm.org/D100985
Currently clang does not emit device template variables
instantiated only in host functions, however, nvcc is
able to do that:
https://godbolt.org/z/fneEfferY
This patch fixes this issue by refactoring and extending
the existing mechanism for emitting static device
var ODR-used by host only. Basically clang records
device variables ODR-used by host code and force
them to be emitted in device compilation. The existing
mechanism makes sure these device variables ODR-used
by host code are added to llvm.compiler-used, therefore
they are guaranteed not to be deleted.
It also fixes non-ODR-use of static device variable by host code
causing static device variable to be emitted and registered,
which should not.
Reviewed by: Artem Belevich
Differential Revision: https://reviews.llvm.org/D102237
This commit brought build break in some f128 related tests. But that's
not the root cause. There exists some differences between Clang and
GCC's definition for 128-bit float types on PPC, so macros/functions in
glibc may not work with clang -mfloat128 well. We need to handle this
carefully and reland it.
I believe Clang's behavior is correct according to the standard here,
but this is an unusual situation for which we had no test coverage, so
I'm adding some.
These are GCC-compatible multilibs that use the generic Itanium C++ ABI
instead of the Fuchsia C++ ABI.
Differential Revision: https://reviews.llvm.org/D102030
Add front end diagnostics to report error for unimplemented TLS models set by
- compiler option `-ftls-model`
- attributes like `__thread int __attribute__((tls_model("local-exec"))) var_name;`
Reviewed by: aaron.ballman, nemanjai, PowerPC
Differential Revision: https://reviews.llvm.org/D102070
The OpenMP spec seems to require the compound operators be used for
+, *, &, |, and ^ reduction. So use these if a class has those operators.
If not try the simple operators as we did previously to limit the impact
to existing code.
Fixes: https://bugs.llvm.org/show_bug.cgi?id=48584
Differential Revision: https://reviews.llvm.org/D101941
-fno-semantic-interposition (only effective with -fpic) can optimize default
visibility external linkage (non-ifunc-non-COMDAT) variable access and function
calls to avoid GOT/PLT, by using local aliases, e.g.
```
int var;
__attribute__((optnone)) int fun(int x) { return x * x; }
int test() { return fun(var); }
```
-fpic (var and fun are dso_preemptable)
```
test:
.LBB1_1:
auipc a0, %got_pcrel_hi(var)
ld a0, %pcrel_lo(.LBB1_1)(a0)
lw a0, 0(a0)
// fun is preemptible by default in ld -shared mode. ld will create a PLT.
tail fun@plt
```
vs -fpic -fno-semantic-interposition (var and fun are dso_local)
```
test:
.Ltest$local:
.LBB1_1:
auipc a0, %pcrel_hi(.Lvar$local)
addi a0, a0, %pcrel_lo(.LBB1_1)
lw a0, 0(a0)
// The assembler either resolves .Lfun$local at assembly time (-mno-relax
// -fno-function-sections), or produces a relocation referencing a non-preemptible
// local symbol (which can avoid PLT).
tail .Lfun$local
```
Note: Clang's default -fpic is more aggressive than GCC -fpic: interprocedural
optimizations (including inlining) are available but local aliases are not used.
-fpic -fsemantic-interposition can disable interprocedural optimizations.
Depends on D101875
Reviewed By: luismarques
Differential Revision: https://reviews.llvm.org/D101876
Summary:
Test and produce warning for subtracting a pointer from null or subtracting
null from a pointer. Reuse existing warning that this is undefined
behaviour. Also add unit test for both warnings.
Reformat to satisfy clang-format.
Respond to review comments: add additional test.
Respond to review comments: Do not issue warning for nullptr - nullptr
in C++.
Fix indenting to satisfy clang-format.
Respond to review comments: Add C++ tests.
Author: Jamie Schmeiser <schmeise@ca.ibm.com>
Reviewed By: efriedma (Eli Friedman), nickdesaulniers (Nick Desaulniers)
Differential Revision: https://reviews.llvm.org/D98798
Simply use of extensions by allowing the use of supported
double types without the pragma. Since earlier standards
instructed that the pragma is used explicitly a new warning
is introduced in pedantic mode to indicate that use of
type without extension pragma enable can be non-portable.
This patch does not break backward compatibility since the
extension pragma is still supported and it makes the behavior
of the compiler less strict by accepting code without extra
pragma statements.
Differential Revision: https://reviews.llvm.org/D100980
This patch adds support for WebAssembly globals in LLVM IR, representing
them as pointers to global values, in a non-default, non-integral
address space. Instruction selection legalizes loads and stores to
these pointers to new WebAssemblyISD nodes GLOBAL_GET and GLOBAL_SET.
Once the lowering creates the new nodes, tablegen pattern matches those
and converts them to Wasm global.get/set of the appropriate type.
Based on work by Paulo Matos in https://reviews.llvm.org/D95425.
Reviewed By: pmatos
Differential Revision: https://reviews.llvm.org/D101608
These are required to be constants, this patch makes sure they
are in the accepted range of values.
These are usually created by wrappers in the riscv_vector.h header
which should always be correct. This patch protects against a user
using the builtin directly.
Reviewed By: khchen
Differential Revision: https://reviews.llvm.org/D102086
-fno-semantic-interposition (only effective with -fpic) can optimize default
visibility external linkage (non-ifunc-non-COMDAT) variable access and function
calls to avoid GOT/PLT, by using local aliases, e.g.
```
int var;
__attribute__((optnone)) int fun(int x) { return x * x; }
int test() { return fun(var); }
```
-fpic (var and fun are dso_preemptable)
```
test: // @test
adrp x8, :got:var
ldr x8, [x8, :got_lo12:var]
ldr w0, [x8]
// fun is preemptible by default in ld -shared mode. ld will create a PLT.
b fun
```
vs -fpic -fno-semantic-interposition (var and fun are dso_local)
```
test: // @test
.Ltest$local:
adrp x8, .Lvar$local
ldr w0, [x8, :lo12:.Lvar$local]
// The assembler either resolves .Lfun$local at assembly time, or produces a
// relocation referencing a non-preemptible section symbol (which can avoid PLT).
b .Lfun$local
```
Note: Clang's default -fpic is more aggressive than GCC -fpic: interprocedural
optimizations (including inlining) are available but local aliases are not used.
-fpic -fsemantic-interposition can disable interprocedural optimizations.
Depends on D101872
Reviewed By: peter.smith
Differential Revision: https://reviews.llvm.org/D101873
Analogously to https://reviews.llvm.org/D98794 this patch uses the
`alignstack` attribute to fix incorrect passing of homogeneous
aggregate (HA) arguments on AArch32. The EABI/AAPCS was recently
updated to clarify how VFP co-processor candidates are aligned:
4488e34998
Differential Revision: https://reviews.llvm.org/D100853
Follow the more general patch for now, do not try to SPMDize the kernel
if the variable is used and local.
Differential Revision: https://reviews.llvm.org/D101911
Previously clang would print a binary blob into the bundled file
for amdgcn. With this patch, it will instead print textual IR as
expected.
Reviewed By: JonChesterfield
Differential Revision: https://reviews.llvm.org/D102065
Printing pass manager invocations is fairly verbose and not super
useful.
This allows us to remove DebugLogging from pass managers and PassBuilder
since all logging (aside from analysis managers) goes through
instrumentation now.
This has the downside of never being able to print the top level pass
manager via instrumentation, but that seems like a minor downside.
Reviewed By: ychen
Differential Revision: https://reviews.llvm.org/D101797
This addresses an issue introduced in D91559. We would invoke the
compiler with -Lpath/to/lib --sysroot=path/to/sysroot where both
locations contain libraries with the same name, but we expect linker
to pick up the library in path/to/lib since that version is more
specialized. This was the case before D91559 where the sysroot path
would be ignored, but after that change linker would now pick up the
library from the sysroot which resulted in unexpected behavior.
The sysroot path should always come after any user provided library
paths, followed by compiler runtime paths. We want for libraries in user
provided library paths to always take precedence over sysroot libraries.
This matches the behavior of other toolchains used with other targets.
Differential Revision: https://reviews.llvm.org/D102049
Commit 5baea05601 set the CurCodeDecl
because it was needed to pass the assert in CodeGenFunction::EmitLValueForLambdaField,
But this was not right to do as CodeGenFunction::FinishFunction passes it to EmitEndEHSpec
and cause corruption of the EHStack.
Revert the part of the commit that changes the CurCodeDecl, and instead
adjust the assert to check for a null CurCodeDecl.
Differential Revision: https://reviews.llvm.org/D102027
This addresses an issue introduced in D91559. We would invoke the
compiler with -Lpath/to/lib --sysroot=path/to/sysroot where both
locations contain libraries with the same name, but we expect linker
to pick up the library in path/to/lib since that version is more
specialized. This was the case before D91559 where the sysroot path
would be ignored, but after that change linker would now pick up the
library from the sysroot which resulted in unexpected behavior.
The sysroot path should always come after any user provided library
paths, followed by compiler runtime paths. We want for libraries in user
provided library paths to always take precedence over sysroot libraries.
This matches the behavior of other toolchains used with other targets.
Differential Revision: https://reviews.llvm.org/D102049
To improve hygiene, consistency, and usability, it would be good to replace all
the macro intrinsics in wasm_simd128.h with functions. The reason for using
macros in the first place was to enforce the use of constants for some arguments
using `_Static_assert` with `__builtin_constant_p`. This commit switches to
using functions and uses the `__diagnose_if__` attribute rather than
`_Static_assert` to enforce constantness.
The remaining macro intrinsics cannot be made into functions until the builtin
functions they are implemented with can be replaced with normal code patterns
because the builtin functions themselves require that their arguments are
constants.
This commit also fixes a bug with the const_splat intrinsics in which the f32x4
and f64x2 variants were incorrectly producing integer vectors.
Differential Revision: https://reviews.llvm.org/D102018
This change allows the use of identifiers for image types
from `cl_khr_gl_msaa_sharing` freely in the kernel code if
the extension is not supported since they are not in the
list of the reserved identifiers.
This change also removed the need for pragma for the types
in the extensions since the spec does not require the pragma
uses.
Differential Revision: https://reviews.llvm.org/D100983
Follow up on 431e3138a and complete the other possible combinations.
Besides enforcing the new behavior, it also mitigates TSAN false positives when
combining orders that used to be stronger.
The builtins were updated to take signed parameters in 627a526955, but the
intrinsics that use those builtins were not updated as well. The intrinsic test
did not catch this sign mismatch because it is only reported as an error under
-fno-lax-vector-conversions.
This commit fixes the type mismatch and adds -fno-lax-vector-conversions to the
test to catch similar problems in the future.
Differential Revision: https://reviews.llvm.org/D101979
This adds additional support for XL compatibility. There are a number
of functions in altivec.h that produce a single instruction (or a
very short sequence) for Power8 but can be done on Power7 without
scalarization. XL provides these implementations.
This patch adds the following overloads for doubleword vectors:
vec_add
vec_cmpeq
vec_cmpgt
vec_cmpge
vec_cmplt
vec_cmple
vec_sl
vec_sr
vec_sra
This patch simplifies the parser and makes the language semantics
consistent. There is no extension pragma requirement in the spec
for the subgroup functions in enqueue kernel or pipes and all other
builtin functions are available without the pragama.
Differential Revision: https://reviews.llvm.org/D100984
This patch fixes various issues with our prior `declare target` handling
and extends it to support `omp begin declare target` as well.
This started with PR49649 in mind, trying to provide a way for users to
avoid the "ref" global use introduced for globals with internal linkage.
From there it went down the rabbit hole, e.g., all variables, even
`nohost` ones, were emitted into the device code so it was impossible to
determine if "ref" was needed late in the game (based on the name only).
To make it really useful, `begin declare target` was needed as it can
carry the `device_type`. Not emitting variables eagerly had a ripple
effect. Finally, the precedence of the (explicit) declare target list
items needed to be taken into account, that meant we cannot just look
for any declare target attribute to make a decision. This caused the
handling of functions to require fixup as well.
I tried to clean up things while I was at it, e.g., we should not "parse
declarations and defintions" as part of OpenMP parsing, this will always
break at some point. Instead, we keep track what region we are in and
act on definitions and declarations instead, this is what we do for
declare variant and other begin/end directives already.
Highlights:
- new diagnosis for restrictions specificed in the standard,
- delayed emission of globals not mentioned in an explicit
list of a declare target,
- omission of `nohost` globals on the host and `host` globals on the
device,
- no explicit parsing of declarations in-between `omp [begin] declare
variant` and the corresponding end anymore, regular parsing instead,
- precedence for explicit mentions in `declare target` lists over
implicit mentions in the declaration-definition-seq, and
- `omp allocate` declarations will now replace an earlier emitted
global, if necessary.
---
Notes:
The patch is larger than I hoped but it turns out that most changes do
on their own lead to "inconsistent states", which seem less desirable
overall.
After working through this I feel the standard should remove the
explicit declare target forms as the delayed emission is horrible.
That said, while we delay things anyway, it seems to me we check too
often for the current status even though that is often not sufficient to
act upon. There seems to be a lot of duplication that can probably be
trimmed down. Eagerly emitting some things seems pretty weak as an
argument to keep so much logic around.
---
Reviewed By: ABataev
Differential Revision: https://reviews.llvm.org/D101030
We do provide `operator delete(void*)` in `<new>` but it should be
available by default. This is mostly boilerplate to test it and the
unconditional include of `<new>` in the header we always in include
on the device.
Reviewed By: JonChesterfield
Differential Revision: https://reviews.llvm.org/D100620
This patch refactors a subset of Clang OpenMP tests, generating checklines using the update_cc_test_checks script. This refactoring facilitates updating the Clang OpenMP code generation codebase by automating test generation.
Reviewed By: jdoerfert
Differential Revision: https://reviews.llvm.org/D101849
This is a patch that disables the poison-unsafe select -> and/or i1 folding.
It has been blocking D72396 and also has been the source of a few miscompilations
described in llvm.org/pr49688 .
D99674 conditionally blocked this folding and successfully fixed the latter one.
The former one was still blocked, and this patch addresses it.
Note that a few test functions that has `_logical` suffix are now deoptimized.
These are created by @nikic to check the impact of disabling this optimization
by copying existing original functions and replacing and/or with select.
I can see that most of these are poison-unsafe; they can be revived by introducing
freeze instruction. I left comments at fcmp + select optimizations (or-fcmp.ll, and-fcmp.ll)
because I think they are good targets for freeze fix.
Reviewed By: nikic
Differential Revision: https://reviews.llvm.org/D101191
specialization while substituting a partial template parameter pack,
don't try to extend the existing deduction.
This caused us to select the wrong partial specialization in some rare
cases. A recent change to libc++ caused this to happen in practice for
code using std::conjunction.
This patch renames the replace-function-regex to replace-value-regex to indicate that the existing regex replacement functionality can replace any IR value besides functions.
Reviewed By: jdoerfert
Differential Revision: https://reviews.llvm.org/D101934
These intrinsics do not correspond to their own underlying instruction, but are
a convenience for the common case of materializing a constant vector that has
the same value in each lane.
Differential Revision: https://reviews.llvm.org/D101885
Update the SIMD builtin load functions to take pointers to const data and update
the intrinsics themselves to not cast away constness.
Differential Revision: https://reviews.llvm.org/D101884
Make the inputs to all narrowing builtins signed, which is how they are
interpreted by the underlying instructions (only the result changes sign
between instructions).
Differential Revision: https://reviews.llvm.org/D101883
CGProfilePass is not always on, it will be disabled when using
non-intergrated assemblers.
// Only enable CGProfilePass when using integrated assembler, since
// non-integrated assemblers don't recognize .cgprofile section.
PMBuilder.CallGraphProfile = !CodeGenOpts.DisableIntegratedAS;
Add -fintegrate-as to make sure the output don't rely on the platform default.
Reviewed By: evgeny777
Differential Revision: https://reviews.llvm.org/D101918
The offload action is used in four different ways as explained
in Driver.cpp:4495. When -c is present, the final phase will be
assemble (linker when -c is not present). However, this phase
is skipped according to D96769 for amdgcn. So, offload action
arrives into following situation,
compile (device) ---> offload ---> offload
without -c the chain looks like,
compile (device) ---> offload ---> linker (device)
---> offload
The former situation creates an unhandled case which causes
problem. The solution presented in this patch delays the D96769
logic until job creation time. This keeps the offload action
in the 1 of the 4 specified situations.
Reviewed By: JonChesterfield
Differential Revision: https://reviews.llvm.org/D101901
Added __cl_clang_non_portable_kernel_param_types extension that
allows using non-portable types as kernel parameters. This allows
bypassing the portability guarantees from the restrictions specified
in C++ for OpenCL v1.0 s2.4.
Currently this only disables the restrictions related to the data
layout. The programmer should ensure the compiler generates the same
layout for host and device or otherwise the argument should only be
accessed on the device side. This extension could be extended to other
case (e.g. permitting size_t) if desired in the future.
Patch by olestrohm (Ole Strohm)!
https://reviews.llvm.org/D101168
This patch refactors a subset of Clang OpenMP tests, generating checklines using the update_cc_test_checks script. This refactoring facilitates updating the Clang OpenMP code generation codebase by automating test generation.
Reviewed By: jdoerfert
Differential Revision: https://reviews.llvm.org/D101849
We previously did not have tests demonstrating that the intrinsics in
wasm_simd128.h lower to reasonable LLVM IR. This commit adds such a test.
Differential Revision: https://reviews.llvm.org/D101805
This attempts to move driver tests out of Frontend and to Driver, separates
RUNs that should fail from RUNs that should succeed, and prevent creating
output files or dumping output.
Differential Revision: https://reviews.llvm.org/D101867
The script update_cc_test_checks runs all non-filechecked runlines before the filechecked ones. This creates problems since outputs of those non-filechecked runlines may conflict and that will fail the execution of update_cc_test_checks. This patch executes non-filechecked in the order specified in the test file to avoid this issue.
Reviewed By: jdoerfert
Differential Revision: https://reviews.llvm.org/D101683
When the target triple was an Apple platform `ToolChain::getOSLibName()`
(called by `getCompilerRTPath()`) would return the full OS name
including the version number (e.g. `darwin20.3.0`). This is not correct
because the library directory for all Apple platforms is `darwin`.
This in turn caused
* `-print-runtime-dir` to return a non-existant path.
* `-print-file-name=<any compiler-rt library>` to return the filename
instead of the full path to the library.
Two regression tests are included.
rdar://77417317
Differential Revision: https://reviews.llvm.org/D101682
This implements the flag proposed in RFC
http://lists.llvm.org/pipermail/cfe-dev/2020-August/066437.html.
The goal is to add a way to override the default target C++ ABI through a
compiler flag. This makes it easier to test and transition between different
C++ ABIs through compile flags rather than build flags.
In this patch:
- Store -fc++-abi= in a LangOpt. This isn't stored in a CodeGenOpt because
there are instances outside of codegen where Clang needs to know what the
ABI is (particularly through ASTContext::createCXXABI), and we should be
able to override the target default if the flag is provided at that point.
- Expose the existing ABIs in TargetCXXABI as values that can be passed
through this flag.
- Create a .def file for these ABIs to make it easier to check flag values.
- Add an error for diagnosing bad ABI flag values.
Differential Revision: https://reviews.llvm.org/D85802
If a return value is explicitly rounded to 64 bits, an additional zext
instruction is emitted, and in some cases it prevents tail call
optimization.
As discussed in D100225, this rounding is not necessary and can be
disabled.
Differential Revision: https://reviews.llvm.org/D100591
as used.
The problem only happens with constexpr variable, for constexpr variable,
variable is not marked during parser variable. This is because compiler
might find some var's associate expressions may not actully an odr-used
later, the variables get kept in MaybeODRUseExprs, in normal case, at
end of process fullExpr, the variable will be marked during the call to
CleanupVarDeclMarking(). Since we are processing expression of OpenMP
clauses, and the ActOnFinishFullExpr is not getting called that casue
variable is not get marked.
One way to fix this is to call CleanupVarDeclMarking() in EndOpenMPClause
for each omp directive.
This to fix https://bugs.llvm.org/show_bug.cgi?id=50206
Differential Revision: https://reviews.llvm.org/D101781
The first crash reported in the bug report 44338.
Condition `!isSat.hasValue() || isNotSat.getValue()` here should be
`!isNotSat.hasValue() || isNotSat.getValue()`.
`getValue()` here crashed when we used the static analyzer to analyze
postgresql-12.0.
Patch By: OikawaKirie
Reviewed By: steakhal, martong
Differential Revision: https://reviews.llvm.org/D83660
Pipe has not been a reserved keyword in the earlier OpenCL
standards. However we failed to allow its use as an identifier
in the original commit. This issues is fixed now and testing
is improved accordingly.
Differential Revision: https://reviews.llvm.org/D101052
Warn when a declaration uses an identifier that doesn't obey the reserved
identifier rule from C and/or C++.
Differential Revision: https://reviews.llvm.org/D93095
GlobalsAA is only created at the beginning of the inliner pipeline. If
an AAManager is cached from previous passes, it won't get rebuilt to
include the newly created GlobalsAA.
Reviewed By: mtrofin
Differential Revision: https://reviews.llvm.org/D101379
In matrix type casts, we were doing bitcast when the matrices had the same size. This was incorrect and this patch fixes that.
Also added some new CodeGen tests for signed <-> usigned conversions
Reviewed By: fhahn
Differential Revision: https://reviews.llvm.org/D101754
We weren't modifying the lock set when intersecting with one coming
from a break-terminated block. This is inconsistent, since break isn't a
back edge, and it leads to false negatives with scoped locks. We usually
don't warn for those when joining locksets aren't the same, we just
silently remove locks that are not in the intersection. But not warning
and not removing them isn't right.
Reviewed By: aaron.ballman
Differential Revision: https://reviews.llvm.org/D101202
MSVC has added some new flags. Although they're not supported, this adds
parsing support for them so clang-cl doesn't treat them as filenames.
Except for /fsanitize=address which we do support. (clang-cl already
exposes the -fsanitize= option, but this allows using the
MSVC-spelling with a slash.)
Differential revision: https://reviews.llvm.org/D101439