GetContainedInventedTypeParmVisitor would not account for the case where TemplateTypeParmType::getDecl() is
nullptr, causing bug #45102.
Add the nullptr check.
Summary:
The messages for two of the warnings are misleading:
* warn_for_range_const_reference_copy suggests that the initialization
of the loop variable results in a copy. But that's not always true,
we just know that some conversion happens, potentially invoking a
constructor or conversion operator. The constructor might copy, as in
the example that lead to this message [1], but it might also not.
However, the constructed object is bound to a reference, which is
potentially misleading, so we rewrite the message to emphasize that.
We also make sure that we print the reference type into the warning
message to clarify that this warning only appears when operator*
returns a reference.
* warn_for_range_variable_always_copy suggests that a reference type
loop variable initialized from a temporary "is always a copy". But
we don't know this, the range might just return temporary objects
which aren't copies of anything. (Assuming RVO a copy constructor
might never have been called.)
The message for warn_for_range_copy is a bit repetitive: the type of a
VarDecl and its initialization Expr are the same up to cv-qualifiers,
because Sema will insert implicit casts or constructor calls to make
them match.
[1] https://bugs.llvm.org/show_bug.cgi?id=32823
Reviewers: aaron.ballman, Mordante, rtrieu
Reviewed By: aaron.ballman
Differential Revision: https://reviews.llvm.org/D75613
dependent constructs.
We previously assumed they were neither value- nor
instantiation-dependent under any circumstances, which would lead to
crashes and other misbehavior.
This doesn't match GCC's behavior (where statement expressions appear to
be treated as value-dependent if they appear in a dependent context),
but seems to be the best thing we can do in the short term: it turns out
to be remarkably difficult for us to correctly determine whether we are
in a dependent context (and it's not even possible in some cases, such
as in a generic lambda where we might not have seen the 'auto' yet).
Added codegen for update clause in depobj. Reads the number of the
elements from the first element and updates flags for each element in
the loop.
```
omp_depend_t x;
kmp_depend_info *base = (kmp_depend_info *)x;
intptr_t num = x[-1].base_addr;
kmp_depend_info *end = x + num;
kmp_depend_info *el = base;
do {
el.flags = new_flag;
el = &el[1];
} while (el != end);
```
in depobj object.
The first element in the list of the dependencies is used for internal
purposes to store the number of the elements in the provided list.
The first element now is skipped and depobj object poits exactly to the
list of dependencies.
Summary:
It's basically Doxygen's version of a link and can happen anywhere
inside of a paragraph. Fixes a bogus warning about empty paragraphs when
a parameter description starts with a link.
Reviewers: gribozavr2
Reviewed By: gribozavr2
Differential Revision: https://reviews.llvm.org/D75632
Block copy/destroy helpers are now linkonce_odr functions, meant to be uniqued, and thus attaching debug information from one translation unit (or even just from one instance of many inside one translation unit) would be misleading and wrong in the general case.
This effectively reverts commit 9c6b6826ce.
<rdar://problem/59137040>
Differential Revision: https://reviews.llvm.org/D75615
The IR hasn't switched the default yet, so explicitly add the ieee
attributes.
I'm still not really sure how the target default denormal mode should
interact with -fno-unsafe-math-optimizations. The target may have
selected the default mode to be non-IEEE based on the flags or based
on its true behavior, but we don't know which is the case. Since the
only users of a non-IEEE mode without a flag still support IEEE mode,
just reset to IEEE.
Added codegen for 'depend' clause in depobj directive. The depend clause
is emitted as kmp_depend_info <deps>[<number_of_items_in_clause> + 1]. The
first element in this array is reserved for storing the number of
elements in this array: <deps>[0].base_addr =
<number_of_items_in_clause>;
This extra element is required to implement 'update' and 'destroy'
clauses. It is required to know the size of array to destroy it
correctly and to update depency kind.
The output of subprocess.check_output is decode()'d through the rest of
this python program, but one appears to have been missed for the output
of the call to "clang -print-file-name=include".
On Windows, with Python 3.6, this leads to the 'args' array being a mix of
bytes and strings, which causes exceptions later down the line.
I can't easily test with python2 on Windows, but python2 on Ubuntu 18.04
was happy with this change.
Summary:
The VSHLC instruction performs a left shift of a whole vector register
by an immediate shift count up to 32, shifting in new bits at the low
end from a GPR and delivering the shifted-out bits from the high end
back into the same GPR.
Since the instruction produces two outputs (the shifted vector
register and the output GPR of shifted-out bits), it has to be
instruction-selected in C++ rather than Tablegen.
Reviewers: MarkMurrayARM, dmgreen, miyuki, ostannard
Reviewed By: miyuki
Subscribers: kristof.beyls, hiraditya, cfe-commits, llvm-commits
Tags: #clang, #llvm
Differential Revision: https://reviews.llvm.org/D75445
Summary:
These are exactly parallel to the existing `vadciq` intrinsics, which
we implemented last year as part of the original MVE intrinsics
framework setup.
Just like VADC/VADCI, the MVE VSBC/VSBCI instructions deliver two
outputs, both of which the intrinsic exposes: a modified vector
register and a carry flag. So they have to be instruction-selected in
C++ rather than Tablegen. However, in this case, that's trivial: the
same C++ isel routine we already have for VADC works unchanged, and
all we have to do is to pass it a different instruction id.
Reviewers: MarkMurrayARM, dmgreen, miyuki, ostannard
Reviewed By: miyuki
Subscribers: kristof.beyls, hiraditya, cfe-commits, llvm-commits
Tags: #clang, #llvm
Differential Revision: https://reviews.llvm.org/D75444
Summary: The new way of checking fix-its is `%check_analyzer_fixit`.
Reviewed By: NoQ, Szelethus, xazax.hun
Differential Revision: https://reviews.llvm.org/D73729
Summary:
This patch introduces a way to apply the fix-its by the Analyzer:
`-analyzer-config apply-fixits=true`.
The fix-its should be testable, therefore I have copied the well-tested
`check_clang_tidy.py` script. The idea is that the Analyzer's workflow
is different so it would be very difficult to use only one script for
both Tidy and the Analyzer, the script would diverge a lot.
Example test: `// RUN: %check-analyzer-fixit %s %t -analyzer-checker=core`
When the copy-paste happened the original authors were:
@alexfh, @zinovy.nis, @JonasToth, @hokein, @gribozavr, @lebedev.ri
Reviewed By: NoQ, alexfh, zinovy.nis
Differential Revision: https://reviews.llvm.org/D69746
Summary:
hip-pinned-shadow global var should remain in the final code object irrespective
of whether it is used or not within the code. Add it to used list, so that it
will not get eliminated when it is unused.
Reviewers: yaxunl, tra, hliao
Reviewed By: yaxunl
Subscribers: hliao, cfe-commits
Tags: #clang
Differential Revision: https://reviews.llvm.org/D75402
dependent contexts.
We previously assumed they were neither value- nor
instantiation-dependent under any circumstances, which would lead to
crashes and other misbehavior.
The -fsystem-module flag is used when explicitly building a module. It
forces the module to be treated as a system module. This is used when
converting an implicit build to an explicit build to match the
systemness the implicit build would have had for a given module.
Differential Revision: https://reviews.llvm.org/D75395
Lower priority of __tgt_register_lib in order to make sure that __tgt_register_requires is called before loading a libomptarget plugin.
We want to know beforehand which requirements the user has asked for so that upon loading the plugin libomptarget can report how many devices there are that can satisfy these requirements.
Differential Revision: https://reviews.llvm.org/D75223
Pickup the default crt and libs when the target is musl.
Resubmitting after updating the testcase.
Differential Revision: https://reviews.llvm.org/D75139
Summary:
This patch introduces the `clang_analyzer_isTainted` expression inspection
check for checking taint.
Using this we could query the analyzer whether the expression used as the
argument is tainted or not. This would be useful in tests, where we don't want
to issue warning for all tainted expressions in a given file
(like the `debug.TaintTest` would do) but only for certain expressions.
Example usage:
```lang=c++
int read_integer() {
int n;
clang_analyzer_isTainted(n); // expected-warning{{NO}}
scanf("%d", &n);
clang_analyzer_isTainted(n); // expected-warning{{YES}}
clang_analyzer_isTainted(n + 2); // expected-warning{{YES}}
clang_analyzer_isTainted(n > 0); // expected-warning{{YES}}
int next_tainted_value = n; // no-warning
return n;
}
```
Reviewers: NoQ, Szelethus, baloghadamsoftware, xazax.hun, boga95
Reviewed By: Szelethus
Subscribers: martong, rnkovacs, whisperity, xazax.hun,
baloghadamsoftware, szepet, a.sidorin, mikhail.ramalho, donat.nagy,
Charusso, cfe-commits, boga95, dkrupp, cfe-commits
Tags: #clang
Differential Revision: https://reviews.llvm.org/D74131
and follow-ups:
a2ca1c2d "build: disable zlib by default on Windows"
2181bf40 "[CMake] Link against ZLIB::ZLIB"
1079c68a "Attempt to fix ZLIB CMake logic on Windows"
This changed the output of llvm-config --system-libs, and more
importantly it broke stand-alone builds. Instead of piling on more fix
attempts, let's revert this to reduce the risk of more breakages.
This reverts commit 0a9fc9233e.
Going to look at the asan failures.
I find the failures in the test suite weird, because they look
like compile time test and I don't understand how that can be
failing, but will have a brief look at that too.
This makes -fno-common the default for all targets because this has performance
and code-size benefits and is more language conforming for C code.
Additionally, GCC10 also defaults to -fno-common and so we get consistent
behaviour with GCC.
With this change, C code that uses tentative definitions as definitions of a
variable in multiple translation units will trigger multiple-definition linker
errors. Generally, this occurs when the use of the extern keyword is neglected
in the declaration of a variable in a header file. In some cases, no specific
translation unit provides a definition of the variable. The previous behavior
can be restored by specifying -fcommon.
As GCC has switched already, we benefit from applications already being ported
and existing documentation how to do this. For example:
- https://gcc.gnu.org/gcc-10/porting_to.html
- https://wiki.gentoo.org/wiki/Gcc_10_porting_notes/fno_common
Differential revision: https://reviews.llvm.org/D75056
This patch adds support for dwarf emission/dumping part of debuginfo
generation for defaulted parameters.
Reviewers: probinson, aprantl, dblaikie
Reviewed By: aprantl, dblaikie
Differential Revision: https://reviews.llvm.org/D73462
When an implicitly generated decl was the first entry in the group, we
attempted to lookup comments with an empty FileID, leading to crashes. Avoid
this by trying to use the other declarations in the group, and then bailing out
if none are valid.
rdar://59919733
Differential revision: https://reviews.llvm.org/D75483
I'm making the CHECK lines vague enough that they pass at -O0.
If that is too vague (we really want to check the data flow
to verify that the variables are not mismatched, etc), then
we can adjust those lines again to more closely match the output
at -O0 rather than -O1.
This change is based on the post-commit comments for:
83f4372f3ahttp://lists.llvm.org/pipermail/cfe-commits/Week-of-Mon-20200224/307888.html
There are no failures from the first set of RUN lines here,
so the CHECKs were already vague enough to not be affected
by optimizations. The final RUN line does induce some kind
of failure, so I'll try to fix that separately in a
follow-up.
This patch upstreams support for the ARM Armv8.1m cpu Cortex-M55.
In detail adding support for:
- mcpu option in clang
- Arm Target Features in clang
- llvm Arm TargetParser definitions
details of the CPU can be found here:
https://developer.arm.com/ip-products/processors/cortex-m/cortex-m55
Reviewers: chill
Reviewed By: chill
Subscribers: dmgreen, kristof.beyls, hiraditya, cfe-commits,
llvm-commits
Tags: #clang, #llvm
Differential Revision: https://reviews.llvm.org/D74966
Summary:
These instructions convert a vector of floats to a vector of integers
of the same size, with assorted non-default rounding modes.
Implemented in IR as target-specific intrinsics, because as far as I
can see there are no matches for that functionality in the standard IR
intrinsics list.
Reviewers: MarkMurrayARM, dmgreen, miyuki, ostannard
Reviewed By: dmgreen
Subscribers: kristof.beyls, hiraditya, cfe-commits, llvm-commits
Tags: #clang, #llvm
Differential Revision: https://reviews.llvm.org/D75255
Summary:
These instructions make a vector of `<4 x float>` by widening every
other lane of a vector of `<8 x half>`.
I wondered about representing these using standard IR, along the lines
of a shufflevector to extract elements of the input into a `<4 x half>`
followed by an `fpext` to turn that into `<4 x float>`. But it looks as
if that would take a lot of work in isel lowering to make it match any
pattern I could sensibly write in Tablegen, and also I haven't been
able to think of any other case where that pattern might be generated
in IR, so there wouldn't be any extra code generation win from doing
it that way.
Therefore, I've just used another target-specific intrinsic. We can
always change it to the other way later if anyone thinks of a good
reason.
(In order to put the intrinsic definition near similar things in
`IntrinsicsARM.td`, I've also lifted the definition of the
`MVEMXPredicated` multiclass higher up the file, without changing it.)
Reviewers: MarkMurrayARM, dmgreen, miyuki, ostannard
Reviewed By: miyuki
Subscribers: kristof.beyls, hiraditya, cfe-commits, llvm-commits
Tags: #clang, #llvm
Differential Revision: https://reviews.llvm.org/D75254
Summary:
These instructions work like VMOVN (narrowing a vector of wide values
to half size, and overwriting every other lane of an output register
with the result), except that the narrowing conversion is saturating.
They come in three signedness flavours: signed to signed, unsigned to
unsigned, and signed to unsigned. All are represented in IR by a
target-specific intrinsic that takes two separate 'unsigned' flags.
Reviewers: MarkMurrayARM, dmgreen, miyuki, ostannard
Reviewed By: dmgreen
Subscribers: kristof.beyls, hiraditya, cfe-commits, llvm-commits
Tags: #clang, #llvm
Differential Revision: https://reviews.llvm.org/D75252
This removes everything but int_x86_avx512_mask_vcvtph2ps_512 which provides the SAE variant, but even this can use the fpext generic if the rounding control is the default.
Differential Revision: https://reviews.llvm.org/D75162
Try again with an up-to-date version of D69471 (99317124 was a stale
revision).
---
Revise the coverage mapping format to reduce binary size by:
1. Naming function records and marking them `linkonce_odr`, and
2. Compressing filenames.
This shrinks the size of llc's coverage segment by 82% (334MB -> 62MB)
and speeds up end-to-end single-threaded report generation by 10%. For
reference the compressed name data in llc is 81MB (__llvm_prf_names).
Rationale for changes to the format:
- With the current format, most coverage function records are discarded.
E.g., more than 97% of the records in llc are *duplicate* placeholders
for functions visible-but-not-used in TUs. Placeholders *are* used to
show under-covered functions, but duplicate placeholders waste space.
- We reached general consensus about giving (1) a try at the 2017 code
coverage BoF [1]. The thinking was that using `linkonce_odr` to merge
duplicates is simpler than alternatives like teaching build systems
about a coverage-aware database/module/etc on the side.
- Revising the format is expensive due to the backwards compatibility
requirement, so we might as well compress filenames while we're at it.
This shrinks the encoded filenames in llc by 86% (12MB -> 1.6MB).
See CoverageMappingFormat.rst for the details on what exactly has
changed.
Fixes PR34533 [2], hopefully.
[1] http://lists.llvm.org/pipermail/llvm-dev/2017-October/118428.html
[2] https://bugs.llvm.org/show_bug.cgi?id=34533
Differential Revision: https://reviews.llvm.org/D69471
Revise the coverage mapping format to reduce binary size by:
1. Naming function records and marking them `linkonce_odr`, and
2. Compressing filenames.
This shrinks the size of llc's coverage segment by 82% (334MB -> 62MB)
and speeds up end-to-end single-threaded report generation by 10%. For
reference the compressed name data in llc is 81MB (__llvm_prf_names).
Rationale for changes to the format:
- With the current format, most coverage function records are discarded.
E.g., more than 97% of the records in llc are *duplicate* placeholders
for functions visible-but-not-used in TUs. Placeholders *are* used to
show under-covered functions, but duplicate placeholders waste space.
- We reached general consensus about giving (1) a try at the 2017 code
coverage BoF [1]. The thinking was that using `linkonce_odr` to merge
duplicates is simpler than alternatives like teaching build systems
about a coverage-aware database/module/etc on the side.
- Revising the format is expensive due to the backwards compatibility
requirement, so we might as well compress filenames while we're at it.
This shrinks the encoded filenames in llc by 86% (12MB -> 1.6MB).
See CoverageMappingFormat.rst for the details on what exactly has
changed.
Fixes PR34533 [2], hopefully.
[1] http://lists.llvm.org/pipermail/llvm-dev/2017-October/118428.html
[2] https://bugs.llvm.org/show_bug.cgi?id=34533
Differential Revision: https://reviews.llvm.org/D69471
Support only preferred spelling 'Modules/module.private.modulemap' and
not the deprecated 'module_private.map'.
rdar://problem/57715533
Reviewed By: bruno
Differential Revision: https://reviews.llvm.org/D75311
Chen
Summary:
Base declaration in pointer arithmetic expression is determined by
binary search with type information. Take "int *a, *b; *(a+*b)" as an
example, we determine the base by checking the type of LHS and RHS. In
this case the type of LHS is "int *", the type of RHS is "int",
therefore, we know that we need to visit LHS in order to find base
declaration.
Reviewers: ABataev, jdoerfert
Reviewed By: ABataev
Subscribers: guansong, cfe-commits, sandoval, dreachem
Tags: #clang
Differential Revision: https://reviews.llvm.org/D75077
This aims to fix a missed inlining case.
If there's a virtual call in the callee on an alloca (stack allocated object) in
the caller, and the callee is inlined into the caller, the post-inline cleanup
would devirtualize the virtual call, but if the next iteration of
DevirtSCCRepeatedPass doesn't happen (under the new pass manager), which is
based on a heuristic to determine whether to reiterate, we may miss inlining the
devirtualized call.
This enables inlining in clang/test/CodeGenCXX/member-function-pointer-calls.cpp.
This is a second commit after a revert
https://reviews.llvm.org/rG4569b3a86f8a4b1b8ad28fe2321f936f9d7ffd43 and a fix
https://reviews.llvm.org/rG41e06ae7ba91.
Differential Revision: https://reviews.llvm.org/D69591
Use UnaryOperator::CreateFNeg instead.
Summary:
With the introduction of the native fneg instruction, the
fsub -0.0, %x idiom is obsolete. This patch makes LLVM
emit fneg instead of the idiom in all places.
Reviewed By: cameron.mcinally
Differential Revision: https://reviews.llvm.org/D75130
WebAssembly enforces a rule that caller and callee signatures must
match. This means that the traditional technique of passing `main`
`argc` and `argv` even when it doesn't need them doesn't work.
Currently the backend renames `main` to `__original_main`, however this
doesn't interact well with LTO'ing libc, and the name isn't intuitive.
This patch allows us to transition to `__main_argc_argv` instead.
This implements the proposal in
https://github.com/WebAssembly/tool-conventions/pull/134
with a flag to disable it when targeting Emscripten, though this is
expected to be temporary, as discussed in the proposal comments.
Differential Revision: https://reviews.llvm.org/D70700
Summary:
User can select the version of SYCL the compiler will
use via the flag -sycl-std, similar to -cl-std.
The flag defines the LangOpts.SYCLVersion option to the
version of SYCL. The default value is undefined.
If driver is building SYCL code, flag is set to the default SYCL
version (1.2.1)
The preprocessor uses this variable to define CL_SYCL_LANGUAGE_VERSION macro,
which should be defined according to SYCL 1.2.1 standard.
Only valid value at this point for the flag is 1.2.1.
Co-Authored-By: David Wood <Q0KPU0H1YOEPHRY1R2SN5B5RL@david.davidtw.co>
Signed-off-by: Ruyman Reyes <ruyman@codeplay.com>
Subscribers: ebevhan, Anastasia, cfe-commits
Tags: #clang
Differential Revision: https://reviews.llvm.org/D72857
Signed-off-by: Alexey Bader <alexey.bader@intel.com>
The code in llvmorg-10-init-12188-g25ce33a6e4f is a breaking change for
users of older linkers who don't pass a version parameter, which
prevents a drop-in clang upgrade. Old tools can't know about what future
tools will do, so as a general principle the burden should be new tools
to be compatible by default. Also, for comparison, none of the other
tests of Version within AddLinkArgs add any new behaviors unless the
version is explicitly specified. Therefore, this patch changes the
-platform_version behavior from opt-out to opt-in.
Patch by David Major!
Differential revision: https://reviews.llvm.org/D74784
I made that file by pasting together several pieces, and forgot to
take out the #include <arm_mve.h> from the tops of the later ones, so
the test was pointlessly including the same header five times. NFC.
is ambiguous, but only one of the possible lookup results could possibly
be right.
Clang recently started diagnosing ambiguity in more cases, and this
broke the build of Firefox. GCC, ICC, MSVC, and previous versions of
Clang all accept some forms of ambiguity here (albeit different ones in
each case); this patch mostly accepts anything any of those compilers
accept.
DevirtSCCRepeatedPass iteration. Needs ReviewPublic
This aims to fix a missed inlining case.
If there's a virtual call in the callee on an alloca (stack allocated object) in
the caller, and the callee is inlined into the caller, the post-inline cleanup
would devirtualize the virtual call, but if the next iteration of
DevirtSCCRepeatedPass doesn't happen (under the new pass manager), which is
based on a heuristic to determine whether to reiterate, we may miss inlining the
devirtualized call.
This enables inlining in clang/test/CodeGenCXX/member-function-pointer-calls.cpp.
Summary:
This commit adds the predicated MVE intrinsics for the same set of
unary operations that I added in their unpredicated forms in
* D74333 (vrint)
* D74334 (vrev)
* D74335 (vclz, vcls)
* D74336 (vmovl)
* D74337 (vmovn)
but since the predicated versions are a lot more similar to each
other, I've kept them all together in a single big patch. Everything
here is done in the standard way we've been doing other predicated
operations: an IR intrinsic called `@llvm.arm.mve.foo.predicated` and
some isel rules that match that alongside whatever they accept for the
unpredicated version of the same instruction.
In order to write the isel rules conveniently, I've refactored the
existing isel rules for the affected instructions into multiclasses
parametrised by a vector-type class, in the usual way. All those
refactorings are intended to leave the existing isel rules unchanged:
the only difference should be that new ones for the predicated
intrinsics are introduced.
The only tiny infrastructure change I needed in this commit was to
change the implementation of `IntrinsicMX` in `arm_mve_defs.td` so
that the records it defines are anonymous rather than named (and use
`NameOverride` to set the output intrinsic name), which allows me to
call it twice in two multiclasses with the same `NAME` without a
tablegen-time error.
Reviewers: dmgreen, MarkMurrayARM, miyuki, ostannard
Reviewed By: MarkMurrayARM
Subscribers: kristof.beyls, hiraditya, cfe-commits, llvm-commits
Tags: #clang, #llvm
Differential Revision: https://reviews.llvm.org/D75165
Summary:
Right now we annotate C++'s `operator new` with `noalias` attribute,
which very much is healthy for optimizations.
However as per [[ http://eel.is/c++draft/basic.stc.dynamic.allocation | `[basic.stc.dynamic.allocation]` ]],
there are more promises on global `operator new`, namely:
* non-`std::nothrow_t` `operator new` *never* returns `nullptr`
* If `std::align_val_t align` parameter is taken, the pointer will also be `align`-aligned
* ~~global `operator new`-returned pointer is `__STDCPP_DEFAULT_NEW_ALIGNMENT__`-aligned ~~ It's more caveated than that.
Supplying this information may not cause immediate landslide effects
on any specific benchmarks, but it for sure will be healthy for optimizer
in the sense that the IR will better reflect the guarantees provided in the source code.
The caveat is `-fno-assume-sane-operator-new`, which currently prevents emitting `noalias`
attribute, and is automatically passed by Sanitizers ([[ https://bugs.llvm.org/show_bug.cgi?id=16386 | PR16386 ]]) - should it also cover these attributes?
The problem is that the flag is back-end-specific, as seen in `test/Modules/explicit-build-flags.cpp`.
But while it is okay to add `noalias` metadata in backend, we really should be adding at least
the alignment metadata to the AST, since that allows us to perform sema checks on it.
Reviewers: erichkeane, rjmccall, jdoerfert, eugenis, rsmith
Reviewed By: rsmith
Subscribers: xbolva00, jrtc27, atanasyan, nlopes, cfe-commits
Tags: #llvm, #clang
Differential Revision: https://reviews.llvm.org/D73380
Summary:
There was even a TODO for this.
The main motivation is to make use of call-site based
`__attribute__((alloc_align(param_idx)))` validation (D72996).
Reviewers: rsmith, erichkeane, aaron.ballman, jdoerfert
Reviewed By: rsmith
Subscribers: cfe-commits
Tags: #clang
Differential Revision: https://reviews.llvm.org/D73020
If we deduplicate OpenMP runtime calls we have multiple `ident_t*` that
represent information like source location. So far, we simply kept the
one used by the replacement call. However, as exposed by PR44893, that
can cause problems if we have stack allocated `ident_t` objects. While
we need to revisit the use of these as well, it is clear that we
eventually want to merge source location information in some way. With
this patch we add the infrastructure to do so but without doing the
actual merge. Instead we pick a global `ident_t` from the replaced
calls, if possible, or create a new one with an unknown location
instead.
Reviewed By: JonChesterfield
Differential Revision: https://reviews.llvm.org/D74925
Verifies that an argument passed to __builtin_frame_address or __builtin_return_address is within the range [0, 0xFFFF]
Differential revision: https://reviews.llvm.org/D66839
Re-committed after fixed: c93112dc4f
This patch fixes PR44896. For IR input files, option fdiscard-value-names
should be ignored as we need named values in loadModule().
Commit 60d3947922 sets this option after loadModule() where valued names
already created. This creates an inconsistent state in setNameImpl()
that leads to a seg fault.
This patch forces fdiscard-value-names to be false for IR input files.
This patch also emits a warning of "ignoring -fdiscard-value-names" if
option fdiscard-value-names is explictly enabled in the commandline for
IR input files.
Differential Revision: https://reviews.llvm.org/D74878
Compute and propagate conversion kind to diagnostics helper in C++
to provide more specific diagnostics about incorrect implicit
conversions in assignments, initializations, params, etc...
Duplicated some diagnostics as errors because C++ is more strict.
Tags: #clang
Differential Revision: https://reviews.llvm.org/D74116
So far we've been dropping coverage every time we've encountered
a CXXInheritedCtorInitExpr. This patch attempts to add some
initial support for it.
Constructors for arguments of a CXXInheritedCtorInitExpr are still
not fully supported.
Differential Revision: https://reviews.llvm.org/D74735
This flag is like /showIncludes, but it only includes user headers and
omits system headers (similar to MD and MMD). The motivation is that
projects that already track system includes though other means can use
this flag to get consistent behavior on Windows and non-Windows, and it
saves tools that output /showIncludes output (e.g. ninja) some work.
implementation-wise, this makes `HeaderIncludesCallback` honor the
existing `IncludeSystemHeaders` bit, and changes the three clients of
`HeaderIncludesCallback` (`/showIncludes`, `-H`, `CC_PRINT_HEADERS=1`)
to pass `-sys-header-deps` to set that bit -- except for
`/showIncludes:user`, which doesn't pass it.
Differential Revision: https://reviews.llvm.org/D75093
Exactly what it says on the tin! I decided not to merge this with the patch that
changes all these to a CallDescriptionMap object, so the patch is that much more
trivial.
Differential Revision: https://reviews.llvm.org/D68163
D68391 added tests that check scenarios where no RISC-V GCC toolchain is
supposed to be detected. When running the tests on RISC-V hosts the system's
GCC toolchain will be detected, and the tests will fail. This patch adds a
`--gcc-toolchain` option pointing to a path where no GCC toolchain is
present, ensuring that the tests are run under the expected conditions, and
therefore are able to pass in all test environments.
Differential Revision: https://reviews.llvm.org/D75061
Currently, using negative numbers in iterator operations (additions and
subractions) results in advancements with huge positive numbers due to
an error. This patch fixes it.
Differential Revision: https://reviews.llvm.org/D74760
Summary:
Clang's "asm goto" feature didn't initially support outputs constraints. That
was the same behavior as gcc's implementation. The decision by gcc not to
support outputs was based on a restriction in their IR regarding terminators.
LLVM doesn't restrict terminators from returning values (e.g. 'invoke'), so
it made sense to support this feature.
Output values are valid only on the 'fallthrough' path. If an output value's used
on an indirect branch, then it's 'poisoned'.
In theory, outputs *could* be valid on the 'indirect' paths, but it's very
difficult to guarantee that the original semantics would be retained. E.g.
because indirect labels could be used as data, we wouldn't be able to split
critical edges in situations where two 'callbr' instructions have the same
indirect label, because the indirect branch's destination would no longer be
the same.
Reviewers: jyknight, nickdesaulniers, hfinkel
Reviewed By: jyknight, nickdesaulniers
Subscribers: MaskRay, rsmith, hiraditya, llvm-commits, cfe-commits, craig.topper, rnk
Tags: #clang, #llvm
Differential Revision: https://reviews.llvm.org/D69876
Previously we emitted an fmadd and a fmadd+fneg and combined them with a shufflevector. But this doesn't follow the correct exception behavior for unselected elements so the backend can't merge them into the fmaddsub/fmsubadd instructions.
This patch restores the the fmaddsub intrinsics so we don't have two arithmetic operations. We lose out on optimization opportunity in the non-strict FP case, but I don't think this is a big loss. If someone gives us a test case we can look into adding instcombine/dagcombine improvements. I'd rather not have the frontend do completely different things for strict and non-strict.
This still has problems because target specific intrinsics don't support strict semantics yet. We also still have all of the problems with masking. But we at least generate the right instruction in constrained mode now.
Differential Revision: https://reviews.llvm.org/D74268
This PR enables "XL" C++ ABI in frontend AST to IR codegen. And it is driven by
static init work. The current kind in Clang by default is Generic Itanium, which
has different behavior on static init with IBM xlclang compiler on AIX.
Differential Revision: https://reviews.llvm.org/D74015
The syntax rules for ptr-operator allow attributes after *, &,
&&, therefore we should be able to parse the following:
void fn() {
void (*[[attr]] x)() = &fn;
void (&[[attr]] y)() = fn;
void (&&[[attr]] z)() = fn;
}
However the current logic in TryParsePtrOperatorSeq does not consider
the presence of attributes leading to unexpected parsing errors.
Moreover we should also consider _Atomic a possible qualifier that can
appear after the sequence of attribute specifiers.
In order to build the Linux kernel, the back chain must be supported with
packed-stack. The back chain is then stored topmost in the register save
area.
Review: Ulrich Weigand
Differential Revision: https://reviews.llvm.org/D74506
The diagnostic added in D72231 also shows a diagnostic when casting to a
_Bool. This is unwanted. This patch removes the diagnostic for _Bool types.
Differential Revision: https://reviews.llvm.org/D74860
Make Clang on aarch64 targets predefine `__AARCH64_CMODEL_SMALL__`
or `__AARCH64_CMODEL_TINY__`, etc. These are the names that GCC
uses for its predefines.
Reviewed By: tamur, MaskRay
Differential Revision: https://reviews.llvm.org/D75002
Similar to the rest of the command line that is recorded, the program
path must also have spaces and backslashes escaped. Without this
parsing the recorded command line becomes hard on platforms like
Windows where spaces and backslashes are common.
This was originally reverted in
577d9ce35532439203411c999deefc9c80e04c69; this version makes a test
agnostic to the presence of backslashes in paths on some platforms.
Patch By: Ravi Ramaseshan
Differential Revision: https://reviews.llvm.org/D74811
By default the RISC-V target doesn't have the atomics standard extension
enabled. The first RUN line in `clang/test/CodeGen/atomic_ops.c` didn't
specify a target triple, which meant that on RISC-V Linux hosts it would
target RISC-V, but because it used clang cc1 we didn't get the toolchain
driver functionality to automatically turn on the extensions implied by
the target triple (riscv64-linux includes atomics). This would cause the
test to fail on RISC-V hosts.
This patch changes the test to have RUN lines for two explicit targets,
one with native atomics and one without. To work around FileCheck
limitations and more accurately match the output, some tests now have
separate prefixes for the two cases.
Reviewers: jyknight, eli.friedman, lenary, efriedma
Reviewed By: efriedma
Differential Revision: https://reviews.llvm.org/D74847
Similar to the rest of the command line that is recorded, the program
path must also have spaces and backslashes escaped. Without this
parsing the recorded command line becomes hard on platforms like
Windows where spaces and backslashes are common.
Patch By: Ravi Ramaseshan
Differential Revision: https://reviews.llvm.org/D74811
For tag typedefs like this one:
/*!
@class Foo
*/
typedef class { } Foo;
clang -Wdocumentation gives:
warning: '@class' command should not be used in a comment attached to a
non-struct declaration [-Wdocumentation]
... while doxygen seems fine with it.
Differential Revision: https://reviews.llvm.org/D74746
Summary:
libindex will canonicalize references to template instantiations:
- 1) reference to an explicit template specialization, report the specializatiion
- 2) otherwise, report the primary template
but 2) is not true for incomplete instantiations, this patch fixes this.
Fixes https://github.com/clangd/clangd/issues/287
Reviewers: kadircet
Subscribers: ilya-biryukov, jkorous, arphaman, usaxena95, cfe-commits
Tags: #clang
Differential Revision: https://reviews.llvm.org/D74830
Summary:
As @rsmith notes in https://reviews.llvm.org/D73020#inline-672219
while that is certainly UB land, it may not be actually reachable at runtime, e.g.:
```
template<int N> void *make() {
if ((N & (N-1)) == 0)
return operator new(N, std::align_val_t(N));
else
return operator new(N);
}
void *p = make<7>();
```
and we shouldn't really error-out there.
That being said, i'm not really following the logic here.
Which ones of these cases should remain being an error?
Reviewers: rsmith, erichkeane
Reviewed By: erichkeane
Subscribers: cfe-commits, rsmith
Tags: #clang
Differential Revision: https://reviews.llvm.org/D73996
Summary:
This patch adds two families of ACLE intrinsics: vqdmullbq and
vqdmulltq (including vector-vector and vector-scalar variants) and the
corresponding LLVM IR intrinsics llvm.arm.mve.vqdmull and
llvm.arm.mve.vqdmull.predicated.
Reviewers: simon_tatham, MarkMurrayARM, dmgreen, ostannard
Reviewed By: MarkMurrayARM
Subscribers: kristof.beyls, hiraditya, cfe-commits, llvm-commits
Tags: #clang, #llvm
Differential Revision: https://reviews.llvm.org/D74845
This has no effect on how LLVM passes the arguments, but it prevents
rewriteWithInAlloca from thinking that these parameters should be part
of the inalloca pack.
Follow-up to D72114
Reviewed By: erichkeane
Differential Revision: https://reviews.llvm.org/D74452
This commit removes the artificial types <512 x i1> and <1024 x i1>
from HVX intrinsics, and makes v512i1 and v1024i1 no longer legal on
Hexagon.
It may cause existing bitcode files to become invalid.
* Converting between vector predicates and vector registers must be
done explicitly via vandvrt/vandqrt instructions (their intrinsics),
i.e. (for 64-byte mode):
%Q = call <64 x i1> @llvm.hexagon.V6.vandvrt(<16 x i32> %V, i32 -1)
%V = call <16 x i32> @llvm.hexagon.V6.vandqrt(<64 x i1> %Q, i32 -1)
The conversion intrinsics are:
declare <64 x i1> @llvm.hexagon.V6.vandvrt(<16 x i32>, i32)
declare <128 x i1> @llvm.hexagon.V6.vandvrt.128B(<32 x i32>, i32)
declare <16 x i32> @llvm.hexagon.V6.vandqrt(<64 x i1>, i32)
declare <32 x i32> @llvm.hexagon.V6.vandqrt.128B(<128 x i1>, i32)
They are all pure.
* Vector predicate values cannot be loaded/stored directly. This directly
reflects the architecture restriction. Loading and storing or vector
predicates must be done indirectly via vector registers and explicit
conversions via vandvrt/vandqrt instructions.
This patch introduces a new helper class `OMPBuilderCBHelpers`,
which will contain all reusable C/C++ language specific function-
alities required by the `OMPIRBuilder`.
Initially, this helper class contains the body and finalization
codegen functionalities implemented using callbacks which were
moved here for reusability among the different directives
implemented in the `OMPIRBuilder`, along with RAIIs for preserving
state prior to emitting outlined and/or inlined OpenMP regions.
In the future this helper class will also contain all the different
call backs required by OpenMP clauses/variable privatization.
Reviewed By: jdoerfert
Differential Revision: https://reviews.llvm.org/D74562
Some IRBuilder methods that were originally defined on
IRBuilderBase do not respect custom IRBuilder inserters/folders,
because those were not accessible prior to D73835. Fix this by
making use of existing (and now accessible) IRBuilder methods,
which will handle inserters/folders correctly.
There are some changes in OpenMP and Instrumentation tests, where
bitcasts now get constant folded. I've also highlighted one
InstCombine test which now finishes in two rather than three
iterations, thanks to new instructions being inserted into the
worklist.
Differential Revision: https://reviews.llvm.org/D74787
Summary:
This patch introduces a new checker:
`alpha.security.cert.pos.34c`
This checker is implemented based on the following rule:
https://wiki.sei.cmu.edu/confluence/x/6NYxBQ
The check warns if `putenv` function is
called with automatic storage variable as an argument.
Differential Revision: https://reviews.llvm.org/D71433
Summary:
Previously any symlinks would be ignored since the directory
traversal doesn't follow them.
With this change we now follow symlinks (via a `stat` call
in order to figure out the target type of the symlink if it
is valid).
Reviewers: sammccall
Subscribers: cfe-commits
Tags: #clang
Differential Revision: https://reviews.llvm.org/D74790
Some IRBuilder methods that were originally defined on
IRBuilderBase do not respect custom IRBuilder inserters/folders,
because those were not accessible prior to D73835. Fix this by
making use of existing (and now accessible) IRBuilder methods,
which will handle inserters/folders correctly.
There are some changes in OpenMP tests, where bitcasts now get
constant folded. I've also highlighted one InstCombine test which
now finishes in two rather than three iterations, thanks to new
instructions being inserted into the worklist.
Differential Revision: https://reviews.llvm.org/D74787
Summary:
Some predicated MVE intrinsics return a vector with element size
different from the input vector element size. In this case the
predicate must type correspond to the output vector type.
The following intrinsics use the incorrect predicate type:
* llvm.arm.mve.mull.int.predicated
* llvm.arm.mve.mull.poly.predicated
* llvm.arm.mve.vshll.imm.predicated
This patch fixes the issue.
Reviewers: simon_tatham, dmgreen, ostannard, MarkMurrayARM
Reviewed By: MarkMurrayARM
Subscribers: kristof.beyls, hiraditya, cfe-commits, llvm-commits
Tags: #clang, #llvm
Differential Revision: https://reviews.llvm.org/D74838
The `-fdeclare-opencl-builtins` option was accepting saturated
conversions for non-integer types, which contradicts both the OpenCL
specification (v2.0 s6.2.3) and Clang's opencl-c.h file.
Summary:
Depends on https://reviews.llvm.org/D71902.
The last in a series of six patches that ports the LLVM coroutines
passes to the new pass manager infrastructure.
This patch has Clang schedule the new coroutines passes when the
`-fexperimental-new-pass-manager` option is used. With this and the
previous 5 patches, Clang is capable of building and successfully
running the test suite of large coroutines projects such as
https://github.com/lewissbaker/cppcoro with
`ENABLE_EXPERIMENTAL_NEW_PASS_MANAGER=On`.
Reviewers: GorNishanov, lewissbaker, chandlerc, junparser
Subscribers: EricWF, cfe-commits, llvm-commits
Tags: #clang
Differential Revision: https://reviews.llvm.org/D71903
user interface and documentation, and update __cplusplus for C++20.
WG21 considers the C++20 standard to be finished (even though it still
has some more steps to pass through in the ISO process).
The old flag names are accepted for compatibility, as usual, and we
still have lots of references to C++2a in comments and identifiers;
those can be cleaned up separately.
Summary:
$ clang -O2 -pg -mfentry foo.c
was adding frame pointers to all functions. This was exposed via
compiling the Linux kernel for x86_64 with CONFIG_FUNCTION_TRACER
enabled.
-pg was unconditionally setting the equivalent of -fno-omit-frame-pointer,
regardless of the presence of -mfentry or optimization level. After this
patch, frame pointers will only be omitted at -O0 or if
-fno-omit-frame-pointer is explicitly set for -pg -mfentry.
See also:
https://gcc.gnu.org/git/?p=gcc.git;a=commitdiff;h=3c5273a96ba8dbf98c40bc6d9d0a1587b4cfedb2;hp=c9d75a48c4ea63ab27ccdb40f993236289b243f2#patch2
(modification to ix86_frame_pointer_required())
Fixes: pr/44934
Reviewers: void, manojgupta, dberris, MaskRay, hfinkel
Reviewed By: MaskRay
Subscribers: cfe-commits, llozano, niravd, srhines
Tags: #clang
Differential Revision: https://reviews.llvm.org/D74698
and objects with mutable subobjects.
The standard wording doesn't really cover these cases; accepting all
such cases seems most in line with what we do in other cases and what
other compilers do. (Essentially this means we're assuming that objects
external to the evaluation are always in-lifetime.)
Summary:
This patch adds a new MVE intrinsics family, `vbrsrq`: vector bit
reverse and shift right. The intrinsics are compiled into the VBRSR
instruction. Two new LLVM IR intrinsics were also added: arm.mve.vbrsr
and arm.mve.vbrsr.predicated.
Reviewers: simon_tatham, dmgreen, ostannard, MarkMurrayARM
Reviewed By: simon_tatham
Subscribers: kristof.beyls, hiraditya, cfe-commits, llvm-commits
Tags: #clang, #llvm
Differential Revision: https://reviews.llvm.org/D74721
Use the more accurate location when emitting the location of the
function being called's prototype in diagnostics emitted when calling
a function with an incorrect number of arguments.
In particular, avoids showing a trace of irrelevant macro expansions
for "MY_EXPORT static int AwesomeFunction(int, int);". Fixes PR#23564.
This patch upstreams support for the AArch64 Armv8-A cpu Cortex-A34.
In detail adding support for:
- mcpu option in clang
- AArch64 Target Features in clang
- llvm AArch64 TargetParser definitions
details of the cpu can be found here:
https://developer.arm.com/ip-products/processors/cortex-a/cortex-a34
Reviewers: SjoerdMeijer
Reviewed By: SjoerdMeijer
Subscribers: SjoerdMeijer, kristof.beyls, hiraditya, cfe-commits,
llvm-commits
Tags: #clang, #llvm
Differential Revision: https://reviews.llvm.org/D74483
Change-Id: Ida101fc544ca183a0a0e61a1277c8957855fde0b
Change clang option -ffp-model=precise, the default, to select ffp-contract=on
The patch caused some problems for PowerPC but ibm has made
adjustments so I am resubmitting this patch. Additionally, Andy looked
at the performance regressions on LNT and it looks like a loop
unrolling decision that could be adjusted.
Reviewers: rjmccall, Andy Kaylor
Differential Revision: https://reviews.llvm.org/D74436
This patch enables the debug entry values feature.
- Remove the (CC1) experimental -femit-debug-entry-values option
- Enable it for x86, arm and aarch64 targets
- Resolve the test failures
- Leave the llc experimental option for targets that do not
support the CallSiteInfo yet
Differential Revision: https://reviews.llvm.org/D73534
Summary:
These are in some sense the inverse of vmovl[bt]q: they take a vector
of n wide elements and truncate each to half its width. So they only
write half a vector's worth of output data, and therefore they also
take an 'inactive' parameter to provide the other half of the data in
the output vector. So vmovnb overwrites the even lanes of 'inactive'
with the narrowed values from the main input, and vmovnt overwrites
the odd lanes.
LLVM had existing codegen which generates these MVE instructions in
response to IR that takes two vectors of wide elements, or two vectors
of narrow ones. But in this case, we have one vector of each. So my
clang codegen strategy is to narrow the input vector of wide elements
by simply reinterpreting it as the output type, and then we have two
narrow vectors and can represent the operation as a vector shuffle
that interleaves lanes from both of them.
Even so, not all the cases I needed ended up being selected as a
single MVE instruction, so I've added a couple more patterns that spot
combinations of the 'MVEvmovn' and 'ARMvrev32' SDNodes which can be
generated as a VMOVN instruction with operands swapped.
This commit adds the unpredicated forms only.
Reviewers: dmgreen, miyuki, MarkMurrayARM, ostannard
Reviewed By: dmgreen
Subscribers: kristof.beyls, hiraditya, cfe-commits, llvm-commits
Tags: #clang, #llvm
Differential Revision: https://reviews.llvm.org/D74337
Summary:
These intrinsics take a vector of 2n elements, and return a vector of
n wider elements obtained by sign- or zero-extending every other
element of the input vector. They're represented in IR as a
shufflevector that extracts the odd or even elements of the input,
followed by a sext or zext.
Existing LLVM codegen already matches this pattern and generates the
VMOVLB instruction (which widens the even-index input lanes). But no
existing isel rule was generating VMOVLT, so I've added some. However,
the new rules currently only work in little-endian MVE, because the
pattern they expect from isel lowering includes a bitconvert which
doesn't have the right semantics in big-endian.
The output of one existing codegen test is improved by those new
rules.
This commit adds the unpredicated forms only.
Reviewers: dmgreen, miyuki, MarkMurrayARM, ostannard
Reviewed By: dmgreen
Subscribers: kristof.beyls, hiraditya, cfe-commits, llvm-commits
Tags: #clang, #llvm
Differential Revision: https://reviews.llvm.org/D74336
Summary:
vclzq maps nicely to the existing target-independent @llvm.ctlz IR
intrinsic. But vclsq ('count leading sign bits') has no corresponding
target-independent intrinsic, so I've made up @llvm.arm.mve.vcls.
This commit adds the unpredicated forms only.
Reviewers: dmgreen, miyuki, MarkMurrayARM, ostannard
Reviewed By: miyuki
Subscribers: kristof.beyls, hiraditya, cfe-commits, llvm-commits
Tags: #clang, #llvm
Differential Revision: https://reviews.llvm.org/D74335
Summary:
These intrinsics just reorder the lanes of a vector, so the natural IR
representation is as a shufflevector operation. Existing LLVM codegen
already recognizes those particular shufflevectors and generates the
MVE VREV instruction.
This commit adds the unpredicated forms only.
Reviewers: dmgreen, miyuki, MarkMurrayARM, ostannard
Reviewed By: dmgreen
Subscribers: kristof.beyls, cfe-commits
Tags: #clang
Differential Revision: https://reviews.llvm.org/D74334
Summary:
This adds the unpredicated forms of six different MVE intrinsics which
all round a vector of floating-point numbers to integer values,
leaving them still in FP format, differing only in rounding mode and
exception settings.
Five of them map to existing target-independent intrinsics in LLVM IR,
such as @llvm.trunc and @llvm.rint. The sixth, mapping to the `vrintn`
instruction, is done by inventing a target-specific intrinsic.
(`vrintn` behaves the same as `vrintx` in terms of the output value:
the side effects on the FPSCR flags are the only difference between
the two. But ACLE specifies separate user-callable intrinsics for the
two, so the side effects matter enough to make sure we generate the
right one of the two instructions in each case.)
Reviewers: dmgreen, miyuki, MarkMurrayARM, ostannard
Reviewed By: miyuki
Subscribers: kristof.beyls, hiraditya, cfe-commits, llvm-commits
Tags: #clang, #llvm
Differential Revision: https://reviews.llvm.org/D74333
Summary:
This adds the unpredicated versions of the family of vcvtq intrinsics
that convert between a vector of floats and a vector of the same size
of integer. These are represented in IR using the standard fptosi,
fptoui, sitofp and uitofp operations, which existing LLVM codegen
already handles.
Reviewers: dmgreen, miyuki, MarkMurrayARM, ostannard
Reviewed By: MarkMurrayARM
Subscribers: kristof.beyls, cfe-commits
Tags: #clang
Differential Revision: https://reviews.llvm.org/D74332
Summary:
This commit adds the unpredicated intrinsics for the unary operations
vabsq (absolute value), vnegq (arithmetic negation), vmvnq (bitwise
complement), vqabsq and vqnegq (saturating versions of abs and neg for
signed integers, in the sense that they give INT_MAX if an input lane
is INT_MIN).
This is done entirely in clang: all of these operations have existing
isel patterns and existing tests for them on the LLVM side, so I've
just made clang emit the same IR that those patterns already match.
Reviewers: dmgreen, miyuki, MarkMurrayARM, ostannard
Reviewed By: MarkMurrayARM
Subscribers: kristof.beyls, cfe-commits
Tags: #clang
Differential Revision: https://reviews.llvm.org/D74331
This is useful for performing custom build system integration that works by appending '--analyze --analyzer-output html' to all clang build commands.
For such users there is now still a way to have the fancy index.html file
in the output.
Differential Revision: https://reviews.llvm.org/D74467
In the path-sensitive vfork() checker that keeps a list of operations
allowed after a successful vfork(), unforget to include execve() in the list.
Patch by Jan Včelák!
Differential Revision: https://reviews.llvm.org/D73629
Summary:
This patch adds vector-scalar variants to the following families of
MVE intrinsics:
* vaddq
* vsubq
* vmulq
* vqaddq
* vqsubq
* vhaddq
* vhsubq
* vqdmulhq
* vqrdmulhq
The vector-scalar variants perform a splat operation on the scalar
operand and then perform the same operations as their vector-vector
counterparts. Code generation is done accordingly (using LLVM IR 'insert'
and 'shuffle' operations which are later converted into an ARMvdup
SDNode).
Reviewers: simon_tatham, dmgreen, MarkMurrayARM, ostannard
Reviewed By: dmgreen
Subscribers: kristof.beyls, hiraditya, cfe-commits, llvm-commits
Tags: #clang, #llvm
Differential Revision: https://reviews.llvm.org/D74620
Summary:
This patch adds assembly-level support for a new Arm M-profile
architecture extension, Custom Datapath Extension (CDE).
A brief description of the extension is available at
https://developer.arm.com/architectures/instruction-sets/custom-instructions
The latest specification for CDE is currently a beta release and is
available at
https://static.docs.arm.com/ddi0607/aa/DDI0607A_a_armv8m_arm_supplement_cde.pdf
CDE allows chip vendors to add custom CPU instructions. The CDE
instructions re-use the same encoding space as existing coprocessor
instructions (such as MRC, MCR, CDP etc.). Each coprocessor in range
cp0-cp7 can be configured as either general purpose (GCP) or custom
datapath (CDEv1). This configuration is defined by the CPU vendor and
is provided to LLVM using 8 subtarget features: cdecp0 ... cdecp7.
The semantics of CDE instructions are implementation-defined, but the
instructions are guaranteed to be pure (that is, they are stateless,
they do not access memory or any registers except their explicit
inputs/outputs).
CDE requires the CPU to support at least Armv8.0-M mainline
architecture. CDE includes 3 sets of instructions:
* Instructions that operate on general purpose registers and NZCV
flags
* Instructions that operate on the S or D register file (require
either FP or MVE extension)
* Instructions that operate on the Q register file, require MVE
The user-facing names that can be specified on the command line are
the same as the 8 subtarget feature names. For example:
$ clang -target arm-none-none-eabi -march=armv8m.main+cdecp0+cdecp3
tells the compiler that the coprocessors 0 and 3 are configured as
CDEv1 and the remaining coprocessors are configured as GCP (which is
the default).
Reviewers: simon_tatham, ostannard, dmgreen, eli.friedman
Reviewed By: simon_tatham
Subscribers: kristof.beyls, hiraditya, cfe-commits, llvm-commits
Tags: #clang, #llvm
Differential Revision: https://reviews.llvm.org/D74044
This patch removes the explicit call graph for CUDA/HIP/OpenMP deferred
diagnostics generated during parsing since it is error prone due to
incomplete information about function declarations during parsing. In stead,
this patch does a post-parsing AST traverse and emits deferred diagnostics
based on the use graph implicitly generated during the traverse.
Differential Revision: https://reviews.llvm.org/D70172
norecurse function attr indicates the function is not called recursively
directly or indirectly.
Add norecurse to OpenCL functions, SYCL functions in device compilation
and CUDA/HIP kernels.
Although there is LLVM pass adding norecurse to functions, it only works
for whole-program compilation. Also FE adding norecurse can make that
pass run faster since functions with norecurse do not need to be checked
again.
Differential Revision: https://reviews.llvm.org/D73651
Add fixits for messaging self in MRR or using super, as the intent is
clear, and it turns out people do that a lot more than expected.
Allow for objc_direct_members on main interfaces, it's extremely useful
for internal only classes, and proves to be quite annoying for adoption.
Add some better warnings around properties direct/non-direct clashes (it
was done for methods but properties were a miss).
Add some errors when direct properties are marked @dynamic.
Radar-Id: rdar://problem/58355212
Signed-off-by: Pierre Habouzit <phabouzit@apple.com>
Differential Revision: https://reviews.llvm.org/D73755
Converting a pointer to an integer whose result cannot represented in the
integer type is undefined behavior is C and prohibited in C++. C++ already
has a diagnostic when casting. This adds a diagnostic for C.
Since this diagnostic uses the range of the conversion it also modifies
int-to-pointer-cast diagnostic to use a range.
Fixes PR8718: No warning on casting between pointer and non-pointer-sized int
Differential Revision: https://reviews.llvm.org/D72231
Add support for Master and Critical directive in the OMPIRBuilder. Both make use of a new common interface for emitting inlined OMP regions called `emitInlinedRegion` which was added in this patch as well.
Also this patch modifies clang to use the new directives when `-fopenmp-enable-irbuilder` commandline option is passed.
Reviewed By: jdoerfert
Differential Revision: https://reviews.llvm.org/D72304
This patch implements an almost complete handling of OpenMP
contexts/traits such that we can reuse most of the logic in Flang
through the OMPContext.{h,cpp} in llvm/Frontend/OpenMP.
All but construct SIMD specifiers, e.g., inbranch, and the device ISA
selector are define in `llvm/lib/Frontend/OpenMP/OMPKinds.def`. From
these definitions we generate the enum classes `TraitSet`,
`TraitSelector`, and `TraitProperty` as well as conversion and helper
functions in `llvm/lib/Frontend/OpenMP/OMPContext.{h,cpp}`.
The above enum classes are used in the parser, sema, and the AST
attribute. The latter is not a collection of multiple primitive variant
arguments that contain encodings via numbers and strings but instead a
tree that mirrors the `match` clause (see `struct OpenMPTraitInfo`).
The changes to the parser make it more forgiving when wrong syntax is
read and they also resulted in more specialized diagnostics. The tests
are updated and the core issues are detected as before. Here and
elsewhere this patch tries to be generic, thus we do not distinguish
what selector set, selector, or property is parsed except if they do
behave exceptionally, as for example `user={condition(EXPR)}` does.
The sema logic changed in two ways: First, the OMPDeclareVariantAttr
representation changed, as mentioned above, and the sema was adjusted to
work with the new `OpenMPTraitInfo`. Second, the matching and scoring
logic moved into `OMPContext.{h,cpp}`. It is implemented on a flat
representation of the `match` clause that is not tied to clang.
`OpenMPTraitInfo` provides a method to generate this flat structure (see
`struct VariantMatchInfo`) by computing integer score values and boolean
user conditions from the `clang::Expr` we keep for them.
The OpenMP context is now an explicit object (see `struct OMPContext`).
This is in anticipation of construct traits that need to be tracked. The
OpenMP context, as well as the `VariantMatchInfo`, are basically made up
of a set of active or respectively required traits, e.g., 'host', and an
ordered container of constructs which allows duplication. Matching and
scoring is kept as generic as possible to allow easy extension in the
future.
---
Test changes:
The messages checked in `OpenMP/declare_variant_messages.{c,cpp}` have
been auto generated to match the new warnings and notes of the parser.
The "subset" checks were reversed causing the wrong version to be
picked. The tests have been adjusted to correct this.
We do not print scores if the user did not provide one.
We print spaces to make lists in the `match` clause more legible.
Reviewers: kiranchandramohan, ABataev, RaviNarayanaswamy, gtbercea, grokos, sdmitriev, JonChesterfield, hfinkel, fghanim
Subscribers: merge_guards_bot, rampitec, mgorny, hiraditya, aheejin, fedor.sergeev, simoncook, bollu, guansong, dexonsmith, jfb, s.egerton, llvm-commits, cfe-commits
Tags: #clang, #llvm
Differential Revision: https://reviews.llvm.org/D71830
Summary:
Zero-parameter K&R definitions specify that the function has no
parameters, but they are still not prototypes, so calling the function
with the wrong number of parameters is just a warning, not an error.
The C11 standard doesn't seem to directly define what a prototype is,
but it can be inferred from 6.9.1p7: "If the declarator includes a
parameter type list, the list also specifies the types of all the
parameters; such a declarator also serves as a function prototype
for later calls to the same function in the same translation unit."
This refers to 6.7.6.3p5: "If, in the declaration “T D1”, D1 has
the form
D(parameter-type-list)
or
D(identifier-list_opt)
[...]". Later in 6.11.7 it also refers only to the parameter-type-list
variant as prototype: "The use of function definitions with separate
parameter identifier and declaration lists (not prototype-format
parameter type and identifier declarators) is an obsolescent feature."
We already correctly treat an empty parameter list as non-prototype
declaration, so we can just take that information.
GCC also warns about this with -Wstrict-prototypes.
This shouldn't affect C++, because there all FunctionType's are
FunctionProtoTypes. I added a simple test for that.
Reviewed By: aaron.ballman
Differential Revision: https://reviews.llvm.org/D66919
This reverts commit 0a1123eb43.
Want to revert this because it's causing trouble for PowerPC
I also fixed test fp-model.c which was looking for an incorrect error message
Having tests that depend on clang inside llvm/ are not a good idea since
it can break incremental `ninja check-llvm`.
Fixes https://llvm.org/PR44798
Reviewed By: lebedev.ri, MaskRay, rsmith
Differential Revision: https://reviews.llvm.org/D74051
Summary: Adds the RedHat Linux triple to the list of 64-bit RISC-V triples.
Without this the gcc libraries wouldn't be found by clang on a redhat/fedora
system, as the search list included `/usr/lib/gcc/riscv64-redhat-linux-gnu`
but the correct path didn't include the `-gnu` suffix.
Reviewers: lenary, asb, dlj
Reviewed By: lenary
Tags: #clang
Differential Revision: https://reviews.llvm.org/D74399
The code generation is exactly the same as it was.
But not that the special handling of untied tasks is still handled by
emitUntiedSwitch in clang.
Differential Revision: https://reviews.llvm.org/D69828
According to OpenMP 5.0, cancel and cancellation point constructs are
supported in taskloop directive. Added support for cancellation in
taskloop, master taskloop and parallel master taskloop.
Summary:
Both EOF and the max value of unsigned char is platform dependent. In this
patch we try our best to deduce the value of EOF from the Preprocessor,
if we can't we fall back to -1.
Reviewers: Szelethus, NoQ
Subscribers: whisperity, xazax.hun, kristof.beyls, baloghadamsoftware, szepet, rnkovacs, a.sidorin, mikhail.ramalh
Tags: #clang
Differential Revision: https://reviews.llvm.org/D74473
When the clang baremetal driver selects the rt.builtins static library
it prefix with "-l" and appends ".a". The result is a nonsense option
which lld refuses to accept.
Differential Revision: https://reviews.llvm.org/D73904
Change-Id: Ic753b6104e259fbbdc059b68fccd9b933092d828
Finalization can introduce new blocks we need to outline as well so it
makes sense to identify the blocks that need to be outlined after
finalization happened. There was also a minor unit test adjustment to
account for the fact that we have a single outlined exit block now.
This is a longstanding bug that seems to have been hidden by
a combination of (1) the normal flow being to deserialize the
interface before deserializing its parameter and (2) a precise
ordering of work that was apparently recently disturbed,
perhaps by my abstract-serialization work or Bruno's ObjC
module merging work.
Fixes rdar://59153545.
As discussed in https://reviews.llvm.org/D74447, this patch disables integrated-cc1 behavior if there's more than one job to be executed. This is meant to limit memory bloating, given that currently jobs don't clean up after execution (-disable-free is always active in cc1 mode).
I see this behavior as temporary until release 10.0 ships (to ease merging of this patch), then we'll reevaluate the situation, see if D74447 makes more sense on the long term.
Differential Revision: https://reviews.llvm.org/D74490
This patch is a follow up to 878a24ee24. Name of bitfields
with value-dependent width should be set as type-dependent. This
patch adds the required value-dependency check and sets the
type-dependency accordingly.
Patch fixes PR44886
Differential revision: https://reviews.llvm.org/D72242
This is to avoid performance regressions when the default attribute
behavior is fixed to assume ieee.
I tested the default on x86_64 ubuntu, which seems to default to
FTZ/DAZ, but am guessing for x86 and PS4.
Summary:
GNU objdump prints the file format in lowercase, e.g. `elf64-x86-64`. llvm-objdump prints `ELF64-x86-64` right now, even though piping that into llvm-objcopy refuses that as a valid arch to use.
As an example of a problem this causes, see: https://github.com/ClangBuiltLinux/linux/issues/779
Reviewers: MaskRay, jhenderson, alexshap
Reviewed By: MaskRay
Subscribers: tpimh, sbc100, grimar, jvesely, nhaehnle, kerbowa, cfe-commits, llvm-commits
Tags: #clang, #llvm
Differential Revision: https://reviews.llvm.org/D74433
This reverts commit 99c5bcbce8.
Change clang option -ffp-model=precise to select ffp-contract=on
Including some small touch-ups to the original commit
Reviewers: rjmccall, Andy Kaylor
Differential Revision: https://reviews.llvm.org/D74436
We previously checked the constraints of instantiated function templates even in cases where
PartialOverloading was true and not all template arguments have been deduced, which caused crashes
in clangd (bug 44714).
We now check if all arguments have been deduced before checking constraints in partial overloading
scenarios.
Summary:
This is trying to implement the functionality proposed in:
http://lists.llvm.org/pipermail/cfe-dev/2017-April/053417.html
An exception can throw, but no cleanup is going to happen.
A module compiled with exceptions on, can catch the exception throws
from module compiled with -fignore-exceptions.
The use cases for enabling this option are:
1. Performance analysis of EH instrumentation overhead
2. The ability to QA non EH functionality when EH functionality is not available.
3. User of EH enabled headers knows the calls won't throw in their program and
wants the performance gain from ignoring EH construct.
The implementation tried to accomplish that by removing any landing pad code
that might get generated.
Reviewed by: aaron.ballman
Differential Revision: https://reviews.llvm.org/D72644
This patch enables the debug entry values feature.
- Remove the (CC1) experimental -femit-debug-entry-values option
- Enable it for x86, arm and aarch64 targets
- Resolve the test failures
- Leave the llc experimental option for targets that do not
support the CallSiteInfo yet
Differential Revision: https://reviews.llvm.org/D73534
This brings back 2af74e27ed and reverts
eaabaf7e04.
The changes were correct, the code that was broken contained an ODR
violation that assumed that these types are passed equivalently:
struct alignas(uint64_t) Wrapper { uint64_t P };
void f(uint64_t p);
void f(Wrapper p);
MSVC does not pass them the same way, and so clang-cl should not pass
them the same way either.
The function attributes xray-skip-entry, xray-skip-exit, and
xray-ignore-loops were only being applied if a function had an
xray-instrument attribute, but they should apply if xray is enabled
globally too.
Differential Revision: https://reviews.llvm.org/D73842
This patch adds the support required for using the __riscv_save and
__riscv_restore libcalls to implement a size-optimization for prologue
and epilogue code, whereby the spill and restore code of callee-saved
registers is implemented by common functions to reduce code duplication.
Logic is also included to ensure that if both this optimization and
shrink wrapping are enabled then the prologue and epilogue code can be
safely inserted into the basic blocks chosen by shrink wrapping.
Differential Revision: https://reviews.llvm.org/D62686
directive.
According to OpenMP 5.0, The atomic_default_mem_order clause specifies the default memory ordering behavior for atomic constructs that must be provided by an implementation. If the default memory ordering is specified as seq_cst, all atomic constructs on which memory-order-clause is not specified behave as if the seq_cst clause appears. If the default memory ordering is specified as relaxed, all atomic constructs on which memory-order-clause is not specified behave as if the relaxed clause appears.
If the default memory ordering is specified as acq_rel, atomic constructs on which memory-order-clause is not specified behave as if the release clause appears if the atomic write or atomic update operation is specified, as if the acquire clause appears if the atomic read operation is specified, and as if the acq_rel clause appears if the atomic captured update operation is specified.
Added a test for #pragma clang __debug llvm_fatal_error to test for the original issue.
Added llvm::sys::Process::Exit() and replaced ::exit() in places where it was appropriate. This new function would call the current CrashRecoveryContext if one is running on the same thread; or call ::exit() otherwise.
Fixes PR44705.
Differential Revision: https://reviews.llvm.org/D73742
Added restrictions for atomic directive.
1. If atomic-clause is read then memory-order-clause must not be acq_rel or release.
2. If atomic-clause is write then memory-order-clause must not be
acq_rel or acquire.
3. If atomic-clause is update or not present then memory-order-clause
must not be acq_rel or acquire.
The C++ rules briefly allowed this, but the rule changed nearly 10 years
ago and we never updated our implementation to match. However, we've
warned on this by default for a long time, and no other compiler accepts
(even as an extension).
-march=armv8.1-m.main+mve.fp+nomve -mfpu=none should disable FP
registers and instructions moving to/from FP registers.
This patch fixes the case when "+mve" (added to the feature list by
"+mve.fp"), is followed by "-mve" (added by "+nomve").
Differential Revision: https://reviews.llvm.org/D72633
Summary:
When renaming a class with template constructors, we are missing the
occurrences of the template constructors, because getUSRsForDeclaration doesn't
give USRs of the templated constructors (they are not in the normal `ctors()`
method).
Reviewers: kbobyrev
Subscribers: jkorous, arphaman, kadircet, usaxena95, cfe-commits
Tags: #clang
Differential Revision: https://reviews.llvm.org/D74216
Null-check and adjut a TypeLoc before casting it to a FunctionTypeLoc.
This fixes a crash in -fsanitize=nullability-return, and also makes the
location of the nonnull type available when the return type is adjusted.
rdar://59263039
Differential Revision: https://reviews.llvm.org/D74355
1) Fix a regression in llvmorg-11-init-2485-g0e3a4877840 that would
reject some cases where a class name is shadowed by a typedef-name
causing a destructor declaration to be rejected. Prefer a tag type over
a typedef in destructor name lookup.
2) Convert the "type in destructor declaration is a typedef" error to an
error-by-default ExtWarn to allow codebases to turn it off. GCC and MSVC
do not enforce this rule.
When specifying -march=arch[8|9|10], those CPU types do NOT support
the vector extension. In this case the vector ABI must be disabled.
The generated data layout should NOT contain 64-v128.
Reviewers: uweigand
Differential Revision: https://reviews.llvm.org/D74146
Implement protection against the stack clash attack [0] through inline stack
probing.
Probe stack allocation every PAGE_SIZE during frame lowering or dynamic
allocation to make sure the page guard, if any, is touched when touching the
stack, in a similar manner to GCC[1].
This extends the existing `probe-stack' mechanism with a special value `inline-asm'.
Technically the former uses function call before stack allocation while this
patch provides inlined stack probes and chunk allocation.
Only implemented for x86.
[0] https://www.qualys.com/2017/06/19/stack-clash/stack-clash.txt
[1] https://gcc.gnu.org/ml/gcc-patches/2017-07/msg00556.html
This a recommit of 39f50da2a3 with proper LiveIn
declaration, better option handling and more portable testing.
Differential Revision: https://reviews.llvm.org/D68720
Implement protection against the stack clash attack [0] through inline stack
probing.
Probe stack allocation every PAGE_SIZE during frame lowering or dynamic
allocation to make sure the page guard, if any, is touched when touching the
stack, in a similar manner to GCC[1].
This extends the existing `probe-stack' mechanism with a special value `inline-asm'.
Technically the former uses function call before stack allocation while this
patch provides inlined stack probes and chunk allocation.
Only implemented for x86.
[0] https://www.qualys.com/2017/06/19/stack-clash/stack-clash.txt
[1] https://gcc.gnu.org/ml/gcc-patches/2017-07/msg00556.html
This a recommit of 39f50da2a3 with proper LiveIn
declaration, better option handling and more portable testing.
Differential Revision: https://reviews.llvm.org/D68720
Add support for Master and Critical directive in the OMPIRBuilder. Both make use of a new common interface for emitting inlined OMP regions called `emitInlinedRegion` which was added in this patch as well.
Also this patch modifies clang to use the new directives when `-fopenmp-enable-irbuilder` commandline option is passed.
Reviewed By: jdoerfert
Differential Revision: https://reviews.llvm.org/D72304
Implement protection against the stack clash attack [0] through inline stack
probing.
Probe stack allocation every PAGE_SIZE during frame lowering or dynamic
allocation to make sure the page guard, if any, is touched when touching the
stack, in a similar manner to GCC[1].
This extends the existing `probe-stack' mechanism with a special value `inline-asm'.
Technically the former uses function call before stack allocation while this
patch provides inlined stack probes and chunk allocation.
Only implemented for x86.
[0] https://www.qualys.com/2017/06/19/stack-clash/stack-clash.txt
[1] https://gcc.gnu.org/ml/gcc-patches/2017-07/msg00556.html
This a recommit of 39f50da2a3 with better option
handling and more portable testing
Differential Revision: https://reviews.llvm.org/D68720
LLVMgold.so tests are duplicated in several places. Deduplicate them.
Move the tests to lto.c and lto.cu
Specify -fuse-ld=bfd or -fuse-ld=gold.
In a future change, if -fuse-ld=lld or CLANG_DEFAULT_LINKER=lld without -fuse-ld=, we will remove -plugin /path/to/LLVMgold.so
Also add extension warnings for the cases that are disallowed by the
current rules for destructor name lookup, refactor and simplify the
lookup code, and improve the diagnostic quality when lookup fails.
The special case we previously supported for converting
p->N::S<int>::~S() from naming a class template into naming a
specialization thereof is subsumed by a more general rule here (which is
also consistent with Clang's historical behavior and that of other
compilers): if we can't find a suitable S in N, also look in N::S<int>.
The extension warnings are off by default, except for a warning when
lookup for p->N::S::~T() looks for T in scope instead of in N (or N::S).
That seems sufficiently heinous to warn on by default, especially since
we can't support it for a dependent nested-name-specifier.
These temporaries are only used in the callee, and their memory can be reused
after the call is complete.
rdar://58552124
Differential revision: https://reviews.llvm.org/D74094
Summary:
Due to a recent (but retroactive) C++ rule change, only sufficiently
C-compatible classes are permitted to be given a typedef name for
linkage purposes. Add an enabled-by-default warning for these cases, and
rephrase our existing error for the case where we encounter the typedef
name for linkage after we've already computed and used a wrong linkage
in terms of the new rule.
Reviewers: rjmccall
Subscribers: cfe-commits
Tags: #clang
Differential Revision: https://reviews.llvm.org/D74103
Implement protection against the stack clash attack [0] through inline stack
probing.
Probe stack allocation every PAGE_SIZE during frame lowering or dynamic
allocation to make sure the page guard, if any, is touched when touching the
stack, in a similar manner to GCC[1].
This extends the existing `probe-stack' mechanism with a special value `inline-asm'.
Technically the former uses function call before stack allocation while this
patch provides inlined stack probes and chunk allocation.
Only implemented for x86.
[0] https://www.qualys.com/2017/06/19/stack-clash/stack-clash.txt
[1] https://gcc.gnu.org/ml/gcc-patches/2017-07/msg00556.html
This a recommit of 39f50da2a3 with correct option
flags set.
Differential Revision: https://reviews.llvm.org/D68720
patch from Philippe Daouadi <blastrock@free.fr>
This is an attempt to fix
[PR#44368](https://bugs.llvm.org/show_bug.cgi?id=44368)
This effectively reverts [D1783](https://reviews.llvm.org/D1783). It
doesn't break the current tests and fixes the test that this commit
adds.
We now decide of a lambda linkage only depending on the visibility of
its parent context.
Differential Revision: https://reviews.llvm.org/D73701
Address space conversion changes pointer representation.
This commit disallows such conversions when they are not
legal i.e. for the nested pointers even with compatible
address spaces. Because the address space conversion in
the nested levels can't be generated to modify the pointers
correctly. The behavior implemented is as follows:
- Any implicit conversions of nested pointers with different
address spaces is rejected.
- Any conversion of address spaces in nested pointers in safe
casts (e.g. const_cast or static_cast) is rejected.
- Conversion in low level C-style or reinterpret_cast is accepted
but with a warning (this aligns with OpenCL C behavior).
Fixes PR39674
Differential Revision: https://reviews.llvm.org/D73360
This reverts commit 39f50da2a3.
The -fstack-clash-protection is being passed to the linker too, which
is not intended.
Reverting and fixing that in a later commit.
Summary:
Following the AAPCS, every store to a volatile bit-field requires to generate one load of that field, even if all the bits are going to be replaced.
This patch allows the user to opt-in in following such rule, whenever the a.
AAPCS Release 2019Q1.1 (https://static.docs.arm.com/ihi0042/g/aapcs32.pdf)
section 8.1 Data Types, page 35, paragraph: Volatile bit-fields – preserving number and width of container accesses
```
When a volatile bit-field is written, and its container does not overlap with any non-bit-field member, its
container must be read exactly once and written exactly once using the access width appropriate to the
type of the container. The two accesses are not atomic.
```
Reviewers: lebedev.ri, ostannard, jfb, eli.friedman
Reviewed By: jfb
Subscribers: rsmith, rjmccall, dexonsmith, kristof.beyls, jfb, cfe-commits
Tags: #clang
Differential Revision: https://reviews.llvm.org/D67399
Implement protection against the stack clash attack [0] through inline stack
probing.
Probe stack allocation every PAGE_SIZE during frame lowering or dynamic
allocation to make sure the page guard, if any, is touched when touching the
stack, in a similar manner to GCC[1].
This extends the existing `probe-stack' mechanism with a special value `inline-asm'.
Technically the former uses function call before stack allocation while this
patch provides inlined stack probes and chunk allocation.
Only implemented for x86.
[0] https://www.qualys.com/2017/06/19/stack-clash/stack-clash.txt
[1] https://gcc.gnu.org/ml/gcc-patches/2017-07/msg00556.html
Differential Revision: https://reviews.llvm.org/D68720
This reverts commits f41ec709d9 and 5fedc2b410. On some buildbots, Clang :: Driver/crash-report.c is broken with:
```
Command Output (stderr):
--
/home/buildslave/ps4-buildslave1/clang-with-thin-lto-ubuntu/llvm-project/clang/test/Driver/crash-report.c:48:11: error: CHECK: expected string not found in input
// CHECK: Preprocessed source(s) and associated run script(s) are located at:
^
<stdin>:1:1: note: scanning from here
/home/buildslave/ps4-buildslave1/clang-with-thin-lto-ubuntu/llvm-project/clang/test/Driver/crash-report.c:50:1: error: unknown type name 'BAZ'
```
Example: http://lab.llvm.org:8011/builders/clang-with-thin-lto-ubuntu/builds/21321/steps/test-stage1-compiler/logs/stdio
With REQUIRES: x86-register-target added to the tests.
Also remove some unneeded FIXMEs
But add a FIXME for bad IR generation for FMADDSUB/FMSUBADD with
constrained FP.
Original patch by Kevin P. Neal
constant initialization.
Removing this zeroing regressed our code generation in a few cases, also
fixed here. We now compute whether a variable has constant destruction
even if it doesn't have a constant initializer, by trying to destroy a
default-initialized value, and skip emitting a trivial default
constructor for a variable even if it has non-trivial (but perhaps
constant) destruction.
This reverts commit 208470dd5d.
Tests fail:
error: unable to create target: 'No available targets are compatible with triple "x86_64-apple-darwin"'
This happens on clang-hexagon-elf, clang-cmake-armv7-quick, and
clang-cmake-armv7-quick bots.
If anyone has any suggestions on why then I'm all ears.
Differential Revision: https://reviews.llvm.org/D73570
Revert "[FPEnv][X86] Speculative fix for failures introduced by eda495426."
This reverts commit 80e17e5fcc.
The speculative fix didn't solve the test failures on Hexagon, ARMv6, and
MSVC AArch64.
We would incorrectly check whether the type-constraint had already been initialized, causing us
to ignore the invented template type constraints entirely.
Also, TemplateParameterList would store incorrect information about invented type parameters
when it observed them before their type-constraint was initialized, so we recreate it after
initializing the function type of an abbreviated template.
Previously, when using '-MF file.d' on the command line, 'file.d' would not be deleted after a compiler crash.
The code path in Compilation::initCompilationForDiagnostics() that was modifying 'TranslatedArgs' had no effect, because 'TCArgs' was already created after the crash.
This was covered by clang/test/Driver/output-file-cleanup.c, the test was succeeding by fluke because Driver::generateCompilationDiagnostics() would fail to launch the subsequent clang -E (see D74070 for a fix for this). So the test was only covering Driver.cpp, C.CleanupFileMap().
After this patch, both cleanup and removal of -MF are exercised.
Differential Revision: https://reviews.llvm.org/D74076
Previously, when the above '#pragma clang __debug' were used, Driver::generateCompilationDiagnostics() wouldn't work as expected.
The 'clang -E' process created for diagnostics would crash, because it would reach again the intended crash in Pragma.cpp, PragmaDebugHandler::HandlePragma() while preprocessing.
When generating crash diagnostics, we now disable the intended crashing behavior with a new cc1 flag -disable-pragma-debug-crash.
Notes:
- #pragma clang __debug llvm_report_fatal isn't currently tested by crash-report.c, because it needs exit() to be handled differently in -fintegrated-cc1 mode. See https://reviews.llvm.org/D73742 for an upcoming fix.
- This is also needed to further validate that -MF is removed from the 'clang -E ' crash diagnostic cmd-line (currently not the case). See https://reviews.llvm.org/D74076 for an upcoming fix.
Differential Revision: https://reviews.llvm.org/D74070
whether a call is to a builtin.
We already had a general mechanism to do this but for some reason
weren't using it. In passing, check for the other unary operators that
can intervene in a reasonably-direct function call (we already handled
'&' but missed '*' and '+').
This reverts commit aaae6b1b61,
reinstating af80b8ccc5, with a fix to
clang-tidy.
When constrained floating point is enabled the X86-specific builtins don't
use constrained intrinsics in some cases. Fix that.
Differential Revision: https://reviews.llvm.org/D73570
If the return block is unreachable, clang removes it in
CodeGenFunction::FinishFunction(). This removal can leave dangling
references to values defined in the return block if the return block has
successors, which it /would/ if UBSan's return value check is emitted.
In this case, as the UBSan check wouldn't be reachable, it's better to
simply not emit it.
rdar://59196131
Summary:
- Similar to other targets, instead of passing a toolchain, a driver
argument should be passed into `arm::getARMTargetFeatures`. Aslo, that
routine should honor the specified triple. Refactor
`arm::getARMFloatABI` with 2 separate interfaces. One has the original
parameters and the other uses the driver and the specified triple.
- That fixes an issue when target & features are queried during the
offload compilation, where the specified triple should be checked
instead of a effective triple. A previously failed test is re-enabled.
Subscribers: kristof.beyls, cfe-commits
Tags: #clang
Differential Revision: https://reviews.llvm.org/D74020
Summary:
As a first step this implementation enables compilation of the offload
code.
Reviewers: ABataev
Subscribers: ebevhan, Anastasia, cfe-commits
Tags: #clang
Differential Revision: https://reviews.llvm.org/D74048
Removed some #ifdefs specific to Windows handling of VFS paths. This
eliminates most of the differences between the Windows and non-Windows
code paths.
Making this work required some changes to account for the fact that VFS
file paths can be Posix style or Windows style, so you cannot just assume
that they use the host's native path style. In one case, this means
implementing our own version of make_absolute, since the filesystem code
in Support doesn't have styles in the sense that the path code does.
Differential Review: https://reviews.llvm.org/D71092
Currently when generating debug-info for a BlockDecl we are setting the Name to the mangled name and not setting the LinkageName.
This means we see the mangled name for block invcations ends up in DW_AT_Name and not in DW_AT_linkage_name.
This patch fixes this case so that we also set the LinkageName as well.
Differential Revision: https://reviews.llvm.org/D73282
STL Algorithms are usually implemented in a tricky for performance
reasons which is too complicated for the analyzer. Furthermore inlining
them is costly. Instead of inlining we should model their behavior
according to the specifications.
This patch is the first step towards STL Algorithm modeling. It models
all the `find()`-like functions in a simple way: the result is either
found or not. In the future it can be extended to only return success if
container modeling is also extended in a way the it keeps track of
trivial insertions and deletions.
Differential Revision: https://reviews.llvm.org/D70818
Summary:
For now, this ABI simply expands all possible aggregate arguments and
returns all possible aggregates directly. This ABI will change rapidly
as we prototype and benchmark a new ABI that takes advantage of
multivalue return and possibly other changes from the MVP ABI.
Reviewers: aheejin, dschuff
Subscribers: sbc100, jgravelle-google, sunfish, cfe-commits
Tags: #clang
Differential Revision: https://reviews.llvm.org/D72972
We did not have a CXXThisScope around constraint checking of functions and
function template specializations, causing a crash when checking a constraint
that had a 'this' (bug 44689).
Recommit after fixing test.
We did not have a CXXThisScope around constraint checking of functions and
function template specializations, causing a crash when checking a constraint
that had a 'this' (bug 44689)
Add support for Flush in the OMPIRBuilder. This patch also adds changes
to clang to use the OMPIRBuilder when '-fopenmp-enable-irbuilder'
commandline option is used.
Reviewed By: jdoerfert
Differential Revision: https://reviews.llvm.org/D70712
Summary:
- The device compilation needs to have a consistent source code compared
to the corresponding host compilation. If macros based on the
host-specific target processor is not properly populated, the device
compilation may fail due to the inconsistent source after the
preprocessor. So far, only the host triple is used to build the
macros. If a detailed host CPU target or certain features are
specified, macros derived from them won't be populated properly, e.g.
`__SSE3__` won't be added unless `+sse3` feature is present. On
Windows compilation compatible with MSVC, that missing macros result
in that intrinsics are not included and cause device compilation
failure on the host-side source.
- This patch addresses this issue by introducing two `cc1` options,
i.e., `-aux-target-cpu` and `-aux-target-feature`. If a specific host
CPU target or certain features are specified, the compiler driver will
append them during the construction of the offline compilation
actions. Then, the toolchain in `cc1` phase will populate macros
accordingly.
- An internal option `--gpu-use-aux-triple-only` is added to fall back
the original behavior to help diagnosing potential issues from the new
behavior.
Reviewers: tra, yaxunl
Subscribers: cfe-commits
Tags: #clang
Differential Revision: https://reviews.llvm.org/D73942
Summary:
Changes:
- Calls to consteval function are now evaluated in constant context but IR is still generated for them.
- Add diagnostic for taking address of a consteval function in non-constexpr context.
- Add diagnostic for address of consteval function accessible at runtime.
- Add tests
Reviewers: rsmith, aaron.ballman
Reviewed By: rsmith
Subscribers: mgrang, riccibruno, cfe-commits
Tags: #clang
Differential Revision: https://reviews.llvm.org/D63960
AMDGPU and x86 at least both have separate controls for whether
denormal results are flushed on output, and for whether denormals are
implicitly treated as 0 as an input. The current DAGCombiner use only
really cares about the input treatment of denormals.
Linux commit
1cf5b23988 (diff-289313b9fec99c6f0acfea19d9cfd949)
uses "#pragma clang attribute push (__attribute__((preserve_access_index)),
apply_to = record)"
to apply CO-RE relocations to all records including the following pattern:
#pragma clang attribute push (__attribute__((preserve_access_index)), apply_to = record)
typedef struct {
int a;
} __t;
#pragma clang attribute pop
int test(__t *arg) { return arg->a; }
The current approach to use struct type in the relocation record will
result in an anonymous struct, which make later type matching difficult
in bpf loader. In fact, current BPF backend will fail the above program
with assertion:
clang: ../lib/Target/BPF/BPFAbstractMemberAccess.cpp:796: ...
Assertion `TypeName.size()' failed.
The patch use the base lvalue type for the "base" value to annotate
preservee_{struct,union}_access_index intrinsics. In the above example,
the type will be "__t" which preserved the type name.
Differential Revision: https://reviews.llvm.org/D73900
The tool is now looked for in the source directory rather than in the
install directory, which should exclude the problems with not being able
to find it.
The tests still aren't being run on Windows, but they hopefully will run
on other platforms that have shell, which hopefully also means Perl.
Differential Revision: https://reviews.llvm.org/D69781
Summary:
Clang -fpic defaults to -fno-semantic-interposition (GCC -fpic defaults
to -fsemantic-interposition).
Users need to specify -fsemantic-interposition to get semantic
interposition behavior.
Semantic interposition is currently a best-effort feature. There may
still be some cases where it is not handled well.
Reviewers: peter.smith, rnk, serge-sans-paille, sfertile, jfb, jdoerfert
Subscribers: dschuff, jyknight, dylanmckay, nemanjai, jvesely, kbarton, fedor.sergeev, asb, rbar, johnrusso, simoncook, sabuasal, niosHD, jrtc27, zzheng, edward-jones, atanasyan, rogfer01, MartinMosbeck, brucehoult, the_o, arphaman, PkmX, jocewei, jsji, Jim, lenary, s.egerton, pzheng, sameer.abuasal, apazos, luismarques, cfe-commits
Tags: #clang
Differential Revision: https://reviews.llvm.org/D73865
We previously instantiated type-constraints of template type parameters along with the type parameter itself,
this caused problems when the type-constraints created by abbreviated templates refreneced other parameters
in the abbreviated templates.
When encountering a template type parameter with a type constraint, if it is implicit, delay instantiation of
the type-constraint until the function parameter which created the invented template type parameter is
instantiated.
Reland after fixing bug caused by another flow reaching SubstParmVarDecl and instantiating the TypeConstraint
a second time.
Add support for Master and Critical directive in the OMPIRBuilder. Both make use of a new common interface for emitting inlined OMP regions called `emitInlinedRegion` which was added in this patch as well.
Also this patch modifies clang to use the new directives when `-fopenmp-enable-irbuilder` commandline option is passed.
Reviewed By: jdoerfert
Differential Revision: https://reviews.llvm.org/D72304
We previously instantiated type-constraints of template type parameters along with the type parameter itself,
this caused problems when the type-constraints created by abbreviated templates refreneced other parameters
in the abbreviated templates.
When encountering a template type parameter with a type constraint, if it is implicit, delay instantiation of
the type-constraint until the function parameter which created the invented template type parameter is
instantiated.
Summary:
In big-endian MVE, the simple vector load/store instructions (i.e.
both contiguous and non-widening) don't all store the bytes of a
register to memory in the same order: it matters whether you did a
VSTRB.8, VSTRH.16 or VSTRW.32. Put another way, the in-register
formats of different vector types relate to each other in a different
way from the in-memory formats.
So, if you want to 'bitcast' or 'reinterpret' one vector type as
another, you have to carefully specify which you mean: did you want to
reinterpret the //register// format of one type as that of the other,
or the //memory// format?
The ACLE `vreinterpretq` intrinsics are specified to reinterpret the
register format. But I had implemented them as LLVM IR bitcast, which
is specified for all types as a reinterpretation of the memory format.
So a `vreinterpretq` intrinsic, applied to values already in registers,
would code-generate incorrectly if compiled big-endian: instead of
emitting no code, it would emit a `vrev`.
To fix this, I've introduced a new IR intrinsic to perform a
register-format reinterpretation: `@llvm.arm.mve.vreinterpretq`. It's
implemented by a trivial isel pattern that expects the input in an
MQPR register, and just returns it unchanged.
In the clang codegen, I only emit this new intrinsic where it's
actually needed: I prefer a bitcast wherever it will have the right
effect, because LLVM understands bitcasts better. So we still generate
bitcasts in little-endian mode, and even in big-endian when you're
casting between two vector types with the same lane size.
For testing, I've moved all the codegen tests of vreinterpretq out
into their own file, so that they can have a different set of RUN
lines to check both big- and little-endian.
Reviewers: dmgreen, MarkMurrayARM, miyuki, ostannard
Reviewed By: dmgreen
Subscribers: kristof.beyls, hiraditya, cfe-commits, llvm-commits
Tags: #clang, #llvm
Differential Revision: https://reviews.llvm.org/D73786
Summary:
These instructions generate a vector of consecutive elements starting
from a given base value and incrementing by 1, 2, 4 or 8. The `wdup`
versions also wrap the values back to zero when they reach a given
limit value. The instruction updates the scalar base register so that
another use of the same instruction will continue the sequence from
where the previous one left off.
At the IR level, I've represented these instructions as a family of
target-specific intrinsics with two return values (the constructed
vector and the updated base). The user-facing ACLE API provides a set
of intrinsics that throw away the written-back base and another set
that receive it as a pointer so they can update it, plus the usual
predicated versions.
Because the intrinsics return two values (as do the underlying
instructions), the isel has to be done in C++.
This is the first family of MVE intrinsics that use the `imm_1248`
immediate type in the clang Tablegen framework, so naturally, I found
I'd given it the wrong C integer type. Also added some tests of the
check that the immediate has a legal value, because this is the first
time those particular checks have been exercised.
Finally, I also had to fix a bug in MveEmitter which failed an
assertion when I nested two `seq` nodes (the inner one used to extract
the two values from the pair returned by the IR intrinsic, and the
outer one put on by the predication multiclass).
Reviewers: dmgreen, MarkMurrayARM, miyuki, ostannard
Reviewed By: dmgreen
Subscribers: kristof.beyls, hiraditya, cfe-commits, llvm-commits
Tags: #clang, #llvm
Differential Revision: https://reviews.llvm.org/D73357
Summary:
The unpredicated case of this is trivial: the clang codegen just makes
a vector splat of the input, and LLVM isel is already prepared to
handle that. For the predicated version, I've generated a `select`
between the same vector splat and the `inactive` input parameter, and
added new Tablegen isel rules to match that pattern into a predicated
`MVE_VDUP` instruction.
Reviewers: dmgreen, MarkMurrayARM, miyuki, ostannard
Reviewed By: dmgreen
Subscribers: kristof.beyls, hiraditya, cfe-commits, llvm-commits
Tags: #clang, #llvm
Differential Revision: https://reviews.llvm.org/D73356
During the review of D73007 Aaron Puchert mentioned
`warn_for_range_variable_always_copy` shouldn't be part of -Wall since
some coding styles require `for(const auto &bar : bars)`. This warning
would cause false positives for these users. Based on Aaron's proposal
refactored the warnings:
* -Wrange-loop-construct warns about possibly unintended constructor
calls. This is part of -Wall. It contains
* warn_for_range_copy: loop variable A of type B creates a copy from
type C
* warn_for_range_const_reference_copy: loop variable A is initialized
with a value of a different type resulting in a copy
* -Wrange-loop-bind-reference warns about misleading use of reference
types. This is not part of -Wall. It contains
* warn_for_range_variable_always_copy: loop variable A is always a copy
because the range of type B does not return a reference
Differential Revision: https://reviews.llvm.org/D73434
Driver errors if -fomit-frame-pointer is used together with -pg.
useFramePointerForTargetByDefault() returns true if -pg is specified.
=>
(!OmitFP && useFramePointerForTargetByDefault(Args, Triple)) is true
=>
We cannot get FramePointerKind::None
When T is a class type, only nvsize(T) bytes need be accessible through
the reference. We had matching bugs in the application of the
dereferenceable attribute and in -fsanitize=undefined.
Summary: Just like templates, they are excepted from the ODR rule.
Reviewed By: aaron.ballman, rsmith
Differential Revision: https://reviews.llvm.org/D68923
types are needed to compute the return type of a defaulted operator<=>.
This raises the question of what to do if return type deduction fails.
The standard doesn't say, and implementations vary, so for now reject
that case eagerly to keep our options open.
isDeclarationSpecifiers did not handle some cases of placeholder-type-specifiers with
type-constraints, causing parsing bugs in abbreviated constructor templates.
Add comprehensive handling of type-constraints to isDeclarationSpecifier.
We previously would not correctly for the initial parameter mapping for variadic template parameters in Concepts.
Testing this lead to the discovery that with the normalization process we would need to substitute into already-substituted-into
template arguments, which means we need to add NonTypeTemplateParmExpr support to TemplateInstantiator.
We do that by substituting into the replacement and the type separately, and then re-checking the expression against the NTTP
with the new type, in order to form any new required implicit casts (for cases where the type of the NTTP was dependent).
First attempt at implementing -fsemantic-interposition.
Rely on GlobalValue::isInterposable that already captures most of the expected
behavior.
Rely on a ModuleFlag to state whether we should respect SemanticInterposition or
not. The default remains no.
So this should be a no-op if -fsemantic-interposition isn't used, and if it is,
isInterposable being already used in most optimisation, they should honor it
properly.
Note that it only impacts architecture compiled with -fPIC and no pie.
Differential Revision: https://reviews.llvm.org/D72829
Add fixits for messaging self in MRR or using super, as the intent is
clear, and it turns out people do that a lot more than expected.
Allow for objc_direct_members on main interfaces, it's extremely useful
for internal only classes, and proves to be quite annoying for adoption.
Add some better warnings around properties direct/non-direct clashes (it
was done for methods but properties were a miss).
Radar-Id: rdar://problem/58355212
Signed-off-by: Pierre Habouzit <phabouzit@apple.com>
For non direct methods, the codegen uses the type of the Implementation.
Because Objective-C rules allow some differences between the Declaration
and Implementation return types, when the Implementation is in this
translation unit, the type of the Implementation should be preferred to
emit the Function over the Declaration.
Radar-Id: rdar://problem/58797748
Signed-off-by: Pierre Habouzit <phabouzit@apple.com>
Differential Revision: https://reviews.llvm.org/D73208
A constrained function with an auto return type would have it's definition
instantiated in order to deduce the auto return type before the constraints
are checked.
Move the constraints check after the return type deduction.
when building a defaulted comparison.
As a convenient way of asking whether `x @ y` is valid and building it,
we previouly always performed overload resolution and built an
overloaded expression, which would both end up picking a builtin
operator candidate when given a non-overloadable type. But that's not
quite right, because it can result in our finding a user-declared
operator overload, which we should never do when applying operators
non-overloadable types.
Handle this more correctly: skip overload resolution when building
`x @ y` if the operands are not overloadable. But still perform overload
resolution (considering only builtin candidates) when checking validity,
as we don't have any other good way to ask whether a binary operator
expression would be valid.
We previously checked for containsUnexpandedParameterPack in CSEs by observing the property
in the converted arguments of the CSE. This may not work if the argument is an expanded
type-alias that contains a pack-expansion (see added test).
Check the as-written arguments when determining containsUnexpandedParameterPack and isInstantiationDependent.
Summary: With OpenMP offloading host compilation is done in two phases to capture host IR that is passed to all device compilations as input. But it turns out that we currently run entire LLVM optimization pipeline on host IR on both compilations which may have unpredictable effects on the resulting code. This patch fixes this problem by disabling LLVM passes on the first compilation, so the host IR that is passed to device compilations will be captured right after front end.
Reviewers: ABataev, jdoerfert, hfinkel
Reviewed By: ABataev
Subscribers: guansong, cfe-commits
Tags: #clang
Differential Revision: https://reviews.llvm.org/D73721
regions.
If the lastprivate conditional is passed as shared in inner region, we
shall check if it was ever changed and use this updated value after exit
from the inner region as an update value.
Summary:
Fat object size has significantly increased after D65819 which changed bundler tool to add host object as a normal bundle to the fat output which almost doubled its size. That patch was fixing the following issues
1. Problems associated with the partial linking - global constructors were not called for partially linking objects which clearly resulted in incorrect behavior.
2. Eliminating "junk" target object sections from the linked binary on the host side.
The first problem is no longer relevant because we do not use partial linking for creating fat objects anymore. Target objects sections are now inserted into the resulting fat object with a help of llvm-objcopy tool.
The second issue, "junk" sections in the linked host binary, has been fixed in D73408 by adding "exclude" flag to the fat object's sections which contain target objects. This flag tells linker to drop section from the inputs when linking executable or shared library, therefore these sections will not be propagated in the linked binary.
Since both problems have been solved, we can revert D65819 changes to reduce fat object size and this patch essentially is doing that.
Reviewers: ABataev, alexshap, jdoerfert
Reviewed By: ABataev
Subscribers: cfe-commits
Tags: #clang
Differential Revision: https://reviews.llvm.org/D73642
The constrained fcmp intrinsics don't allow the TRUE/FALSE predicates.
Using them will assert. To workaround this I'm emitting the old X86 specific intrinsics that were never removed from the backend when we switched to using fcmp in IR. We have no way to mark them as being strict, but that's true of all target specific intrinsics so doesn't seem like we need to solve that here.
I've also added support for selecting between signaling and quiet.
Still need to support SAE which will require using a target specific
intrinsic. Also need to fix masking to not use an AND instruction
after the compare.
Differential Revision: https://reviews.llvm.org/D72906
Summary: This flag tells link editor to exclude section from linker inputs when linking executable or shared library.
Reviewers: ABataev, alexshap, jdoerfert
Reviewed By: ABataev
Subscribers: cfe-commits
Tags: #clang
Differential Revision: https://reviews.llvm.org/D73408
Iterator modeling depends on container modeling,
but not vice versa. This enables the possibility
to arrange these two modeling checkers into
separate layers.
There are several advantages for doing this: the
first one is that this way we can keep the
respective modeling checkers moderately simple
and small. Furthermore, this enables creation of
checkers on container operations which only
depend on the container modeling. Thus iterator
modeling can be disabled together with the
iterator checkers if they are not needed.
Since many container operations also affect
iterators, container modeling also uses the
iterator library: it creates iterator positions
upon calling the `begin()` or `end()` method of
a containter (but propagation of the abstract
position is left to the iterator modeling),
shifts or invalidates iterators according to the
rules upon calling a container modifier and
rebinds the iterator to a new container upon
`std::move()`.
Iterator modeling propagates the abstract
iterator position, handles the relations between
iterator positions and models iterator
operations such as increments and decrements.
Differential Revision: https://reviews.llvm.org/D73547
Summary:
Currently, sqdmulh_lane and friends from the ACLE (implemented in arm_neon.h),
are represented in LLVM IR as a (by vector) sqdmulh and a vector of (repeated)
indices, like so:
%shuffle = shufflevector <4 x i16> %v, <4 x i16> undef, <4 x i32> <i32 3, i32 3, i32 3, i32 3>
%vqdmulh2.i = tail call <4 x i16> @llvm.aarch64.neon.sqdmulh.v4i16(<4 x i16> %a, <4 x i16> %shuffle)
When %v's values are known, the shufflevector is optimized away and we are no
longer able to select the lane variant of sqdmulh in the backend.
This defeats a (hand-coded) optimization that packs several constants into a
single vector and uses the lane intrinsics to reduce register pressure and
trade-off materialising several constants for a single vector load from the
constant pool, like so:
int16x8_t v = {2,3,4,5,6,7,8,9};
a = vqdmulh_laneq_s16(a, v, 0);
b = vqdmulh_laneq_s16(b, v, 1);
c = vqdmulh_laneq_s16(c, v, 2);
d = vqdmulh_laneq_s16(d, v, 3);
[...]
In one microbenchmark from libjpeg-turbo this accounts for a 2.5% to 4%
performance difference.
We could teach the compiler to recover the lane variants, but this would likely
require its own pass. (Alternatively, "volatile" could be used on the constants
vector, but this is a bit ugly.)
This patch instead implements the following LLVM IR intrinsics for AArch64 to
maintain the original structure through IR optmization and into instruction
selection:
- sqdmulh_lane
- sqdmulh_laneq
- sqrdmulh_lane
- sqrdmulh_laneq.
These 'lane' variants need an additional register class. The second argument
must be in the lower half of the 64-bit NEON register file, but only when
operating on i16 elements.
Note that the existing patterns for shufflevector and sqdmulh into sqdmulh_lane
(etc.) remain, so code that does not rely on NEON intrinsics to generate these
instructions is not affected.
This patch also changes clang to emit these IR intrinsics for the corresponding
NEON intrinsics (AArch64 only).
Reviewers: SjoerdMeijer, dmgreen, t.p.northover, rovka, rengolin, efriedma
Reviewed By: efriedma
Subscribers: kristof.beyls, hiraditya, jdoerfert, cfe-commits, llvm-commits
Tags: #clang, #llvm
Differential Revision: https://reviews.llvm.org/D71469
include Clang builtin headers even with -nostdinc
Some projects use -nostdinc, but need to access some intrinsics files when building specific files.
The new -ibuiltininc flag lets them use this flag when compiling these files to ensure they can
find Clang's builtin headers.
The use of -nobuiltininc after the -ibuiltininc flag does not add the builtin header
search path to the list of header search paths.
Differential Revision: https://reviews.llvm.org/D73500
When using -fno-builtin[-<name>], we don't attach the IR attributes to
function definitions with no Decl, like the ones created through
`CreateGlobalInitOrDestructFunction`.
This results in projects using -fno-builtin or -ffreestanding to start
seeing symbols like _memset_pattern16.
The fix changes the behavior to always add the attribute if LangOptions
requests it.
Differential Revision: https://reviews.llvm.org/D73495
This makes clang somewhat forward-compatible with new CUDA releases
without having to patch it for every minor release without adding
any new function.
If an unknown version is found, clang issues a warning (can be disabled
with -Wno-cuda-unknown-version) and assumes that it has detected
the latest known version. CUDA releases are usually supersets
of older ones feature-wise, so it should be sufficient to keep
released clang versions working with minor CUDA updates without
having to upgrade clang, too.
Differential Revision: https://reviews.llvm.org/D73231
Performing a cast where the result is ignored caused Clang to crash when
performing codegen for the conversion:
_Complex int a;
void fn1() { (_Complex double) a; }
This patch addresses the crash by not trying to emit the scalar conversions,
causing it to be a noop. Fixes PR44624.
This CL adds clang declarations of built-in functions for AMDGPU MFMA intrinsics and instructions.
OpenCL tests for new built-ins are included.
Differential Revision: https://reviews.llvm.org/D72723
Currently device lib path set by environment variable HIP_DEVICE_LIB_PATH
does not work due to extra "-L" added to each entry.
This patch fixes that by allowing argument name to be empty in addDirectoryList.
Differential Revision: https://reviews.llvm.org/D73299
Summary:
The 'z' length modifier, signalling that an integer format specifier
takes a `size_t` sized integer, is only supported by the C library of
MSVC 2015 and later. Earlier versions don't recognize the 'z' at all,
and respond to `printf("%zu", x)` by just printing "zu".
So, if the MS compatibility version is set to a value earlier than
MSVC2015, it's useful to warn about 'z' modifiers in printf format
strings we check.
Reviewers: aaron.ballman, lebedev.ri, rnk, majnemer, zturner
Reviewed By: aaron.ballman
Subscribers: amccarth, cfe-commits
Tags: #clang
Differential Revision: https://reviews.llvm.org/D73457
This required some fixes to the generic code for two issues:
1. -fsanitize=safe-stack is default on x86_64-fuchsia and is *not* incompatible with -fsanitize=leak on Fuchisa
2. -fsanitize=leak and other static-only runtimes must not be omitted under -shared-libsan (which is the default on Fuchsia)
Patch By: mcgrathr
Differential Revision: https://reviews.llvm.org/D73397
With LLVM_APPEND_VC_REV=NO, Modules/merge-lifetime-extended-temporary.cpp
would fail if it ran before a0f50d7316 (which changed
the serialization format) and then after, for these reasons:
1. With LLVM_APPEND_VC_REV=NO, the module hash before and after the
change was the same.
2. Modules/merge-lifetime-extended-temporary.cpp is the only test
we have that uses -fmodule-cache-path=%t that
a) actually writes to the cache path
b) doesn't do `rm -rf %t` at the top of the test
So the old run would write a module file, and then the new run would
try to load it, but the serialized format changed.
Do several things to fix this:
1. Include clang::serialization::VERSION_MAJOR/VERSION_MINOR in
the module hash, so that when the AST format changes (...and
we remember to bump these), we use a different module cache dir.
2. Bump VERSION_MAJOR, since a0f50d7316 changed the
on-disk format in a way that a gch file written before that change
can't be read after that change.
3. Add `rm -rf %t` to all tests that pass -fmodule-cache-path=%t.
This is unnecessary from a correctness PoV after 1 and 2,
but makes it so that we don't amass many cache dirs over time.
(Arguably, it also makes it so that the test suite doesn't catch
when we change the serialization format but don't bump
clang::serialization::VERSION_MAJOR/VERSION_MINOR; oh well.)
Differential Revision: https://reviews.llvm.org/D73202
whether a call is to a builtin.
We already had a general mechanism to do this but for some reason
weren't using it. In passing, check for the other unary operators that
can intervene in a reasonably-direct function call (we already handled
'&' but missed '*' and '+').
These are mostly trivial additions as both of them are reusing existing
PThreadLockChecker logic. I only needed to add the list of functions to
check and do some plumbing to make sure that we display the right
checker name in the diagnostic.
Differential Revision: https://reviews.llvm.org/D73376
regions with reductions, lastprivates or linears clauses.
If the lastprivate conditional variable is updated in inner parallel
region with reduction, lastprivate or linear clause, the value must be
considred as a candidate for lastprivate conditional. Also, tracking in
inner parallel regions is not required.
This restores 59733525d3 (D71913), along
with bot fix 19c76989bb.
The bot failure should be fixed by D73418, committed as
af954e441a.
I also added a fix for non-x86 bot failures by requiring x86 in new test
lld/test/ELF/lto/devirt_vcall_vis_public.ll.
See
https://docs.google.com/document/d/1xMkTZMKx9llnMPgso0jrx3ankI4cv60xeZ0y4ksf4wc/preview
for background discussion.
This adds a warning, flags and pragmas to limit the number of
pre-processor tokens either at a certain point in a translation unit, or
overall.
The idea is that this would allow projects to limit the size of certain
widely included headers, or for translation units overall, as a way to
insert backstops for header bloat and prevent compile-time regressions.
Differential revision: https://reviews.llvm.org/D72703
Summary:
The MicrosoftCXXABI uses a separate mechanism for emitting vtable
type metadata, and thus didn't pick up the change from D71907
to emit the vcall_visibility metadata under -fwhole-program-vtables.
I believe this is the cause of a Windows bot failure when I committed
follow on change D71913 that required a revert. The failure occurred
in a CFI test that was expecting to not abort because it expected a
devirtualization to occur, and without the necessary vcall_visibility
metadata we would not get devirtualization.
Note in the equivalent code in CodeGenModule::EmitVTableTypeMetadata
(used by the ItaniumCXXABI), we also emit the vcall_visibility metadata
when Virtual Function Elimination is enabled. Since I am not as familiar
with the details of that optimization, I have marked that as a TODO and
am only inserting under -fwhole-program-vtables.
Reviewers: evgeny777
Subscribers: Prazek, ostannard, cfe-commits
Tags: #clang
Differential Revision: https://reviews.llvm.org/D73418
The code for parsing of type-constraints in compound-requirements was not adapted for the new TryAnnotateTypeConstraint which
caused compound-requirements with scope specifiers to ignore them.
Also add regression tests for scope specifiers in type-constraints in more contexts.
We would previously try to evaluate atomic constraints of non-template functions as-is,
and since they are now unevaluated at first, this would cause incorrect evaluation (bugs #44657, #44656).
Substitute into atomic constraints of non-template functions as we would atomic constraints
of template functions, in order to rebuild the expressions in a constant-evaluated context.
Implement a pessimistic evaluator of the minimal required size for a buffer
based on the format string, and couple that with the fortified version to emit a
warning when the buffer size is lower than the lower bound computed from the
format string.
Differential Revision: https://reviews.llvm.org/D71566
When used as qualified names, pseudo-destructors are always named as if
they were members of the type, never as members of the namespace
enclosing the type.
Summary:
As discussed in http://lists.llvm.org/pipermail/cfe-dev/2019-October/063459.html
the overflow of the souce locations (limited to 2^31 chars) can generate all sorts of
weird things (bogus warnings, hangs, crashes, miscompilation and correct compilation).
In debug mode this assert would fail. So it might be a good start, as in PR42301,
to detect the failure and exit with a proper error message.
Reviewers: rsmith, thakis, miyuki
Reviewed By: miyuki
Subscribers: cfe-commits
Tags: #clang
Differential Revision: https://reviews.llvm.org/D70183
Since 2009 (in r63846) we've been `#define`-ing OBJC_NEW_PROPERTIES all
the time on Darwin, but this macro only makes sense for `-x objective-c`
and `-x objective-c++`. Restrict it to those cases (for which there is
already separate logic).
https://reviews.llvm.org/D72970
rdar://problem/10050342
Summary:
This adds the reference types target feature. This does not enable any
more functionality in LLVM/clang for now, but this is necessary to embed
the info in the target features section, which is used by Binaryen and
Emscripten. It turned out that after D69832 `-fwasm-exceptions` crashed
because we didn't have the reference types target feature.
Reviewers: tlively
Subscribers: dschuff, sbc100, jgravelle-google, hiraditya, sunfish, cfe-commits, llvm-commits
Tags: #clang, #llvm
Differential Revision: https://reviews.llvm.org/D73320
There is llvm::Value::MaximumAlignment, which is numerically
equivalent to these constants, but we can't use it directly
because we can't include llvm IR headers in clang Sema.
So instead, copy-paste the constant, and fixup the places to use it.
This was initially reviewed in https://reviews.llvm.org/D72998
Summary:
For `__builtin_assume_aligned()`, we do validate that the alignment
is not greater than `536870912` (D68824), but we don't do that for
`__attribute__((assume_aligned(N)))` attribute.
I suspect we should.
This was initially committed in a4cfb15d15
but reverted in 210f0882c9 due to
suspicious bot failures.
Reviewers: erichkeane, aaron.ballman, hfinkel, rsmith, jdoerfert
Reviewed By: erichkeane
Subscribers: cfe-commits, llvm-commits
Tags: #llvm, #clang
Differential Revision: https://reviews.llvm.org/D72994
Summary:
`alloc_align` attribute takes parameter number, not the alignment itself,
so given **just** the attribute/function declaration we can't do any
sanity checking for said alignment.
However, at call site, given the actual `Expr` that is passed
into that parameter, we //might// be able to evaluate said `Expr`
as Integer Constant Expression, and perform the sanity checks.
But since there is no requirement for that argument to be an immediate,
we may fail, and that's okay.
However if we did evaluate, we should enforce the same constraints
as with `__builtin_assume_aligned()`/`__attribute__((assume_aligned(imm)))`:
said alignment is a power of two, and is not greater than our magic threshold
This was initially committed in c2a9061ac5
but reverted in 00756b1823 because of
suspicious bot failures.
Reviewers: erichkeane, aaron.ballman, hfinkel, rsmith, jdoerfert
Reviewed By: erichkeane
Subscribers: cfe-commits
Tags: #clang
Differential Revision: https://reviews.llvm.org/D72996
Summary:
This was reverted in e45fcfc3aa due to
libcxx build failure. This revision addresses that case.
Original commit message:
This patch will provide support for auto return type for the C++ member
functions.
This patch includes clang side implementation of this feature.
Patch by: Awanish Pandey <Awanish.Pandey@amd.com>
Reviewers: dblaikie, aprantl, shafik, alok, SouraVX, jini.susan.george
Reviewed by: dblaikie
Differential Revision: https://reviews.llvm.org/D70524
If we do, then the property_list_t length is wrong
and class_getProperty gets very sad.
Signed-off-by: Pierre Habouzit <phabouzit@apple.com>
Radar-Id: rdar://problem/58804805
Differential Revision: https://reviews.llvm.org/D73219
Sending a message to `self` when it is const and within a class method
is safe because we know that `self` is the Class itself.
We can only relax this warning in ARC.
Signed-off-by: Pierre Habouzit <phabouzit@apple.com>
Radar-Id: rdar://problem/58581965
Differential Revision: https://reviews.llvm.org/D72747
Differential Revision: https://reviews.llvm.org/D70268
This is a recommit of f978ea4983 with a fix for the PowerPC failure.
The issue was that:
* `CompilerInstance::ExecuteAction` calls
`getTarget().adjust(getLangOpts());`.
* `PPCTargetInfo::adjust` changes `LangOptions::HasAltivec`.
* This happens after the first few calls to `getModuleHash`.
There’s even a FIXME saying:
```
// FIXME: We shouldn't need to do this, the target should be immutable once
// created. This complexity should be lifted elsewhere.
```
This only showed up on PowerPC because it's one of the few targets that
almost always changes a hashed langopt.
I looked into addressing the fixme, but that would be a much larger
change, and it's not the only thing that happens in `ExecuteAction` that
can change the module context hash. Instead I changed the code to not
call `getModuleHash` until after it has been modified in `ExecuteAction`.
As per P1980R0, constraint expressions are unevaluated operands, and their constituent atomic
constraints only become constant evaluated during satisfaction checking.
Change the evaluation context during parsing and instantiation of constraints to unevaluated.
Summary:
Third part in series to support Safe Whole Program Devirtualization
Enablement, see RFC here:
http://lists.llvm.org/pipermail/llvm-dev/2019-December/137543.html
This patch adds type test metadata under -fwhole-program-vtables,
even for classes without hidden visibility. It then changes WPD to skip
devirtualization for a virtual function call when any of the compatible
vtables has public vcall visibility.
Additionally, internal LLVM options as well as lld and gold-plugin
options are added which enable upgrading all public vcall visibility
to linkage unit (hidden) visibility during LTO. This enables the more
aggressive WPD to kick in based on LTO time knowledge of the visibility
guarantees.
Support was added to all flavors of LTO WPD (regular, hybrid and
index-only), and to both the new and old LTO APIs.
Unfortunately it was not simple to split the first and second parts of
this part of the change (the unconditional emission of type tests and
the upgrading of the vcall visiblity) as I needed a way to upgrade the
public visibility on legacy WPD llvm assembly tests that don't include
linkage unit vcall visibility specifiers, to avoid a lot of test churn.
I also added a mechanism to LowerTypeTests that allows dropping type
test assume sequences we now aggressively insert when we invoke
distributed ThinLTO backends with null indexes, which is used in testing
mode, and which doesn't invoke the normal ThinLTO backend pipeline.
Depends on D71907 and D71911.
Reviewers: pcc, evgeny777, steven_wu, espindola
Subscribers: emaste, Prazek, inglorion, arichardson, hiraditya, MaskRay, dexonsmith, dang, davidxl, cfe-commits, llvm-commits
Tags: #clang, #llvm
Differential Revision: https://reviews.llvm.org/D71913
MSVC 2013 would refuse to pass highly aligned things (typically vectors
and aggregates) by value. Users would receive this error:
t.cpp(11) : error C2719: 'w': formal parameter with __declspec(align('32')) won't be aligned
t.cpp(11) : error C2719: 'q': formal parameter with __declspec(align('32')) won't be aligned
However, in MSVC 2015, this behavior was changed, and highly aligned
things are now passed indirectly. To avoid breaking backwards
incompatibility, objects that do not have a *required* high alignment
(i.e. double) are still passed directly, even though they are not
naturally aligned. This change implements the new behavior of passing
things indirectly.
The new behavior is:
- up to three vector parameters can be passed in [XYZ]MM0-2
- remaining arguments with required alignment greater than 4 bytes are
passed indirectly
Previously, MSVC never passed things truly indirectly, meaning clang
would always apply the byval attribute to indirect arguments. We had to
go to the trouble of adding inalloca so that non-trivially copyable C++
types could be passed in place without copying the object
representation. When inalloca was added, we asserted that all arguments
passed indirectly must use byval. With this change, that assert no
longer holds, and I had to update inalloca to handle that case. The
implicit sret pointer parameter was already handled this way, and this
change generalizes some of that logic to arguments.
There are two cases that this change leaves unfixed:
1. objects that are non-trivially copyable *and* overaligned
2. vectorcall + inalloca + vectors
For case 1, I need to touch C++ ABI code in MicrosoftCXXABI.cpp, so I
want to do it in a follow-up.
For case 2, my fix is one line, but it will require updating IR tests to
use lots of inreg, so I wanted to separate it out.
Related to D71915 and D72110
Fixes most of PR44395
Reviewed By: rjmccall, craig.topper, erichkeane
Differential Revision: https://reviews.llvm.org/D72114
Now with concepts support merged and mostly complete, we do not need -fconcepts-ts
(which was also misleading as we were not implementing the TS) and can enable
concepts features under C++2a. A warning will be generated if users still attempt
to use -fconcepts-ts.
Using the same strategy as c38e42527b.
D69825 revealed (introduced?) a problem when building with ASan, and
some memory leaks somewhere. More details are available in the original
patch.
Looks like we missed one failing tests, this patch adds the workaround
to this test as well.
Proper ExpressionEvaluationContext were not being entered when instantiating constraint
expressions, which caused assertion failures in certain cases, including bug #44614.
Summary:
Second patch in series to support Safe Whole Program Devirtualization
Enablement, see RFC here:
http://lists.llvm.org/pipermail/llvm-dev/2019-December/137543.html
Summarize vcall_visibility metadata in ThinLTO global variable summary.
Depends on D71907.
Reviewers: pcc, evgeny777, steven_wu
Subscribers: mehdi_amini, Prazek, inglorion, hiraditya, dexonsmith, arphaman, ostannard, llvm-commits, cfe-commits, davidxl
Tags: #clang, #llvm
Differential Revision: https://reviews.llvm.org/D71911
Summary:
I kind-of understand why it is restricted to integer-typed arguments,
for general enum's the value passed is not nessesairly the alignment implied,
although one might say that user would know best.
But we clearly should whitelist `std::align_val_t`,
which is just a thin wrapper over `std::size_t`,
and is the C++ standard way of specifying alignment.
Reviewers: erichkeane, rsmith, aaron.ballman, jdoerfert
Reviewed By: erichkeane
Subscribers: cfe-commits
Tags: #clang
Differential Revision: https://reviews.llvm.org/D73019
Summary:
Much like with the previous patch (D73005) with `AssumeAlignedAttr`
handling, results in mildly more readable IR,
and will improve test coverage in upcoming patch.
Note that in `AllocAlignAttr`'s case, there is no requirement
for that alignment parameter to end up being an I-C-E.
Reviewers: erichkeane, jdoerfert, hfinkel, aaron.ballman, rsmith
Reviewed By: erichkeane
Subscribers: cfe-commits
Tags: #clang
Differential Revision: https://reviews.llvm.org/D73006
Summary:
This should be mostly NFC - we still lower the same alignment
knowledge to the IR. The main reasoning here is that
this somewhat improves readability of IR like this,
and will improve test coverage in upcoming patch.
Even though the alignment is guaranteed to always be an I-C-E,
we don't always materialize it as llvm's Alignment Attribute because:
1. There may be a non-zero offset
2. We may be sanitizing for alignment
Note that if there already was an IR alignment attribute
on return value, we union them, and thus the alignment
only ever rises.
Also, there is a second relevant clang attribute `AllocAlignAttr`,
so that is why `AbstractAssumeAlignedAttrEmitter` is templated.
Reviewers: erichkeane, jdoerfert, hfinkel, aaron.ballman, rsmith
Reviewed By: erichkeane
Subscribers: cfe-commits
Tags: #clang
Differential Revision: https://reviews.llvm.org/D73005
Summary:
`alloc_align` attribute takes parameter number, not the alignment itself,
so given **just** the attribute/function declaration we can't do any
sanity checking for said alignment.
However, at call site, given the actual `Expr` that is passed
into that parameter, we //might// be able to evaluate said `Expr`
as Integer Constant Expression, and perform the sanity checks.
But since there is no requirement for that argument to be an immediate,
we may fail, and that's okay.
However if we did evaluate, we should enforce the same constraints
as with `__builtin_assume_aligned()`/`__attribute__((assume_aligned(imm)))`:
said alignment is a power of two, and is not greater than our magic threshold
Reviewers: erichkeane, aaron.ballman, hfinkel, rsmith, jdoerfert
Reviewed By: erichkeane
Subscribers: cfe-commits
Tags: #clang
Differential Revision: https://reviews.llvm.org/D72996
Summary:
For `__builtin_assume_aligned()`, we do validate that the alignment
is not greater than `536870912` (D68824), but we don't do that for
`__attribute__((assume_aligned(N)))` attribute.
I suspect we should.
Reviewers: erichkeane, aaron.ballman, hfinkel, rsmith, jdoerfert
Reviewed By: erichkeane
Subscribers: cfe-commits, llvm-commits
Tags: #llvm, #clang
Differential Revision: https://reviews.llvm.org/D72994
Summary:
First patch to support Safe Whole Program Devirtualization Enablement,
see RFC here: http://lists.llvm.org/pipermail/llvm-dev/2019-December/137543.html
Always emit !vcall_visibility metadata under -fwhole-program-vtables,
and not just for -fvirtual-function-elimination. The vcall visibility
metadata will (in a subsequent patch) be used to communicate to WPD
which vtables are safe to devirtualize, and we will optionally convert
the metadata to hidden visibility at link time. Subsequent follow on
patches will help enable this by adding vcall_visibility metadata to the
ThinLTO summaries, and always emit type test intrinsics under
-fwhole-program-vtables (and not just for vtables with hidden
visibility).
In order to do this safely with VFE, since for VFE all vtable loads must
be type checked loads which will no longer be the case, this patch adds
a new "Virtual Function Elim" module flag to communicate to GlobalDCE
whether to perform VFE using the vcall_visibility metadata.
One additional advantage of using the vcall_visibility metadata to drive
more WPD at LTO link time is that we can use the same mechanism to
enable more aggressive VFE at LTO link time as well. The link time
option proposed in the RFC will convert vcall_visibility metadata to
hidden (aka linkage unit visibility), which combined with
-fvirtual-function-elimination will allow it to be done more
aggressively at LTO link time under the same conditions.
Reviewers: pcc, ostannard, evgeny777, steven_wu
Subscribers: mehdi_amini, Prazek, hiraditya, dexonsmith, davidxl, cfe-commits, llvm-commits
Tags: #clang, #llvm
Differential Revision: https://reviews.llvm.org/D71907
This patch implements P1141R2 "Yet another approach for constrained declarations".
General strategy for this patch was:
- Expand AutoType to include optional type-constraint, reflecting the wording and easing the integration of constraints.
- Replace autos in parameter type specifiers with invented parameters in GetTypeSpecTypeForDeclarator, using the same logic
previously used for generic lambdas, now unified with abbreviated templates, by:
- Tracking the template parameter lists in the Declarator object
- Tracking the template parameter depth before parsing function declarators (at which point we can match template
parameters against scope specifiers to know if we have an explicit template parameter list to append invented parameters
to or not).
- When encountering an AutoType in a parameter context we check a stack of InventedTemplateParameterInfo structures that
contain the info required to create and accumulate invented template parameters (fields that were already present in
LambdaScopeInfo, which now inherits from this class and is looked up when an auto is encountered in a lambda context).
Resubmit after fixing MSAN failures caused by incomplete initialization of AutoTypeLocs in TypeSpecLocFiller.
Differential Revision: https://reviews.llvm.org/D65042
If local allocator was declared and used in the allocate clause, it was
not captured in inner region. It leads to a compiler crash, need to
capture the allocator declarator.
Summary:
CodeCompletion was not being triggered after successfully parsed
initializer lists, e.g.
```cpp
void foo(int, bool);
void bar() {
foo({1}^, false);
}
```
CodeCompletion would suggest the function foo as an overload candidate up until
the point marked with `^` but after that point we do not trigger signature help
since parsing succeeds.
This patch handles that case by failing in parsing expression lists whenever we
see a codecompletion token, in addition to getting an invalid subexpression.
Reviewers: sammccall
Subscribers: cfe-commits
Tags: #clang
Differential Revision: https://reviews.llvm.org/D73177
Summary:
Apparently nobody has tried this in months of development. It turns
out that `FunctionDecl::getBuiltinID` will never consider a function
to be a builtin if it is in C++ and not extern "C". So none of the
function declarations in <arm_mve.h> are recognized as builtins when
clang is compiling in C++ mode: it just emits calls to them as
ordinary functions, which then turn out not to exist at link time.
The trivial fix is to wrap most of arm_mve.h in an extern "C".
Added a test in clang/test/CodeGen/arm-mve-intrinsics which checks
basic functioning of the MVE header file in C++ mode. I've filled it
with copies of existing test functions from other files in that
directory, including a few moderately tricky cases of overloading (in
particular one that relies on the strict-polymorphism attribute added
in D72518).
(I considered making //every// test in that directory compile in both
C and C++ mode and check the code generation was identical. But I
think that would increase testing time by more than the value it adds,
and also update_cc_test_checks gets confused when the output function
name varies between RUN lines.)
Reviewers: LukeGeeson, MarkMurrayARM, miyuki, dmgreen
Reviewed By: MarkMurrayARM
Subscribers: kristof.beyls, cfe-commits
Tags: #clang
Differential Revision: https://reviews.llvm.org/D73268
Summary:
Immediate vmvnq is code-generated as a simple vector constant in IR,
and left to the backend to recognize that it can be created with an
MVE VMVN instruction. The predicated version is represented as a
select between the input and the same constant, and I've added a
Tablegen isel rule to turn that into a predicated VMVN. (That should
be better than the previous VMVN + VPSEL: it's the same number of
instructions but now it can fold into an adjacent VPT block.)
The unpredicated forms of VBIC and VORR are done by enabling the same
isel lowering as for NEON, recognizing appropriate immediates and
rewriting them as ARMISD::VBICIMM / ARMISD::VORRIMM SDNodes, which I
then instruction-select into the right MVE instructions (now that I've
also reworked those instructions to use the same MC operand encoding).
In order to do that, I had to promote the Tablegen SDNode instance
`NEONvorrImm` to a general `ARMvorrImm` available in MVE as well, and
similarly for `NEONvbicImm`.
The predicated forms of VBIC and VORR are represented as a vector
select between the original input vector and the output of the
unpredicated operation. The main convenience of this is that it still
lets me use the existing isel lowering for VBICIMM/VORRIMM, and not
have to write another copy of the operand encoding translation code.
This intrinsic family is the first to use the `imm_simd` system I put
into the MveEmitter tablegen backend. So, naturally, it showed up a
bug or two (emitting bogus range checks and the like). Fixed those,
and added a full set of tests for the permissible immediates in the
existing Sema test.
Also adjusted the isel pattern for `vmovlb.u8`, which stopped matching
because lowering started turning its input into a VBICIMM. Now it
recognizes the VBICIMM instead.
Reviewers: dmgreen, MarkMurrayARM, miyuki, ostannard
Reviewed By: dmgreen
Subscribers: kristof.beyls, hiraditya, cfe-commits, llvm-commits
Tags: #clang, #llvm
Differential Revision: https://reviews.llvm.org/D72934
Profile TypeConstraints in ProfileTemplateParameterList so we can distinguish
between partial specializations which differ in their TemplateParameterList
type constraints.
Recommit, now profiling the IDC so that we can deal with situations where the
TemplateArgsAsWritten are nullptr (happens when canonicalizing type constraints).
Profile TypeConstraints in ProfileTemplateParameterList so we can distinguish
between partial specializations which differ in their TemplateParameterList
type constraints
After rGb4a99a061f517e60985667e39519f60186cbb469, passing a response file such as -Wp,@a.rsp wasn't working anymore because .rsp expansion happens inside clang's main() function.
This patch adds response file expansion in the -cc1 tool.
Differential Revision: https://reviews.llvm.org/D73120
This patch implements P1141R2 "Yet another approach for constrained declarations".
General strategy for this patch was:
- Expand AutoType to include optional type-constraint, reflecting the wording and easing the integration of constraints.
- Replace autos in parameter type specifiers with invented parameters in GetTypeSpecTypeForDeclarator, using the same logic
previously used for generic lambdas, now unified with abbreviated templates, by:
- Tracking the template parameter lists in the Declarator object
- Tracking the template parameter depth before parsing function declarators (at which point we can match template
parameters against scope specifiers to know if we have an explicit template parameter list to append invented parameters
to or not).
- When encountering an AutoType in a parameter context we check a stack of InventedTemplateParameterInfo structures that
contain the info required to create and accumulate invented template parameters (fields that were already present in
LambdaScopeInfo, which now inherits from this class and is looked up when an auto is encountered in a lambda context).
Resubmit after incorrect check in NonTypeTemplateParmDecl broke lldb.
Differential Revision: https://reviews.llvm.org/D65042
When using in-process cc1, the Clang Interface Stubs pipeline setup
exposes an ASAN bug. I am still investigating this issue but want to
green the bots for now. I don't think this is a huge issue since the
Clang Interface Stubs Driver setup code is the only code path that sets
up such a pipeline (ie N cc1's for N c files followed by another N cc1's
for to generate stub files for the same N c files).
This issue is being discussed in https://reviews.llvm.org/D69825.
If a resolution is not found soon, a bugzilla filling will be in order.
Add a simple cache for constraint satisfaction results. Whether or not this simple caching
would be permitted in final C++2a is currently being discussed but it is required for
acceptable performance so we use it in the meantime, with the possibility of adding some
cache invalidation mechanisms later.
Differential Revision: https://reviews.llvm.org/D72552
This patch implements P1141R2 "Yet another approach for constrained declarations".
General strategy for this patch was:
- Expand AutoType to include optional type-constraint, reflecting the wording and easing the integration of constraints.
- Replace autos in parameter type specifiers with invented parameters in GetTypeSpecTypeForDeclarator, using the same logic
previously used for generic lambdas, now unified with abbreviated templates, by:
- Tracking the template parameter lists in the Declarator object
- Tracking the template parameter depth before parsing function declarators (at which point we can match template
parameters against scope specifiers to know if we have an explicit template parameter list to append invented parameters
to or not).
- When encountering an AutoType in a parameter context we check a stack of InventedTemplateParameterInfo structures that
contain the info required to create and accumulate invented template parameters (fields that were already present in
LambdaScopeInfo, which now inherits from this class and is looked up when an auto is encountered in a lambda context).
Differential Revision: https://reviews.llvm.org/D65042
Summary:
We previously listed first declared members, then implicit operator=,
then implicit operator==, then implicit destructors. Per discussion on
https://github.com/itanium-cxx-abi/cxx-abi/issues/88, put the implicit
equality comparison operators at the very end, after all special member
functions.
This reinstates add2b7e44a, reverted in
commit 89e43f04ba, with a fix for 32-bit
targets.
Reviewers: rjmccall
Subscribers: cfe-commits
Tags: #clang
Differential Revision: https://reviews.llvm.org/D72897
The issue was reported by @xazax.hun here: https://reviews.llvm.org/D69825#1827826
"This patch (D69825) breaks scan-build-py which parses the output of "-###" to get -cc1 command. There might be other tools with the same problems. Could we either remove (in-process) from CC1Command::Print or add a line break?
Having the last line as a valid invocation is valuable and there might be tools relying on that."
Differential Revision: https://reviews.llvm.org/D72982
When Wrange-loop-analysis issues a diagnostic on a dependent type in a
template the diagnostic may not be valid for all instantiations. Therefore
the diagnostic is suppressed during the instantiation. Non dependent types
still issue a diagnostic.
The same can happen when using macros. Therefore the diagnostic is
disabled for macros.
Fixes https://bugs.llvm.org/show_bug.cgi?id=44556
Differential Revision: https://reviews.llvm.org/D73007
Summary:
We shouldn't be just giving up if we find one of them
(like we currently do with `AssumeAlignedAttr`),
we should emit them all.
As the tests show, even if we materialized good knowledge
from `__attribute__((assume_aligned(32)`, it doesn't mean
`__attribute__((alloc_align([...])))` info won't be useful.
It might be, but that isn't given.
Reviewers: erichkeane, jdoerfert, aaron.ballman
Reviewed By: erichkeane
Subscribers: cfe-commits
Tags: #clang
Differential Revision: https://reviews.llvm.org/D72979
When constrained floating point is enabled the SystemZ-specific builtins
don't use constrained intrinsics in some cases. Fix that.
Differential Revision: https://reviews.llvm.org/D72722
Summary:
This patch resumes the work of D16586.
According to the AAPCS, volatile bit-fields should
be accessed using containers of the widht of their
declarative type. In such case:
```
struct S1 {
short a : 1;
}
```
should be accessed using load and stores of the width
(sizeof(short)), where now the compiler does only load
the minimum required width (char in this case).
However, as discussed in D16586,
that could overwrite non-volatile bit-fields, which
conflicted with C and C++ object models by creating
data race conditions that are not part of the bit-field,
e.g.
```
struct S2 {
short a;
int b : 16;
}
```
Accessing `S2.b` would also access `S2.a`.
The AAPCS Release 2019Q1.1
(https://static.docs.arm.com/ihi0042/g/aapcs32.pdf)
section 8.1 Data Types, page 35, "Volatile bit-fields -
preserving number and width of container accesses" has been
updated to avoid conflict with the C++ Memory Model.
Now it reads in the note:
```
This ABI does not place any restrictions on the access widths
of bit-fields where the container overlaps with a non-bit-field member.
This is because the C/C++ memory model defines these as being separate
memory locations, which can be accessed by two threads
simultaneously. For this reason, compilers must be permitted to use a
narrower memory access width (including splitting the access
into multiple instructions) to avoid writing to a different memory location.
```
I've updated the patch D16586 to follow such behavior by verifying that we
only change volatile bit-field access when:
- it won't overlap with any other non-bit-field member
- we only access memory inside the bounds of the record
Regarding the number of memory accesses, that should be preserved, that will
be implemented by D67399.
Reviewers: rsmith, rjmccall, eli.friedman, ostannard
Subscribers: ostannard, kristof.beyls, cfe-commits, carwil, olista01
Tags: #clang
Differential Revision: https://reviews.llvm.org/D72932
This patch broke the Sanitizer buildbots. Please see the commit's
differential revision for more information
(https://reviews.llvm.org/D67678).
This reverts commit b72a8c65e4.
Target regions have implicit outer region which may erroneously capture
some globals when it should not. It may lead to a compiler crash at the
compile time.
Summary:
Printing policy was not propogated to functiondecls when creating a
completion string which resulted in canonical template parameters like
`foo<type-parameter-0-0>`. This patch propogates printing policy to those as
well.
Fixes https://github.com/clangd/clangd/issues/76
Reviewers: ilya-biryukov
Subscribers: jkorous, arphaman, usaxena95, cfe-commits
Tags: #clang
Differential Revision: https://reviews.llvm.org/D72715
Summary:
We previously listed first declared members, then implicit operator=,
then implicit operator==, then implicit destructors. Per discussion on
https://github.com/itanium-cxx-abi/cxx-abi/issues/88, put the implicit
equality comparison operators at the very end, after all special member
functions.
Reviewers: rjmccall
Subscribers: cfe-commits
Tags: #clang
Differential Revision: https://reviews.llvm.org/D72897
a temporary.
We previously failed to materialize a temporary when performing an
implicit conversion to a reference type, resulting in our thinking the
argument was a value rather than a reference in some cases.
Implement support for C++2a requires-expressions.
Re-commit after compilation failure on some platforms due to alignment issues with PointerIntPair.
Differential Revision: https://reviews.llvm.org/D50360