For example:
svint32_t svget4(svint32x4_t tuple, uint64_t imm_index)
returns the subvector at `index`, which must be in range `0..3`.
svint32x3_t svset3(svint32x3_t tuple, uint64_t index, svint32_t vec)
returns a tuple vector with `vec` inserted into `tuple` at `index`,
which must be in range `0..2`.
Reviewers: c-rhodes, efriedma
Reviewed By: c-rhodes
Tags: #clang
Differential Revision: https://reviews.llvm.org/D81464
This patch add __builtin_matrix_column_major_load to Clang,
as described in clang/docs/MatrixTypes.rst. In the initial version,
the stride is not optional yet.
Reviewers: rjmccall, rsmith, jfb, Bigcheese
Reviewed By: rjmccall
Differential Revision: https://reviews.llvm.org/D72781
Summary:
Add a flag to omit the xray_fn_idx to cut size overhead and relocations
roughly in half at the cost of reduced performance for single function
patching. Minor additions to compiler-rt support per-function patching
without the index.
Reviewers: dberris, MaskRay, johnislarry
Subscribers: hiraditya, arphaman, cfe-commits, #sanitizers, llvm-commits
Tags: #clang, #sanitizers, #llvm
Differential Revision: https://reviews.llvm.org/D81995
This patch upstreams support for BFloat Matrix Multiplication Intrinsics
and Code Generation from __bf16 to AArch64. This includes IR intrinsics. Unittests are
provided as needed. AArch32 Intrinsics + CodeGen will come after this
patch.
This patch is part of a series implementing the Bfloat16 extension of
the
Armv8.6-a architecture, as detailed here:
https://community.arm.com/developer/ip-products/processors/b/processors-ip-blog/posts/arm-architecture-developments-armv8-6-a
The bfloat type, and its properties are specified in the Arm
Architecture
Reference Manual:
https://developer.arm.com/docs/ddi0487/latest/arm-architecture-reference-manual-armv8-for-armv8-a-architecture-profile
The following people contributed to this patch:
Luke Geeson
- Momchil Velikov
- Mikhail Maltsev
- Luke Cheeseman
Reviewers: SjoerdMeijer, t.p.northover, sdesmalen, labrinea, miyuki,
stuij
Reviewed By: miyuki, stuij
Subscribers: kristof.beyls, hiraditya, danielkiss, cfe-commits,
llvm-commits, miyuki, chill, pbarrio, stuij
Tags: #clang, #llvm
Differential Revision: https://reviews.llvm.org/D80752
Change-Id: I174f0fd0f600d04e3799b06a7da88973c6c0703f
parameters of non-trivial C struct special functions
This removes the need to pass std::array of Addresses to getFunction,
which were overwritten in the function.
Summary:
If a record has a mix of relative pointers and other fields they
wouldn't necessarily be the same.
Fallout from D77592.
rdar://64309883
Subscribers: cfe-commits
Tags: #clang
Differential Revision: https://reviews.llvm.org/D81857
_ExtInt types
- Fix computed size for _ExtInt types passed to checked arithmetic
builtins.
- Emit diagnostic when signed _ExtInt larger than 128-bits is passed
to __builtin_mul_overflow.
- Change Sema checks for builtins to accept placeholder types.
Differential Revision: https://reviews.llvm.org/D81420
Prevent IR-gen from emitting consteval declarations
Summary: with this patch instead of emitting calls to consteval function. the IR-gen will emit a store of the already computed result.
Summary: with this patch instead of emitting calls to consteval function. the IR-gen will emit a store of the already computed result.
Reviewers: rsmith
Reviewed By: rsmith
Subscribers: cfe-commits
Tags: #clang
Differential Revision: https://reviews.llvm.org/D76420
When checking for an enum function attribute, use hasFnAttribute()
rather than hasAttribute() at FunctionIndex, because it is
significantly faster (and more concise to boot).
This patch adds new SVE types to Clang that describe tuples of SVE
vectors. For example `svint32x2_t` which maps to the twice-as-wide
vector `<vscale x 8 x i32>`. Similarly, `svint32x3_t` will map to
`<vscale x 12 x i32>`.
It also adds builtins to return an `undef` vector for a given
SVE type.
Reviewers: c-rhodes, david-arm, ctetreau, efriedma, rengolin
Reviewed By: c-rhodes
Tags: #clang
Differential Revision: https://reviews.llvm.org/D81459
As pointed out in PR45708, -ffine-grained-bitfield-accesses doesn't
trigger in all cases you think it might for RISC-V. The logic in
CGRecordLowering::accumulateBitFields checks OffsetInRecord is a legal
integer according to the datalayout. RISC targets will typically only
have the native width as a legal integer type so this check will fail
for OffsetInRecord of 8 or 16 when you would expect the transformation
is still worthwhile.
This patch changes the logic to check for an OffsetInRecord of a at
least 1 byte, that fits in a legal integer, and is a power of 2. We
would prefer to query whether native load/store operations are
available, but I don't believe that is possible.
Differential Revision: https://reviews.llvm.org/D79155
Rather than pushing inactive cleanups for the block captures at the
entry of a full expression and activating them during the creation of
the block literal, just call pushLifetimeExtendedDestroy to ensure the
cleanups are popped at the end of the scope enclosing the block
expression.
rdar://problem/63996471
Differential Revision: https://reviews.llvm.org/D81624
Functions can have local pragmas that override the global settings.
We set the flags eagerly based on global settings, but if we emit
an expression under the influence of a pragma, we clear the
appropriate flags from the function.
In order to avoid doing a ton of redundant work whenever we emit
an FP expression, configure the IRBuilder to default to global
settings, and only reconfigure it when we see an FP expression
that's not using the global settings.
Patch by Michele Scandale!
https://reviews.llvm.org/D80462
This patch contains all of the clang changes from D72959.
- Generalize the relative vtables ABI such that it can be used by other targets.
- Add an enum VTableComponentLayout which controls whether components in the
vtable should be pointers to other structs or relative offsets to those structs.
Other ABIs can change this enum to restructure how components in the vtable
are laid out/accessed.
- Add methods to ConstantInitBuilder for inserting relative offsets to a
specified position in the aggregate being constructed.
- Fix failing tests under new PM and ASan and MSan issues.
See D72959 for background info.
Differential Revision: https://reviews.llvm.org/D77592
Summary:
Added codegen for use_device_addr clause. The components of the list
items are mapped as a kind of RETURN components and then the returned
base address is used instead of the real address of the base declaration
used in the use_device_addr expressions.
Reviewers: jdoerfert
Subscribers: yaxunl, guansong, sstefan1, cfe-commits, caomhin
Tags: #clang
Differential Revision: https://reviews.llvm.org/D80730
This reverts commit 2e009dbcb3.
Reverting since there were some test failures on buildbots that used the
new pass manager. ASan and MSan are also finding some bugs in this that
I'll need to address.
This patch contains all of the clang changes from D72959.
- Generalize the relative vtables ABI such that it can be used by other targets.
- Add an enum VTableComponentLayout which controls whether components in the
vtable should be pointers to other structs or relative offsets to those structs.
Other ABIs can change this enum to restructure how components in the vtable
are laid out/accessed.
- Add methods to ConstantInitBuilder for inserting relative offsets to a
specified position in the aggregate being constructed.
See D72959 for background info.
Differential Revision: https://reviews.llvm.org/D77592
Summary:
As specified in https://github.com/WebAssembly/simd/pull/232. These
instructions are implemented as LLVM intrinsics for now rather than
normal ISel patterns to make these instructions opt-in. Once the
instructions are merged to the spec proposal, the intrinsics will be
replaced with proper ISel patterns.
Reviewers: aheejin
Subscribers: dschuff, sbc100, jgravelle-google, hiraditya, sunfish, cfe-commits, llvm-commits
Tags: #clang, #llvm
Differential Revision: https://reviews.llvm.org/D81222
Summary:
__builtin_amdgcn_atomic_inc32(int *Ptr, int Val, unsigned MemoryOrdering, const char *SyncScope)
__builtin_amdgcn_atomic_inc64(int64_t *Ptr, int64_t Val, unsigned MemoryOrdering, const char *SyncScope)
__builtin_amdgcn_atomic_dec32(int *Ptr, int Val, unsigned MemoryOrdering, const char *SyncScope)
__builtin_amdgcn_atomic_dec64(int64_t *Ptr, int64_t Val, unsigned MemoryOrdering, const char *SyncScope)
First and second arguments gets transparently passed to the amdgcn atomic
inc/dec intrinsic. Fifth argument of the intrinsic is set as true if the
first argument of the builtin is a volatile pointer. The third argument of
this builtin is one of the memory-ordering specifiers ATOMIC_ACQUIRE,
ATOMIC_RELEASE, ATOMIC_ACQ_REL, or ATOMIC_SEQ_CST following C++11 memory
model semantics. This is mapped to corresponding LLVM atomic memory ordering
for the atomic inc/dec instruction using CLANG atomic C ABI. The fourth
argument is an AMDGPU-specific synchronization scope defined as string.
Reviewers: arsenm, sameerds, JonChesterfield, jdoerfert
Reviewed By: arsenm, sameerds
Subscribers: kzhuravl, jvesely, wdng, nhaehnle, yaxunl, dstuttard, tpr, t-tye, jfb, kerbowa, cfe-commits
Tags: #clang
Differential Revision: https://reviews.llvm.org/D80804
Check that getDebugInfo() is not null, as in the first revision, before
calling getDebugInfo()->addHeapAllocSiteMetadata().
Else would cause a crash with a new expression in a default arg.
---
Clang marks calls to operator new as heap allocation sites, but the
operator declared at global scope returns a void pointer. There is no
explicit cast in the code, so the compiler has to write down the
allocated type itself.
Also generalize a cast to use CallBase, so that we mark heap alloc sites
when exceptions are enabled.
Differential Revision: https://reviews.llvm.org/D80966
This patch add __builtin_matrix_transpose to Clang, as described in
clang/docs/MatrixTypes.rst.
Reviewers: rjmccall, jfb, rsmith, Bigcheese
Reviewed By: rjmccall
Differential Revision: https://reviews.llvm.org/D72778
Summary:
Add -ftrivial-auto-var-init-stop-after= to limit the number of times
stack variables are initialized when -ftrivial-auto-var-init= is used to
initialize stack variables to zero or a pattern. This flag can be used
to bisect uninitialized uses of a stack variable exposed by automatic
variable initialization, such as http://crrev.com/c/2020401.
Reviewers: jfb, vitalybuka, kcc, glider, rsmith, rjmccall, pcc, eugenis, vlad.tsyrklevich
Reviewed By: jfb
Subscribers: phosek, hubert.reinterpretcast, srhines, MaskRay, george.burgess.iv, dexonsmith, inglorion, gbiv, llozano, manojgupta, cfe-commits
Tags: #clang
Differential Revision: https://reviews.llvm.org/D77168
Summary:
This fixes pr33372.cpp under the new pass manager.
ASan adds padding to globals. For example, it will change a {i32, i32, i32} to a {{i32, i32, i32}, [52 x i8]}. However, when loading from the {i32, i32, i32}, InstCombine may (after various optimizations) end up loading 16 bytes instead of 12, likely because it thinks the [52 x i8] padding is ok to load from. But ASan checks that padding should not be loaded from.
Ultimately this is an issue of *San passes wanting to be run after all optimizations. This change moves the module passes right next to the corresponding function passes.
Also remove comment that's no longer relevant, this is the last ASan/MSan/TSan failure under the NPM (hopefully...).
As mentioned in https://reviews.llvm.org/rG1285e8bcac2c54ddd924ffb813b2b187467ac2a6, NPM doesn't support LTO + sanitizers, so modified some tests that test for that.
Reviewers: leonardchan, vitalybuka
Subscribers: cfe-commits
Tags: #clang
Differential Revision: https://reviews.llvm.org/D81323
With a change to use `CGM.getCodeGenOpts().getDebugInfo() != codegenoptions::NoDebugInfo`
instead of `getDebugInfo()`,
to fix `Profile-<arch> :: instrprof-gcov-multithread_fork.test`
See CodeGenModule::CodeGenModule, `EmitGcovArcs || EmitGcovNotes` can
set `clang::CodeGen::CodeGenModule::DebugInfo`.
---
Clang marks calls to operator new as heap allocation sites, but the
operator declared at global scope returns a void pointer. There is no
explicit cast in the code, so the compiler has to write down the
allocated type itself.
Also generalize a cast to use CallBase, so that we mark heap alloc sites
when exceptions are enabled.
Differential Revision: https://reviews.llvm.org/D80966
This patch implements the * binary operator for values of
MatrixType. It adds support for matrix * matrix, scalar * matrix and
matrix * scalar.
For the matrix, matrix case, the number of columns of the first operand
must match the number of rows of the second. For the scalar,matrix variants,
the element type of the matrix must match the scalar type.
Reviewers: rjmccall, anemet, Bigcheese, rsmith, martong
Reviewed By: rjmccall
Differential Revision: https://reviews.llvm.org/D76794
Summary:
This transformation is correct for a builtin call to 'free(p)', but not
for 'operator delete(p)'. There is no guarantee that a user replacement
'operator delete' has no effect when called on a null pointer.
However, the principle behind the transformation *is* correct, and can
be applied more broadly: a 'delete p' expression is permitted to
unconditionally call 'operator delete(p)'. So do that in Clang under
-Oz where possible. We do this whether or not 'p' has trivial
destruction, since the destruction might turn out to be trivial after
inlining, and even for a class-specific (but non-virtual,
non-destroying, non-array) 'operator delete'.
Reviewers: davide, dnsampaio, rjmccall
Reviewed By: dnsampaio
Subscribers: hiraditya, cfe-commits, llvm-commits
Tags: #clang, #llvm
Differential Revision: https://reviews.llvm.org/D79378
Clang marks calls to operator new as heap allocation sites, but the
operator declared at global scope returns a void pointer. There is no
explicit cast in the code, so the compiler has to write down the
allocated type itself.
Also generalize a cast to use CallBase, so that we mark heap alloc sites
when exceptions are enabled.
Reviewed By: MaskRay
Differential Revision: https://reviews.llvm.org/D80966
Summary:
This patch upstreams support for a new storage only bfloat16 C type.
This type is used to implement primitive support for bfloat16 data, in
line with the Bfloat16 extension of the Armv8.6-a architecture, as
detailed here:
https://community.arm.com/developer/ip-products/processors/b/processors-ip-blog/posts/arm-architecture-developments-armv8-6-a
The bfloat type, and its properties are specified in the Arm Architecture
Reference Manual:
https://developer.arm.com/docs/ddi0487/latest/arm-architecture-reference-manual-armv8-for-armv8-a-architecture-profile
In detail this patch:
- introduces an opaque, storage-only C-type __bf16, which introduces a new bfloat IR type.
This is part of a patch series, starting with command-line and Bfloat16
assembly support. The subsequent patches will upstream intrinsics
support for BFloat16, followed by Matrix Multiplication and the
remaining Virtualization features of the armv8.6-a architecture.
The following people contributed to this patch:
- Luke Cheeseman
- Momchil Velikov
- Alexandros Lamprineas
- Luke Geeson
- Simon Tatham
- Ties Stuij
Reviewers: SjoerdMeijer, rjmccall, rsmith, liutianle, RKSimon, craig.topper, jfb, LukeGeeson, fpetrogalli
Reviewed By: SjoerdMeijer
Subscribers: labrinea, majnemer, asmith, dexonsmith, kristof.beyls, arphaman, danielkiss, cfe-commits
Tags: #clang
Differential Revision: https://reviews.llvm.org/D76077
Summary:
If the variables must be globalized in OpenMP mode (local automatic
variable, GPU compilation mode, the variable may escape its declaration
context by the reference or by the pointer), it should not be considered
as the NRVO candidate. Otherwise, incorrect the return value of the
function might not be updated.
Reviewers: jdoerfert
Subscribers: yaxunl, guansong, sstefan1, cfe-commits, caomhin
Tags: #clang
Differential Revision: https://reviews.llvm.org/D80936
This is a re-revert with a corrected test.
This patch adds a test for the PowerPC fma compiler builtins, some variations
of which negate inputs and outputs. The code to generate IR for these
builtins was untested before this patch.
Originally, the code used the outdated method of subtracting floating point
values from -0.0 as floating point negation. This patch remedies that.
Patch by: Drew Wock <drew.wock@sas.com>
Differential Revision: https://reviews.llvm.org/D76949
Summary:
If the data member is mapped as an array section, need to emit the
pointer to the last element of this array section and use this pointer
as the highest element in partial struct data.
Reviewers: jdoerfert
Subscribers: yaxunl, guansong, sstefan1, cfe-commits, caomhin
Tags: #clang
Differential Revision: https://reviews.llvm.org/D81037
Summary:
During CodeGen for AArch64 Neon intrinsics, Clang was incorrectly
assuming all the pointers from which loads were being generated for vld1
intrinsics were aligned according to the intrinsics result type, causing
alignment faults on the code generated by the backend.
This patch updates vld1 intrinsics' CodeGen to properly capture the
correct load alignment based on the type of the pointer provided as
input for the intrinsic.
Reviewers: t.p.northover, ostannard, pcc, efriedma
Reviewed By: ostannard, efriedma
Subscribers: echristo, plotfi, nickdesaulniers, efriedma, kristof.beyls, danielkiss, cfe-commits
Tags: #clang
Differential Revision: https://reviews.llvm.org/D79721
When sampleFDO is enabled, people may expect they can use
-fno-profile-sample-use to opt-out using sample profile for a certain file.
That could be either for debugging purpose or for performance tuning purpose.
However, when thinlto is enabled, if a function in file A compiled with
-fno-profile-sample-use is imported to another file B compiled with
-fprofile-sample-use, the inlined copy of the function in file B may still
get its profile annotated.
The inconsistency may even introduce profile unused warning because if the
target is not compiled with explicit debug information flag, the function
in file A won't have its debug information enabled (debug information will
be enabled implicitly only when -fprofile-sample-use is used). After it is
imported into file B which is compiled with -fprofile-sample-use, profile
annotation for the outline copy of the function will fail because the
function has no debug information, and that will trigger profile unused
warning.
We add a new attribute use-sample-profile to control whether a function
will use its sample profile no matter for its outline or inline copies.
That will make the behavior of -fno-profile-sample-use consistent.
Differential Revision: https://reviews.llvm.org/D79959
This lets us to remove !stack-safe metadata and
better controll when to perform StackSafety
analysis.
Reviewers: eugenis
Subscribers: hiraditya, steven_wu, dexonsmith, cfe-commits, llvm-commits
Tags: #clang, #llvm
Differential Revision: https://reviews.llvm.org/D80771
Extension vectors now can be used in element-wise conditional selector.
For example:
```
R[i] = C[i]? A[i] : B[i]
```
This feature was previously only enabled in OpenCL C. Now it's also
available in C. Not that it has different behaviors than GNU vectors
(i.e. __vector_size__). Extension vectors selects on signdness of the
vector. GNU vectors on the other hand do normal bool conversions. Also,
this feature is not available in C++.
Differential Revision: https://reviews.llvm.org/D80574
Summary:
Added initial codegen for 'affinity' clauses on task directives.
Emits next code:
```
kmp_task_affinity_info_t affs[<num_elems>];
void *td = __kmpc_task_alloc(..);
affs[<i>].base = &data_i;
affs[<i>].size = sizeof(data_i);
__kmpc_omp_reg_task_with_affinity(&loc, <gtid>, td, <num_elems>, affs);
```
The result returned by the call of `__kmpc_omp_reg_task_with_affinity`
function is ignored currently sincethe runtime currently ignores args
and returns 0 uncoditionally.
Reviewers: jdoerfert
Subscribers: yaxunl, guansong, sstefan1, llvm-commits, cfe-commits, caomhin
Tags: #clang, #llvm
Differential Revision: https://reviews.llvm.org/D80240
This patch adds clang options:
-fbasic-block-sections={all,<filename>,labels,none} and
-funique-basic-block-section-names.
LLVM Support for basic block sections is already enabled.
+ -fbasic-block-sections={all, <file>, labels, none} : Enables/Disables basic
block sections for all or a subset of basic blocks. "labels" only enables
basic block symbols.
+ -funique-basic-block-section-names: Enables unique section names for
basic block sections, disabled by default.
Differential Revision: https://reviews.llvm.org/D68049
Canonicalize on storing FP options in LangOptions instead of
redundantly in CodeGenOptions. Incorporate -ffast-math directly
into the values of those LangOptions rather than considering it
separately when building FPOptions. Build IR attributes from
those options rather than a mix of sources.
We should really simplify the driver/cc1 interaction here and have
the driver pass down options that cc1 directly honors. That can
happen in a follow-up, though.
Patch by Michele Scandale!
https://reviews.llvm.org/D80315
Summary: This changes Clang's generation of OpenMP runtime functions to use the types and functions defined in OpenMPKinds and OpenMPConstants. New OpenMP runtime function information should now be added to OMPKinds.def. This patch also changed the definitions of __kmpc_push_num_teams and __kmpc_copyprivate to match those found in the runtime.
Reviewers: jdoerfert
Reviewed By: jdoerfert
Subscribers: jfb, AndreyChurbanov, openmp-commits, fghanim, hiraditya, sstefan1, cfe-commits, llvm-commits
Tags: #openmp, #clang, #llvm
Differential Revision: https://reviews.llvm.org/D80222
This patch implements matrix index expressions
(matrix[RowIdx][ColumnIdx]).
It does so by introducing a new MatrixSubscriptExpr(Base, RowIdx, ColumnIdx).
MatrixSubscriptExprs are built in 2 steps in ActOnMatrixSubscriptExpr. First,
if the base of a subscript is of matrix type, we create a incomplete
MatrixSubscriptExpr(base, idx, nullptr). Second, if the base is an incomplete
MatrixSubscriptExpr, we create a complete
MatrixSubscriptExpr(base->getBase(), base->getRowIdx(), idx)
Similar to vector elements, it is not possible to take the address of
a MatrixSubscriptExpr.
For CodeGen, a new MatrixElt type is added to LValue, which is very
similar to VectorElt. The only difference is that we may need to cast
the type of the base from an array to a vector type when accessing it.
Reviewers: rjmccall, anemet, Bigcheese, rsmith, martong
Reviewed By: rjmccall
Differential Revision: https://reviews.llvm.org/D76791
Summary:
Forked from:
https://reviews.llvm.org/D80242
Use the getter for access to DebugInfo consistently.
Use break in switch in CodeGenModule::EmitTopLevelDecl consistently.
Reviewers: dblaikie
Reviewed By: dblaikie
Subscribers: cfe-commits, srhines
Tags: #clang
Differential Revision: https://reviews.llvm.org/D80840
After the D70350, the retainedTypes: isn't being used for the purpose
of call site debug info for extern calls, so it is safe to delete it
from IR representation.
We are also adding a test to ensure the subprogram isn't stored within
the retainedTypes: from corresponding DICompileUnit.
Differential Revision: https://reviews.llvm.org/D80369
This patch implements the + and - binary operators for values of
MatrixType. It adds support for matrix +/- matrix, scalar +/- matrix and
matrix +/- scalar.
For the matrix, matrix case, the types must initially be structurally
equivalent. For the scalar,matrix variants, the element type of the
matrix must match the scalar type.
Reviewers: rjmccall, anemet, Bigcheese, rsmith, martong
Reviewed By: rjmccall
Differential Revision: https://reviews.llvm.org/D76793
Summary:
This was attempted once before in https://reviews.llvm.org/D79698, but
was reverted due to the coverage pass running in the wrong part of the
pipeline. This commit puts it in the same place as the other sanitizers.
This changes PassBuilder.OptimizerLastEPCallbacks to work on a
ModulePassManager instead of a FunctionPassManager. That is because
SanitizerCoverage cannot (easily) be split into a module pass and a
function pass like some of the other sanitizers since in its current
implementation it conditionally inserts module constructors based on
whether or not it successfully modified functions.
This fixes compiler-rt/test/msan/coverage-levels.cpp under the new pass
manager (last check-msan test).
Currently sanitizers + LTO don't work together under the new pass
manager, so I removed tests that checked that this combination works for
sancov.
Subscribers: hiraditya, cfe-commits, llvm-commits
Tags: #clang, #llvm
Differential Revision: https://reviews.llvm.org/D80692
Summary:
This was attempted once before in https://reviews.llvm.org/D79698, but
was reverted due to the coverage pass running in the wrong part of the
pipeline. This commit puts it in the same place as the other sanitizers.
This changes PassBuilder.OptimizerLastEPCallbacks to work on a
ModulePassManager instead of a FunctionPassManager. That is because
SanitizerCoverage cannot (easily) be split into a module pass and a
function pass like some of the other sanitizers since in its current
implementation it conditionally inserts module constructors based on
whether or not it successfully modified functions.
This fixes compiler-rt/test/msan/coverage-levels.cpp under the new pass
manager (last check-msan test).
Subscribers: hiraditya, cfe-commits, llvm-commits
Tags: #clang, #llvm
Differential Revision: https://reviews.llvm.org/D80692
Summary:
D76801 caused some regressions in debuginfo compatibility by changing how
certain functions were named.
For CodeView we try to mirror MSVC exactly: this was fixed in a549c0d004
For DWARF the situation is murkier. Per David Blaikie:
> In general DWARF doesn't specify this at all.
> [...]
> This isn't the only naming divergence between GCC and Clang
Nevertheless, including the space seems to provide better compatibility with
GCC and GDB. E.g. cpexprs.cc in the GDB testsuite requires this formatting.
And there was no particular desire to change the printing of names in debug
info in the first place (just in diagnostics and other more user-facing text).
Fixes PR46052
Reviewers: dblaikie, labath
Subscribers: aprantl, cfe-commits, dyung
Tags: #clang
Differential Revision: https://reviews.llvm.org/D80554
This patch upgrades DISubrange to support fortran requirements.
Summary:
Below are the updates/addition of fields.
lowerBound - Now accepts signed integer or DIVariable or DIExpression,
earlier it accepted only signed integer.
upperBound - This field is now added and accepts signed interger or
DIVariable or DIExpression.
stride - This field is now added and accepts signed interger or
DIVariable or DIExpression.
This is required to describe bounds of array which are known at runtime.
Testing:
unit test cases added (hand-written)
check clang
check llvm
check debug-info
Reviewed By: aprantl
Differential Revision: https://reviews.llvm.org/D80197
Unlike other platforms using ItaniumCXXABI, Darwin does not allow the
creation of a thread-wrapper function for a variable in the TU of
users. Because of this, it can set the linkage of the thread-local
symbol to internal, with the assumption that no TUs other than the one
defining the variable will need it.
However, constinit thread_local variables do not require the use of
the thread-wrapper call, so users reference the variable
directly. Thus, it must not be converted to internal, or users will
get a link failure.
This was a regression introduced by the optimization in
00223827a9.
Differential Revision: https://reviews.llvm.org/D80417
And bump its version number accordingly.
This is a patched recommit of 7c298c104b
Previous hash implementation was incorrectly passing an uint64_t, that got converted
to an uint8_t, to finalize the hash computation. This led to different functions
having the same hash if they only differ by the remaining statements, which is
incorrect.
Added a new test case that trivially tests that a small function change is
reflected in the hash value.
Not that as this patch fixes the hash computation, it would invalidate all hashes
computed before that patch applies, this is why we bumped the version number.
Update profile data hash entries due to hash function update, except for binary
version, in which case we keep the buggy behavior for backward compatibility.
Differential Revision: https://reviews.llvm.org/D79961
The os_log helper functions are linkonce_odr and supposed to be
uniqued across TUs, so attachine a DW_AT_decl_line on it is highly
misleading. By setting the function decl to implicit, CGDebugInfo
properly marks the functions as artificial and uses a default file /
line 0 location for the function.
rdar://problem/63450824
Differential Revision: https://reviews.llvm.org/D80463
Summary:
During CodeGen for AArch64 Neon intrinsics, Clang was incorrectly
assuming all the pointers from which loads were being generated for vld1
intrinsics were aligned according to the intrinsics result type, causing
alignment faults on the code generated by the backend.
This patch updates vld1 intrinsics' CodeGen to properly capture the
correct load alignment based on the type of the pointer provided as
input for the intrinsic.
Reviewers: t.p.northover, ostannard, pcc
Reviewed By: ostannard
Subscribers: kristof.beyls, danielkiss, cfe-commits
Tags: #clang
Differential Revision: https://reviews.llvm.org/D79721
-fno-semantic-interposition is currently the CC1 default. (The opposite
disables some interprocedural optimizations.) However, it does not infer
dso_local: on most targets accesses to ExternalLinkage functions/variables
defined in the current module still need PLT/GOT.
This patch makes explicit -fno-semantic-interposition infer dso_local,
so that PLT/GOT can be eliminated if targets implement local aliases
for AsmPrinter::getSymbolPreferLocal (currently only x86).
Currently we check whether the module flag "SemanticInterposition" is 0.
If yes, infer dso_local. In the future, we can infer dso_local unless
"SemanticInterposition" is 1: frontends other than clang will also
benefit from the optimization if they don't bother setting the flag.
(There will be risks if they do want ELF interposition: they need to set
"SemanticInterposition" to 1.)
Previous implementation was incorrectly passing an uint64_t, that got converted
to an uint8_t, to finalize the hash computation. This led to different functions
having the same hash if they only differ by the remaining statements, which is
incorrect.
Added a new test case that trivially tests that a small function change is
reflected in the hash value.
Not that as this patch fixes the hash computation, it invalidates all hashes
computed before that patch applies, which could be an issue for large build
system that pre-compute the profile data and let client download them as part of
the build process.
Differential Revision: https://reviews.llvm.org/D79961
Fixes PR45753
When a program that contains a loop to which both `omp parallel for`
pragma and `clang loop` pragma are associated is compiled with the
-fopenmp option, `clang loop` pragma did not take effect. The example
below should not be vectorized by the `clang loop` pragma but it was
actually vectorized. The cause is that `llvm.loop.vectorize.width`
was not output to the IR when -fopenmp is specified.
The fix attaches attributes if they exist for the loop.
[example.c]
```
int a[100], b[100];
void foo() {
#pragma omp parallel for
#pragma clang loop vectorize(disable)
for (int i = 0; i < 100; i++)
a[i] += b[i] * i;
}
```
[compile]
```
$ clang -O2 -fopenmp example.c -c -Rpass=vect
example.c:3:11: remark: vectorized loop (vectorization width: 4, interleaved count: 2) [-Rpass=loop-vectorize]
#pragma omp parallel for
^
```
[IR with -fopenmp]
```
$ clang -O2 exmaple.c -S -emit-llvm -mllvm -disable-llvm-optzns -o - -fopenmp | grep 'vectorize\.width'
```
[IR with -fno-openmp]
```
$ clang -O2 example.c -S -emit-llvm -mllvm -disable-llvm-optzns -o - -fno-openmp | grep 'vectorize\.width'
!7 = !{!"llvm.loop.vectorize.width", i32 1}
```
Differential Revision: https://reviews.llvm.org/D79921
Summary:
In D80061 we added warning for exception specifications with types (such
as `throw(int)`), but it was enabled every time the target was wasm,
which means it warned (and ignored) exception specifications even if
wasm EH was not used. This fixes it and we only have the warning when we
enable `-fwasm-exceptions`.
Reviewers: dschuff
Subscribers: sbc100, jgravelle-google, sunfish, cfe-commits
Tags: #clang
Differential Revision: https://reviews.llvm.org/D80362
NoDebug attr does not totally eliminate debug info about a function when
inlining is enabled. This is inconsistent with when inlining is disabled.
This patch fixes that.
Differential Revision: https://reviews.llvm.org/D79967
Summary:
No need to generate inlined OpenMP region for variables captured in
lambdas or block decls, only for implicitly captured variables in the
OpenMP region.
Reviewers: jdoerfert
Subscribers: yaxunl, guansong, cfe-commits, caomhin
Tags: #clang
Differential Revision: https://reviews.llvm.org/D79966
If we're going to assume references are dereferenceable, we should also
assume they're aligned: otherwise, we can't actually dereference them.
See also D80072.
Differential Revision: https://reviews.llvm.org/D80166
D35259 introduced a case where complex types of non-long-double would
result in FI.getReturnInfo() to not be initialized properly. This
resulted in a crash under some very specific circumstances when
dereferencing the LLVMContext.
This patch makes sure that these types have the intended getReturnInfo
initialization.
Summary:
Created AIXABIInfo and AIXTargetCodeGenInfo for AIX ABI.
Reviewed By: Xiangling_L, ZarkoCA
Differential Revision: https://reviews.llvm.org/D79035
Summary:
Wasm currently does not fully handle exception specifications. Rather
than crashing,
- This treats `throw()` in the same way as `noexcept`.
- This ignores and prints a warning for `throw(type, ..)`, for a
temporary measure. This warning is controlled by
`-Wwasm-exception-spec`, which is on by default. You can suppress the
warning by using `-Wno-wasm-exception-spec`.
Reviewers: dschuff
Subscribers: sbc100, jgravelle-google, sunfish, cfe-commits
Tags: #clang
Differential Revision: https://reviews.llvm.org/D80061
Summary:
This is needed in Swift for C++ interop -- see here for the corresponding Swift change:
https://github.com/apple/swift/pull/30630
As part of this change, I've had to make some changes to the interface of CGCXXABI to return the additional parameters separately rather than adding them directly to a `CallArgList`.
Reviewers: rjmccall
Reviewed By: rjmccall
Subscribers: cfe-commits
Tags: #clang
Differential Revision: https://reviews.llvm.org/D79942
This operator is intended for casting between
pointers to objects in different address spaces
and follows similar logic as const_cast in C++.
Tags: #clang
Differential Revision: https://reviews.llvm.org/D60193
Summary:
This patch fixed the error of counting the remaining FPRs. Complex floating-point
values should be passed by two FPRs for the hard-float ABI. If no two FPRs are
available, it should be passed via a 64-bit GPR (fp+fp). `ArgFPRsLeft` is only
decreased one while the type is complex floating-point. It causes two floating-point
values in the complex are passed separately by two GPRs.
Reviewers: asb, luismarques, lenary
Reviewed By: asb
Subscribers: rbar, johnrusso, simoncook, sabuasal, niosHD, kito-cheng, shiva0217, jrtc27, zzheng, edward-jones, rogfer01, MartinMosbeck, brucehoult, the_o, rkruppe, PkmX, jocewei, psnobl, benna, s.egerton, pzheng, sameer.abuasal, apazos, evandro, cfe-commits
Tags: #clang
Differential Revision: https://reviews.llvm.org/D79770
I've also made a stab at imposing some more order on where and how we add
attributes; this part should be NFC. I wasn't sure whether the CUDA use
case for libdevice should propagate CPU/features attributes, so there's a
bit of unnecessary duplication.
This reverts commit bca347508c.
This broke clang/test/Misc/warning-flags.c, because the newly added
warning option in this commit didn't have a matching flag.
Summary:
Wasm currently does not fully handle exception specifications. Rather
than crashing, this treats `throw()` in the same way as `noexcept`, and
ignores and prints a warning for `throw(type, ..)`, for a temporary
measure.
Reviewers: dschuff
Subscribers: sbc100, jgravelle-google, sunfish, cfe-commits
Tags: #clang
Differential Revision: https://reviews.llvm.org/D79655
This is D77454, except for stores. All the infrastructure work was done
for loads, so the remaining changes necessary are relatively small.
Differential Revision: https://reviews.llvm.org/D79968
The "null-pointer-is-valid" attribute needs to be checked by many
pointer-related combines. To make the check more efficient, convert
it from a string into an enum attribute.
In the future, this attribute may be replaced with data layout
properties.
Differential Revision: https://reviews.llvm.org/D78862
Such a builtin function is mostly useful to preserve btf type id
for non-global data. For example,
extern void foo(..., void *data, int size);
int test(...) {
struct t { int a; int b; int c; } d;
d.a = ...; d.b = ...; d.c = ...;
foo(..., &d, sizeof(d));
}
The function "foo" in the above only see raw data and does not
know what type of the data is. In certain cases, e.g., logging,
the additional type information will help pretty print.
This patch implemented a BPF specific builtin
u32 btf_type_id = __builtin_btf_type_id(param, flag)
which will return a btf type id for the "param".
flag == 0 will indicate a BTF local relocation,
which means btf type_id only adjusted when bpf program BTF changes.
flag == 1 will indicate a BTF remote relocation,
which means btf type_id is adjusted against linux kernel or
future other entities.
Differential Revision: https://reviews.llvm.org/D74668
This reverts commit 7d5bb94d78.
Reverting since this leads to a linker error we're seeing on Fuchsia.
The underlying issue seems to be that inlining is run after sanitizers
and causes different comdat groups instrumented by Sancov to reference
non-key symbols defined in other comdat groups.
Will re-land this patch after a fix for that is landed.
Summary:
Predefined allocators should not be mapped at all (they are just enumeric
constants). FOr user-defined allocators need to map the traits only as
firstprivates, the allocator itself is private.
At the beginning of the target region the user-defined allocatores must
be created and then destroyed at the end of the target region:
```
omp_allocator_handle_t my_allocator = __kmpc_init_allocator(<gtid>,
/*default memhandle*/ 0, <number_of_traits>, &<traits>);
...
call void @__kmpc_destroy_allocator(<gtid>, my_allocator);
```
Reviewers: jdoerfert, aaron.ballman
Subscribers: jholewinski, yaxunl, guansong, cfe-commits, caomhin
Tags: #clang
Differential Revision: https://reviews.llvm.org/D79257
An earlier change eliminated spaces between the close brackets of nested
template lists. Unfortunately that prevents the Windows debuggers from
matching some types to their corresponding visualizers (e.g., std::map).
This selects the SeparateTemplateClosers flag when generating CodeView.
Note that we were already making formatting adjustments under similar
circumstances for similar reasons.
This wasn't caught by existing tests because they were using only
-std=c++98.
Differential Revision: https://reviews.llvm.org/D79274
In D49466, sys::path::replace_path_prefix was used instead startswith for -f[macro/debug/file]-prefix-map options.
However those were reverted later (commit rG3bb24bf25767ef5bbcef958b484e7a06d8689204) due to broken Windows tests.
This patch restores those replace_path_prefix calls.
It also modifies the prefix matching to be case-insensitive under Windows.
Differential Revision : https://reviews.llvm.org/D76869
Summary:
As proposed in https://github.com/WebAssembly/simd/pull/122. Since
these instructions are not yet merged to the SIMD spec proposal, this
patch makes them entirely opt-in by surfacing them only through LLVM
intrinsics and clang builtins. If these instructions are made
official, these intrinsics and builtins should be replaced with simple
instruction patterns.
Reviewers: aheejin
Subscribers: dschuff, sbc100, jgravelle-google, hiraditya, sunfish, cfe-commits, llvm-commits
Tags: #clang, #llvm
Differential Revision: https://reviews.llvm.org/D79742
gcov 4.8 (r189778) moved the exit block from the last to the second.
The .gcda format is compatible with 4.7 but
* decoding libgcov 4.7 produced .gcda with gcov [4.7,8) can mistake the
exit block, emit bogus `%s:'%s' has arcs from exit block\n` warnings,
and print wrong `" returned %s` for branch statistics (-b).
* decoding libgcov 4.8 produced .gcda with gcov 4.7 has similar issues.
Also, rename "return block" to "exit block" because the latter is the
appropriate term.
Summary:
This fixes compiler-rt/test/msan/coverage-levels.cpp under the new pass manager (final check-msan test!).
Under the old pass manager, the coverage pass would run before the MSan pass. The opposite happened under the new pass manager. The MSan pass adds extra basic blocks, changing the number of coverage callbacks.
Reviewers: vitalybuka, leonardchan
Subscribers: cfe-commits
Tags: #clang
Differential Revision: https://reviews.llvm.org/D79698
This patch adds a matrix type to Clang as described in the draft
specification in clang/docs/MatrixSupport.rst. It introduces a new option
-fenable-matrix, which can be used to enable the matrix support.
The patch adds new MatrixType and DependentSizedMatrixType types along
with the plumbing required. Loads of and stores to pointers to matrix
values are lowered to memory operations on 1-D IR arrays. After loading,
the loaded values are cast to a vector. This ensures matrix values use
the alignment of the element type, instead of LLVM's large vector
alignment.
The operators and builtins described in the draft spec will will be added in
follow-up patches.
Reviewers: martong, rsmith, Bigcheese, anemet, dexonsmith, rjmccall, aaron.ballman
Reviewed By: rjmccall
Differential Revision: https://reviews.llvm.org/D72281
Summary:
Although using `__builtin_shufflevector` and the `shufflevector`
instruction works fine, they are not opaque to the optimizer. As a
result, DAGCombine can potentially reduce the number of shuffles and
change the shuffle masks. This is unexpected behavior for users of the
WebAssembly SIMD intrinsics who have crafted their shuffles to
optimize the code generated by engines. This patch solves the problem
by adding a new shuffle intrinsic that is opaque to the optimizers in
line with the decision of the WebAssembly SIMD contributors at
https://github.com/WebAssembly/simd/issues/196#issuecomment-622494748. In
the future we may implement custom DAG combines to properly optimize
shuffles and replace this solution.
Reviewers: aheejin, dschuff
Subscribers: sbc100, jgravelle-google, hiraditya, sunfish, cfe-commits, llvm-commits
Tags: #clang, #llvm
Differential Revision: https://reviews.llvm.org/D66983
These builtins are expanded in CGBuiltin to use intrinsics
for (signed/unsigned) shift left long top/bottom.
Reviewers: efriedma, SjoerdMeijer
Reviewed By: efriedma
Differential Revision: https://reviews.llvm.org/D79579
Defaulting to -Xclang -coverage-version='407*' makes .gcno/.gcda
compatible with gcov [4.7,8)
In addition, delete clang::CodeGenOptionsBase::CoverageExtraChecksum and GCOVOptions::UseCfgChecksum.
We can infer the information from the version.
With this change, .gcda files produced by `clang --coverage a.o` linked executable can be read by gcov 4.7~7.
We don't need other -Xclang -coverage* options.
There may be a mismatching version warning, though.
(Note, GCC r173147 "split checksum into cfg checksum and line checksum"
made gcov 4.7 incompatible with previous versions.)
rL144865 incorrectly wrote function names for GCOV_TAG_FUNCTION
(this might be part of the reasons the header says
"We emit files in a corrupt version of GCOV's "gcda" file format").
rL176173 and rL177475 realized the problem and introduced -coverage-no-function-names-in-data
to work around the issue. (However, the description is wrong.
libgcov never writes function names, even before GCC 4.2).
In reality, the linker command line has to look like:
clang --coverage -Xclang -coverage-version='407*' -Xclang -coverage-cfg-checksum -Xclang -coverage-no-function-names-in-data
Failing to pass -coverage-no-function-names-in-data can make gcov 4.7~7
either produce wrong results (for one gcov-4.9 program, I see "No executable lines")
or segfault (gcov-7).
(gcov-8 uses an incompatible format.)
This patch deletes -coverage-no-function-names-in-data and the related
function names support from libclang_rt.profile
https://reviews.llvm.org/D63616 added `-fsanitize-coverage-whitelist`
and `-fsanitize-coverage-blacklist` for clang.
However, it was done only for legacy pass manager.
This patch enable it for new pass manager as well.
Reviewed By: vitalybuka
Differential Revision: https://reviews.llvm.org/D79653
This is a standalone patch and this would help Propeller do a better job of code
layout as it can accurately attribute the profiles to the right internal linkage
function.
This also helps SampledFDO/AutoFDO correctly associate sampled profiles to the
right internal function. Currently, if there is more than one internal symbol
foo, their profiles are aggregated by SampledFDO.
This patch adds a new clang option, -funique-internal-funcnames, to generate
unique names for functions with internal linkage. This patch appends the md5
hash of the module name to the function symbol as a best effort to generate a
unique name for symbols with internal linkage.
Differential Revision: https://reviews.llvm.org/D73307
Summary:
Properly forward TrackOrigins and Recover user options to the MSan pass under the new pass manager.
This makes the number of check-msan failures when ENABLE_EXPERIMENTAL_NEW_PASS_MANAGER is TRUE go from 52 to 2.
Based on https://reviews.llvm.org/D77249.
Reviewers: nemanjai, vitalybuka, leonardchan
Subscribers: cfe-commits
Tags: #clang
Differential Revision: https://reviews.llvm.org/D79445
Summary:
omp.h header file defines omp_null_allocator as a predefined allocator,
need to consider it also as a predefined allocator.
Reviewers: jdoerfert
Subscribers: jholewinski, yaxunl, guansong, cfe-commits, caomhin
Tags: #clang
Differential Revision: https://reviews.llvm.org/D79186
passed to __builtin_os_log_format to extend its lifetime to the end of
its enclosing block
Extend only lifetimes of pointers returned by function calls or message
sends instead. In the long term, we should lifetime-extend pointers in
more complex expressions and non-ARC objects (e.g., C++ temporaries)
too.
rdar://problem/61846261
This is the result of an audit of all of the ABIs in clang to implement
and enable the type for those targets.
Additionally, this finds an issue with integer-promotion passing for a
few platforms when using _ExtInt of < int, so this also corrects that
resulting in signext/zeroext being on a params of those types in some
platforms.
Differential Revisions: https://reviews.llvm.org/D79118
Summary:
- If the coerced type is still a pointer, it should be set with proper
parameter attributes, such as `noalias`, `nonnull`, and etc. Hoist
that (pointer) parameter attribute setting so that the coerced pointer
parameter could be marked properly.
Depends on D79394
Reviewers: rjmccall, kerbowa, yaxunl
Subscribers: jvesely, nhaehnle, cfe-commits
Tags: #clang
Differential Revision: https://reviews.llvm.org/D79395
Summary:
- Skip copying function arguments and unnecessary casting by using them
directly.
Reviewers: rjmccall, kerbowa, yaxunl
Subscribers: cfe-commits
Tags: #clang
Differential Revision: https://reviews.llvm.org/D79394
Summary:
This change fixes an aarch64-specific bug in the generation of the NDS and WDS values used to compute the signature of the vector functions out of OpenMP directives like `declare simd`. When the directive is used in conjunction with the `linear` clause, the size of the pointee must be used instead of the size of the pointer to compute NDS and WDS.
The code-fix is strictly related to the behavior for `linear`, but given that the only way we have to test the NDS and WDS values is to check the resulting `<vlen>` token in the mangled name of the vector function, the tests have been extended to cover all the possible values of WDS and NDS as defined in the ABI at https://github.com/ARM-software/abi-aa/tree/master/vfabia64.
Reviewers: ABataev, jdoerfert, andwar
Reviewed By: jdoerfert
Subscribers: yaxunl, kristof.beyls, guansong, danielkiss, cfe-commits
Tags: #clang
Differential Revision: https://reviews.llvm.org/D78969
The reinterpret builtins are generated separately because they
need the cross product of all types, 121 functions in total,
which is inconvenient to specify in the arm_sve.td file.
Reviewers: SjoerdMeijer, efriedma, ctetreau, rengolin
Reviewed By: efriedma
Tags: #clang
Differential Revision: https://reviews.llvm.org/D78756
* svdupq builtins that duplicate scalars to every quadword of a vector
are defined using builtins for svld1rq (load and replicate quadword).
* svdupq builtins that duplicate boolean values to fill a predicate vector
are defined using `svcmpne`.
Reviewers: SjoerdMeijer, efriedma, ctetreau
Reviewed By: efriedma
Tags: #clang
Differential Revision: https://reviews.llvm.org/D78750
Summary:
As described in https://github.com/WebAssembly/simd/pull/209. This is
the final reorganization of the SIMD opcode space before
standardization. It has been landed in concert with corresponding
changes in other projects in the WebAssembly SIMD ecosystem.
Reviewers: aheejin
Subscribers: dschuff, sbc100, jgravelle-google, hiraditya, sunfish, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D79224
Summary:
The linear parameter token in the mangling function must be multiplied
by the pointee size in bytes when the parameter is a pointer.
Reviewers: ABataev, andwar, jdoerfert
Subscribers: yaxunl, guansong, cfe-commits
Tags: #clang
Differential Revision: https://reviews.llvm.org/D78965
test cases
Add support for #pragma float_control
Reviewers: rjmccall, erichkeane, sepavloff
Differential Revision: https://reviews.llvm.org/D72841
This reverts commit 85dc033cac, and makes
corrections to the test cases that failed on buildbots.